[Gluster-devel] Too many open files
Brent A Nelson
brent at phys.ufl.edu
Thu Apr 5 05:09:16 UTC 2007
Awesome!
Thanks,
Brent
On Wed, 4 Apr 2007, Anand Avati wrote:
> Brent,
> thank you so much for your efforts of sending the output!
> from the log it is clear the leak fd's are all for directories. Indeed
> there was an issue with releasedir() call reaching all the nodes. The
> fix should be committed today to tla.
>
> Thanks!!
>
> avati
>
>
>
> On Wed, Apr 04, 2007 at 09:18:48PM -0400, Brent A Nelson wrote:
>> I avoided restarting, as this issue would take a while to reproduce.
>>
>> jupiter01 and jupiter02 are mirrors of each other. All performance
>> translators are in use, except for writebehind (due to the mtime bug).
>>
>> jupiter01:
>> ls -l /proc/26466/fd |wc
>> 65536 655408 7358168
>> See attached for ls -l output.
>>
>> jupiter02:
>> ls -l /proc/3651/fd |wc
>> ls -l /proc/3651/fd
>> total 11
>> lrwx------ 1 root root 64 2007-04-04 20:43 0 -> /dev/null
>> lrwx------ 1 root root 64 2007-04-04 20:43 1 -> /dev/null
>> lrwx------ 1 root root 64 2007-04-04 20:43 10 -> socket:[2565251]
>> lrwx------ 1 root root 64 2007-04-04 20:43 2 -> /dev/null
>> l-wx------ 1 root root 64 2007-04-04 20:43 3 ->
>> /var/log/glusterfs/glusterfsd.log
>> lrwx------ 1 root root 64 2007-04-04 20:43 4 -> socket:[2255275]
>> lrwx------ 1 root root 64 2007-04-04 20:43 5 -> socket:[2249710]
>> lr-x------ 1 root root 64 2007-04-04 20:43 6 -> eventpoll:[2249711]
>> lrwx------ 1 root root 64 2007-04-04 20:43 7 -> socket:[2255306]
>> lr-x------ 1 root root 64 2007-04-04 20:43 8 ->
>> /etc/glusterfs/glusterfs-client.vol
>> lr-x------ 1 root root 64 2007-04-04 20:43 9 ->
>> /etc/glusterfs/glusterfs-client.vol
>>
>> Note that it looks like all those extra directories listed on jupiter01
>> were locally rsynched from jupiter01's Lustre filesystems onto the
>> glusterfs client on jupiter01. A very large rsync from a different
>> machine to jupiter02 didn't go nuts.
>>
>> Thanks,
>>
>> Brent
>>
>> On Wed, 4 Apr 2007, Anand Avati wrote:
>>
>>> Brent,
>>> I hope the system is still in the same state to dig some info out.
>>> To verify that it is a file descriptor leak, can you please run this
>>> test. On the server, run ps ax and get the PID of glusterfsd. then do
>>> an ls -l on /proc/<pid>/fd/ and please mail the output of that. That
>>> should give a precise idea of what is happening.
>>> If the system has been reset out of the state, please give us the
>>> spec file you are using and the commands you ran (of some major jobs
>>> like heavy rsync) so that we will try to reproduce the error in our
>>> setup.
>>>
>>> regards,
>>> avati
>>>
>>>
>>> On Wed, Apr 04, 2007 at 01:12:33PM -0400, Brent A Nelson wrote:
>>>> I put a 2-node GlusterFS mirror into use internally yesterday, as
>>>> GlusterFS was looking pretty solid, and I rsynced a whole bunch of stuff
>>>> to it. Today, however, an ls on any of the three clients gives me:
>>>>
>>>> ls: /backup: Too many open files
>>>>
>>>> It looks like glusterfsd hit a limit. Is this a bug (glusterfs/glusterfsd
>>>> forgetting to close files; essentially, a file descriptor leak), or do I
>>>> just need to increase the limit somewhere?
>>>>
>>>> Thanks,
>>>>
>>>> Brent
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at nongnu.org
>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>
>>>
>>> --
>>> Shaw's Principle:
>>> Build a system that even a fool can use,
>>> and only a fool will want to use it.
>>>
>
>
>
> --
> Shaw's Principle:
> Build a system that even a fool can use,
> and only a fool will want to use it.
>
More information about the Gluster-devel
mailing list