[Gluster-devel] About file descriptor leak in glusterfsd daemon after network failure

Wed Aug 20 11:16:16 UTC 2014

Hi gluster-devel team,

We are running a 2 replica volume in 2 servers. One of our service daemon
open a file with 'flock' in the volume. We can see every glusterfsd daemon
open the replica files in its own server(in /proc/pid/fd). When we pull off
the cable of one server about 10 minutes then re-plug in. We found that the
glusterfsd open a 'NEW' file descriptor while still holding the old one
which is opened in the first file access.

Then we stop our service daemon, but the glusterfsd(the re-plug cable one)
only closes the new fd, leave the old fd open, we think that may be a fd
leak issue. And we restart our service daemon. It flocked the same file,
and get a flock failure. The errno is Resource Temporary Unavailable.

However, this situation is not replay every time but often come out. We are
still looking into the source code of glusterfsd, but it is not a easy job.
So we want to look for some help in here. Here are our questions:

1. Has this issue been solved? Or is it a known issue?
2. Does anyone know the file descriptor maintenance logic in
glusterfsd(server-side)? When the fd will be closed or held?

Thank you very much.

-- 
Best regards,
Jaden Liang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140820/158d23f3/attachment.html>