[Gluster-users] EBADFD with large number of concurrent files

Tue Aug 7 09:18:24 UTC 2012

Hi Brian,

Can you please provide the client(mnt) log files?

Also if possible can you take the state dumps and attach them in the mail
"gluster volume statedump <volname>"
The o/p will be files in /tmp/<brick-path>.<pid>.dump.x

with regards,
Shishir

----- Original Message -----
From: "Brian Candler" <B.Candler at pobox.com>
To: gluster-users at gluster.org
Sent: Tuesday, August 7, 2012 3:13:24 AM
Subject: [Gluster-users] EBADFD with large number of concurrent files

I have an application where there are 48 processes, and each one has opens
1000 files (different files for all 48 processes).  They are opened onto a
distributed gluster volume, distributed between two nodes.

It works initially, but after a while, some of the processes abort. perror
prints "File descriptor in bad state" (I think this means EBADFD)

This is with glusterfs 3.3.0 under Ubuntu 12.04 (both the storage nodes and
the application servers)

Looking on the two backend bricks, each has two glusterfsd processes.  On
both bricks, the one with the lower pid has 24168 open FDs
(ls /proc/<pid>/fd | wc -l), and also 1.5-2.5GB of RSS.  So it's pretty
clear that glusterfsd keeps one open file handle per file opened by the
client. That's pretty reasonable.

I don't think I'm hitting a system limit for this:

# cat /proc/sys/fs/file-max
808870

and it's clearly working for the first few minutes.  So I wonder if anyone
has any other suggestions for why EBADFD is getting returned after a while?

Thanks,

Brian.
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users