[Bugs] [Bug 1528571] New: glusterd Too many open files

bugzilla at redhat.com bugzilla at redhat.com
Fri Dec 22 07:47:32 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1528571

            Bug ID: 1528571
           Summary: glusterd Too many open files
           Product: GlusterFS
           Version: 3.10
         Component: glusterd
          Assignee: bugs at gluster.org
          Reporter: stefan_bo at 163.com
                CC: bugs at gluster.org



Description of problem:
glusterd max open files can't change

Version-Release number of selected component (if applicable):
3.10.3 (CentOS7.3) production environment
3.12.3 (CentOS7.3) test environment

How reproducible:
after install glusterd service in centos. increase kernel paramater
`fs.file-max` , `fs.nr_open` and increase LimitNOFILE
start glusterd service , check /proc/pid/limits

Steps to Reproduce:
1. yum install -y glusterfs glusterfs-server glusterfs-fuse glusterfs-rdma
2. sysctl -w fs.file-max = 4194304
3. sysctl -wfs.nr_open = 2097152
4. make sure /usr/lib/systemd/system/glusterd.service LimitNOFILE=2097152
5. systemctl start glusterd
6. check gluster process max openfile by `cat /proc/pid/limits` 

Actual results:
[root at gfs1 ~]# ps -ef|grep gluster
root       23602       1  0 15:13 ?        00:00:01 /usr/sbin/glusterd -p
/var/run/glusterd.pid --log-level DEBUG
root       23613       1  0 15:13 ?        00:00:01 /usr/sbin/glusterfs -s
localhost --volfile-id gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log
-S /var/run/gluster/187faae78e098adf3f277b8be93e2e7f.socket --xlator-option
*replicate*.node-uuid=9728e5d2-5135-47bb-ab7a-d186be9e804a
root       23622       1  0 15:13 ?        00:00:01 /usr/sbin/glusterfs -s
localhost --volfile-id gluster/quotad -p /var/run/gluster/quotad/quotad.pid -l
/var/log/glusterfs/quotad.log -S
/var/run/gluster/89152d51ca7da373002a84bcab99e3e2.socket --xlator-option
*replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off
root       23631       1  0 15:13 ?        00:00:00 /usr/sbin/glusterfsd -s
172.17.18.61 --volfile-id douyu-volume.172.17.18.61.opt-gluster-data -p
/var/run/gluster/vols/douyu-volume/172.17.18.61-opt-gluster-data.pid -S
/var/run/gluster/53e32eeabb2d21c972cfca0e3fd17e49.socket --brick-name
/opt/gluster/data -l /var/log/glusterfs/bricks/opt-gluster-data.log
--xlator-option *-posix.glusterd-uuid=9728e5d2-5135-47bb-ab7a-d186be9e804a
--brick-port 49152 --xlator-option douyu-volume-server.listen-port=49152

[root at gfs1 ~]# cat /proc/23602/limits |grep "open files"
Max open files            65536                65536                files     
[root at gfs1 ~]# cat /proc/23613/limits |grep "open files"
Max open files            65536                65536                files     
[root at gfs1 ~]# cat /proc/23631/limits |grep "open files"
Max open files            1048576              1048576              files     
[root at gfs1 ~]# cat /proc/23622/limits |grep "open files"
Max open files            65536                65536                files

Expected results:
Max open files            2097152              2097152             files

Additional info:
No matter how fs.file-max fs.nr_open is , the output is same:
1. glusterd  max open file is always 65536
2. glusterfs max open file is always 65536
3. glusterfsd max open file is always 1048576


My problem is one gluster brick self-heal open huge numbers file fd and not
release.
I can see /proc/pid/fd/ has large fd number almost 1048576 fd. 
brick data path has almost 3,000,000 + files

and last `Too many open files` until health check process cannot open
health_check file, in the last health-check failed, going down
`/var/log/glusterfs/bricks/{data_path}.log` display
[2017-12-22 05:39:00.399010] E [MSGID: 115056]
[server-rpc-fops.c:627:server_readdir_cbk] 0-douyu-volume-server: 544076318:
READDIR -2 (bf4ac1aa-4b4c-48e7-9f57-a3b9e848ffce), client:
Dy-JXQ-4-18-2731592-2017/08/15-22:13:09:109245-douyu-volume-client-3-0-30,
error-xlator: douyu-volume-posix [Too many open files]
[2017-12-22 05:39:00.425369] W [MSGID: 113006] [posix.c:6368:posix_do_readdir]
0-douyu-volume-posix: pfd is NULL, fd=0x7fde7384f3e0 [Operation not permitted]
[2017-12-22 05:39:00.425428] E [MSGID: 115056]
[server-rpc-fops.c:627:server_readdir_cbk] 0-douyu-volume-server: 544076359:
READDIR -2 (bf4ac1aa-4b4c-48e7-9f57-a3b9e848ffce), client:
Dy-JXQ-4-18-2731592-2017/08/15-22:13:09:109245-douyu-volume-client-3-0-30,
error-xlator: douyu-volume-posix [Too many open files]
[2017-12-22 05:39:02.436663] W [MSGID: 113075]
[posix-helpers.c:1777:posix_fs_health_check] 0-douyu-volume-posix: open() on
/home/gluster/data_36/.glusterfs/health_check returned [Too many open files]

But No Matter how I change system kernel parameter, the glusterd open file is
always same.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list