[Bugs] [Bug 1528571] New: glusterd Too many open files

Fri Dec 22 07:47:32 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1528571

            Bug ID: 1528571
           Summary: glusterd Too many open files
           Product: GlusterFS
           Version: 3.10
         Component: glusterd
          Assignee: bugs at gluster.org
          Reporter: stefan_bo at 163.com
                CC: bugs at gluster.org

Description of problem:
glusterd max open files can't change

Version-Release number of selected component (if applicable):
3.10.3 (CentOS7.3) production environment
3.12.3 (CentOS7.3) test environment

How reproducible:
after install glusterd service in centos. increase kernel paramater
`fs.file-max` , `fs.nr_open` and increase LimitNOFILE
start glusterd service , check /proc/pid/limits

Steps to Reproduce:
1. yum install -y glusterfs glusterfs-server glusterfs-fuse glusterfs-rdma
2. sysctl -w fs.file-max = 4194304
3. sysctl -wfs.nr_open = 2097152
4. make sure /usr/lib/systemd/system/glusterd.service LimitNOFILE=2097152
5. systemctl start glusterd
6. check gluster process max openfile by `cat /proc/pid/limits` 

Actual results:
[root at gfs1 ~]# ps -ef|grep gluster
root       23602       1  0 15:13 ?        00:00:01 /usr/sbin/glusterd -p
/var/run/glusterd.pid --log-level DEBUG
root       23613       1  0 15:13 ?        00:00:01 /usr/sbin/glusterfs -s
localhost --volfile-id gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log
-S /var/run/gluster/187faae78e098adf3f277b8be93e2e7f.socket --xlator-option
*replicate*.node-uuid=9728e5d2-5135-47bb-ab7a-d186be9e804a
root       23622       1  0 15:13 ?        00:00:01 /usr/sbin/glusterfs -s
localhost --volfile-id gluster/quotad -p /var/run/gluster/quotad/quotad.pid -l
/var/log/glusterfs/quotad.log -S
/var/run/gluster/89152d51ca7da373002a84bcab99e3e2.socket --xlator-option
*replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off
root       23631       1  0 15:13 ?        00:00:00 /usr/sbin/glusterfsd -s
172.17.18.61 --volfile-id douyu-volume.172.17.18.61.opt-gluster-data -p
/var/run/gluster/vols/douyu-volume/172.17.18.61-opt-gluster-data.pid -S
/var/run/gluster/53e32eeabb2d21c972cfca0e3fd17e49.socket --brick-name
/opt/gluster/data -l /var/log/glusterfs/bricks/opt-gluster-data.log
--xlator-option *-posix.glusterd-uuid=9728e5d2-5135-47bb-ab7a-d186be9e804a
--brick-port 49152 --xlator-option douyu-volume-server.listen-port=49152

[root at gfs1 ~]# cat /proc/23602/limits |grep "open files"
Max open files            65536                65536                files     
[root at gfs1 ~]# cat /proc/23613/limits |grep "open files"
Max open files            65536                65536                files     
[root at gfs1 ~]# cat /proc/23631/limits |grep "open files"
Max open files            1048576              1048576              files     
[root at gfs1 ~]# cat /proc/23622/limits |grep "open files"
Max open files            65536                65536                files

Expected results:
Max open files            2097152              2097152             files

Additional info:
No matter how fs.file-max fs.nr_open is , the output is same：
1. glusterd  max open file is always 65536
2. glusterfs max open file is always 65536
3. glusterfsd max open file is always 1048576

My problem is one gluster brick self-heal open huge numbers file fd and not
release.
I can see /proc/pid/fd/ has large fd number almost 1048576 fd. 
brick data path has almost 3,000,000 + files

and last `Too many open files` until health check process cannot open
health_check file, in the last health-check failed, going down
`/var/log/glusterfs/bricks/{data_path}.log` display
[2017-12-22 05:39:00.399010] E [MSGID: 115056]
[server-rpc-fops.c:627:server_readdir_cbk] 0-douyu-volume-server: 544076318:
READDIR -2 (bf4ac1aa-4b4c-48e7-9f57-a3b9e848ffce), client:
Dy-JXQ-4-18-2731592-2017/08/15-22:13:09:109245-douyu-volume-client-3-0-30,
error-xlator: douyu-volume-posix [Too many open files]
[2017-12-22 05:39:00.425369] W [MSGID: 113006] [posix.c:6368:posix_do_readdir]
0-douyu-volume-posix: pfd is NULL, fd=0x7fde7384f3e0 [Operation not permitted]
[2017-12-22 05:39:00.425428] E [MSGID: 115056]
[server-rpc-fops.c:627:server_readdir_cbk] 0-douyu-volume-server: 544076359:
READDIR -2 (bf4ac1aa-4b4c-48e7-9f57-a3b9e848ffce), client:
Dy-JXQ-4-18-2731592-2017/08/15-22:13:09:109245-douyu-volume-client-3-0-30,
error-xlator: douyu-volume-posix [Too many open files]
[2017-12-22 05:39:02.436663] W [MSGID: 113075]
[posix-helpers.c:1777:posix_fs_health_check] 0-douyu-volume-posix: open() on
/home/gluster/data_36/.glusterfs/health_check returned [Too many open files]

But No Matter how I change system kernel parameter, the glusterd open file is
always same.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.