[Bugs] [Bug 1528571] glusterd Too many open files

Thu Dec 28 09:01:21 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1528571


--- Comment #4 from stefan_bo <stefan_bo at 163.com> ---
(In reply to Mohit Agrawal from comment #3)
> Hi,
> 
>  Currently, we don't provide any option to configure RLIMIT_NOFILE for any
> gluster daemon but to control the same I can provide some workaround to you.
>  1) Can you please confirm have you configured any limit for
> cluster.shd-max-threads, by default the value 
>     of this option is 1. To control the no. of file limits usage for shd u
> can reduce the value of this option.

cluster.shd-max-threads is default 1
no mater it value is high or low, nofile is just same 65536


>  2) You can kill the shd process and from command-line, you can start the
> process with the same argument as it(glustershd)is showing before killing
> the process.At the time of start new shd process, you can tune the value of
> RLIMIT_NOFILE through ulimit or systemd, I think it should work.
> 
> 
> Regards
> Mohit Agrawal


I just do what you say

[root at gfs1 ~]# ulimit -a|grep "open files"
open files                      (-n) 2097100

glusterfs daemon can achive ulimit what i set:

[root at gfs1 ~]# cat /proc/$(ps -ef|grep "/usr/sbin/glusterfs "|grep -v grep |awk
'{print $2}')/limits|grep "open files"
Max open files            2097100              2097100              files 

glusterfsd daemon can't achive ulimit what i set:
[root at gfs1 ~]# cat /proc/$(ps -ef|grep "/usr/sbin/glusterfsd "|grep -v grep
|awk '{print $2}')/limits|grep "open files"
Max open files            1048576              1048576              files 

It's always 1048576, can't increase any more


My problem is glusterfsd daemon, after log
/var/log/glusterfs/bricks/home-gluster-data_36.log a while "[Too many open
files]"

systemd status display 
Dec 24 15:36:55 gfs home-gluster-data_36[181536]: [2017-12-24 07:36:55.579134]
M [MSGID: 113075] [posix-helpers.c:1837:posix_health_check_thread_proc]
0-volume-posix: health-check failed, going down
Dec 24 15:37:25 gfs home-gluster-data_36[181536]: [2017-12-24 07:37:25.579509]
M [MSGID: 113075] [posix-helpers.c:1844:posix_health_check_thread_proc]
0-volume-posix: still alive! -> SIGTERM

gluster volume status show
[root at gfs1 ~]# gluster volume status
Status of volume: volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.11.0.33:/home/gluster/data_33      49152     0          Y       228792
Brick 10.11.0.34:/home/gluster/data_34      49152     0          Y       231688
Brick 10.11.0.35:/home/gluster/data_35      49152     0          Y       229844
Brick 10.11.0.36:/home/gluster/data_36      N/A       N/A        N       N/A  
Brick 10.11.0.37:/home/gluster/data_37      49152     0          Y       440437
Brick 10.11.0.38:/home/gluster/data_38      49152     0          Y       349833
Brick 10.11.0.39:/home/gluster/data_39      49152     0          Y       482584
Brick 10.11.0.40:/home/gluster/data_40      49152     0          Y       483945
....

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.