[Bugs] [Bug 1342902] New: glustershd process memory usage too high on CentOS 7.2

Mon Jun 6 04:57:21 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1342902

            Bug ID: 1342902
           Summary: glustershd process memory usage too high on CentOS 7.2
           Product: GlusterFS
           Version: 3.8.0
         Component: disperse
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: jianwei1216 at qq.com
                CC: bugs at gluster.org

Description of problem:
I am running glusterfs release-3.8 on linux kernel version
3.10.0-327.el7.x86_64 on three(node-1/2/3) CentOS 7.2 host with 8GB of RAM each
and double network interface cards. 
When alternately 'ifconfig eno1 down and ifconfig eno1 up' on node-2/3 and run
more than 10 hours, the glustershd process memory(VIRT) usage more than 20GB on
every node.

Version-Release number of selected component (if applicable):
glusterfs release-3.8 and git cherry-pick
24dd33929bbbc9a72360793048f17bf4e6cec8a3 on release-3.8
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
commit 24dd33929bbbc9a72360793048f17bf4e6cec8a3  --> (master)
Author: Kaleb S KEITHLEY <kkeithle at redhat.com>
Date:   Fri May 6 13:04:38 2016 -0400

    libglusterfs (timer): race conditions, illegal mem access, mem leak

    While investigating gfapi memory consumption with valgrind, valgrind
    reported several memory access issues.

    Also see the timer 'registry' being recreated (shortly) after being
    freed during teardown due to the way it's currently written.

    Passing ctx as data to gf_timer_proc() is prone to memory access
    issues if ctx is freed before gf_timer_proc() terminates. (And in
    fact this does happen, at least in valgrind.) gf_timer_proc() doesn't
    need ctx for anything, it only needs ctx->timer, so just pass that.

    Nothing ever calls gf_timer_registry_init(). Nothing outside of
    timer.c that is. Making it and gf_timer_proc() static.

    Change-Id: Ia28454dda0cf0de2fec94d76441d98c3927a906a
    BUG: 1333925
    Signed-off-by: Kaleb S KEITHLEY <kkeithle at redhat.com>
    Reviewed-on: http://review.gluster.org/14247
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    Smoke: Gluster Build System <jenkins at build.gluster.com>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Poornima G <pgurusid at redhat.com>
    Reviewed-by: Niels de Vos <ndevos at redhat.com>
    Reviewed-by: Jeff Darcy <jdarcy at redhat.com>
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Linux Kernel 3.10.0-327.el7.x86_64
CentOS Linux release 7.2.1511 (Core)
16 disks(xfs) on every node. (three nodes)

How reproducible:
Create disperse test volume (16 * (2+1));
test-disperse-0
test-disperse-1
...
test-disperse-15
Bricks of every disperse group are node-1:/disk,node-2:/disk,node-3:/disk; 
Follow network cards of every node configure:
eno1: 10.10.21.111   10.10.21.112   10.10.21.113,   no gateway
eno2: 192.168.21.111 192.168.21.112 192.168.21.113, no gateway
eno1 bind to bricks （10.10.21.111:/brick)

Steps to Reproduce:
1.create disperse test volume (16 * (2+1)), start and mount test volume on
every node(mount.glusterfs 127.0.0.1:/test /mnt/test);
2.execute a script on 10.10.21.111,following: 
if runtime > 10 hours, exit()
ssh 192.168.21.112 'ifconfig eno1 down'
ssh 192.168.21.112 'ifconfig eno1 up'
sleep(5)
ssh 192.168.21.113 'ifconfig eno1 down'
ssh 192.168.21.113 'ifconfig eno1 up'
sleep(5)
3.to observe the glustershd process memory usage;

Actual results:
According to the above test method, the glustershd process's memory usage
always very very very high(about 20 GB). sweep space is use up!
This is abnormal, I doubt that have a memory leak in glustershd;

Expected results:
We would expect the memory usage to fall within a reasonable ceiling

Additional info:

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.