[Bugs] [Bug 1636798] New: Memory Consumption of Glusterfs-fuse client mount point process keeps on increasing If processes which has opened the files gets killed abruptly

Sun Oct 7 17:33:38 UTC 2018

https://bugzilla.redhat.com/show_bug.cgi?id=1636798

            Bug ID: 1636798
           Summary: Memory Consumption of Glusterfs-fuse client mount
                    point process keeps on increasing If processes which
                    has opened the files gets killed abruptly
           Product: GlusterFS
           Version: 3.12
         Component: fuse
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: vikas.1.rana at nokia.com
                CC: bugs at gluster.org

Created attachment 1491385
  --> https://bugzilla.redhat.com/attachment.cgi?id=1491385&action=edit
glusterfs statedump before running the test script

Description of problem:Memory consumption of Glusterfs-fuse client mount
process keeps on increasing if any Linux process which has opened the files for
write operation gets killed abruptly before closing the files.

Version-Release number of selected component (if applicable): glusterfs 3.13.2

How reproducible:
This behavior is seen every time.

Steps to Reproduce:
1. Run a test script which spawns a child process responsible for opening 10-15
files in write mode.
2. The parent test scripts kills the child process abruptly using SIGKILL
signal before the files open in write mode are closed.
3. Keep on doing this in a loop for a period of 30 mins. You will notice that
the memory consumption of the glusterfs fuse mount point process keeps on
increasing until we kill the test scripts.

Actual results: 
The continues increase in memory consumption of the glusterfs process is seen
until the test script is killed.
Even after stopping the test script the increased memory consumption for the
glusterfs process does not reduces over a period of time.

Expected results:
Continues increase in memory consumption of the glusterfs process should not be
seen. Also after stopping the test script the memory consumption of the process
should reduce over a period of time.

Additional info:

gluster volume info output:
# gluster volume info pmdb

Volume Name: pmdb
Type: Replicate
Volume ID: 1906452a-6062-4ec7-91b9-caa9ba86d4cb
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 169.254.0.6:/mnt/pmdb1/brick
Brick2: 169.254.0.7:/mnt/pmdb1/brick
Options Reconfigured:
performance.client-io-threads: off
cluster.consistent-metadata: on
network.ping-timeout: 10
cluster.heal-timeout: 300
cluster.granular-entry-heal: enable
transport.address-family: inet
nfs.disable: on

gluster volume status output:
# gluster volume status pmdb
Status of volume: pmdb
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 169.254.0.6:/mnt/pmdb1/brick          49155     0          Y       12375
Brick 169.254.0.7:/mnt/pmdb1/brick          49155     0          Y       16390
Self-heal Daemon on localhost               N/A       N/A        Y       17062
Self-heal Daemon on 169.254.0.7             N/A       N/A        Y       19392

Task Status of Volume pmdb
------------------------------------------------------------------------------
There are no active volume tasks

gluster volume status detail output:
# gluster volume status pmdb detail
Status of volume: pmdb
------------------------------------------------------------------------------
Brick                : Brick 169.254.0.6:/mnt/pmdb1/brick
TCP Port             : 49155
RDMA Port            : 0
Online               : Y
Pid                  : 12375
File System          : ext4
Device               : /dev/sde1
Mount Options        : rw,relatime,data=ordered
Inode Size           : 256
Disk Space Free      : 9.8GB
Total Disk Space     : 9.8GB
Inode Count          : 655360
Free Inodes          : 655322
------------------------------------------------------------------------------
Brick                : Brick 169.254.0.7:/mnt/pmdb1/brick
TCP Port             : 49155
RDMA Port            : 0
Online               : Y
Pid                  : 16390
File System          : ext4
Device               : /dev/sde1
Mount Options        : rw,relatime,data=ordered
Inode Size           : 256
Disk Space Free      : 9.8GB
Total Disk Space     : 9.8GB
Inode Count          : 655360
Free Inodes          : 655322

Attached is ths glusterfs statedump before running the test script.

Memory consumption of the process before running the test script:
ps -aux --sort -rss | grep "glusterfs" | grep -w pmdb
root     19204  0.0  0.1 571100 12456 ?        Ssl  11:46   0:00
/usr/local/sbin/glusterfs --process-name fuse --volfile-server=169.254.0.6
--volfile-server=169.254.0.7 --volfile-id=/pmdb /mnt/pmdb

Top command output for the process before running the test script:
top - 12:11:22 up 29 min,  3 users,  load average: 0.59, 0.56, 0.47
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu0  :   2.7/2.7     5[||||||                                                
                                             ]
%Cpu1  :   2.0/2.0     4[||||                                                  
                                             ]
GiB Mem :  8.5/7.801    [                                                      
                                             ]
GiB Swap:  0.0/0.000    [                                                      
                                             ]

  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND
19204 root      20   0  621.7m  12.2m   0.0  0.2   0:00.24 S glusterfs

Memory consumption of the process immediately  after running the test script:
ps -aux --sort -rss | grep "glusterfs" | grep -w pmdb
root     19204  0.4  0.4 638776 34560 ?        Ssl  11:46   0:12
/usr/local/sbin/glusterfs --process-name fuse --volfile-server=169.254.0.6
--volfile-server=169.254.0.7 --volfile-id=/pmdb /mnt/pmdb

Top command output for the process before running the test script:
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu0  :  26.9/22.1   49[|||||||||||||||||||||||||||||||||||||||||||||||||     
                                             ]
%Cpu1  :  23.9/23.9   48[||||||||||||||||||||||||||||||||||||||||||||||||      
                                             ]
GiB Mem :  9.7/7.801    [                                                      
                                             ]
GiB Swap:  0.0/0.000    [                                                      
                                             ]

  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND
19204 root      20   0  623.8m  75.0m  55.3  0.9   0:36.38 S glusterfs

Memory consumption of the process after 10 min interval when the test script
was run:
ps -aux --sort -rss | grep "glusterfs" | grep -w pmdb
root     19204  9.1  6.6 1097528 547504 ?      Ssl  11:46   4:53
/usr/local/sbin/glusterfs --process-name fuse --volfile-server=169.254.0.6
--volfile-server=169.254.0.7 --volfile-id=/pmdb /mnt/pmdb

Top command output for the process after 10 min interval when the test script
was run:
top - 12:40:46 up 59 min,  6 users,  load average: 1.83, 1.86, 1.30
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu0  :  20.3/16.8   37[|||||||||||||||||||||||||||||||||||||                 
                                             ]
%Cpu1  :  28.5/15.3   44[|||||||||||||||||||||||||||||||||||||||||||           
                                             ]
GiB Mem : 17.8/7.801    [                                                      
                                             ]
GiB Swap:  0.0/0.000    [                                                      
                                             ]

  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND
19204 root      20   0 1071.8m 566.6m  52.7  7.1   5:10.84 S glusterfs

Memory consumption of the process after another 10 min interval when the test
script was run (i.e after 20 mins run of the test script ):
ps -aux --sort -rss | grep "glusterfs" | grep -w pmdb
root     19204 15.0 12.7 1621816 1046200 ?     Ssl  11:46   9:29
/usr/local/sbin/glusterfs --process-name fuse --volfile-server=169.254.0.6
--volfile-server=169.254.0.7 --volfile-id=/pmdb /mnt/pmdb

Top command output for the process after another 10 min interval when the test
script was run (i.e after 20 mins run of the test script ):
top - 12:50:34 up  1:08,  6 users,  load average: 2.70, 2.22, 1.71
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu0  :  22.6/14.4   37[|||||||||||||||||||||||||||||||||||||                 
                                             ]
%Cpu1  :  24.6/18.1   43[|||||||||||||||||||||||||||||||||||||||||||           
                                             ]
GiB Mem : 26.0/7.801    [                                                      
                                             ]
GiB Swap:  0.0/0.000    [                                                      
                                             ]

  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND
19204 root      20   0 1583.8m 1.025g  57.3 13.1   9:46.24 S glusterfs

Memory consumption of the process after another 10 min interval when the test
script was run (i.e after 30 mins run of the test script ):
ps -aux --sort -rss | grep "glusterfs" | grep -w pmdb
root     19204 19.3 19.1 2080568 1564432 ?     Ssl  11:46  14:14
/usr/local/sbin/glusterfs --process-name fuse --volfile-server=169.254.0.6
--volfile-server=169.254.0.7 --volfile-id=/pmdb /mnt/pmdb

Top command output for the process after another 10 min interval when the test
script was run (i.e after 30 mins run of the test script ):
top - 13:00:58 up  1:19,  6 users,  load average: 1.57, 1.92, 1.83
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu0  :   8.7/16.8   26[||||||||||||||||||||||||||                            
                                             ]
%Cpu1  :  13.1/15.9   29[|||||||||||||||||||||||||||||                         
                                             ]
GiB Mem : 34.1/7.801    [                                                      
                                             ]
GiB Swap:  0.0/0.000    [                                                      
                                             ]

  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND
19204 root      20   0 2031.8m 1.501g   0.0 19.2  14:26.96 S glusterfs

Memory consumption of the process after test script has been stopped:
ps -aux --sort -rss | grep "glusterfs" | grep -w pmdb
root     19204 17.8 19.2 2080568 1574200 ?     Ssl  11:46  14:27
/usr/local/sbin/glusterfs --process-name fuse --volfile-server=169.254.0.6
--volfile-server=169.254.0.7 --volfile-id=/pmdb /mnt/pmdb

Top command output for the process after test script has been stopped:
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu0  :  25.5/57.0  
83[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
                ]
%Cpu1  :  21.6/36.5  
58[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                  
                       ]
GiB Mem : 35.9/7.801    [                                                      
                                             ]
GiB Swap:  0.0/0.000    [                                                      
                                             ]

  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND
19204 root      20   0 2031.8m 1.501g   0.0 19.2  14:34.73 S glusterfs

Memory consumption of the process after test script has been stopped more than
24 hours ago:
# ps -aux --sort -rss | grep 19204
root     19204  0.3 19.2 2146104 1576576 ?     Ssl  Oct04  16:28
/usr/local/sbin/glusterfs --process-name fuse --volfile-server=169.254.0.6
--volfile-server=169.254.0.7 --volfile-id=/pmdb /mnt/pmdb

Top command output of the process after the test script has been stopped more
than 24 hours ago:
top - 23:01:14 up 3 days, 11:19,  1 user,  load average: 0.83, 0.55, 0.49
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu0  :   8.2/15.0   23[|||||||||||||||||||||||                               
                                             ]
%Cpu1  :   9.5/12.9   22[|||||||||||||||||||||||                               
                                             ]
GiB Mem : 41.0/7.801    [                                                      
                                             ]
GiB Swap:  0.0/0.000    [                                                      
                                             ]

  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND
19204 root      20   0 2095.8m 1.504g   0.0 19.3  16:28.64 S glusterfs

For each interval mentioned above we have glusterfs statedump available. As we
can attach only one file in this bugzilla. Will try to attach them in a
separate comment.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.