[Gluster-users] Gluster eating up a lot of ram

Tue Jul 30 00:13:40 UTC 2019

Unfortunately statedump crashes on both machines, even freshly rebooted.

[root at ysmha01 ~]# gluster --print-statedumpdir
/var/run/gluster
[root at ysmha01 ~]# gluster v statedump export
Segmentation fault (core dumped)

[root at ysmha02 ~]# uptime
 20:12:20 up 6 min,  1 user,  load average: 0.72, 0.52, 0.24
[root at ysmha02 ~]# gluster --print-statedumpdir
/var/run/gluster
[root at ysmha02 ~]# gluster v statedump export
Segmentation fault (core dumped)

I rebooted today after 40 days. Gluster was eating up shy of 40GB of RAM
out of 64.

What would you recommend to be the next step?

Diego

On Mon, Mar 4, 2019 at 5:07 AM Poornima Gurusiddaiah <pgurusid at redhat.com>
wrote:

> Could you also provide the statedump of the gluster process consuming 44G
> ram [1]. Please make sure the statedump is taken when the memory
> consumption is very high, like 10s of GBs, otherwise we may not be able to
> identify the issue. Also i see that the cache size is 10G is that something
> you arrived at, after doing some tests? Its relatively higher than normal.
>
> [1]
> https://docs.gluster.org/en/v3/Troubleshooting/statedump/#generate-a-statedump
>
> On Mon, Mar 4, 2019 at 12:23 AM Diego Remolina <dijuremo at gmail.com> wrote:
>
>> Hi,
>>
>> I will not be able to test gluster-6rc because this is a production
>> environment and it takes several days for memory to grow a lot.
>>
>> The Samba server is hosting all types of files, small and large from
>> small roaming profile type files to bigger files like adobe suite, autodesk
>> Revit (file sizes in the hundreds of megabytes).
>>
>> As I stated before, this same issue was present back with 3.8.x which I
>> was running before.
>>
>> The information you requested:
>>
>> [root at ysmha02 ~]# gluster v info export
>>
>> Volume Name: export
>> Type: Replicate
>> Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: 10.0.1.7:/bricks/hdds/brick
>> Brick2: 10.0.1.6:/bricks/hdds/brick
>> Options Reconfigured:
>> performance.stat-prefetch: on
>> performance.cache-min-file-size: 0
>> network.inode-lru-limit: 65536
>> performance.cache-invalidation: on
>> features.cache-invalidation: on
>> performance.md-cache-timeout: 600
>> features.cache-invalidation-timeout: 600
>> performance.cache-samba-metadata: on
>> transport.address-family: inet
>> server.allow-insecure: on
>> performance.cache-size: 10GB
>> cluster.server-quorum-type: server
>> nfs.disable: on
>> performance.io-thread-count: 64
>> performance.io-cache: on
>> cluster.lookup-optimize: on
>> cluster.readdir-optimize: on
>> server.event-threads: 5
>> client.event-threads: 5
>> performance.cache-max-file-size: 256MB
>> diagnostics.client-log-level: INFO
>> diagnostics.brick-log-level: INFO
>> cluster.server-quorum-ratio: 51%
>>
>>
>>
>>
>>
>>
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>> <#m_-1483290904248086332_m_-4429654867678350131_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>> On Fri, Mar 1, 2019 at 11:07 PM Poornima Gurusiddaiah <
>> pgurusid at redhat.com> wrote:
>>
>>> This high memory consumption is not normal. Looks like it's a memory
>>> leak. Is it possible to try it on test setup with gluster-6rc? What is the
>>> kind of workload that goes into fuse mount? Large files or small files? We
>>> need the following information to debug further:
>>> - Gluster volume info output
>>> - Statedump of the Gluster fuse mount process consuming 44G ram.
>>>
>>> Regards,
>>> Poornima
>>>
>>>
>>> On Sat, Mar 2, 2019, 3:40 AM Diego Remolina <dijuremo at gmail.com> wrote:
>>>
>>>> I am using glusterfs with two servers as a file server sharing files
>>>> via samba and ctdb. I cannot use samba vfs gluster plugin, due to bug in
>>>> current Centos version of samba. So I am mounting via fuse and exporting
>>>> the volume to samba from the mount point.
>>>>
>>>> Upon initial boot, the server where samba is exporting files climbs up
>>>> to ~10GB RAM within a couple hours of use. From then on, it is a constant
>>>> slow memory increase. In the past with gluster 3.8.x we had to reboot the
>>>> servers at around 30 days . With gluster 4.1.6 we are getting up to 48
>>>> days, but RAM use is at 48GB out of 64GB. Is this normal?
>>>>
>>>> The particular versions are below,
>>>>
>>>> [root at ysmha01 home]# uptime
>>>> 16:59:39 up 48 days,  9:56,  1 user,  load average: 3.75, 3.17, 3.00
>>>> [root at ysmha01 home]# rpm -qa | grep gluster
>>>> centos-release-gluster41-1.0-3.el7.centos.noarch
>>>> glusterfs-server-4.1.6-1.el7.x86_64
>>>> glusterfs-api-4.1.6-1.el7.x86_64
>>>> centos-release-gluster-legacy-4.0-2.el7.centos.noarch
>>>> glusterfs-4.1.6-1.el7.x86_64
>>>> glusterfs-client-xlators-4.1.6-1.el7.x86_64
>>>> libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.8.x86_64
>>>> glusterfs-fuse-4.1.6-1.el7.x86_64
>>>> glusterfs-libs-4.1.6-1.el7.x86_64
>>>> glusterfs-rdma-4.1.6-1.el7.x86_64
>>>> glusterfs-cli-4.1.6-1.el7.x86_64
>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>> [root at ysmha01 home]# rpm -qa | grep samba
>>>> samba-common-tools-4.8.3-4.el7.x86_64
>>>> samba-client-libs-4.8.3-4.el7.x86_64
>>>> samba-libs-4.8.3-4.el7.x86_64
>>>> samba-4.8.3-4.el7.x86_64
>>>> samba-common-libs-4.8.3-4.el7.x86_64
>>>> samba-common-4.8.3-4.el7.noarch
>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>> [root at ysmha01 home]# cat /etc/redhat-release
>>>> CentOS Linux release 7.6.1810 (Core)
>>>>
>>>> RAM view using top
>>>> Tasks: 398 total,   1 running, 397 sleeping,   0 stopped,   0 zombie
>>>> %Cpu(s):  7.0 us,  9.3 sy,  1.7 ni, 71.6 id,  9.7 wa,  0.0 hi,  0.8
>>>> si,  0.0 st
>>>> KiB Mem : 65772000 total,  1851344 free, 60487404 used,  3433252
>>>> buff/cache
>>>> KiB Swap:        0 total,        0 free,        0 used.  3134316 avail
>>>> Mem
>>>>
>>>>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
>>>> COMMAND
>>>>  9953 root      20   0 3727912 946496   3196 S 150.2  1.4  38626:27
>>>> glusterfsd
>>>>  9634 root      20   0   48.1g  47.2g   3184 S  96.3 75.3  29513:55
>>>> glusterfs
>>>> 14485 root      20   0 3404140  63780   2052 S  80.7  0.1   1590:13
>>>> glusterfs
>>>>
>>>> [root at ysmha01 ~]# gluster v status export
>>>> Status of volume: export
>>>> Gluster process                             TCP Port  RDMA Port
>>>> Online  Pid
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Brick 10.0.1.7:/bricks/hdds/brick           49157     0          Y
>>>>    13986
>>>> Brick 10.0.1.6:/bricks/hdds/brick           49153     0          Y
>>>>    9953
>>>> Self-heal Daemon on localhost               N/A       N/A        Y
>>>>  14485
>>>> Self-heal Daemon on 10.0.1.7                N/A       N/A        Y
>>>>  21934
>>>> Self-heal Daemon on 10.0.1.5                N/A       N/A        Y
>>>>  4598
>>>>
>>>> Task Status of Volume export
>>>>
>>>> ------------------------------------------------------------------------------
>>>> There are no active volume tasks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>>>> www.avast.com
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>> <#m_-1483290904248086332_m_-4429654867678350131_m_1092070095161815064_m_5816452762692804512_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190729/1d2b815d/attachment.html>