[Gluster-users] Gluster eating up a lot of ram

Nithya Balachandran nbalacha at redhat.com
Tue Jul 30 12:43:57 UTC 2019


On Tue, 30 Jul 2019 at 16:37, Diego Remolina <dijuremo at gmail.com> wrote:

> This option is enabled. In which version has this been patched? This is a
> file server and disabling readdir-ahead will have a hard impact on
> performance.
>

This was fixed in 5.3 (https://bugzilla.redhat.com/show_bug.cgi?id=1659676
).
This bug is only relevant if the gluster fuse client is the one that is
using up memory.

The first thing to do would be to determine which process is using up the
memory and to get a statedump.

ps <pid> should give you the details of the gluster process .

Regards,
Nithya



>
> [root at ysmha01 ~]# gluster v get export readdir-ahead
> Option                                  Value
>
> ------                                  -----
>
> performance.readdir-ahead               on
>
> The guide recommends enabling the setting:
>
>
> https://docs.gluster.org/en/latest/Administrator%20Guide/Accessing%20Gluster%20from%20Windows/
>
> Diego
>
>
>
> On Mon, Jul 29, 2019 at 11:52 PM Nithya Balachandran <nbalacha at redhat.com>
> wrote:
>
>>
>> Hi Diego,
>>
>> Please do the following:
>>
>> gluster v get <volname> readdir-ahead
>>
>> If this is enabled, please disable it and see if it helps. There was a
>> leak in the opendir codpath that was fixed in later released.
>>
>> Regards,
>> Nithya
>>
>>
>> On Tue, 30 Jul 2019 at 09:04, Diego Remolina <dijuremo at gmail.com> wrote:
>>
>>> Will this kill the actual process or simply trigger the dump? Which
>>> process should I kill? The brick process in the system or the fuse mount?
>>>
>>> Diego
>>>
>>> On Mon, Jul 29, 2019, 23:27 Nithya Balachandran <nbalacha at redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Tue, 30 Jul 2019 at 05:44, Diego Remolina <dijuremo at gmail.com>
>>>> wrote:
>>>>
>>>>> Unfortunately statedump crashes on both machines, even freshly
>>>>> rebooted.
>>>>>
>>>>
>>>> Do you see any statedump files in /var/run/gluster?  This looks more
>>>> like the gluster cli crashed.
>>>>
>>>>>
>>>>> [root at ysmha01 ~]# gluster --print-statedumpdir
>>>>> /var/run/gluster
>>>>> [root at ysmha01 ~]# gluster v statedump export
>>>>> Segmentation fault (core dumped)
>>>>>
>>>>> [root at ysmha02 ~]# uptime
>>>>>  20:12:20 up 6 min,  1 user,  load average: 0.72, 0.52, 0.24
>>>>> [root at ysmha02 ~]# gluster --print-statedumpdir
>>>>> /var/run/gluster
>>>>> [root at ysmha02 ~]# gluster v statedump export
>>>>> Segmentation fault (core dumped)
>>>>>
>>>>> I rebooted today after 40 days. Gluster was eating up shy of 40GB of
>>>>> RAM out of 64.
>>>>>
>>>>> What would you recommend to be the next step?
>>>>>
>>>>> Diego
>>>>>
>>>>> On Mon, Mar 4, 2019 at 5:07 AM Poornima Gurusiddaiah <
>>>>> pgurusid at redhat.com> wrote:
>>>>>
>>>>>> Could you also provide the statedump of the gluster process consuming
>>>>>> 44G ram [1]. Please make sure the statedump is taken when the memory
>>>>>> consumption is very high, like 10s of GBs, otherwise we may not be able to
>>>>>> identify the issue. Also i see that the cache size is 10G is that something
>>>>>> you arrived at, after doing some tests? Its relatively higher than normal.
>>>>>>
>>>>>> [1]
>>>>>> https://docs.gluster.org/en/v3/Troubleshooting/statedump/#generate-a-statedump
>>>>>>
>>>>>> On Mon, Mar 4, 2019 at 12:23 AM Diego Remolina <dijuremo at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I will not be able to test gluster-6rc because this is a production
>>>>>>> environment and it takes several days for memory to grow a lot.
>>>>>>>
>>>>>>> The Samba server is hosting all types of files, small and large from
>>>>>>> small roaming profile type files to bigger files like adobe suite, autodesk
>>>>>>> Revit (file sizes in the hundreds of megabytes).
>>>>>>>
>>>>>>> As I stated before, this same issue was present back with 3.8.x
>>>>>>> which I was running before.
>>>>>>>
>>>>>>> The information you requested:
>>>>>>>
>>>>>>> [root at ysmha02 ~]# gluster v info export
>>>>>>>
>>>>>>> Volume Name: export
>>>>>>> Type: Replicate
>>>>>>> Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff
>>>>>>> Status: Started
>>>>>>> Snapshot Count: 0
>>>>>>> Number of Bricks: 1 x 2 = 2
>>>>>>> Transport-type: tcp
>>>>>>> Bricks:
>>>>>>> Brick1: 10.0.1.7:/bricks/hdds/brick
>>>>>>> Brick2: 10.0.1.6:/bricks/hdds/brick
>>>>>>> Options Reconfigured:
>>>>>>> performance.stat-prefetch: on
>>>>>>> performance.cache-min-file-size: 0
>>>>>>> network.inode-lru-limit: 65536
>>>>>>> performance.cache-invalidation: on
>>>>>>> features.cache-invalidation: on
>>>>>>> performance.md-cache-timeout: 600
>>>>>>> features.cache-invalidation-timeout: 600
>>>>>>> performance.cache-samba-metadata: on
>>>>>>> transport.address-family: inet
>>>>>>> server.allow-insecure: on
>>>>>>> performance.cache-size: 10GB
>>>>>>> cluster.server-quorum-type: server
>>>>>>> nfs.disable: on
>>>>>>> performance.io-thread-count: 64
>>>>>>> performance.io-cache: on
>>>>>>> cluster.lookup-optimize: on
>>>>>>> cluster.readdir-optimize: on
>>>>>>> server.event-threads: 5
>>>>>>> client.event-threads: 5
>>>>>>> performance.cache-max-file-size: 256MB
>>>>>>> diagnostics.client-log-level: INFO
>>>>>>> diagnostics.brick-log-level: INFO
>>>>>>> cluster.server-quorum-ratio: 51%
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>>>>>>> www.avast.com
>>>>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>>>>> <#m_119331930431500301_m_-8568737347882478303_m_-4833401328225509760_m_8374238785685214358_m_-3340449949414300599_m_-7001269052163580460_m_-4519402017059013283_m_-1483290904248086332_m_-4429654867678350131_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>>>>
>>>>>>> On Fri, Mar 1, 2019 at 11:07 PM Poornima Gurusiddaiah <
>>>>>>> pgurusid at redhat.com> wrote:
>>>>>>>
>>>>>>>> This high memory consumption is not normal. Looks like it's a
>>>>>>>> memory leak. Is it possible to try it on test setup with gluster-6rc? What
>>>>>>>> is the kind of workload that goes into fuse mount? Large files or small
>>>>>>>> files? We need the following information to debug further:
>>>>>>>> - Gluster volume info output
>>>>>>>> - Statedump of the Gluster fuse mount process consuming 44G ram.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Poornima
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Mar 2, 2019, 3:40 AM Diego Remolina <dijuremo at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I am using glusterfs with two servers as a file server sharing
>>>>>>>>> files via samba and ctdb. I cannot use samba vfs gluster plugin, due to bug
>>>>>>>>> in current Centos version of samba. So I am mounting via fuse and exporting
>>>>>>>>> the volume to samba from the mount point.
>>>>>>>>>
>>>>>>>>> Upon initial boot, the server where samba is exporting files
>>>>>>>>> climbs up to ~10GB RAM within a couple hours of use. From then on, it is a
>>>>>>>>> constant slow memory increase. In the past with gluster 3.8.x we had to
>>>>>>>>> reboot the servers at around 30 days . With gluster 4.1.6 we are getting up
>>>>>>>>> to 48 days, but RAM use is at 48GB out of 64GB. Is this normal?
>>>>>>>>>
>>>>>>>>> The particular versions are below,
>>>>>>>>>
>>>>>>>>> [root at ysmha01 home]# uptime
>>>>>>>>> 16:59:39 up 48 days,  9:56,  1 user,  load average: 3.75, 3.17,
>>>>>>>>> 3.00
>>>>>>>>> [root at ysmha01 home]# rpm -qa | grep gluster
>>>>>>>>> centos-release-gluster41-1.0-3.el7.centos.noarch
>>>>>>>>> glusterfs-server-4.1.6-1.el7.x86_64
>>>>>>>>> glusterfs-api-4.1.6-1.el7.x86_64
>>>>>>>>> centos-release-gluster-legacy-4.0-2.el7.centos.noarch
>>>>>>>>> glusterfs-4.1.6-1.el7.x86_64
>>>>>>>>> glusterfs-client-xlators-4.1.6-1.el7.x86_64
>>>>>>>>> libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.8.x86_64
>>>>>>>>> glusterfs-fuse-4.1.6-1.el7.x86_64
>>>>>>>>> glusterfs-libs-4.1.6-1.el7.x86_64
>>>>>>>>> glusterfs-rdma-4.1.6-1.el7.x86_64
>>>>>>>>> glusterfs-cli-4.1.6-1.el7.x86_64
>>>>>>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>>>>>>> [root at ysmha01 home]# rpm -qa | grep samba
>>>>>>>>> samba-common-tools-4.8.3-4.el7.x86_64
>>>>>>>>> samba-client-libs-4.8.3-4.el7.x86_64
>>>>>>>>> samba-libs-4.8.3-4.el7.x86_64
>>>>>>>>> samba-4.8.3-4.el7.x86_64
>>>>>>>>> samba-common-libs-4.8.3-4.el7.x86_64
>>>>>>>>> samba-common-4.8.3-4.el7.noarch
>>>>>>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>>>>>>> [root at ysmha01 home]# cat /etc/redhat-release
>>>>>>>>> CentOS Linux release 7.6.1810 (Core)
>>>>>>>>>
>>>>>>>>> RAM view using top
>>>>>>>>> Tasks: 398 total,   1 running, 397 sleeping,   0 stopped,   0
>>>>>>>>> zombie
>>>>>>>>> %Cpu(s):  7.0 us,  9.3 sy,  1.7 ni, 71.6 id,  9.7 wa,  0.0 hi,
>>>>>>>>> 0.8 si,  0.0 st
>>>>>>>>> KiB Mem : 65772000 total,  1851344 free, 60487404 used,  3433252
>>>>>>>>> buff/cache
>>>>>>>>> KiB Swap:        0 total,        0 free,        0 used.  3134316
>>>>>>>>> avail Mem
>>>>>>>>>
>>>>>>>>>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM
>>>>>>>>>  TIME+ COMMAND
>>>>>>>>>  9953 root      20   0 3727912 946496   3196 S 150.2  1.4
>>>>>>>>> 38626:27 glusterfsd
>>>>>>>>>  9634 root      20   0   48.1g  47.2g   3184 S  96.3 75.3
>>>>>>>>> 29513:55 glusterfs
>>>>>>>>> 14485 root      20   0 3404140  63780   2052 S  80.7  0.1
>>>>>>>>>  1590:13 glusterfs
>>>>>>>>>
>>>>>>>>> [root at ysmha01 ~]# gluster v status export
>>>>>>>>> Status of volume: export
>>>>>>>>> Gluster process                             TCP Port  RDMA Port
>>>>>>>>> Online  Pid
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>> Brick 10.0.1.7:/bricks/hdds/brick           49157     0
>>>>>>>>> Y       13986
>>>>>>>>> Brick 10.0.1.6:/bricks/hdds/brick           49153     0
>>>>>>>>> Y       9953
>>>>>>>>> Self-heal Daemon on localhost               N/A       N/A
>>>>>>>>> Y       14485
>>>>>>>>> Self-heal Daemon on 10.0.1.7                N/A       N/A
>>>>>>>>> Y       21934
>>>>>>>>> Self-heal Daemon on 10.0.1.5                N/A       N/A
>>>>>>>>> Y       4598
>>>>>>>>>
>>>>>>>>> Task Status of Volume export
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>> There are no active volume tasks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>>>>>>>>> www.avast.com
>>>>>>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>>>>>>> <#m_119331930431500301_m_-8568737347882478303_m_-4833401328225509760_m_8374238785685214358_m_-3340449949414300599_m_-7001269052163580460_m_-4519402017059013283_m_-1483290904248086332_m_-4429654867678350131_m_1092070095161815064_m_5816452762692804512_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190730/21bbcd6f/attachment.html>


More information about the Gluster-users mailing list