[Gluster-users] Memory leak in 3.6.*?
Yannick Perret
yannick.perret at liris.cnrs.fr
Fri Jul 22 20:16:38 UTC 2016
Le 22/07/2016 21:12, Yannick Perret a écrit :
> Le 22/07/2016 17:47, Mykola Ulianytskyi a écrit :
>> Hi
>>
>>> 3.7 clients are not compatible with 3.6 servers
>> Can you provide more info?
>>
>> I use some 3.7 clients with 3.6 servers and don't see issues.
> Well,
> with client 3.7.13 compiled on the same machine when I try the same
> mount I get:
> # mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
> Mount failed. Please check the log file for more details.
>
> Checking the logs (/var/log/glusterfs/zog.log) I have:
> [2016-07-22 19:05:40.249143] I [MSGID: 100030]
> [glusterfsd.c:2338:main] 0-/usr/local/sbin/glusterfs: Started running
> /usr/local/sbin/glusterfs version 3.7.13 (args:
> /usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
> --volfile-id=BACKUP-ADMIN-DATA /zog)
> [2016-07-22 19:05:40.258437] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
> thread with index 1
> [2016-07-22 19:05:40.259480] W [socket.c:701:__socket_rwv]
> 0-glusterfs: readv on <the-IP>:24007 failed (Aucune donnée disponible)
> [2016-07-22 19:05:40.259859] E [rpc-clnt.c:362:saved_frames_unwind]
> (-->
> /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x175)[0x7fad7d039335]
> (-->
> /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1b3)[0x7fad7ce04e73] (-->
> /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fad7ce04f6e]
> (-->
> /usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7fad7ce065ee]
> (-->
> /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fad7ce06de8]
> ))))) 0-glusterfs: forced unwinding frame type(GlusterFS Handshake)
> op(GETSPEC(2)) called at 2016-07-22 19:05:40.258858 (xid=0x1)
> [2016-07-22 19:05:40.259894] E
> [glusterfsd-mgmt.c:1690:mgmt_getspec_cbk] 0-mgmt: failed to fetch
> volume file (key:BACKUP-ADMIN-DATA)
> [2016-07-22 19:05:40.259939] W [glusterfsd.c:1251:cleanup_and_exit]
> (-->/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1de)
> [0x7fad7ce04e9e] -->/usr/local/sbin/glusterfs(mgmt_getspec_cbk+0x454)
> [0x40d564] -->/usr/local/sbin/glusterfs(cleanup_and_exit+0x4b)
> [0x407eab] ) 0-: received signum (0), shutting down
> [2016-07-22 19:05:40.259965] I [fuse-bridge.c:5720:fini] 0-fuse:
> Unmounting '/zog'.
> [2016-07-22 19:05:40.260913] W [glusterfsd.c:1251:cleanup_and_exit]
> (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4) [0x7fad7c0a30a4]
> -->/usr/local/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x408015]
> -->/usr/local/sbin/glusterfs(cleanup_and_exit+0x4b) [0x407eab] ) 0-:
> received signum (15), shutting down
>
Hmmm… I just saw that logs are (partly) translated which can be harder
to understand for non-french speakers.
"Aucune donnée disponible" means: no available data
BTW If I could manage 3.7 clients to work with my servers and if the
memory leak don't exists in 3.7 it would be fine for me.
--
Y.
> I did not go further about that as I just presumed that 3.7 series was
> not compatible with 3.6 servers but it's maybe something else. But
> here it is the same client, the same server(s) and the same volume.
>
> The compilation is with features (built with "configure
> --disable-tiering" as I don't have installed stuff for that):
> FUSE client : yes
> Infiniband verbs : no
> epoll IO multiplex : yes
> argp-standalone : no
> fusermount : yes
> readline : yes
> georeplication : yes
> Linux-AIO : no
> Enable Debug : no
> Block Device xlator : no
> glupy : yes
> Use syslog : yes
> XML output : yes
> QEMU Block formats : no
> Encryption xlator : yes
> Unit Tests : no
> POSIX ACLs : yes
> Data Classification : no
> firewalld-config : no
>
> Regards,
> --
> Y.
>
>
>> Thank you
>>
>> --
>> With best regards,
>> Mykola
>>
>>
>> On Fri, Jul 22, 2016 at 4:31 PM, Yannick Perret
>> <yannick.perret at liris.cnrs.fr> wrote:
>>> Note: I'm have a dev client machine so I can perform tests or recompile
>>> glusterfs client if it can helps getting data about that.
>>>
>>> I did not test this problem against 3.7.x version as my 2 servers
>>> are in use
>>> and I can't upgrade them at this time, and 3.7 clients are not
>>> compatible
>>> with 3.6 servers (as far as I can see from my tests).
>>>
>>> --
>>> Y.
>>>
>>>
>>> Le 22/07/2016 14:06, Yannick Perret a écrit :
>>>
>>> Hello,
>>> some times ago I posted about a memory leak in client process, but
>>> it was on
>>> a very old 32bit machine (both kernel and OS) and I don't found
>>> evidences
>>> about a similar problem on our recent machines.
>>> But I performed more tests and I have the same problem.
>>>
>>> Clients are 64bit Debian 8.2 machines. Glusterfs client on these
>>> machines is
>>> compiled from sources with activated stuff:
>>> FUSE client : yes
>>> Infiniband verbs : no
>>> epoll IO multiplex : yes
>>> argp-standalone : no
>>> fusermount : yes
>>> readline : yes
>>> georeplication : yes
>>> Linux-AIO : no
>>> Enable Debug : no
>>> systemtap : no
>>> Block Device xlator : no
>>> glupy : no
>>> Use syslog : yes
>>> XML output : yes
>>> QEMU Block formats : no
>>> Encryption xlator : yes
>>> Erasure Code xlator : yes
>>>
>>> I tested both 3.6.7 and 3.6.9 version on client (3.6.7 is the one
>>> installed
>>> on our machines, even on servers, 3.6.9 is for testing with last 3.6
>>> version).
>>>
>>> Here are the operations on the client (also performed with similar
>>> results
>>> with 3.6.7 version):
>>> # /usr/local/sbin/glusterfs --version
>>> glusterfs 3.6.9 built on Jul 22 2016 13:27:42
>>> (…)
>>> # mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
>>> # cd /usr/
>>> # cp -Rp * /zog/TEMP/
>>> Then monitoring memory used by glusterfs process while 'cp' is running
>>> (resp. VSZ and RSS from 'ps'):
>>> 284740 70232
>>> 284740 70232
>>> 284876 71704
>>> 285000 72684
>>> 285136 74008
>>> 285416 75940
>>> (…)
>>> 368684 151980
>>> 369324 153768
>>> 369836 155576
>>> 370092 156192
>>> 370092 156192
>>> Here both sizes are stable and correspond to the end of 'cp' command.
>>> If I restart an other 'cp' (even on the same directories) size
>>> starts again
>>> to increase.
>>> If I perform a 'ls -lR' in the directory size also increase:
>>> 370756 192488
>>> 389964 212148
>>> 390948 213232
>>> (here I ^C the 'ls')
>>>
>>> When doing nothing the size don't increase but never decrease (calling
>>> 'sync' don't change the situation).
>>> Sending a HUP signal to glusterfs process also increases memory (390948
>>> 213324 → 456484 213320).
>>> Changing volume configuration (changing
>>> diagnostics.client-sys-log-level
>>> value) don't change anything.
>>>
>>> Here the actual ps:
>>> root 17041 4.9 5.2 456484 213320 ? Ssl 13:29 1:21
>>> /usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
>>> --volfile-id=BACKUP-ADMIN-DATA /zog
>>>
>>> Of course umouting/remounting fall back to "start" size:
>>> # umount /zog
>>> # mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
>>> → root 28741 0.3 0.7 273320 30484 ? Ssl 13:57 0:00
>>> /usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
>>> --volfile-id=BACKUP-ADMIN-DATA /zog
>>>
>>>
>>> I didn't saw this before because most of our volumes are mounted "on
>>> demand"
>>> for some storage activities or are permanently mounted but with very
>>> few
>>> activity.
>>> But clearly this memory usage driff is a long-term problem. On the
>>> old 32bit
>>> machine I had this problem ("solved" by using NFS mounts in order to
>>> wait
>>> for this old machine to be replaced) and it lead to glusterfs being
>>> killed
>>> by OS when out of free memory. It was faster than what I describe
>>> here but
>>> it's just a question of time.
>>>
>>>
>>> Thanks for any help about that.
>>>
>>> Regards,
>>> --
>>> Y.
>>>
>>>
>>> The corresponding volume on servers is (if it can help):
>>> Volume Name: BACKUP-ADMIN-DATA
>>> Type: Replicate
>>> Volume ID: 306d57f3-fb30-4bcc-8687-08bf0a3d7878
>>> Status: Started
>>> Number of Bricks: 1 x 2 = 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: sto1.my.domain:/glusterfs/backup-admin/data
>>> Brick2: sto2.my.domain:/glusterfs/backup-admin/data
>>> Options Reconfigured:
>>> diagnostics.client-sys-log-level: WARNING
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160722/fec56bde/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3369 bytes
Desc: Signature cryptographique S/MIME
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160722/fec56bde/attachment.p7s>
More information about the Gluster-users
mailing list