[Gluster-users] Many logs (errors?) on client → memory problem

Yannick Perret yannick.perret at liris.cnrs.fr
Fri Jun 10 09:11:40 UTC 2016


I get no feedback on that but I think I found the problem:
the glusterfs client grows on memory until no memory available and them 
it crashes.

I performed the same operations on an other machine without being able 
to reproduce the problem.
The machine with the problem is an old machine (debian, 3.2.50 kernel, 
32bit), whereas the other machine is an up-to-date debian 64bit.

To give some stats the glusterfs on the client starts with less than 
810220 of resident size and finished with 3055336 (3Go!) when it crashes 
again. The volume was mounted only on this machine, used by only one 
process (a 'cp -Rp').

Running the same from a recent machine gives far more stable memory 
usage (43364 of resident size and few and small increasing).
Of course I'm using the same glusterfs version (compiled from sources on 
both machines).

As I can't upgrade this old machine due to version compatibility with 
old softs − at least until we replace these old softs − I will so use a 
NFS mountpoint from the gluster servers.

Whatever I still get on the recent machine very verbose logs for each 
directory creation:
[2016-06-10 08:35:12.965438] I 
[dht-selfheal.c:1065:dht_selfheal_layout_new_directory] 
0-HOME-LIRIS-dht: chunk size = 0xffffffff / 2064114 = 0x820
[2016-06-10 08:35:12.965473] I 
[dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 
0-HOME-LIRIS-dht: assigning range size 0xffe76e40 to HOME-LIRIS-replicate-0
[2016-06-10 08:35:12.966987] I [MSGID: 109036] 
[dht-common.c:6296:dht_log_new_layout_for_dir_selfheal] 
0-HOME-LIRIS-dht: Setting layout of /log_apache_error with [Subvol_name: 
HOME-LIRIS-replicate-0, Err: -1 , Start: 0 , Stop: 4294967295 ],

I switched clients to WARNING log level (gluster volume set HOME-LIRIS 
diagnostics.client-sys-log-level WARNING) which is fine for me.
But maybe WARNING should be the default log level, at least for clients, 
no? In production getting 3 lines per created directory is useless, and 
anyone who wants to analyze a problem will switch to INFO or DEBUG.

Regards,
--
Y.



Le 08/06/2016 17:35, Yannick Perret a écrit :
> Hello,
>
> I have a replica 2 volume managed on 2 identical server, using 3.6.7 
> version of gluster. Here is the volume info:
> Volume Name: HOME-LIRIS
> Type: Replicate
> Volume ID: 47b4b856-371b-4b8c-8baa-2b7c32d7bb23
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: sto1.mydomain:/glusterfs/home-liris/data
> Brick2: sto2.mydomain:/glusterfs/home-liris/data
>
> It is mounted on a (single) client with mount -t glusterfs 
> sto1.mydomain:/HOME-LIRIS /futur-home/
>
> I started to copy a directory (~550Go, ~660 directories with many 
> files) into it. Copy was done using 'cp -Rp'.
>
> It seems to work fine but I get *many* log entries in the 
> corresponding mountpoint logs:
> [2016-06-07 14:01:27.587300] I 
> [dht-selfheal.c:1065:dht_selfheal_layout_new_directory] 
> 0-HOME-LIRIS-dht: chunk size = 0xffffffff / 2064114 = 0x820
> [2016-06-07 14:01:27.587338] I 
> [dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 
> 0-HOME-LIRIS-dht: assigning range size 0xffe76e40 to 
> HOME-LIRIS-replicate-0
> [2016-06-07 14:01:27.588436] I [MSGID: 109036] 
> [dht-common.c:6296:dht_log_new_layout_for_dir_selfheal] 
> 0-HOME-LIRIS-dht: Setting layout of /olfamine with [Subvol_name: 
> HOME-LIRIS-replicate-0, Err: -1 , Start: 0 , Stop: 4294967295 ],
>
> This is repeated for many files (124088 exactly). Is it normal? If yes 
> I use default settings on the client so I find it a little bit 
> verbose. If no can someone tell me what is the problem here?
>
> Moreover at the end of the log file I have:
> [2016-06-08 04:42:58.210617] A [MSGID: 0] [mem-pool.c:110:__gf_calloc] 
> : no memory available for size (14651) [call stack follows]
> [2016-06-08 04:42:58.219060] A [MSGID: 0] [mem-pool.c:134:__gf_malloc] 
> : no memory available for size (21026) [call stack follows]
> pending frames:
> frame : type(1) op(CREATE)
> frame : type(1) op(CREATE)
> frame : type(1) op(LOOKUP)
> frame : type(0) op(0)
> patchset: git://git.gluster.com/glusterfs.git
> signal received: 11
> time of crash:
> 2016-06-08 04:42:58
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.6.7
>
> Which clearly don't seems right.
> The data were not all copied (logs of copy got a logical list of 
> "final transport node not connected" (or similar, it was translated in 
> my language)).
>
> I re-mounted the volume and created a directory with 'mkdir TOTO' and 
> get a similar:
> [2016-06-08 15:32:23.692936] I 
> [dht-selfheal.c:1065:dht_selfheal_layout_new_directory] 
> 0-HOME-LIRIS-dht: chunk size = 0xffffffff / 2064114 = 0x820
> [2016-06-08 15:32:23.692982] I 
> [dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 
> 0-HOME-LIRIS-dht: assigning range size 0xffe76e40 to 
> HOME-LIRIS-replicate-0
> [2016-06-08 15:32:23.694144] I [MSGID: 109036] 
> [dht-common.c:6296:dht_log_new_layout_for_dir_selfheal] 
> 0-HOME-LIRIS-dht: Setting layout of /TOTO with [Subvol_name: 
> HOME-LIRIS-replicate-0, Err: -1 , Start: 0 , Stop: 4294967295 ],
> but I don't get such message with files.
>
> If it can help volumes are ~2To and content is far from that, and both 
> bricks are ext4 (both same size).
>
>
> Any help would be appreciated.
>
> Regards,
> -- 
> Y.
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160610/b349a95d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3369 bytes
Desc: Signature cryptographique S/MIME
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160610/b349a95d/attachment.p7s>


More information about the Gluster-users mailing list