[Gluster-users] Glusterfs 2.0 hangs on high load

Maris Ruskulis maris at chown.lv
Thu May 28 10:07:52 UTC 2009


I have same issue with same config when both nodes are x64. But 
difference is that, there is no bailout messages in logs.

Jasper van Wanrooy - Chatventure wrote:
> Hi Maris,
>
> I regret to hear that. I was also having problems with the stability 
> on 32bit platforms. Possibly you should try it on a 64bit platform. Is 
> that an option?
>
> Best Regards Jasper
>
>
> On 28 mei 2009, at 09:36, Maris Ruskulis wrote:
>
>> Hello!
>> After upgrade to version 2.0, now using 2.0.1, I'm experiencing 
>> problems with glusterfs stability.
>> I'm running 2 node setup with cliet side afr, and glusterfsd also is 
>> running on same servers. Time to time glusterfs just hangs, i can 
>> reproduce this running iozone benchmarking tool.  I'm using patched 
>> Fuse, but same result is with unpatched.
>>
>> ================================================================================
>> Version      : glusterfs 2.0.1 built on May 27 2009 16:04:01
>> TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
>> Starting Time: 2009-05-27 16:38:20
>> Command line : /usr/sbin/glusterfsd 
>> --volfile=/etc/glusterfs/glusterfs-server.vol 
>> --pid-file=/var/run/glusterfsd.pid --log-file=/var/log/glusterfsd.log
>> PID          : 31971
>> System name  : Linux
>> Nodename     : weeber.st-inst.lv
>> Kernel Release : 2.6.28-hardened-r7
>> Hardware Identifier: i686
>>
>> Given volfile:
>> +------------------------------------------------------------------------------+
>> 1: # file: /etc/glusterfs/glusterfs-server.vol
>> 2: volume posix
>> 3:   type storage/posix
>> 4:   option directory /home/export
>> 5: end-volume
>> 6:
>> 7: volume locks
>> 8:   type features/locks
>> 9:   option mandatory-locks on
>> 10:   subvolumes posix
>> 11: end-volume
>> 12:
>> 13: volume brick
>> 14:   type performance/io-threads
>> 15:   option autoscaling on
>> 16:   subvolumes locks
>> 17: end-volume
>> 18:
>> 19: volume server
>> 20:   type protocol/server
>> 21:   option transport-type tcp
>> 22:   option auth.addr.brick.allow 127.0.0.1,192.168.1.*
>> 23:   subvolumes brick
>> 24: end-volume
>>
>> +------------------------------------------------------------------------------+
>> [2009-05-27 16:38:20] N [glusterfsd.c:1152:main] glusterfs: 
>> Successfully started
>> [2009-05-27 16:38:33] N [server-protocol.c:7035:mop_setvolume] 
>> server: accepted client from 192.168.1.233:1021
>> [2009-05-27 16:38:33] N [server-protocol.c:7035:mop_setvolume] 
>> server: accepted client from 192.168.1.233:1020
>> [2009-05-27 16:38:46] N [server-protocol.c:7035:mop_setvolume] 
>> server: accepted client from 192.168.1.252:1021
>> [2009-05-27 16:38:46] N [server-protocol.c:7035:mop_setvolume] 
>> server: accepted client from 192.168.1.252:1020
>>
>> ================================================================================
>> Version      : glusterfs 2.0.1 built on May 27 2009 16:04:01
>> TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
>> Starting Time: 2009-05-27 16:38:46
>> Command line : /usr/sbin/glusterfs -N -f 
>> /etc/glusterfs/glusterfs-client.vol /mnt/gluster
>> PID          : 32161
>> System name  : Linux
>> Nodename     : weeber.st-inst.lv
>> Kernel Release : 2.6.28-hardened-r7
>> Hardware Identifier: i686
>>
>> Given volfile:
>> +------------------------------------------------------------------------------+
>> 1: volume xeon
>> 2:   type protocol/client
>> 3:   option transport-type tcp
>> 4:   option remote-host 192.168.1.233
>> 5:   option remote-subvolume brick
>> 6: end-volume
>> 7:
>> 8: volume weeber
>> 9:   type protocol/client
>> 10:   option transport-type tcp
>> 11:   option remote-host 192.168.1.252
>> 12:   option remote-subvolume brick
>> 13: end-volume
>> 14:
>> 15: volume replicate
>> 16:  type cluster/replicate
>> 17:  subvolumes xeon weeber
>> 18: end-volume
>> 20: volume readahead
>> 21:   type performance/read-ahead
>> 22:   option page-size 128kB
>> 23:   option page-count 16
>> 24:   option force-atime-update off
>> 25:   subvolumes replicate
>> 26: end-volume
>> 27:
>> 28: volume writebehind
>> 29:   type performance/write-behind
>> 30:   option aggregate-size 1MB
>> 31:   option window-size 3MB
>> 32:   option flush-behind on
>> 33:   option enable-O_SYNC on
>> 34:   subvolumes readahead
>> 35: end-volume
>> 36:
>> 37: volume iothreads
>> 38:   type performance/io-threads
>> 39:   option autoscaling on
>> 40:   subvolumes writebehind
>> 41: end-volume
>> 42:
>> 43:
>> 44:
>> 45: #volume bricks
>> 46: #type cluster/distribute
>> 47:  #option lookup-unhashed yes
>> 48:  #option min-free-disk 20%
>> 49: # subvolumes weeber xeon
>> 50: #end-volume
>>
>> +------------------------------------------------------------------------------+
>> [2009-05-27 16:38:46] W [xlator.c:555:validate_xlator_volume_options] 
>> writebehind: option 'window-size' is deprecated, preferred is 
>> 'cache-size', continuing with correction
>> [2009-05-27 16:38:46] W [glusterfsd.c:455:_log_if_option_is_invalid] 
>> writebehind: option 'aggregate-size' is not recognized
>> [2009-05-27 16:38:46] W [glusterfsd.c:455:_log_if_option_is_invalid] 
>> readahead: option 'page-size' is not recognized
>> [2009-05-27 16:38:46] N [glusterfsd.c:1152:main] glusterfs: 
>> Successfully started
>> [2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk] 
>> xeon: Connected to 192.168.1.233:6996, attached to remote volume 'brick'.
>> [2009-05-27 16:38:46] N [afr.c:2190:notify] replicate: Subvolume 
>> 'xeon' came back up; going online.
>> [2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk] 
>> xeon: Connected to 192.168.1.233:6996, attached to remote volume 'brick'.
>> [2009-05-27 16:38:46] N [afr.c:2190:notify] replicate: Subvolume 
>> 'xeon' came back up; going online.
>> [2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk] 
>> weeber: Connected to 192.168.1.252:6996, attached to remote volume 
>> 'brick'.
>> [2009-05-27 18:46:02] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 18:16:01. 
>> frame-timeout = 1800
>> [2009-05-27 19:16:09] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 18:46:02. 
>> frame-timeout = 1800
>> [2009-05-27 19:46:18] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame OPEN(12) frame sent = 2009-05-27 19:16:09. 
>> frame-timeout = 1800
>> [2009-05-27 20:16:25] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 19:46:18. 
>> frame-timeout = 1800
>> [2009-05-27 20:46:34] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 20:16:25. 
>> frame-timeout = 1800
>> [2009-05-27 21:16:41] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame OPEN(12) frame sent = 2009-05-27 20:46:34. 
>> frame-timeout = 1800
>> [2009-05-27 21:47:00] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 21:16:53. 
>> frame-timeout = 1800
>> [2009-05-27 22:17:07] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 21:47:00. 
>> frame-timeout = 1800
>> [2009-05-27 22:47:15] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame OPENDIR(21) frame sent = 2009-05-27 22:17:07. 
>> frame-timeout = 1800
>> [2009-05-27 23:17:23] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 22:47:15. 
>> frame-timeout = 1800
>> [2009-05-27 23:47:31] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame OPEN(12) frame sent = 2009-05-27 23:17:23. 
>> frame-timeout = 1800
>> [2009-05-28 00:17:39] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 23:47:32. 
>> frame-timeout = 1800
>> [2009-05-28 00:47:47] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 00:17:39. 
>> frame-timeout = 1800
>> [2009-05-28 01:17:55] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame OPENDIR(21) frame sent = 2009-05-28 00:47:47. 
>> frame-timeout = 1800
>> [2009-05-28 01:48:03] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 01:17:55. 
>> frame-timeout = 1800
>> [2009-05-28 02:18:11] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame OPEN(12) frame sent = 2009-05-28 01:48:03. 
>> frame-timeout = 1800
>> [2009-05-28 02:48:29] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 02:18:24. 
>> frame-timeout = 1800
>> [2009-05-28 03:18:37] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 02:48:29. 
>> frame-timeout = 1800
>> [2009-05-28 03:48:45] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 03:18:37. 
>> frame-timeout = 1800
>> [2009-05-28 04:18:53] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame XATTROP(40) frame sent = 2009-05-28 03:48:45. 
>> frame-timeout = 1800
>> [2009-05-28 04:49:01] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 04:18:53. 
>> frame-timeout = 1800
>> [2009-05-28 05:19:09] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame OPENDIR(21) frame sent = 2009-05-28 04:49:01. 
>> frame-timeout = 1800
>> [2009-05-28 05:49:17] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 05:19:09. 
>> frame-timeout = 1800
>> [2009-05-28 06:19:25] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 05:49:17. 
>> frame-timeout = 1800
>> [2009-05-28 06:49:33] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame XATTROP(40) frame sent = 2009-05-28 06:19:25. 
>> frame-timeout = 1800
>> [2009-05-28 07:19:40] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 06:49:33. 
>> frame-timeout = 1800
>> [2009-05-28 07:49:48] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 07:19:40. 
>> frame-timeout = 1800
>> [2009-05-28 08:19:56] E [client-protocol.c:292:call_bail] weeber: 
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 07:49:48. 
>> frame-timeout = 1800
>>
>> <maris.vcf>_______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090528/908e2ebb/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maris.vcf
Type: text/x-vcard
Size: 206 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090528/908e2ebb/attachment.vcf>


More information about the Gluster-users mailing list