[Gluster-users] Glusterfs 2.0 hangs on high load
Maris Ruskulis
maris at chown.lv
Thu May 28 10:07:52 UTC 2009
I have same issue with same config when both nodes are x64. But
difference is that, there is no bailout messages in logs.
Jasper van Wanrooy - Chatventure wrote:
> Hi Maris,
>
> I regret to hear that. I was also having problems with the stability
> on 32bit platforms. Possibly you should try it on a 64bit platform. Is
> that an option?
>
> Best Regards Jasper
>
>
> On 28 mei 2009, at 09:36, Maris Ruskulis wrote:
>
>> Hello!
>> After upgrade to version 2.0, now using 2.0.1, I'm experiencing
>> problems with glusterfs stability.
>> I'm running 2 node setup with cliet side afr, and glusterfsd also is
>> running on same servers. Time to time glusterfs just hangs, i can
>> reproduce this running iozone benchmarking tool. I'm using patched
>> Fuse, but same result is with unpatched.
>>
>> ================================================================================
>> Version : glusterfs 2.0.1 built on May 27 2009 16:04:01
>> TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
>> Starting Time: 2009-05-27 16:38:20
>> Command line : /usr/sbin/glusterfsd
>> --volfile=/etc/glusterfs/glusterfs-server.vol
>> --pid-file=/var/run/glusterfsd.pid --log-file=/var/log/glusterfsd.log
>> PID : 31971
>> System name : Linux
>> Nodename : weeber.st-inst.lv
>> Kernel Release : 2.6.28-hardened-r7
>> Hardware Identifier: i686
>>
>> Given volfile:
>> +------------------------------------------------------------------------------+
>> 1: # file: /etc/glusterfs/glusterfs-server.vol
>> 2: volume posix
>> 3: type storage/posix
>> 4: option directory /home/export
>> 5: end-volume
>> 6:
>> 7: volume locks
>> 8: type features/locks
>> 9: option mandatory-locks on
>> 10: subvolumes posix
>> 11: end-volume
>> 12:
>> 13: volume brick
>> 14: type performance/io-threads
>> 15: option autoscaling on
>> 16: subvolumes locks
>> 17: end-volume
>> 18:
>> 19: volume server
>> 20: type protocol/server
>> 21: option transport-type tcp
>> 22: option auth.addr.brick.allow 127.0.0.1,192.168.1.*
>> 23: subvolumes brick
>> 24: end-volume
>>
>> +------------------------------------------------------------------------------+
>> [2009-05-27 16:38:20] N [glusterfsd.c:1152:main] glusterfs:
>> Successfully started
>> [2009-05-27 16:38:33] N [server-protocol.c:7035:mop_setvolume]
>> server: accepted client from 192.168.1.233:1021
>> [2009-05-27 16:38:33] N [server-protocol.c:7035:mop_setvolume]
>> server: accepted client from 192.168.1.233:1020
>> [2009-05-27 16:38:46] N [server-protocol.c:7035:mop_setvolume]
>> server: accepted client from 192.168.1.252:1021
>> [2009-05-27 16:38:46] N [server-protocol.c:7035:mop_setvolume]
>> server: accepted client from 192.168.1.252:1020
>>
>> ================================================================================
>> Version : glusterfs 2.0.1 built on May 27 2009 16:04:01
>> TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
>> Starting Time: 2009-05-27 16:38:46
>> Command line : /usr/sbin/glusterfs -N -f
>> /etc/glusterfs/glusterfs-client.vol /mnt/gluster
>> PID : 32161
>> System name : Linux
>> Nodename : weeber.st-inst.lv
>> Kernel Release : 2.6.28-hardened-r7
>> Hardware Identifier: i686
>>
>> Given volfile:
>> +------------------------------------------------------------------------------+
>> 1: volume xeon
>> 2: type protocol/client
>> 3: option transport-type tcp
>> 4: option remote-host 192.168.1.233
>> 5: option remote-subvolume brick
>> 6: end-volume
>> 7:
>> 8: volume weeber
>> 9: type protocol/client
>> 10: option transport-type tcp
>> 11: option remote-host 192.168.1.252
>> 12: option remote-subvolume brick
>> 13: end-volume
>> 14:
>> 15: volume replicate
>> 16: type cluster/replicate
>> 17: subvolumes xeon weeber
>> 18: end-volume
>> 20: volume readahead
>> 21: type performance/read-ahead
>> 22: option page-size 128kB
>> 23: option page-count 16
>> 24: option force-atime-update off
>> 25: subvolumes replicate
>> 26: end-volume
>> 27:
>> 28: volume writebehind
>> 29: type performance/write-behind
>> 30: option aggregate-size 1MB
>> 31: option window-size 3MB
>> 32: option flush-behind on
>> 33: option enable-O_SYNC on
>> 34: subvolumes readahead
>> 35: end-volume
>> 36:
>> 37: volume iothreads
>> 38: type performance/io-threads
>> 39: option autoscaling on
>> 40: subvolumes writebehind
>> 41: end-volume
>> 42:
>> 43:
>> 44:
>> 45: #volume bricks
>> 46: #type cluster/distribute
>> 47: #option lookup-unhashed yes
>> 48: #option min-free-disk 20%
>> 49: # subvolumes weeber xeon
>> 50: #end-volume
>>
>> +------------------------------------------------------------------------------+
>> [2009-05-27 16:38:46] W [xlator.c:555:validate_xlator_volume_options]
>> writebehind: option 'window-size' is deprecated, preferred is
>> 'cache-size', continuing with correction
>> [2009-05-27 16:38:46] W [glusterfsd.c:455:_log_if_option_is_invalid]
>> writebehind: option 'aggregate-size' is not recognized
>> [2009-05-27 16:38:46] W [glusterfsd.c:455:_log_if_option_is_invalid]
>> readahead: option 'page-size' is not recognized
>> [2009-05-27 16:38:46] N [glusterfsd.c:1152:main] glusterfs:
>> Successfully started
>> [2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk]
>> xeon: Connected to 192.168.1.233:6996, attached to remote volume 'brick'.
>> [2009-05-27 16:38:46] N [afr.c:2190:notify] replicate: Subvolume
>> 'xeon' came back up; going online.
>> [2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk]
>> xeon: Connected to 192.168.1.233:6996, attached to remote volume 'brick'.
>> [2009-05-27 16:38:46] N [afr.c:2190:notify] replicate: Subvolume
>> 'xeon' came back up; going online.
>> [2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk]
>> weeber: Connected to 192.168.1.252:6996, attached to remote volume
>> 'brick'.
>> [2009-05-27 18:46:02] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 18:16:01.
>> frame-timeout = 1800
>> [2009-05-27 19:16:09] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 18:46:02.
>> frame-timeout = 1800
>> [2009-05-27 19:46:18] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame OPEN(12) frame sent = 2009-05-27 19:16:09.
>> frame-timeout = 1800
>> [2009-05-27 20:16:25] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 19:46:18.
>> frame-timeout = 1800
>> [2009-05-27 20:46:34] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 20:16:25.
>> frame-timeout = 1800
>> [2009-05-27 21:16:41] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame OPEN(12) frame sent = 2009-05-27 20:46:34.
>> frame-timeout = 1800
>> [2009-05-27 21:47:00] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 21:16:53.
>> frame-timeout = 1800
>> [2009-05-27 22:17:07] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 21:47:00.
>> frame-timeout = 1800
>> [2009-05-27 22:47:15] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame OPENDIR(21) frame sent = 2009-05-27 22:17:07.
>> frame-timeout = 1800
>> [2009-05-27 23:17:23] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 22:47:15.
>> frame-timeout = 1800
>> [2009-05-27 23:47:31] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame OPEN(12) frame sent = 2009-05-27 23:17:23.
>> frame-timeout = 1800
>> [2009-05-28 00:17:39] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-27 23:47:32.
>> frame-timeout = 1800
>> [2009-05-28 00:47:47] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 00:17:39.
>> frame-timeout = 1800
>> [2009-05-28 01:17:55] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame OPENDIR(21) frame sent = 2009-05-28 00:47:47.
>> frame-timeout = 1800
>> [2009-05-28 01:48:03] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 01:17:55.
>> frame-timeout = 1800
>> [2009-05-28 02:18:11] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame OPEN(12) frame sent = 2009-05-28 01:48:03.
>> frame-timeout = 1800
>> [2009-05-28 02:48:29] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 02:18:24.
>> frame-timeout = 1800
>> [2009-05-28 03:18:37] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 02:48:29.
>> frame-timeout = 1800
>> [2009-05-28 03:48:45] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 03:18:37.
>> frame-timeout = 1800
>> [2009-05-28 04:18:53] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame XATTROP(40) frame sent = 2009-05-28 03:48:45.
>> frame-timeout = 1800
>> [2009-05-28 04:49:01] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 04:18:53.
>> frame-timeout = 1800
>> [2009-05-28 05:19:09] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame OPENDIR(21) frame sent = 2009-05-28 04:49:01.
>> frame-timeout = 1800
>> [2009-05-28 05:49:17] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 05:19:09.
>> frame-timeout = 1800
>> [2009-05-28 06:19:25] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 05:49:17.
>> frame-timeout = 1800
>> [2009-05-28 06:49:33] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame XATTROP(40) frame sent = 2009-05-28 06:19:25.
>> frame-timeout = 1800
>> [2009-05-28 07:19:40] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 06:49:33.
>> frame-timeout = 1800
>> [2009-05-28 07:49:48] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 07:19:40.
>> frame-timeout = 1800
>> [2009-05-28 08:19:56] E [client-protocol.c:292:call_bail] weeber:
>> bailing out frame LOOKUP(32) frame sent = 2009-05-28 07:49:48.
>> frame-timeout = 1800
>>
>> <maris.vcf>_______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090528/908e2ebb/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maris.vcf
Type: text/x-vcard
Size: 206 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090528/908e2ebb/attachment.vcf>
More information about the Gluster-users
mailing list