[Gluster-users] Glusterfs 2.0 hangs on high load

Maris Ruskulis maris at chown.lv
Thu May 28 07:36:49 UTC 2009


Hello!
After upgrade to version 2.0, now using 2.0.1, I'm experiencing problems 
with glusterfs stability.
I'm running 2 node setup with cliet side afr, and glusterfsd also is 
running on same servers. Time to time glusterfs just hangs, i can 
reproduce this running iozone benchmarking tool.  I'm using patched 
Fuse, but same result is with unpatched.

================================================================================
Version      : glusterfs 2.0.1 built on May 27 2009 16:04:01
TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
Starting Time: 2009-05-27 16:38:20
Command line : /usr/sbin/glusterfsd 
--volfile=/etc/glusterfs/glusterfs-server.vol 
--pid-file=/var/run/glusterfsd.pid --log-file=/var/log/glusterfsd.log
PID          : 31971
System name  : Linux
Nodename     : weeber.st-inst.lv
Kernel Release : 2.6.28-hardened-r7
Hardware Identifier: i686

Given volfile:
+------------------------------------------------------------------------------+
  1: # file: /etc/glusterfs/glusterfs-server.vol
  2: volume posix
  3:   type storage/posix
  4:   option directory /home/export
  5: end-volume
  6:
  7: volume locks
  8:   type features/locks
  9:   option mandatory-locks on
 10:   subvolumes posix
 11: end-volume
 12:
 13: volume brick
 14:   type performance/io-threads
 15:   option autoscaling on
 16:   subvolumes locks
 17: end-volume
 18:
 19: volume server
 20:   type protocol/server
 21:   option transport-type tcp
 22:   option auth.addr.brick.allow 127.0.0.1,192.168.1.*
 23:   subvolumes brick
 24: end-volume

+------------------------------------------------------------------------------+
[2009-05-27 16:38:20] N [glusterfsd.c:1152:main] glusterfs: Successfully 
started
[2009-05-27 16:38:33] N [server-protocol.c:7035:mop_setvolume] server: 
accepted client from 192.168.1.233:1021
[2009-05-27 16:38:33] N [server-protocol.c:7035:mop_setvolume] server: 
accepted client from 192.168.1.233:1020
[2009-05-27 16:38:46] N [server-protocol.c:7035:mop_setvolume] server: 
accepted client from 192.168.1.252:1021
[2009-05-27 16:38:46] N [server-protocol.c:7035:mop_setvolume] server: 
accepted client from 192.168.1.252:1020

================================================================================
Version      : glusterfs 2.0.1 built on May 27 2009 16:04:01
TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
Starting Time: 2009-05-27 16:38:46
Command line : /usr/sbin/glusterfs -N -f 
/etc/glusterfs/glusterfs-client.vol /mnt/gluster
PID          : 32161
System name  : Linux
Nodename     : weeber.st-inst.lv
Kernel Release : 2.6.28-hardened-r7
Hardware Identifier: i686

Given volfile:
+------------------------------------------------------------------------------+
  1: volume xeon
  2:   type protocol/client
  3:   option transport-type tcp
  4:   option remote-host 192.168.1.233
  5:   option remote-subvolume brick
  6: end-volume
  7:
  8: volume weeber
  9:   type protocol/client
 10:   option transport-type tcp
 11:   option remote-host 192.168.1.252
 12:   option remote-subvolume brick
 13: end-volume
 14:
 15: volume replicate
 16:  type cluster/replicate
 17:  subvolumes xeon weeber
 18: end-volume
 20: volume readahead
 21:   type performance/read-ahead
 22:   option page-size 128kB
 23:   option page-count 16
 24:   option force-atime-update off
 25:   subvolumes replicate
 26: end-volume
 27:
 28: volume writebehind
 29:   type performance/write-behind
 30:   option aggregate-size 1MB
 31:   option window-size 3MB
 32:   option flush-behind on
 33:   option enable-O_SYNC on
 34:   subvolumes readahead
 35: end-volume
 36:
 37: volume iothreads
 38:   type performance/io-threads
 39:   option autoscaling on
 40:   subvolumes writebehind
 41: end-volume
 42:
 43:
 44:
 45: #volume bricks
 46: #type cluster/distribute
 47:  #option lookup-unhashed yes
 48:  #option min-free-disk 20%
 49: # subvolumes weeber xeon
 50: #end-volume

+------------------------------------------------------------------------------+
[2009-05-27 16:38:46] W [xlator.c:555:validate_xlator_volume_options] 
writebehind: option 'window-size' is deprecated, preferred is 
'cache-size', continuing with correction
[2009-05-27 16:38:46] W [glusterfsd.c:455:_log_if_option_is_invalid] 
writebehind: option 'aggregate-size' is not recognized
[2009-05-27 16:38:46] W [glusterfsd.c:455:_log_if_option_is_invalid] 
readahead: option 'page-size' is not recognized
[2009-05-27 16:38:46] N [glusterfsd.c:1152:main] glusterfs: Successfully 
started
[2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk] 
xeon: Connected to 192.168.1.233:6996, attached to remote volume 'brick'.
[2009-05-27 16:38:46] N [afr.c:2190:notify] replicate: Subvolume 'xeon' 
came back up; going online.
[2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk] 
xeon: Connected to 192.168.1.233:6996, attached to remote volume 'brick'.
[2009-05-27 16:38:46] N [afr.c:2190:notify] replicate: Subvolume 'xeon' 
came back up; going online.
[2009-05-27 16:38:46] N [client-protocol.c:5557:client_setvolume_cbk] 
weeber: Connected to 192.168.1.252:6996, attached to remote volume 'brick'.
[2009-05-27 18:46:02] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-27 18:16:01. 
frame-timeout = 1800
[2009-05-27 19:16:09] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-27 18:46:02. 
frame-timeout = 1800
[2009-05-27 19:46:18] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame OPEN(12) frame sent = 2009-05-27 19:16:09. 
frame-timeout = 1800
[2009-05-27 20:16:25] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-27 19:46:18. 
frame-timeout = 1800
[2009-05-27 20:46:34] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-27 20:16:25. 
frame-timeout = 1800
[2009-05-27 21:16:41] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame OPEN(12) frame sent = 2009-05-27 20:46:34. 
frame-timeout = 1800
[2009-05-27 21:47:00] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-27 21:16:53. 
frame-timeout = 1800
[2009-05-27 22:17:07] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-27 21:47:00. 
frame-timeout = 1800
[2009-05-27 22:47:15] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame OPENDIR(21) frame sent = 2009-05-27 22:17:07. 
frame-timeout = 1800
[2009-05-27 23:17:23] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-27 22:47:15. 
frame-timeout = 1800
[2009-05-27 23:47:31] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame OPEN(12) frame sent = 2009-05-27 23:17:23. 
frame-timeout = 1800
[2009-05-28 00:17:39] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-27 23:47:32. 
frame-timeout = 1800
[2009-05-28 00:47:47] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 00:17:39. 
frame-timeout = 1800
[2009-05-28 01:17:55] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame OPENDIR(21) frame sent = 2009-05-28 00:47:47. 
frame-timeout = 1800
[2009-05-28 01:48:03] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 01:17:55. 
frame-timeout = 1800
[2009-05-28 02:18:11] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame OPEN(12) frame sent = 2009-05-28 01:48:03. 
frame-timeout = 1800
[2009-05-28 02:48:29] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 02:18:24. 
frame-timeout = 1800
[2009-05-28 03:18:37] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 02:48:29. 
frame-timeout = 1800
[2009-05-28 03:48:45] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 03:18:37. 
frame-timeout = 1800
[2009-05-28 04:18:53] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame XATTROP(40) frame sent = 2009-05-28 03:48:45. 
frame-timeout = 1800
[2009-05-28 04:49:01] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 04:18:53. 
frame-timeout = 1800
[2009-05-28 05:19:09] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame OPENDIR(21) frame sent = 2009-05-28 04:49:01. 
frame-timeout = 1800
[2009-05-28 05:49:17] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 05:19:09. 
frame-timeout = 1800
[2009-05-28 06:19:25] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 05:49:17. 
frame-timeout = 1800
[2009-05-28 06:49:33] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame XATTROP(40) frame sent = 2009-05-28 06:19:25. 
frame-timeout = 1800
[2009-05-28 07:19:40] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 06:49:33. 
frame-timeout = 1800
[2009-05-28 07:49:48] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 07:19:40. 
frame-timeout = 1800
[2009-05-28 08:19:56] E [client-protocol.c:292:call_bail] weeber: 
bailing out frame LOOKUP(32) frame sent = 2009-05-28 07:49:48. 
frame-timeout = 1800

-------------- next part --------------
A non-text attachment was scrubbed...
Name: maris.vcf
Type: text/x-vcard
Size: 206 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090528/b863fae2/attachment.vcf>


More information about the Gluster-users mailing list