[Gluster-devel] Could be the bug of Glusterfs? The file system is unstable and hang

Alpha Electronics myitouchs at gmail.com
Mon Jun 1 14:17:29 UTC 2009


Thanks for looking into this. We do use io-threads. Here is the server
config:
: volume brick1-posix
  2:  type storage/posix
  3:  option directory /mnt/brick1
  4: end-volume
  5:
  6: volume brick2-posix
  7:  type storage/posix
  8:  option directory /mnt/brick2
  9: end-volume
 10:
 11:
 12: volume brick1-locks
 13:   type features/locks
 14:   subvolumes brick1-posix
 15: end-volume
 16:
 17: volume brick2-locks
 18:   type features/locks
 19:   subvolumes brick2-posix
 20: end-volume
 21:
 22: volume brick1
 23:  type performance/io-threads
 24:  option min-threads 16
 25:  option autoscaling on
 26:  subvolumes brick1-locks
 27: end-volume
 28:
 29: volume brick2
 30:  type performance/io-threads
 31:  option min-threads 16
 32:  option autoscaling on
 33:  subvolumes brick2-locks
 34: end-volume
 35:
 36: volume server
 37:  type protocol/server
 38:  option transport-type tcp
 40:  option auth.addr.brick1.allow *
 41:  option auth.addr.brick2.allow *
 42:  subvolumes brick1 brick2
 43: end-volume
 44:



On Sun, May 31, 2009 at 11:44 PM, Shehjar Tikoo <shehjart at gluster.com>wrote:

> Alpha Electronics wrote:
>
>> We are testing the glusterfs before recommending them to enterprise
>> clients. We found that the file system always hang after running for about 2
>> days. after killing the server side process and then restart, everything
>> goes back to normal.
>>
>>
> What is the server config?
> If you're not using io-threads on the server, I suggest you do,
> because it does basic load-balancing to avoid timeouts.
>
> Also, avoid using autoscaling in io-threads for now.
>
> -Shehjar
>
>
>   Here is the spec and error logged:
>> GlusterFS version:  v2.0.1
>>
>> Client volume:
>> volume brick_1
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-port 7777 # Non-default port
>>  option remote-host server1
>>  option remote-subvolume brick
>> end-volume
>>
>> volume brick_2
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-port 7777 # Non-default port
>>  option remote-host server2
>>  option remote-subvolume brick
>> end-volume
>>
>> volume bricks
>>  type cluster/distribute
>>  subvolumes brick_1 brick_2
>> end-volume
>>
>> Error logged on client side through /var/log/glusterfs.log
>> [2009-05-29 14:58:55] E [client-protocol.c:292:call_bail] brick_1: bailing
>> out frame LK(28) frame sent = 2009-05-29 14:28:54. frame-timeout = 1800
>> [2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk]
>> glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not connected)
>> error logged on server
>> [2009-05-29 14:59:15] E [client-protocol.c:292:call_bail] brick_2: bailing
>> out frame LK(28) frame sent = 2009-05-29 14:29:05. frame-timeout = 1800
>> [2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk]
>> glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not connected)
>>
>> There is error message logged on server side after 1 hour in
>> /var/log/messages:
>> May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
>> lib/util_sock.c:write_data(564)
>> May 29 16:04:16 server2 winbindd[3649]:   write_data: write failure. Error
>> = Connection reset by peer
>> May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
>> libsmb/clientgen.c:write_socket(158)
>> May 29 16:04:16 server2 winbindd[3649]:   write_socket: Error writing 104
>> bytes to socket 18: ERRNO = Connection reset by peer
>> May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
>> libsmb/clientgen.c:cli_send_smb(188)
>> May 29 16:04:16 server2 winbindd[3649]:   Error writing 104 bytes to
>> client. -1 (Connection reset by peer)
>> May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
>> libsmb/cliconnect.c:cli_session_setup_spnego(859)
>> May 29 16:04:16 server2 winbindd[3649]:   Kinit failed: Cannot contact any
>> KDC for requested realm
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090601/fa5523bb/attachment-0003.html>


More information about the Gluster-devel mailing list