[Gluster-devel] Could be the bug of Glusterfs? The file system is unstable and hang
Rodrigo Azevedo
rodrigoams at gmail.com
Thu Jun 4 13:04:11 UTC 2009
I am trying a workaround with clients:
volume pnc4
type protocol/client
option transport-type tcp
option remote-host teoria4
option frame-timeout 180000
option ping-timeout 1
option remote-subvolume dados
end-volume
....
volume replicate
type cluster/replicate
subvolumes teoria3 teoria4
end-volume
With server: I avoid autoscaling in io-threads.
This way the "bailing out frame" error disapeared and the system is stable.
2009/6/3 Alpha Electronics <myitouchs at gmail.com>:
> We applied the patch mentioned the thread, and use fixed thread count in the
> server config. Unfortunately, we got the same error:
>
> [2009-06-03 04:57:36] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse:
> 22347008: ERR => -1 (Resource temporarily unavailable)
> [2009-06-03 07:55:04] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse:
> 23431094: ERR => -1 (Resource temporarily unavailable)
> [2009-06-03 15:58:25] E [client-protocol.c:292:call_bail] brick1: bailing
> out frame LOOKUP(32) frame sent = 2009-06-03 15:28:23. frame-timeout = 1800
>
> John
>
>
> On Tue, Jun 2, 2009 at 12:25 AM, Shehjar Tikoo <shehjart at gluster.com> wrote:
>>
>> Hi
>>
>> >
>> > Also, avoid using autoscaling in io-threads for now.
>> >
>> > -Shehjar
>> >
>> >
>>
>> -Shehjar
>>
>> Alpha Electronics wrote:
>>>
>>> Thanks for looking into this. We do use io-threads. Here is the server
>>> config:
>>> : volume brick1-posix
>>> 2: type storage/posix
>>> 3: option directory /mnt/brick1
>>> 4: end-volume
>>> 5:
>>> 6: volume brick2-posix
>>> 7: type storage/posix
>>> 8: option directory /mnt/brick2
>>> 9: end-volume
>>> 10:
>>> 11:
>>> 12: volume brick1-locks
>>> 13: type features/locks
>>> 14: subvolumes brick1-posix
>>> 15: end-volume
>>> 16:
>>> 17: volume brick2-locks
>>> 18: type features/locks
>>> 19: subvolumes brick2-posix
>>> 20: end-volume
>>> 21:
>>> 22: volume brick1
>>> 23: type performance/io-threads
>>> 24: option min-threads 16
>>> 25: option autoscaling on
>>> 26: subvolumes brick1-locks
>>> 27: end-volume
>>> 28:
>>> 29: volume brick2
>>> 30: type performance/io-threads
>>> 31: option min-threads 16
>>> 32: option autoscaling on
>>> 33: subvolumes brick2-locks
>>> 34: end-volume
>>> 35:
>>> 36: volume server
>>> 37: type protocol/server
>>> 38: option transport-type tcp
>>> 40: option auth.addr.brick1.allow *
>>> 41: option auth.addr.brick2.allow *
>>> 42: subvolumes brick1 brick2
>>> 43: end-volume
>>> 44:
>>>
>>>
>>>
>>> On Sun, May 31, 2009 at 11:44 PM, Shehjar Tikoo <shehjart at gluster.com
>>> <mailto:shehjart at gluster.com>> wrote:
>>>
>>> Alpha Electronics wrote:
>>>
>>> We are testing the glusterfs before recommending them to
>>> enterprise clients. We found that the file system always hang
>>> after running for about 2 days. after killing the server side
>>> process and then restart, everything goes back to normal.
>>>
>>>
>>> What is the server config?
>>> If you're not using io-threads on the server, I suggest you do,
>>> because it does basic load-balancing to avoid timeouts.
>>>
>>> Also, avoid using autoscaling in io-threads for now.
>>>
>>> -Shehjar
>>>
>>>
>>> Here is the spec and error logged:
>>> GlusterFS version: v2.0.1
>>>
>>> Client volume:
>>> volume brick_1
>>> type protocol/client
>>> option transport-type tcp/client
>>> option remote-port 7777 # Non-default port
>>> option remote-host server1
>>> option remote-subvolume brick
>>> end-volume
>>>
>>> volume brick_2
>>> type protocol/client
>>> option transport-type tcp/client
>>> option remote-port 7777 # Non-default port
>>> option remote-host server2
>>> option remote-subvolume brick
>>> end-volume
>>>
>>> volume bricks
>>> type cluster/distribute
>>> subvolumes brick_1 brick_2
>>> end-volume
>>>
>>> Error logged on client side through /var/log/glusterfs.log
>>> [2009-05-29 14:58:55] E [client-protocol.c:292:call_bail]
>>> brick_1: bailing out frame LK(28) frame sent = 2009-05-29
>>> 14:28:54. frame-timeout = 1800
>>> [2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk]
>>> glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not
>>> connected)
>>> error logged on server
>>> [2009-05-29 14:59:15] E [client-protocol.c:292:call_bail]
>>> brick_2: bailing out frame LK(28) frame sent = 2009-05-29
>>> 14:29:05. frame-timeout = 1800
>>> [2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk]
>>> glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not
>>> connected)
>>>
>>> There is error message logged on server side after 1 hour in
>>> /var/log/messages:
>>> May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
>>> lib/util_sock.c:write_data(564)
>>> May 29 16:04:16 server2 winbindd[3649]: write_data: write
>>> failure. Error = Connection reset by peer
>>> May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
>>> libsmb/clientgen.c:write_socket(158)
>>> May 29 16:04:16 server2 winbindd[3649]: write_socket: Error
>>> writing 104 bytes to socket 18: ERRNO = Connection reset by peer
>>> May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
>>> libsmb/clientgen.c:cli_send_smb(188)
>>> May 29 16:04:16 server2 winbindd[3649]: Error writing 104
>>> bytes to client. -1 (Connection reset by peer)
>>> May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
>>> libsmb/cliconnect.c:cli_session_setup_spnego(859)
>>> May 29 16:04:16 server2 winbindd[3649]: Kinit failed: Cannot
>>> contact any KDC for requested realm
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>
>>>
>>>
>>>
>>
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
--
Rodrigo Azevedo Moreira da Silva
Departamento de Física
Universidade Federal de Pernambuco
http://www.df.ufpe.br
More information about the Gluster-devel
mailing list