[Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang

jvanwanrooy at chatventure.nl jvanwanrooy at chatventure.nl
Sat May 30 19:18:16 UTC 2009


Hi, 

We just ran into a problem with the same result: the crash of a client. We use two storage bricks with replication in the client. We stopped the first storage brick, which caused the crash of the client. 
Please take a look at my log below. Does anyone know why this is caused? 

Best regards J asper 


[2009-05-30 21:02:22] E [saved-frames.c:165:saved_frames_unwind] brick1: forced unwinding frame type(1) op(FINODELK) 
[2009-05-30 21:02:22] E [saved-frames.c:165:saved_frames_unwind] brick1: forced unwinding frame type(1) op(FINODELK) 
[2009-05-30 21:02:22] D [socket.c:1229:socket_submit] brick1: not connected (priv->connected = 255) 
[2009-05-30 21:02:22] N [client-protocol.c:6248:notify] brick1: disconnected 
[2009-05-30 21:02:22] E [socket.c:744:socket_connect_finish] brick1: connection to 172.23.120.210:6996 failed (Connection refused) 
pending frames: 
frame : type(1) op(READ) 

patchset: 5c1d9108c1529a1155963cb1911f8870a674ab5b 
signal received: 11 
configuration details:argp 1 
backtrace 1 
db.h 1 
dlfcn 1 
fdatasync 1 
libpthread 1 
llistxattr 1 
setfsid 1 
spinlock 1 
epoll.h 1 
xattr.h 1 
st_atim.tv_nsec 1 
package-string: glusterfs 2.0.1 
/lib64/libc.so.6[0x381e830280] 
/usr/lib64/glusterfs/2.0.1/xlator/performance/read-ahead.so(ra_readv+0x58)[0x2b8c0d2a1888] 
/usr/lib64/glusterfs/2.0.1/xlator/cluster/replicate.so(afr_readv+0x173)[0x2b8c0d4b8203] 
/usr/lib64/libfuse.so.2[0x2b8c0d8fdf39] 
/usr/lib64/glusterfs/2.0.1/xlator/mount/fuse.so[0x2b8c0d6deffd] 
/lib64/libpthread.so.0[0x381f806367] 
/lib64/libc.so.6(clone+0x6d)[0x381e8d2f7d] 
--------- 
[2009-05-30 21:02:23] N [client-protocol.c:6248:notify] brick1: disconnected 












Jasper van Wanrooy - Chatventure BV 
Technical Manager 
T: +31 (0) 6 47 248 722 
E: jvanwanrooy at chatventure.nl 
W: www.chatventure.nl 


----- Original Message ----- 
From: "Vahriç Muhtaryan" <vahric at doruk.net.tr> 
To: "Alpha Electronics" <myitouchs at gmail.com>, gluster-users at gluster.org 
Sent: Saturday, 30 May, 2009 10:43:12 GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna 
Subject: Re: [Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang 








Hello, 



I was installed new version like you and making test for something should be or not . We have same configuration but I got differnet error, I couldn’t create directory or file , “it was giving Invalid Argument” and I saw that one of server give an error like below , still testing .... 



pending frames: 

frame : type(1) op(WRITE) 



patchset: 5c1d9108c1529a1155963cb1911f8870a674ab5b 

signal received: 6 

configuration details:argp 1 

backtrace 1 

db.h 1 

dlfcn 1 

fdatasync 1 

libpthread 1 

llistxattr 1 

setfsid 1 

spinlock 1 

epoll.h 1 

xattr.h 1 

st_atim.tv_nsec 1 

package-string: glusterfs 2.0.1 

[0xfa9420] 

/lib/libc.so.6(abort+0x101)[0x218691] 

/lib/libc.so.6[0x24f24b] 

/lib/libc.so.6[0x2570f1] 

/lib/libc.so.6(cfree+0x90)[0x25abc0] 

/usr/local/lib/glusterfs/2.0.1/transport/socket.so(__socket_reset+0x3e)[0xc8155e] 

/usr/local/lib/glusterfs/2.0.1/transport/socket.so(socket_event_poll_err+0x3b)[0xc8303b] 

/usr/local/lib/glusterfs/2.0.1/transport/socket.so(socket_event_handler+0x8b)[0xc833bb] 

/usr/local/lib/libglusterfs.so.0[0x9820ca] 

/usr/local/lib/libglusterfs.so.0(event_dispatch+0x21)[0x980fb1] 

glusterfsd(main+0xdf3)[0x804b1a3] 

/lib/libc.so.6(__libc_start_main+0xdc)[0x203e8c] 

glusterfsd[0x8049911] 

--------- 




From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Alpha Electronics 
Sent: Friday, May 29, 2009 10:32 PM 
To: gluster-users at gluster.org 
Subject: [Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang 




We are testing the glusterfs before recommending them to enterprise clients. We found that the file system always hang after running for about 2 days. after killing the server side process and then restart, everything goes back to normal. 

Here is the spec and error logged: 
GlusterFS version: v2.0.1 

Client volume: 
volume brick_1 
type protocol/client 
option transport-type tcp/client 
option remote-port 7777 # Non-default port 
option remote-host server1 
option remote-subvolume brick 
end-volume 

volume brick_2 
type protocol/client 
option transport-type tcp/client 
option remote-port 7777 # Non-default port 
option remote-host server2 
option remote-subvolume brick 
end-volume 

volume bricks 
type cluster/distribute 
subvolumes brick_1 brick_2 
end-volume 

Error logged on client side through /var/log/glusterfs.log 
[2009-05-29 14:58:55] E [client-protocol.c:292:call_bail] brick_1: bailing out frame LK(28) frame sent = 2009-05-29 14:28:54. frame-timeout = 1800 
[2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not connected) 
error logged on server 
[2009-05-29 14:59:15] E [client-protocol.c:292:call_bail] brick_2: bailing out frame LK(28) frame sent = 2009-05-29 14:29:05. frame-timeout = 1800 
[2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not connected) 

There is error message logged on server side after 1 hour in /var/log/messages: 
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] lib/util_sock.c:write_data(564) 
May 29 16:04:16 server2 winbindd[3649]: write_data: write failure. Error = Connection reset by peer 
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:write_socket(158) 
May 29 16:04:16 server2 winbindd[3649]: write_socket: Error writing 104 bytes to socket 18: ERRNO = Connection reset by peer 
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:cli_send_smb(188) 
May 29 16:04:16 server2 winbindd[3649]: Error writing 104 bytes to client. -1 (Connection reset by peer) 
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/cliconnect.c:cli_session_setup_spnego(859) 
May 29 16:04:16 server2 winbindd[3649]: Kinit failed: Cannot contact any KDC for requested realm 


_______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090530/6436cd5b/attachment.html>


More information about the Gluster-users mailing list