[Gluster-users] fd cleanup and Bad file descriptor

Tue Sep 13 14:17:02 UTC 2011

Hi folks,

We're researching the use of glusterfs (glusterfs 3.2.2 installed from
sources) as storage solution for open nebula (open source cloud solution
version 2.2 also installed from sources) and faced one weird issue were some
files get corrupted. We were performing some tests and disconnected the
subnet were the glusterfd run (2 servers) from the subnets of the clients (4
nodes).
Apologies if the mail is too big but I'm trying to attach as much
information as possible. I might have left some important stuff out so, if
anyone can shed some light, I would happily provide more information.

Several disconnection messsages appeared in the logs:

client1:
[2011-09-12 16:20:39.861342] W [socket.c:1494:__socket_proto_state_machine]
0-glusterfs: reading from socket failed. Error (Connection reset by peer),
peer (A.B.C31:24007)
[2011-09-12 16:22:48.101327] W [socket.c:204:__socket_rwv] 0-glusterfs:
readv failed (Connection timed out)
[2011-09-12 16:22:48.101405] W [socket.c:1494:__socket_proto_state_machine]
0-glusterfs: reading from socket failed. Error (Connection timed out), peer
(A.B.C31:24007)
[2011-09-12 16:23:09.938010] C
[client-handshake.c:121:rpc_client_ping_timer_expired] 0-cloud-client-1:
server A.B.C32:24010 has not responded in the last 42 seconds,
disconnecting.
[2011-09-12 16:23:09.938196] C
[client-handshake.c:121:rpc_client_ping_timer_expired] 0-cloud-client-0:
server A.B.C31:24010 has not responded in the last 42 seconds,
disconnecting.

server1:
2011-09-12 16:19:26.658858] W [socket.c:204:__socket_rwv]
0-socket.management: readv failed (Connection timed out)
[2011-09-12 16:19:26.688463] W [socket.c:1494:__socket_proto_state_machine]
0-socket.management: reading from socket failed. Error (Connection timed
out), peer (A.B.C21:1023)
[2011-09-12 16:19:26.835551] W [socket.c:204:__socket_rwv] 0-management:
readv failed (Connection timed out)
[2011-09-12 16:19:26.835586] W [socket.c:1494:__socket_proto_state_machine]
0-management: reading from socket failed. Error (Connection timed out), peer
(A.B.C32:24007)
[2011-09-12 16:19:27.255418] W [socket.c:204:__socket_rwv]
0-socket.management: readv failed (Connection timed out)
[2011-09-12 16:19:27.255454] W [socket.c:1494:__socket_proto_state_machine]
0-socket.management: reading from socket failed. Error (Connection timed
out), peer (A.B.C32:1023)

Then both gluster servers see each other:
server 1:
[2011-09-12 16:19:40.34860] I
[glusterd-handler.c:2797:glusterd_handle_incoming_friend_req] 0-glusterd:
Received probe from uuid: 91a43282-2a05-4e7d-b4eb-3fbd3cef0055
[2011-09-12 16:19:49.775930] I
[glusterd-handshake.c:317:glusterd_set_clnt_mgmt_program] 0-: Using Program
glusterd clnt mgmt, Num (1238433), Version (1)
[2011-09-12 16:19:49.776026] I
[glusterd-handler.c:3567:glusterd_xfer_friend_add_resp] 0-glusterd:
Responded to A.B.C31 (0), ret: 0
[2011-09-12 16:19:49.811642] I
[glusterd-sm.c:492:glusterd_ac_send_friend_update] 0-: Added uuid:
91a43282-2a05-4e7d-b4eb-3fbd3cef0055, host: store2
[2011-09-12 16:19:49.811878] I
[glusterd-rpc-ops.c:523:glusterd3_1_friend_add_cbk] 0-glusterd: Received ACC
from uuid: 91a43282-2a05-4e7d-b4eb-3fbd3cef0055, host: store2, port: 0
[2011-09-12 16:19:49.812122] I
[glusterd-handler.c:2944:glusterd_handle_friend_update] 0-glusterd: Received
friend update from uuid: 91a43282-2a05-4e7d-b4eb-3fbd3cef0055
[2011-09-12 16:19:49.812156] I
[glusterd-handler.c:2989:glusterd_handle_friend_update] 0-: Received uuid:
ede0b9eb-c93c-40f2-a065-0f04a541b784, hostname:A.B.C31
[2011-09-12 16:19:49.812171] I
[glusterd-handler.c:2992:glusterd_handle_friend_update] 0-: Received my uuid
as Friend
[2011-09-12 16:19:49.812234] I
[glusterd-rpc-ops.c:705:glusterd3_1_friend_update_cbk] 0-glusterd: Received
ACC from uuid:

server 2:
[2011-09-12 16:22:13.252695] I
[glusterd-handshake.c:317:glusterd_set_clnt_mgmt_program] 0-: Using Program
glusterd clnt mgmt, Num (1238433), Version (1)
[2011-09-12 16:22:13.264336] I
[glusterd-rpc-ops.c:523:glusterd3_1_friend_add_cbk] 0-glusterd: Received ACC
from uuid: ede0b9eb-c93c-40f2-a065-0f04a541b784, host: A.B.C31, port: 0
[2011-09-12 16:22:13.264635] I
[glusterd-handler.c:2797:glusterd_handle_incoming_friend_req] 0-glusterd:
Received probe from uuid: ede0b9eb-c93c-40f2-a065-0f04a541b784
[2011-09-12 16:22:13.264708] I
[glusterd-handler.c:3567:glusterd_xfer_friend_add_resp] 0-glusterd:
Responded to store2 (0), ret: 0
[2011-09-12 16:22:13.264855] I
[glusterd-sm.c:492:glusterd_ac_send_friend_update] 0-: Added uuid:
ede0b9eb-c93c-40f2-a065-0f04a541b784, host: A.B.C31
[2011-09-12 16:22:13.265151] I
[glusterd-handler.c:2944:glusterd_handle_friend_update] 0-glusterd: Received
friend update from uuid: ede0b9eb-c93c-40f2-a065-0f04a541b784
[2011-09-12 16:22:13.265184] I
[glusterd-handler.c:2989:glusterd_handle_friend_update] 0-: Received uuid:
91a43282-2a05-4e7d-b4eb-3fbd3cef0055, hostname:store2
[2011-09-12 16:22:13.265197] I
[glusterd-handler.c:2992:glusterd_handle_friend_update] 0-: Received my uuid
as Friend

Then on client 4 the "problematic" issue start to appear (File descriptor in
bad state)

client 4:
[2011-09-12 16:23:09.985706] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x6d) [0x7f20c5ba750d]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f20c5ba52dd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f20c5ba523e]))) 0-cloud-client-0: forced unwinding frame type(GlusterFS
3.1) op(LOOKUP(27)) called at 2011-09-12 16:22:22.945095
[2011-09-12 16:23:09.985770] W [fuse-bridge.c:184:fuse_entry_cbk]
0-glusterfs-fuse: 5657287: LOOKUP() /one/.pam_environment => -1 (Transport
endpoint is not connected)
[2011-09-12 16:23:09.985853] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x6d) [0x7f20c5ba750d]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f20c5ba52dd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f20c5ba523e]))) 0-cloud-client-0: forced unwinding frame type(GlusterFS
3.1) op(FINODELK(30)) called at 2011-09-12 16:22:24.690197
[2011-09-12 16:23:09.986420] W [rpc-clnt.c:1411:rpc_clnt_submit]
0-cloud-client-0: failed to submit rpc-request (XID: 0x7926489x Program:
GlusterFS 3.1, ProgVers: 310, Proc: 30) to rpc-transport (cloud-client-0)
[2011-09-12 16:23:09.986463] W [client3_1-fops.c:4868:client3_1_finodelk]
0-cloud-client-1: (2760): failed to get fd ctx. EBADFD
[2011-09-12 16:23:09.986480] W [client3_1-fops.c:4915:client3_1_finodelk]
0-cloud-client-1: failed to send the fop: File descriptor in bad state

and later on:
[2011-09-12 16:23:09.993817] W [rpc-clnt.c:1411:rpc_clnt_submit]
0-cloud-client-0: failed to submit rpc-request (XID: 0x7926670x Program:
GlusterFS 3.1, ProgVers: 310, Proc: 30) to rpc-transport (cloud-client-0)
[2011-09-12 16:23:09.993837] W [rpc-clnt.c:1411:rpc_clnt_submit]
0-cloud-client-0: failed to submit rpc-request (XID: 0x7926671x Program:
GlusterFS 3.1, ProgVers: 310, Proc: 30) to rpc-transport (cloud-client-0)
[2011-09-12 16:23:09.993970] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x6d) [0x7f20c5ba750d]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f20c5ba52dd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f20c5ba523e]))) 0-cloud-client-0: forced unwinding frame type(GlusterFS
Handshake) op(PING(3)) called at 2011-09-12 16:22:27.931756
[2011-09-12 16:23:09.993992] W [client-handshake.c:264:client_ping_cbk]
0-cloud-client-0: timer must have expired
[2011-09-12 16:23:09.994013] E [afr-common.c:2634:afr_notify]
0-cloud-replicate-0: All subvolumes are down. Going offline until atleast
one of them comes back up.
[2011-09-12 16:23:10.577233] W [fuse-bridge.c:1825:fuse_writev_cbk]
0-glusterfs-fuse: 5657529: WRITE => -1 (Transport endpoint is not connected)
[2011-09-12 16:23:11.577562] W [afr-open.c:624:afr_openfd_flush]
0-cloud-replicate-0: fd not open on any subvolume 0x7f20c0f16174
(/one/var/30/images/disk.0)
[2011-09-12 16:23:11.607857] W [dict.c:418:dict_unref]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7f20c5ba7145]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/protocol/client.so(client3_1_fstat_cbk+0x31b)
[0x7f20c292ee3b]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x164)
[0x7f20c26d8284]))) 0-dict: dict is NULL
[2011-09-12 16:23:11.608416] W [afr-common.c:122:afr_set_split_brain]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_flush_cbk+0x79)
[0x7f20c26d7239]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_done+0x4e)
[0x7f20c26d696e]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_self_heal_completion_cbk+0x271)
[0x7f20c26dd171]))) 0-cloud-replicate-0: invalid argument: inode
[2011-09-12 16:23:11.608446] W [afr-open.c:326:afr_openfd_sh_unwind]
0-cloud-replicate-0: fd not open on any subvolume 0x7f20c0f16174
(/one/var/30/images/disk.0)
[2011-09-12 16:23:14.180613] W [fuse-bridge.c:1825:fuse_writev_cbk]
0-glusterfs-fuse: 5657804: WRITE => -1 (Transport endpoint is not connected)
[2011-09-12 16:23:14.184627] W [afr-open.c:624:afr_openfd_flush]
0-cloud-replicate-0: fd not open on any subvolume 0x7f20c0f160cc
(/one/var/25/images/disk.0)
[2011-09-12 16:23:14.186908] W [dict.c:418:dict_unref]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7f20c5ba7145]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/protocol/client.so(client3_1_fstat_cbk+0x31b)
[0x7f20c292ee3b]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x164)
[0x7f20c26d8284]))) 0-dict: dict is NULL
[2011-09-12 16:23:14.187411] W [afr-common.c:122:afr_set_split_brain]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_flush_cbk+0x79)
[0x7f20c26d7239]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_done+0x4e)
[0x7f20c26d696e]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_self_heal_completion_cbk+0x271)
[0x7f20c26dd171]))) 0-cloud-replicate-0: invalid argument: inode
[2011-09-12 16:23:14.187447] W [afr-open.c:326:afr_openfd_sh_unwind]
0-cloud-replicate-0: fd not open on any subvolume 0x7f20c0f160cc
(/one/var/25/images/disk.0)
[2011-09-12 16:23:21.10661] W [afr-open.c:624:afr_openfd_flush]
0-cloud-replicate-0: fd not open on any subvolume 0x7f20c0f160cc
(/one/var/25/images/disk.0)
[2011-09-12 16:23:21.544266] W [afr-open.c:624:afr_openfd_flush]
0-cloud-replicate-0: fd not open on any subvolume 0x7f20c0f16174
(/one/var/30/images/disk.0)
[2011-09-12 16:53:31.252358] E [rpc-clnt.c:197:call_bail] 0-cloud-client-0:
bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x7927047x sent
= 2011-09-12 16:23:21.545077. timeout = 1800
[2011-09-12 16:53:31.252463] E [rpc-clnt.c:197:call_bail] 0-cloud-client-0:
bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x7927028x sent
= 2011-09-12 16:23:21.11527. timeout = 1800
[2011-09-12 16:53:31.255321] W [dict.c:418:dict_unref]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7f20c5ba7145]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/protocol/client.so(client3_1_fstat_cbk+0x31b)
[0x7f20c292ee3b]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x164)
[0x7f20c26d8284]))) 0-dict: dict is NULL
[2011-09-12 16:53:31.255463] W [dict.c:418:dict_unref]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7f20c5ba7145]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/protocol/client.so(client3_1_fstat_cbk+0x31b)
[0x7f20c292ee3b]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x164)
[0x7f20c26d8284]))) 0-dict: dict is NULL
[2011-09-12 16:53:31.255875] W [afr-common.c:122:afr_set_split_brain]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_flush_cbk+0x79)
[0x7f20c26d7239]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_done+0x4e)
[0x7f20c26d696e]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_self_heal_completion_cbk+0x271)
[0x7f20c26dd171]))) 0-cloud-replicate-0: invalid argument: inode
[2011-09-12 16:53:31.255909] W [afr-open.c:326:afr_openfd_sh_unwind]
0-cloud-replicate-0: fd not open on any subvolume 0x7f20c0f16174
(/one/var/30/images/disk.0)
[2011-09-12 16:53:31.256041] W [afr-common.c:122:afr_set_split_brain]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_flush_cbk+0x79)
[0x7f20c26d7239]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_sh_data_done+0x4e)
[0x7f20c26d696e]
(-->/usr/local/lib/glusterfs/3.2.2/xlator/cluster/replicate.so(afr_self_heal_completion_cbk+0x271)
[0x7f20c26dd171]))) 0-cloud-replicate-0: invalid argument: inode
[2011-09-12 16:53:31.256085] W [afr-open.c:326:afr_openfd_sh_unwind]
0-cloud-replicate-0: fd not open on any subvolume 0x7f20c0f160cc
(/one/var/25/images/disk.0)

And on the server side:
[2011-09-12 16:21:47.835521] W [socket.c:1494:__socket_proto_state_machine]
0-tcp.cloud-server: reading from socket failed. Error (Connection timed
out), peer (A.B.C21:1021)
[2011-09-12 16:21:47.835563] I [server-helpers.c:485:do_fd_cleanup]
0-cloud-server: fd cleanup on /one/var/46/images/disk.0
[2011-09-12 16:22:36.486483] I [server-helpers.c:343:do_lock_table_cleanup]
0-cloud-server: finodelk released on /one/var/30/images/disk.0
[2011-09-12 16:22:36.486507] I [server-helpers.c:343:do_lock_table_cleanup]
0-cloud-server: finodelk released on /one/var/25/images/disk.0
[2011-09-12 16:22:36.497377] I [server-helpers.c:485:do_fd_cleanup]
0-cloud-server: fd cleanup on /one/var/25/images/disk.0
[2011-09-12 16:22:36.497413] I [server-helpers.c:485:do_fd_cleanup]
0-cloud-server: fd cleanup on /one/var/30/images/disk.0

What exactlly are these "fd cleanup" messages?

And this morning still side effects:
[2011-09-13 13:18:59.205656] E [afr-self-heal-data.c:637:afr_sh_data_fix]
0-cloud-replicate-0: Unable to self-heal contents of
'/one/var/25/images/disk.0' (possible split-brain). Please delete the file
from all but the preferred subvolume.
[2011-09-13 13:18:59.206990] W [afr-open.c:168:afr_open]
0-cloud-replicate-0: failed to open as split brain seen, returning EIO
[2011-09-13 13:18:59.207019] W [fuse-bridge.c:585:fuse_fd_cbk]
0-glusterfs-fuse: 62384: OPEN() /one/var/25/images/disk.0 => -1
(Input/output error)

>From the application layer, following error appears:
Sep 13 13:18:58 node4 libvirtd: 13:18:58.417: 1111: error :
qemudDomainSaveImageClose:5668 : cannot close file: Bad file descriptor
Sep 13 13:18:59 node4 libvirtd: 13:18:59.207: 1113: error :
virDomainDiskDefForeachPath:9010 : unable to open disk path
/srv/cloud/one/var//25/images/disk.0: Input/output error
Sep 13 13:18:59 node4 libvirtd: 13:18:59.213: 1113: error :
qemudDomainSaveImageClose:5668 : cannot close file: Bad file descriptor

Is there any configuration option to prevent this problem? Is there some
enhancements done in gluster 3.3 that can prevent the "split brain" issue
happen again?

Thank you very much in advance for any link or information,
Samuel.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110913/5c46767a/attachment.html>