[Gluster-devel] bug report: gluster blocks when server is down

Raghavendra G raghavendra.hg at gmail.com
Tue Aug 26 03:23:27 UTC 2008


Hi Markus,

you're right. glusterfs creates a thread which reads from /dev/fuse only
after it establishes connection with server. Since connection is not
established even once, the thread is not created and hence any operation on
mount point hangs forever. I am not sure whether this can be treated as a
bug.

regards,

On Tue, Aug 26, 2008 at 5:33 AM, Markus Fenske <iblue at gmx.net> wrote:

> Hello,
>
> It seems like I have found a bug.
>
> I run 1.3.11, my configuration is:
> ---
> volume server1
>       type protocol/client
>       option transport-type tcp/client
>       option remote-host 10.13.37.101
>       option remote-subvolume brick1
> end-volume
>
> volume server2
>       type protocol/client
>       option transport-type tcp/client
>       option remote-host 10.13.37.102
>       option remote-subvolume brick2
> end-volume
>
> volume mirror0
>       type cluster/afr
>       subvolumes server1 server2
> end-volume
> ---
>
> When I am running one or two servers and mount the volume to i.e.
> /cluster, and all servers go down, then I instantly get:
> ---
> user at test2:~$ ls /cluster
> ls: cannot access /cluster: Transport endpoint is not connected
> ---
>
> The glusterfs.log says:
> ---
> 2008-08-26 02:07:44 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:07:44 W [client-protocol.c:332:client_protocol_xfer]
> server1: not connected at the moment to submit frame type(1) op(34)
> 2008-08-26 02:07:44 E [client-protocol.c:4430:client_lookup_cbk]
> server1: no proper reply from server, returning ENOTCONN
> 2008-08-26 02:07:44 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:07:44 W [client-protocol.c:332:client_protocol_xfer]
> server2: not connected at the moment to submit frame type(1) op(34)
> 2008-08-26 02:07:44 E [client-protocol.c:4430:client_lookup_cbk]
> server2: no proper reply from server, returning ENOTCONN
> 2008-08-26 02:07:44 E [fuse-bridge.c:468:fuse_entry_cbk]
> glusterfs-fuse: 12: (34) / => -1 (107)
> 2008-08-26 02:07:44 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:07:44 W [client-protocol.c:332:client_protocol_xfer]
> server1: not connected at the moment to submit frame type(1) op(34)
> 2008-08-26 02:07:44 E [client-protocol.c:4430:client_lookup_cbk]
> server1: no proper reply from server, returning ENOTCONN
> 2008-08-26 02:07:44 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:07:44 W [client-protocol.c:332:client_protocol_xfer]
> server2: not connected at the moment to submit frame type(1) op(34)
> 2008-08-26 02:07:44 E [client-protocol.c:4430:client_lookup_cbk]
> server2: no proper reply from server, returning ENOTCONN
> 2008-08-26 02:07:44 E [fuse-bridge.c:468:fuse_entry_cbk]
> glusterfs-fuse: 12: (34) / => -1 (107)
> 2008-08-26 02:07:51 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> ---
>
> But when I start and the server is already down, I can wait forever
> for my ls to complete:
> ---
> user at test2:~$ ls /cluster
>
> ---
>
> With following glusterfs.log:
> ---
> 2008-08-26 02:08:11 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:11 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:12 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:12 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:14 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:14 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:17 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:17 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:22 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:22 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:30 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:30 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:43 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:08:43 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:09:04 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:09:04 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:09:38 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:09:38 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:10:33 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:10:33 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:12:02 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:12:02 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:14:26 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:14:27 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:18:20 E [tcp-client.c:190:tcp_connect] server2:
> non-blocking connect() returned: 111 (Connection refused)
> 2008-08-26 02:18:20 E [tcp-client.c:190:tcp_connect] server1:
> non-blocking connect() returned: 111 (Connection refused)
> ---
>
> Thanks,
> Markus Fenske
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G

A centipede was happy quite, until a toad in fun,
Said, "Prey, which leg comes after which?",
This raised his doubts to such a pitch,
He fell flat into the ditch,
Not knowing how to run.
-Anonymous



More information about the Gluster-devel mailing list