[Gluster-devel] problem when client goes down

Antonio González antonio.gonzalez at libera.net
Tue Apr 15 10:18:27 UTC 2008


Hello, 

 

This issue is similar to the issue comments in the GlusterFS list
“[Gluster-devel] Timeout settings and self-healing? (WAS: HA failover test
unsuccessful (inaccessible mountpoint))”

 

I have  already probed the “option transport-timeout 20” . The result are
the same, the first “ls” of the other clients takes 30 seconds, others “ls”
takes a variable time. Even times the file system is blocked during a large
time 3 or 4 minutes. 

 

I have made the test with a simple schema with the same results. 2 servers
(one export a brick for storage and a brick for namespace and the other
server exports a brick for storage). At the client side is defined the unify
translator (without replication). The test is to upload a large file (3.5
GB) to the GlusterFS and then unplugged the network cable of the client who
uploads the file. 

 

The client (upload) log says: 

            C [tcp.c:87:tcp_disconnect] brick1: connection disconnected

E [unify.c:325:unify_lookup] unify: returning ESTALE for /(17314670376)
[translator generation (0) inode generation (3)]

E [fuse-bridge.c:459:fuse_entry_cbk] glusterfs-fuse: 111: (34) / => -1 (116)

E [unify.c:325:unify_lookup] unify: returning ESTALE for /(17314670376)
[translator generation (0) inode generation (3)]

 

The client log (other client, ls) says: 

            E [client-protocol.c:4520:client_getspec_cbk] trans: no proper
reply from server, returning ENTCONN

            E [client-protocol-c:4809:client_protocol_cleanup] trnas torced
unwinding frame type(2) op(4) reply=@0x8050988] 

 

The server log says: 

            E [protocol.c:271:gf_block_unserialize_tranposrt] server: EOF
from peer (10.1.0.45:1023)

            C [tcp.c:87:tcp_disconnect] server: connection]

 

The version of GlusterFS is 1.3.8pre5.

The version of fuse is 2.7.2glfs9

 

The next test will be with and older version, p.e. glusterfs 1.3.7. do you
think that this test is necessary ??

 

Thanks for the reply.

 

 

 




 

 



Antonio González
Analista I+D 

Libera Networks
  C/ Marie Curie, 12 (PTA)
29590 Málaga 


antonio.gonzalez at libera.net 


          tel: 
fax: 

+34902105282
+34952020438 

 



 
<http://www.libera.net/correoweb/redir.php?https://www.plaxo.com/add_me?u=51
540170138&v0=1125188&k0=1660502549> 

 <http://www.libera.net/correoweb/redir.php?http://www.plaxo.com/signature> 

 

  _____  

De: anand.avati at gmail.com [mailto:anand.avati at gmail.com] En nombre de Anand
Avati
Enviado el: martes, 15 de abril de 2008 11:32
Para: Antonio González
CC: gluster-devel at nongnu.org
Asunto: Re: [Gluster-devel] problem when client goes down

 

Antonio,
please use 'option transport-timeout 20' in (all of) your protocl/client
volumes.

avati

2008/4/14, Antonio González <antonio.gonzalez at libera.net>:

Sorry, i forgot to say that when I say "shut down the network" I want to say
"unplugged" the cable of the client 1, the networks is operative for the
other clients and servers.

Thanks,

-----Mensaje original-----
De: gluster-devel-bounces+antonio.gonzalez=libera.net at nongnu.org
[mailto:gluster-devel-bounces+antonio.gonzalez
<mailto:gluster-devel-bounces%2Bantonio.gonzalez> =libera.net at nongnu.org] En
nombre de Antonio González
Enviado el: lunes, 14 de abril de 2008 18:51
Para: gluster-devel at nongnu.org
Asunto: [Gluster-devel] problem when client goes down




Hello all, the scenario is:



*         3 machines as server (pc1, pc2 exports three volumes (storage,
replication, namespace), pc3 exports two bricks (storage/posix).

*         Pc1 replicates pc2, pc2 replicates pc3 and pc3 replicates pc1.

*         Namespace is replicated at pc1 and pc2.

*         AFR's and unify at client side.

*         2 machines as client.





The test is to copy a file from GlusterFS to local and shut down the network
before the copy finishes. I can see that the client who launched the command
waits until the network is operative.  I think that is normal that the
client waits a response from server (or timeout). But the problem is that if
I try to "ls" in the GlusterFS file system from other client, is blocked
also.



I don't know if the configuration is not correct or if a GlusterFS problem
is.



Thanks















Antonio González
Analista I+D

Libera Networks
  C/ Marie Curie, 12 (PTA)
29590 Málaga


antonio.gonzalez at libera.net


          tel:
fax:

+34902105282
+34952020438






<http://www.libera.net/correoweb/redir.php?https://www.plaxo.com/add_me?u=51
540170138&v0=1125188&k0=1660502549>

  <http://www.libera.net/correoweb/redir.php?http://www.plaxo.com/signature>



_______________________________________________
Gluster-devel mailing list
Gluster-devel at nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel



_______________________________________________
Gluster-devel mailing list
Gluster-devel at nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel




-- 
If I traveled to the end of the rainbow
As Dame Fortune did intend,
Murphy would be there to tell me
The pot's at the other end. 




More information about the Gluster-devel mailing list