[Gluster-devel] odd connection issues under high write load

Anand Avati avati at zresearch.com
Fri Jun 22 01:14:16 UTC 2007


the log messages below seem to be those from glusterfs--mainline--2.4.
assuming that you are trying to test glusterfs--mainline--2.5, can you
please verify the installtaion again? (make uninstall from the old tree/rm
-rf libglusterfs.so, ${prefix}/lib/glusterfs etc)

thanks
avati

2007/6/22, Daniel <daniel at datinggold.com>:
>
> 1.30-pre4
> afr across 2 servers
>
> servers are io-streams, no write back no read forward
> TCP on a Gigabit network
>
> We setup a stresstest script to test the client using php and about 36
> instances of the script, and occasionally we get a "transport end point
> not connected" which kills all of the instances (intentionally, they
> halt on error, but it means the mount went stale), but without any
> intervention gluster picks up again and seems to operate fine when we
> re-run the scripts
>
> we're pushing roughly 300 writes a second in the test
>
> the only debug info in the log is the following:
>
> [Jun 21 19:33:29] [CRITICAL/client-protocol.c:218/call_bail()]
> client/protocol:bailing transport
> [Jun 21 19:33:29] [CRITICAL/client-protocol.c:218/call_bail()]
> client/protocol:bailing transport
> [Jun 21 19:33:29] [ERROR/common-utils.c:55/full_rw()]
> libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=104)
> [Jun 21 19:33:29] [CRITICAL/tcp.c:81/tcp_disconnect()]
> transport/tcp:mortar1: connection to server disconnected
> [Jun 21 19:33:29] [ERROR/common-utils.c:55/full_rw()]
> libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=104)
> [Jun 21 19:33:29] [CRITICAL/tcp.c:81/tcp_disconnect()]
> transport/tcp:mortar2: connection to server disconnected
> [Jun 21 19:33:29] [ERROR/client-protocol.c:204/client_protocol_xfer()]
> protocol/client:transport_submit failed
> [Jun 21 19:33:29] [ERROR/client-protocol.c:204/client_protocol_xfer()]
> protocol/client:transport_submit failed
> [Jun 21 19:33:29] [CRITICAL/client-protocol.c:218/call_bail()]
> client/protocol:bailing transport
> [Jun 21 19:33:29] [CRITICAL/tcp.c:81/tcp_disconnect()]
> transport/tcp:mortar2: connection to server disconnected
> [Jun 21 19:33:29] [ERROR/client-protocol.c:204/client_protocol_xfer()]
> protocol/client:transport_submit failed
> [Jun 21 19:33:29] [CRITICAL/client-protocol.c:218/call_bail()]
> client/protocol:bailing transport
> [Jun 21 19:33:29] [ERROR/common-utils.c:55/full_rw()]
> libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
> [Jun 21 19:33:29] [CRITICAL/tcp.c:81/tcp_disconnect()]
> transport/tcp:mortar1: connection to server disconnected
> [Jun 21 19:33:29] [ERROR/client-protocol.c:204/client_protocol_xfer()]
> protocol/client:transport_submit failed
>
> I'm going to setup the debug xlator tomorrow if no one has anything off
> the tops of their heads about what might be wrong
>
> we haven't tested heavy read load yet, just writes
> we have managed to cause it multiple times, but haven't pinned down a
> cause as the debug logging all spits out basically the same material
>
> the client also has fairly high CPU usage during the test, roughly 90%
> of the core its on
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>



-- 
Anand V. Avati



More information about the Gluster-devel mailing list