[Gluster-devel] NFS reexport works, still stat-prefetch issues, -s problem
Brent A Nelson
brent at phys.ufl.edu
Fri May 11 02:01:18 UTC 2007
On Thu, 10 May 2007, Brent A Nelson wrote:
> [May 10 18:14:18] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0
> bytes r/w instead of 113 (errno=115)
> [May 10 18:14:18] [CRITICAL/tcp.c:81/tcp_disconnect()]
> transport/tcp:share4-1: connection to server disconnected
> [May 10 18:14:18] [CRITICAL/client-protocol.c:218/call_bail()]
> client/protocol:bailing transport
> [May 10 18:14:18] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0
> bytes r/w instead of 113 (errno=9)
> [May 10 18:14:18] [CRITICAL/tcp.c:81/tcp_disconnect()]
> transport/tcp:share4-0: connection to server disconnected
> [May 10 18:14:18] [ERROR/client-protocol.c:204/client_protocol_xfer()]
> protocol/client:transport_submit failed
> [May 10 18:14:18] [ERROR/client-protocol.c:204/client_protocol_xfer()]
> protocol/client:transport_submit failed
> [May 10 18:14:19] [CRITICAL/client-protocol.c:218/call_bail()]
> client/protocol:bailing transport
> [May 10 18:14:19] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0
> bytes r/w instead of 113 (errno=115)
> [May 10 18:14:19] [CRITICAL/tcp.c:81/tcp_disconnect()]
> transport/tcp:share4-0: connection to server disconnected
> [May 10 18:14:19] [ERROR/client-protocol.c:204/client_protocol_xfer()]
> protocol/client:transport_submit failed
>
> I've seen the "0 bytes r/w instead of 113" message plenty of times in the
> past (with older GlusterFS versions), although it was apparently harmless
> before. It looks like the code now considers this to be a disconnection and
> tries to reconnect. For some reason, when it does manage to reconnect, it
> nevertheless results in an I/O error. I wonder if this relates to a previous
> issue I mentioned with real disconnects (node dies or glusterfsd is
> restarted), where the first access after a failure (at least for ls or df)
> results in an error, but the next attempt succeeds? Seems like an issue with
> the reconnection logic (and some sort of glitch masquerading as a disconnect
> in the first place)... This is probably the real problem that is triggering
> the read-ahead crash (i.e., the read-ahead crash would not be triggered in my
> test case if it weren't for this issue).
>
Well, it looks like I can reproduce this behavior (but, so far, not the
memory leak), on a much simpler setup, no NFS required. I was copying my
test area (with several 10GB files) to a really simple GlusterFS (one
share, no afr, no unify, glusterfsd on the same machine), when I hit the
disconnect issue (after a few files successfully copied). This looked
like an issue with protocol/client and/or protocol/server, but I thought
it would be a good idea to narrow things down a bit...
Thanks,
Brent
More information about the Gluster-devel
mailing list