[Gluster-devel] NFS reexport works, still stat-prefetch issues, -s problem
Brent A Nelson
brent at phys.ufl.edu
Thu May 10 23:36:24 UTC 2007
It looks like the glusterfs crash in the slow NFS-client case may be
caused by read-ahead.
I was able to get this backtrace:
Program terminated with signal 11, Segmentation fault.
#0 0xb756246d in ra_frame_return ()
from /usr/lib/glusterfs/1.3.0-pre3/xlator/performance/read-ahead.so
(gdb) bt
#0 0xb756246d in ra_frame_return ()
from /usr/lib/glusterfs/1.3.0-pre3/xlator/performance/read-ahead.so
#1 0xb7562587 in ra_page_error ()
from /usr/lib/glusterfs/1.3.0-pre3/xlator/performance/read-ahead.so
#2 0xb7562cf0 in ?? ()
from /usr/lib/glusterfs/1.3.0-pre3/xlator/performance/read-ahead.so
#3 0x12b66f20 in ?? ()
#4 0xffffffff in ?? ()
#5 0x0000004d in ?? ()
#6 0x00000020 in ?? ()
#7 0x00000000 in ?? ()
Removing read-ahead from my config, I was able to do my 10GB file copy
without a crash. A bonus was that my copy was much faster without
read-ahead (3.7MBps vs. 2.2MBps), although I suspect that's why the copy
actually completed successfully.
Even without read-ahead, I still get a very large glusterfs process, so it
appears that read-ahead is not the memory leak culprit.
If I also remove write-behind (making the copy horribly slow), my copy
still fails eventually, but glusterfs doesn't crash and the filesystem is
still available. Errors logged:
[May 10 18:14:18] [ERROR/common-utils.c:55/full_rw()]
libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
[May 10 18:14:18] [CRITICAL/tcp.c:81/tcp_disconnect()]
transport/tcp:share4-1: connection to server disconnected
[May 10 18:14:18] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[May 10 18:14:18] [ERROR/common-utils.c:55/full_rw()]
libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=9)
[May 10 18:14:18] [CRITICAL/tcp.c:81/tcp_disconnect()]
transport/tcp:share4-0: connection to server disconnected
[May 10 18:14:18] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
[May 10 18:14:18] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
[May 10 18:14:19] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[May 10 18:14:19] [ERROR/common-utils.c:55/full_rw()]
libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
[May 10 18:14:19] [CRITICAL/tcp.c:81/tcp_disconnect()]
transport/tcp:share4-0: connection to server disconnected
[May 10 18:14:19] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
I've seen the "0 bytes r/w instead of 113" message plenty of times in the
past (with older GlusterFS versions), although it was apparently harmless
before. It looks like the code now considers this to be a disconnection
and tries to reconnect. For some reason, when it does manage to
reconnect, it nevertheless results in an I/O error. I wonder if this
relates to a previous issue I mentioned with real disconnects (node dies
or glusterfsd is restarted), where the first access after a failure (at
least for ls or df) results in an error, but the next attempt succeeds?
Seems like an issue with the reconnection logic (and some sort of glitch
masquerading as a disconnect in the first place)... This is probably the
real problem that is triggering the read-ahead crash (i.e., the read-ahead
crash would not be triggered in my test case if it weren't for this
issue).
Finally, glusterfs still grows even in this case, so that would leave afr,
unify, protocol/client, or glusterfs itself as possible leakers.
Thanks,
Brent
On Thu, 10 May 2007, Brent A Nelson wrote:
> The -s issue was completely eliminated with the recent patch.
>
> GlusterFS is looking quite solid now, but I can still kill it with an NFS
> reexport to a slow client (100Mbps, while the servers and reexport node are
> 1000Mbps) and a 10GB file copy via NFS from the GlusterFS filesystem to the
> GlusterFS filesystem.
>
> The glusterfs process slowly consumes more and more memory (many 10s of MB to
> several hundred MB) and eventually dies sometime before the copy completes
> (well before it would run out of memory, however). The copy does work for
> quite a while before the glusterfs suddenly dies. See attached -LDEBUG
> output from the glusterfs process.
>
> The glusterfs client is using client, afr, unify, read-ahead, and
> write-behind (with aggregation of 0). glusterfsd runs with server,
> storage/posix, and posix locks (although nothing in my test should invoke
> locking). The glusterfsd processes survive the test just fine and don't
> require a restart.
>
> Thanks,
>
> Brent
>
> On Tue, 8 May 2007, Anand Avati wrote:
>
>> does the log say "connection on <socket> still in progress - try
>> later" when run with -LDEBUG?
>>
>> avati
>>
>> 2007/5/8, Brent A Nelson <brent at phys.ufl.edu>:
>>> On Sun, 6 May 2007, Anand Avati wrote:
>>>
>>> >> 3) When doing glusterfs -s to a different machine to retrieve the spec
>>> >> file, it now fails. A glusterfs -s to the local machine succeeds. It
>>> >> looks like a small buglet was introduced in the -s support.
>>> >
>>> > this is fixed now, it was an unrelated change triggered by the new way
>>> -s
>>> > works.
>>> >
>>>
>>> Hmm, my -s issue still seems to be there, a client can only seem to
>>> retrieve its spec file from a local glusterfsd. Was the -s fix applied to
>>> the tla repository?
>>>
>>> root at jupiter02:~# glusterfs -s jupiter01 /backup
>>> glusterfs: could not open specfile
>>> root at jupiter02:~# glusterfs -s jupiter02 /backup
>>> root at jupiter02:~#
>>>
>>> The reverse on jupiter01 behaves the same way (can retrieve from itself,
>>> not from jupiter02).
>>>
>>> The big glitch that I thought might be related (client could only mount a
>>> GlusterFS if it was also a server of that GlusterFS) WAS fixed after a
>>> tla update and recompile following your email, however.
>>>
>>> Thanks,
>>>
>>> Brent
>>>
>>
>>
>> --
>> Anand V. Avati
>
More information about the Gluster-devel
mailing list