[Gluster-devel] client crash with write-behind, other oddities in latest 1.4 tla
Brent A Nelson
brent at phys.ufl.edu
Tue Sep 16 16:15:25 UTC 2008
I've been looking for ways to speed up the performance of the new afr code
in 1.4 (which is far slower writing, though presumably more correct). I
found that write-behind brings it to more normal speeds, and the
new o-direct option of storage/posix seems to help quite a bit, too.
Unfortunately, I'm seeing some occasional client crashes when write-behind
is used on the client (with or without o-direct), with dump output such as
the following:
/lib/tls/i686/cmov/libc.so.6[0xb7dc4128]
/usr/lib/glusterfs/1.4.0pre6/xlator/protocol/client.so(client_flush_cbk+0xad)[0xb7d3f11d]
/usr/lib/glusterfs/1.4.0pre6/xlator/protocol/client.so(protocol_client_interpret+0x43d)[0xb7d401bd]
/usr/lib/glusterfs/1.4.0pre6/xlator/protocol/client.so(protocol_client_pollin+0xd2)[0xb7d40352]
/usr/lib/glusterfs/1.4.0pre6/xlator/protocol/client.so(notify+0x12f)[0xb7d419af]/usr/lib/glusterfs/1.4.0pre6/transport/socket.so[0xb74c712b]
/usr/lib/libglusterfs.so.0[0xb7f26575]
/usr/lib/libglusterfs.so.0(event_dispatch+0x21)[0xb7f25191]
/usr/sbin/glusterfs(main+0xad3)[0x804a4e3]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0)[0xb7daf450]
/usr/sbin/glusterfs[0x80499a1]
Even more puzzling, the o-direct option seems to require write-behind;
disabling write-behind causes immediate I/O errors when attempting to
write with o-direct enabled (restarting servers and client didn't help).
After then disabling o-direct on the servers and remounting, I discovered
something still more puzzling; the performance was still pretty good, even
though write-behind was now disabled on the client! Perhaps the effects of
o-direct linger (even though all processes were restarted)?
Thanks,
Brent
More information about the Gluster-devel
mailing list