[Gluster-devel] client crash with write-behind, other oddities in latest 1.4 tla

Brent A Nelson brent at phys.ufl.edu
Tue Sep 16 16:15:25 UTC 2008


I've been looking for ways to speed up the performance of the new afr code 
in 1.4 (which is far slower writing, though presumably more correct).  I 
found that write-behind brings it to more normal speeds, and the 
new o-direct option of storage/posix seems to help quite a bit, too.

Unfortunately, I'm seeing some occasional client crashes when write-behind 
is used on the client (with or without o-direct), with dump output such as 
the following:

/lib/tls/i686/cmov/libc.so.6[0xb7dc4128]
/usr/lib/glusterfs/1.4.0pre6/xlator/protocol/client.so(client_flush_cbk+0xad)[0xb7d3f11d]
/usr/lib/glusterfs/1.4.0pre6/xlator/protocol/client.so(protocol_client_interpret+0x43d)[0xb7d401bd]
/usr/lib/glusterfs/1.4.0pre6/xlator/protocol/client.so(protocol_client_pollin+0xd2)[0xb7d40352]
/usr/lib/glusterfs/1.4.0pre6/xlator/protocol/client.so(notify+0x12f)[0xb7d419af]/usr/lib/glusterfs/1.4.0pre6/transport/socket.so[0xb74c712b]
/usr/lib/libglusterfs.so.0[0xb7f26575]
/usr/lib/libglusterfs.so.0(event_dispatch+0x21)[0xb7f25191]
/usr/sbin/glusterfs(main+0xad3)[0x804a4e3]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0)[0xb7daf450]
/usr/sbin/glusterfs[0x80499a1]

Even more puzzling, the o-direct option seems to require write-behind; 
disabling write-behind causes immediate I/O errors when attempting to 
write with o-direct enabled (restarting servers and client didn't help).

After then disabling o-direct on the servers and remounting, I discovered 
something still more puzzling; the performance was still pretty good, even 
though write-behind was now disabled on the client! Perhaps the effects of 
o-direct linger (even though all processes were restarted)?

Thanks,

Brent





More information about the Gluster-devel mailing list