[Gluster-users] replicate weirdness

Jeff Evans jeffe at tricab.com
Fri Aug 7 01:54:36 UTC 2009

Hi gluster mongers,

I have ran into a critical problem under 2.0.3, and I would like to
know if it has been reported (or fixed) already before making a
detailed bug report.


Low down:

Two machines, RHEL 5.3, fuse 2.7.4, each running a single brick server.

clients on the same machines, AFR with writebehind & local
read-subvolume enabled.

clients run with --disable-direct-io-mode.


Each client has several open fd's, including a xen image.

kill glusterfsd on one machine.

Open fd's are still being written to.

umount & mount the underlying FS.

Restart glusterfsd.


On the same machine, client log entries:
... forced unwinding frame type(1) ...
... disconnected ... connected.

Server log entries:

[server-protocol.c:3903:server_readv] invalid argument: state->fd
[fd.c:326:gf_fd_fdptr_get] fd: invalid argument
[server-protocol.c:4108:server_flush] invalid argument: state->fd
[server-protocol.c:3903:server_readv] invalid argument: state->fd
[posix.c:1712:posix_writev] export: writev failed on
fd=0x2aaaac0040c0: Bad file descriptor
[server-protocol.c:3956:server_writev] invalid argument: state->fd
[server-protocol.c:4062:server_fsync] invalid argument: state->fd
[fd.c:282:gf_fd_put] fd: invalid argument

Really, Really Weird:

The fd's seem to have been confused some how, as data from the xen
images began to appear in other open files.

This occurred on the underlying FS on the broken server side only.


To restore sanity:

umount both sides
kill both glusterfsd's
delete the corrupted files from the broken server's underlying FS.
restart servers then clients.

The deleted files auto repair successfully upon access and normality


Thanks for reading this far!
Anyone experienced this or something similar?
Comments/feedback much appreciated, further info available on request.


More information about the Gluster-users mailing list