[Gluster-users] replicate weirdness
Jeff Evans
jeffe at tricab.com
Fri Aug 7 01:54:36 UTC 2009
Hi gluster mongers,
I have ran into a critical problem under 2.0.3, and I would like to
know if it has been reported (or fixed) already before making a
detailed bug report.
----
Low down:
Two machines, RHEL 5.3, fuse 2.7.4, each running a single brick server.
clients on the same machines, AFR with writebehind & local
read-subvolume enabled.
clients run with --disable-direct-io-mode.
Activity:
Each client has several open fd's, including a xen image.
kill glusterfsd on one machine.
Open fd's are still being written to.
umount & mount the underlying FS.
Restart glusterfsd.
Weirdness:
On the same machine, client log entries:
... forced unwinding frame type(1) ...
... disconnected ... connected.
Server log entries:
[server-protocol.c:3903:server_readv] invalid argument: state->fd
...
[fd.c:326:gf_fd_fdptr_get] fd: invalid argument
[server-protocol.c:4108:server_flush] invalid argument: state->fd
[server-protocol.c:3903:server_readv] invalid argument: state->fd
...
[posix.c:1712:posix_writev] export: writev failed on
fd=0x2aaaac0040c0: Bad file descriptor
...
[server-protocol.c:3956:server_writev] invalid argument: state->fd
[server-protocol.c:4062:server_fsync] invalid argument: state->fd
...
[fd.c:282:gf_fd_put] fd: invalid argument
...REPEATS AD NAUSEUM...
Really, Really Weird:
The fd's seem to have been confused some how, as data from the xen
images began to appear in other open files.
This occurred on the underlying FS on the broken server side only.
----
To restore sanity:
umount both sides
kill both glusterfsd's
delete the corrupted files from the broken server's underlying FS.
restart servers then clients.
The deleted files auto repair successfully upon access and normality
returns.
----
Thanks for reading this far!
Anyone experienced this or something similar?
Comments/feedback much appreciated, further info available on request.
Jeff.
More information about the Gluster-users
mailing list