[Gluster-users] replicate weirdness

Dan Bambach dan at lateral.net
Fri Aug 7 09:02:42 UTC 2009


I too noticed some corruption problems in the 2.0.x series under  
replicate when servers go down, though I have not had the time to  
isolate the scenario to reliably reproduce.

Latest version we are running is 2.0.4 so it still exists there.


On 7 Aug 2009, at 02:54, Jeff Evans wrote:

> Hi gluster mongers,
>
> I have ran into a critical problem under 2.0.3, and I would like to
> know if it has been reported (or fixed) already before making a
> detailed bug report.
>
> ----
>
> Low down:
>
> Two machines, RHEL 5.3, fuse 2.7.4, each running a single brick  
> server.
>
> clients on the same machines, AFR with writebehind & local
> read-subvolume enabled.
>
> clients run with --disable-direct-io-mode.
>
> Activity:
>
> Each client has several open fd's, including a xen image.
>
> kill glusterfsd on one machine.
>
> Open fd's are still being written to.
>
> umount & mount the underlying FS.
>
> Restart glusterfsd.
>
> Weirdness:
>
> On the same machine, client log entries:
> ... forced unwinding frame type(1) ...
> ... disconnected ... connected.
>
> Server log entries:
>
> [server-protocol.c:3903:server_readv] invalid argument: state->fd
> ...
> [fd.c:326:gf_fd_fdptr_get] fd: invalid argument
> [server-protocol.c:4108:server_flush] invalid argument: state->fd
> [server-protocol.c:3903:server_readv] invalid argument: state->fd
> ...
> [posix.c:1712:posix_writev] export: writev failed on
> fd=0x2aaaac0040c0: Bad file descriptor
> ...
> [server-protocol.c:3956:server_writev] invalid argument: state->fd
> [server-protocol.c:4062:server_fsync] invalid argument: state->fd
> ...
> [fd.c:282:gf_fd_put] fd: invalid argument
> ...REPEATS AD NAUSEUM...
>
> Really, Really Weird:
>
> The fd's seem to have been confused some how, as data from the xen
> images began to appear in other open files.
>
> This occurred on the underlying FS on the broken server side only.
>
> ----
>
> To restore sanity:
>
> umount both sides
> kill both glusterfsd's
> delete the corrupted files from the broken server's underlying FS.
> restart servers then clients.
>
> The deleted files auto repair successfully upon access and normality
> returns.
>
> ----
>
> Thanks for reading this far!
> Anyone experienced this or something similar?
> Comments/feedback much appreciated, further info available on request.
>
> Jeff.
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>




More information about the Gluster-users mailing list