[Gluster-users] 1.4.0RC6 AFR problems
Keith Freedman
freedman at FreeFormIT.com
Tue Dec 23 10:32:11 UTC 2008
so, I had a drive failure on one of my boxes and it lead to discovery
of numerous issues today:
1) when a drive is failing and one of the AFR servers is dealing with
IO errors, the other one freaks out and sometimes crashes, but
doesn't seem to ever network timeout.
2) when starting gluster on the server with the new empty drive, it
gave me a bunch of errors about things being out of sync and to
delete a file from all but the preferred server.
this struck me as odd, since the thing was empty.
so I used the favorite child, but this isn't a preferred solution long term.
3) one of the directories had 20GB of data in it.... I went to do an
ls of the directory and had to wait while it auto-healed all the
files.. while this is helpful, it would be nice to have gotten back
the directory listing without having to wait for 20GB of data to get
sent over the network.
4) while the other server was down, the up server kept failing..
signal 11? and I had to constantly remount the filesystem. It was
giving me messages about the other node being down which was fine but
then it'd just die after a while.. consistently.
More information about the Gluster-users
mailing list