[Gluster-devel] Gluster Recovery

Tue May 1 09:45:57 UTC 2007

On Fri, 2007-04-27 at 16:25 -0700, Anand Avati wrote: 

> > * How do I cleanly shut down the bricks making sure that they remain
> > consistent?
> 
> For 1.3 you have to kill the glusterfsd manually. You can get the pid
> from the pidfile (${datadir}/run/glusterfsd.pid)

That's not a problem, my question is how do I shut down two mirrored
bricks whilst maintaining the consistency of the mirrors? 

> > * Could race conditions ever lead to the different bricks having
> > different data if two clients tried to write to the same mirrored file?
> > Is this the reason for using the posix-locks translator over and above
> > the posix locks on the underlying bricks? 
> 
> you are right, two clients writing to the same region of a file are
> expected to use posix locks to lockout their region before editing in
> an AFR scenario.

Mirroring still raises a 'layer' issue: for an unmirrored, functioning
disk the filesystem always knows what the bits on the disk are although
locking issues may mean that the data are invalid at the application
level. A mirrored filesystem raises the additional issue that the two
mirrors may disagree about what the bits are. So, if applications fail
to use locking is there the danger that the two mirrors may end up with
different bits on their disks? (This is a similar question to the one
above.)

> > * A mirror-consistency check command. Presumably this would be a fairly-
> > small addition to the rebuild code. A danger of all mirroring schemes is
> > that they can hide underlying problems until it's too late!
> 
> the 'self-heal' feature is aimed to be this, which, in runtime keeps
> checking for inconsitoncies and fixes them 'on-the-fly' in a proactive
> fashion.

Great, I assume there will be a log of these actions?

Thanks again.

John