[Gluster-users] Some questions about theoretical gluster failures.

Wed Oct 26 15:31:53 UTC 2011

Thanks very much for your input.

I'm a bit surprised that new files would hash to the failed brick - 
there isn't a check to make sure that the assigned brick is responding 
and fall back to a ready brick?  I can see that this would happen in 
the 1st few seconds of failure, but after a short timeout, shouldn't 
this feed back to the hasher?

I'll explicitly test this when I bring up the new version today.

Thanks again
hjm

On Wednesday 26 October 2011 06:34:33 Jeff Darcy wrote:
> > - what happens in a distributed system if a node goes down?  Does
> > the  rest of the system keep working with the files on that
> > brick unavailable until it comes back or is the filesystem
> > corrupted?  In my testing, it seemed that the system indeed kept
> > working and added files to the remaining systems, but that files
> > that were hashed to the failed volume were unavailable (of
> > course).
> 
> Yes, this is what I would expect (and have always observed) when
> using just distribution without replication.  Not only are
> existing files on the failed brick unavailable, but IMX attempts
> to create new files which would hash to that brick (effectively a
> random 1/N) also fail.  That part, at least, is fixable.  With
> replication, the single-brick failure would effectively be
> invisible to the distribution layer so even this glitch wouldn't
> occur.

-- 
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[ZOT 2225] / 92697  Google Voice Multiplexer: (949) 478-4487 
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
--
This signature has been OCCUPIED!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111026/6446ed0e/attachment.html>