[Gluster-users] Some questions about theoretical gluster failures.
Harry Mangalam
harry.mangalam at uci.edu
Wed Oct 26 15:31:53 UTC 2011
Thanks very much for your input.
I'm a bit surprised that new files would hash to the failed brick -
there isn't a check to make sure that the assigned brick is responding
and fall back to a ready brick? I can see that this would happen in
the 1st few seconds of failure, but after a short timeout, shouldn't
this feed back to the hasher?
I'll explicitly test this when I bring up the new version today.
Thanks again
hjm
On Wednesday 26 October 2011 06:34:33 Jeff Darcy wrote:
> > - what happens in a distributed system if a node goes down? Does
> > the rest of the system keep working with the files on that
> > brick unavailable until it comes back or is the filesystem
> > corrupted? In my testing, it seemed that the system indeed kept
> > working and added files to the remaining systems, but that files
> > that were hashed to the failed volume were unavailable (of
> > course).
>
> Yes, this is what I would expect (and have always observed) when
> using just distribution without replication. Not only are
> existing files on the failed brick unavailable, but IMX attempts
> to create new files which would hash to that brick (effectively a
> random 1/N) also fail. That part, at least, is fixable. With
> replication, the single-brick failure would effectively be
> invisible to the distribution layer so even this glitch wouldn't
> occur.
--
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[ZOT 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
--
This signature has been OCCUPIED!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111026/6446ed0e/attachment.html>
More information about the Gluster-users
mailing list