[Gluster-devel] self heal problem

Stephan von Krawczynski skraw at ithnet.com
Mon Mar 29 11:06:28 UTC 2010


Just in case you need a self-heal case under 2.0.9 that does not work, try
this:

Standard replication setup, 2 servers, 1 client.
lets assume you have some data on the glusterfs and everything is fine.
Now take away server1 (being the first server in your replication subvolume
list.
Delete all glusterfs-exported data on it (dont forget to re-create the root
export directory), just as you would have re-formatted the disks during a new
installation. Now bring your server back online, but before make sure your
client does not tilt your data by adding a favourite-child pointing to your
second server. You should be safe now. Of course you have to self-heal somehow.
Try to ls some file and find out it does not get self-healed. In fact you
cannot ls anything, the glusterfs on the client side seems empty.
Now change the replication subvolumes entries from "server1 server2" to
"server2 server1" and try again. Now your data gets self-healed and is back
visible.
Obviously the first server in the list has the decision what files exist, even
if favourite-child names another one. 
That sounds like a true bug.

--
Regards,
Stephan




On Mon, 29 Mar 2010 03:39:06 -0600 (CST)
"Tejas N. Bhise" <tejas at gluster.com> wrote:

> He is writing the algorithm in detail so the source code is not the only method to understand the algorithm. We cannot expect all filesystem users to be C coding experts .. so the effort is on to explain stuff in an easy to understand way ..in fact I would welcome C gurus on the community to contribute to such code documentation projects for some of the more involved functionality. 
> ----- Original Message -----
> From: Ed W <lists at wildgooses.com>
> To: Tejas N. Bhise <tejas at gluster.com>
> Cc: gluster-devel at nongnu.org
> Sent: Mon, 29 Mar 2010 02:56:18 -0600 (CST)
> Subject: Re: [Gluster-devel] self heal problem
> 
> On 27/03/2010 17:19, Tejas N. Bhise wrote:
> >
> >    
> >> It would seem that you could cause the situation that Stephen describes
> >> as follows:
> >>      
> > Just to clarify, were you able to create the failure with these steps ?
> > If yes then I will open a defect. If on the other hand you want to know
> > how this situation will be handled - Vikas is writing a document about
> > replication. He will include this case also.
> >    
> 
> No, it's just a theoretical question - I have read Vikas's page so far 
> and it looks like it will eventually answer this kind of question, but 
> not yet!
> 
> Thanks
> 
> Ed W
> 
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> 







More information about the Gluster-devel mailing list