[Gluster-users] self heal errors on 3.1.1 clients
Burnash, James
jburnash at knight.com
Thu Jan 27 22:49:18 UTC 2011
Well, I must say THAT is good to hear. That being the case, I'm not touching anything that seems to be working.
Thanks Avati.
-----Original Message-----
From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Anand Avati
Sent: Thursday, January 27, 2011 5:45 PM
To: David Lloyd
Cc: gluster-users at gluster.org
Subject: Re: [Gluster-users] self heal errors on 3.1.1 clients
David,
The problem what you are facing is something we are already investigating.
We still haven't root-caused it yet, but from what we have seen this happens only on / and only for metadata changelog. This shows up as just annoying logs but it should not affect your functionality.
Avati
On Thu, Jan 27, 2011 at 2:03 PM, David Lloyd < david.lloyd at v-consultants.co.uk> wrote:
> Yes, it seemed really dangerous to me too. But with the lack of
> documentation, and lack of response from gluster (and the data is
> still on the old system too), I thought I'd give it a shot.
>
> Thanks for the explanation. The split-brain problem seems to come up
> fairly regularly, but I've not found any clear explanation of what to
> do in this situation. I'm starting to worry about what appears to be a
> rationing of information from gluster.com to the the community at large.
>
> We're not in a position to purchase support, and I'm a sysadmin, not a
> developer. I hope to make a contribution in terms of testing and
> feedback and bug reports, but I'm seeing a lot of threads that seem to
> go nowhere, and it's getting a bit frustrating.
>
> David
>
>
>
> > This seems really dangerous to me. On a brick xxx, the
> > trusted.afr.yyy attribute consists of three unsigned 32-bit
> > counters, indicating how many uncommitted operations (data,
> > metadata, and namespace respectively) might exist at yyy. If xxx
> > shows uncommitted operations at yyy but not vice versa, then we know
> > that xxx is more up to date and it should be the
> source
> > for self-heal. If two bricks show uncommitted operations at each
> > other, then we're in the infamous "split brain" scenario. Some
> > client was
> unable
> > to clear the counter at xxx while another was unable to clear it at
> > yyy,
> or
> > both xxx and yyy went down after the operation was complete but
> > before
> they
> > could clear the counters for each other.
> >
> > In this case, it looks like a metadata operation (permission change)
> > was
> in
> > this state. If the permissions are in fact the same both places
> > then it doesn't matter which way self-heal happens, or whether it happens at all.
> > In fact, it seems to me that AFR should be able to detect this
> particular
> > condition and not flag it as an error. In any case, I think you're
> probably
> > fine in this case but in general it's a very bad idea to clear these
> flags
> > manually because it can cause updates to be lost (if self-heal goes
> > the wrong way) or files to remain in an inconsistent state (if no
> > self-heal occurs).
> >
> > The real thing I'd wonder about is why both servers are so
> > frequently becoming unavailable at the same instant (switch
> > problem?) and why permission changes on the root are apparently so
> > frequent that this ofen results in a split-brain.
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> >
>
>
>
> --
> David Lloyd
> V Consultants
> www.v-consultants.co.uk
> tel: +44 7983 816501
> skype: davidlloyd1243
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>
DISCLAIMER:
This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any copy of any e-mail and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission.
NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com
More information about the Gluster-users
mailing list