[Gluster-devel] 1402538 : Assertion failure during rebalance of symbolic links

Raghavendra Gowdappa rgowdapp at redhat.com
Wed Dec 14 05:10:27 UTC 2016



----- Original Message -----
> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> To: "Ashish Pandey" <aspandey at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>, "Shyam Ranganathan" <srangana at redhat.com>, "Nithya Balachandran"
> <nbalacha at redhat.com>, "Xavier Hernandez" <xhernandez at datalab.es>, "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> Sent: Tuesday, December 13, 2016 9:29:46 PM
> Subject: Re: 1402538 : Assertion failure during rebalance of symbolic links
> 
> On Tue, Dec 13, 2016 at 2:45 PM, Ashish Pandey <aspandey at redhat.com> wrote:
> 
> > Hi All,
> >
> > We have been seeing an issue where re balancing symbolic links leads to an
> > assertion failure in EC volume.
> >
> > The root cause of this is that while migrating symbolic links to other sub
> > volume, it creates a link file (with attributes .........T) .
> > This file is a regular file.
> > Now, during migration a setattr comes to this link and because of possible
> > race, posix_stat return stats of this "T" file.
> > In ec_manager_seattr, we receive callbacks and check the type of entry. If
> > it is a regular file we try to get size and if it is not there, we raise an
> > assert.
> > So, basically we are checking a size of the link (which will not have
> > size) which has been returned as regular file and we are ending up when
> > this condition
> > becomes TRUE.
> >
> > Now, this looks like a problem with re balance and difficult to fix at
> > this point (as per the discussion).
> > We have an alternative to fix it in EC but that will be more like a hack
> > than an actual fix. We should not modify EC
> > to deal with an individual issue which is in other translator.

I am afraid, dht doesn't have a better way of handling this. While DHT maintains abstraction (of a symbolic link) to layers above, the layers below it cannot be shielded from seeing the details like a linkto file etc. If the concern really is that the file is changing its type in a span of single fop, we can probably explore the option of locking (or other synchronization mechanisms) to prevent migration taking place, while a fop is in progress. But, I assume there will be performance penalties for that too.

> >
> > Now the question is how to proceed with this? Any suggestions?
> >
> 
> Raghavendra/Nithya,
>          Could one of you explain the difficulties in fixing this issue in
> DHT so that Xavi will also be caught up with why we should add this change
> in EC in the short term.
> 
> 
> >
> > Details on this bug can be found here -
> > https://bugzilla.redhat.com/show_bug.cgi?id=1402538
> >
> > ----
> > Ashish
> >
> >
> >
> >
> 
> 
> --
> Pranith
> 


More information about the Gluster-devel mailing list