[Gluster-devel] upstream: Symbolic link not getting healed
Vijay Bellur
vbellur at redhat.com
Fri Dec 20 12:36:03 UTC 2013
On 12/19/2013 02:28 PM, Harshavardhana wrote:
> GFAPI observes ENOENT with glfs_stat() - so the fix is necessary.
I agree that the fix is necessary. We will address it for release-3.5
and master now. Getting this into release-3.4 at this point in time is
dicey as we are planning to release 3.4.2 on Monday. Given that the
libgfapi problem has existed in 3.4.1 and is not a new regression in
3.4.2, we can target the complete fix for 3.4.3. At the moment, I am
inclined to revert that fix for getting 3.4.2 out.
-Vijay
>
>
> On Wed, Dec 18, 2013 at 9:55 PM, Pranith Kumar Karampuri
> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>
>
>
> ----- Original Message -----
> > From: "Vijay Bellur" <vbellur at redhat.com <mailto:vbellur at redhat.com>>
> > To: "Pranith Kumar Karampuri" <pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>>, "Venkatesh Somyajulu"
> <vsomyaju at redhat.com <mailto:vsomyaju at redhat.com>>
> > Cc: gluster-devel at nongnu.org <mailto:gluster-devel at nongnu.org>
> > Sent: Thursday, December 19, 2013 9:59:01 AM
> > Subject: Re: [Gluster-devel] upstream: Symbolic link not getting
> healed
> >
> > On 12/19/2013 07:58 AM, Pranith Kumar Karampuri wrote:
> > > hi,
> > > I used the following test to figure out the bad commit.
> > > #!/bin/bash
> > >
> > > . $(dirname $0)/../include.rc
> > > . $(dirname $0)/../volume.rc
> > >
> > > function trigger_mount_self_heal {
> > > find $M0 | xargs stat
> > > }
> > >
> > > cleanup;
> > >
> > > TEST glusterd
> > > TEST pidof glusterd
> > > TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}{0,1}
> > > TEST $CLI volume set $V0 cluster.background-self-heal-count 0
> > > TEST $CLI volume start $V0
> > > TEST glusterfs --volfile-id=/$V0 --volfile-server=$H0 $M0
> --use-readdirp=no
> > > --attribute-timeout=0 --entry-timeout=0
> > > TEST touch $M0/a
> > > TEST kill_brick $V0 $H0 $B0/${V0}0
> > > TEST ln -s $M0/a $M0/s
> > > TEST ! stat $B0/${V0}0/s
> > > TEST stat $B0/${V0}1/s
> > > TEST $CLI volume start $V0 force
> > > EXPECT_WITHIN 20 "Y" glustershd_up_status
> > > EXPECT_WITHIN 20 "1" afr_child_up_status_in_shd $V0 0
> > > TEST $CLI volume heal $V0 full
> > > TEST trigger_mount_self_heal
> > > TEST stat $B0/${V0}0/s
> > > TEST stat $B0/${V0}1/s
> > > cleanup
> > >
> > > According to git bisect run, the commit which introduced this
> problem is:
> > >
> > > 837422858c2e4ab447879a4141361fd382645406
> > > commit 837422858c2e4ab447879a4141361fd382645406
> > > Author: Anand Avati <avati at redhat.com <mailto:avati at redhat.com>>
> > > Date: Thu Nov 21 06:48:17 2013 -0800
> > >
> > > core: fix errno for non-existent GFID
> > >
> > > When clients refer to a GFID which does not exist, the
> errno to
> > > be returned in ESTALE (and not ENOENT). Even though ENOENT
> might
> > > look "proper" most of the time, as the application
> eventually expects
> > > ENOENT even if a parent directory does not exist, not
> returning
> > > ESTALE results in resolvers (FUSE and GFAPI) to not retry
> resolution
> > > in uncached mode. This can result in spurious ENOENTs during
> > > concurrent path modification operations.
> > >
> > > Change-Id: I7a06ea6d6a191739f2e9c6e333a1969615e05936
> > > BUG: 1032894
> > > Signed-off-by: Anand Avati <avati at redhat.com
> <mailto:avati at redhat.com>>
> > > Reviewed-on: http://review.gluster.org/6322
> > > Tested-by: Gluster Build System <jenkins at build.gluster.com
> <mailto:jenkins at build.gluster.com>>
> > >
> > > Affected branches: master, 3.5, 3.4,
> > >
> > > Will be working with Venkatesh to get a fix for this on all
> these branches.
> > > Good catch venkatesh!!. Thanks a lot for a simple case to
> re-create the
> > > issue :-).
> >
> > Thanks for the analysis, Pranith & Venkatesh! Let us make sure
> that we
> > add this test case to our regression tests.
> >
> > >
> > > Vijay,
> > > Do you think we need this patch for 3.4 as well? Did we
> get enough
> > > baking time? The change seems delicate. In the sense that
> all the
> > > places which are expecting ENOENT need to be carefully
> examined.
> > > Even if we miss one place, we have a potential bug.
> >
> >
> > We would need to fix this in 3.4 failing which we will end up with a
> > regression from 3.4.1. For 3.4.2, we have two options:
> >
> > 1. Revert the original commit
> >
> > 2. Fix this problem
>
> If we fix this problem, we will only be fixing this particular
> problem. We
> don't know if there are more similar issues. That is the reason I am
> a bit
> concerned about the nature of change introduced by the original commit.
>
> Pranith
>
> >
> > I think we can reach a decision after you post a fix. We can base our
> > decision on the complexity/intrusiveness of the new patch.
> >
> > -Vijay
> >
> >
> >
> >
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>
>
> --
> /Religious confuse piety with mere ritual, the virtuous confuse
> regulation with outcomes/
More information about the Gluster-devel
mailing list