[Gluster-devel] upstream: Symbolic link not getting healed

Vijay Bellur vbellur at redhat.com
Thu Dec 19 04:29:01 UTC 2013


On 12/19/2013 07:58 AM, Pranith Kumar Karampuri wrote:
> hi,
>      I used the following test to figure out the bad commit.
> #!/bin/bash
>
> . $(dirname $0)/../include.rc
> . $(dirname $0)/../volume.rc
>
> function trigger_mount_self_heal {
>          find $M0 | xargs stat
> }
>
> cleanup;
>
> TEST glusterd
> TEST pidof glusterd
> TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}{0,1}
> TEST $CLI volume set $V0 cluster.background-self-heal-count 0
> TEST $CLI volume start $V0
> TEST glusterfs --volfile-id=/$V0 --volfile-server=$H0 $M0 --use-readdirp=no --attribute-timeout=0 --entry-timeout=0
> TEST touch $M0/a
> TEST kill_brick $V0 $H0 $B0/${V0}0
> TEST ln -s $M0/a $M0/s
> TEST ! stat $B0/${V0}0/s
> TEST stat $B0/${V0}1/s
> TEST $CLI volume start $V0 force
> EXPECT_WITHIN 20 "Y" glustershd_up_status
> EXPECT_WITHIN 20 "1" afr_child_up_status_in_shd $V0 0
> TEST $CLI volume heal $V0 full
> TEST trigger_mount_self_heal
> TEST stat $B0/${V0}0/s
> TEST stat $B0/${V0}1/s
> cleanup
>
> According to git bisect run, the commit which introduced this problem is:
>
> 837422858c2e4ab447879a4141361fd382645406
> commit 837422858c2e4ab447879a4141361fd382645406
> Author: Anand Avati <avati at redhat.com>
> Date:   Thu Nov 21 06:48:17 2013 -0800
>
>      core: fix errno for non-existent GFID
>
>      When clients refer to a GFID which does not exist, the errno to
>      be returned in ESTALE (and not ENOENT). Even though ENOENT might
>      look "proper" most of the time, as the application eventually expects
>      ENOENT even if a parent directory does not exist, not returning
>      ESTALE results in resolvers (FUSE and GFAPI) to not retry resolution
>      in uncached mode. This can result in spurious ENOENTs during
>      concurrent path modification operations.
>
>      Change-Id: I7a06ea6d6a191739f2e9c6e333a1969615e05936
>      BUG: 1032894
>      Signed-off-by: Anand Avati <avati at redhat.com>
>      Reviewed-on: http://review.gluster.org/6322
>      Tested-by: Gluster Build System <jenkins at build.gluster.com>
>
> Affected branches: master, 3.5, 3.4,
>
> Will be working with Venkatesh to get a fix for this on all these branches.
> Good catch venkatesh!!. Thanks a lot for a simple case to re-create the issue :-).

Thanks for the analysis, Pranith & Venkatesh! Let us make sure that we 
add this test case to our regression tests.

>
> Vijay,
>       Do you think we need this patch for 3.4 as well? Did we get enough baking time? The change seems delicate. In the sense that all the places which are expecting ENOENT need to be carefully examined. Even if we miss one place, we have a potential bug.


We would need to fix this in 3.4 failing which we will end up with a 
regression from 3.4.1. For 3.4.2, we have two options:

1. Revert the original commit

2. Fix this problem

I think we can reach a decision after you post a fix. We can base our 
decision on the complexity/intrusiveness of the new patch.

-Vijay







More information about the Gluster-devel mailing list