[Gluster-devel] brick multiplexing regression is broken

Fri Oct 6 05:44:23 UTC 2017


On 10/06/2017 11:08 AM, Mohit Agrawal wrote:
> Without a patch test case will fail, it is an expected behavior.
When I said without patches, I meant it is failing on current HEAD on 
master which has the commit 9b4de61a136b8e5ba7bf0e48690cdb
1292d0dee8.
-Ravi
>
>
> Regards
> Mohit Agrawal
>
> On Fri, Oct 6, 2017 at 11:04 AM, Ravishankar N <ravishankar at redhat.com 
> <mailto:ravishankar at redhat.com>> wrote:
>
>     The test is failing on master without any patches:
>
>     [root at tuxpad glusterfs]# prove tests/bugs/bug-1371806_1.t
>     tests/bugs/bug-1371806_1.t .. 7/9 setfattr: ./tmp1: No such file
>     or directory
>     setfattr: ./tmp2: No such file or directory
>     setfattr: ./tmp3: No such file or directory
>     setfattr: ./tmp4: No such file or directory
>     setfattr: ./tmp5: No such file or directory
>     setfattr: ./tmp6: No such file or directory
>     setfattr: ./tmp7: No such file or directory
>     setfattr: ./tmp8: No such file or directory
>     setfattr: ./tmp9: No such file or directory
>     setfattr: ./tmp10: No such file or directory
>     ./tmp1: user.foo: No such attribute
>     tests/bugs/bug-1371806_1.t .. Failed 2/9 subtests
>
>     Mount log for one of the directories:
>     [2017-10-06 05:32:10.059798] I [MSGID: 109005]
>     [dht-selfheal.c:2458:dht_selfheal_directory] 0-patchy-dht:
>     Directory selfheal failed: Unable to form layout for directory /tmp1
>     [2017-10-06 05:32:10.060013] E [MSGID: 109011]
>     [dht-common.c:5011:dht_dir_common_setxattr] 0-patchy-dht: Failed
>     to get mds subvol for path /tmp1gfid is
>     00000000-0000-0000-0000-000000000000
>     [2017-10-06 05:32:10.060041] W [fuse-bridge.c:1377:fuse_err_cbk]
>     0-glusterfs-fuse: 99: SETXATTR() /tmp1 => -1 (No such file or
>     directory)
>
>     Request the patch authors to take a look at it.
>     Thanks
>     Ravi
>
>
>     On 10/05/2017 06:04 PM, Atin Mukherjee wrote:
>>     The following commit has broken the brick multiplexing regression
>>     job. tests/bugs/bug-1371806_1.t has failed couple of times.  One
>>     of the latest regression job report is at
>>     https://build.gluster.org/job/regression-test-with-multiplex/406/console
>>     <https://build.gluster.org/job/regression-test-with-multiplex/406/console>
>>     .
>>
>>
>>     commit 9b4de61a136b8e5ba7bf0e48690cdb1292d0dee8
>>     Author: Mohit Agrawal <moagrawa at redhat.com
>>     <mailto:moagrawa at redhat.com>>
>>     Date:   Fri May 12 21:12:47 2017 +0530
>>
>>         cluster/dht : User xattrs are not healed after brick stop/start
>>
>>         Problem: In a distributed volume custom extended attribute
>>     value for a directory
>>                  does not display correct value after stop/start or
>>     added newly brick.
>>                  If any extended(acl) attribute value is set for a
>>     directory after stop/added
>>                  the brick the attribute(user|acl|quota) value is not
>>     updated on brick
>>                  after start the brick.
>>
>>         Solution: First store hashed subvol or subvol(has internal
>>     xattr) on inode ctx and
>>                   consider it as a MDS subvol.At the time of update
>>     custom xattr
>>                   (user,quota,acl, selinux) on directory first check
>>     the mds from
>>                   inode ctx, if mds is not present on inode ctx then
>>     throw EINVAL error
>>                   to application otherwise set xattr on MDS subvol
>>     with internal xattr
>>                   value of -1 and then try to update the attribute on
>>     other non MDS
>>                   volumes also.If mds subvol is down in that case
>>     throw an
>>                   error "Transport endpoint is not connected". In
>>     dht_dir_lookup_cbk|
>>                   dht_revalidate_cbk|dht_discover_complete call
>>     dht_call_dir_xattr_heal
>>                   to heal custom extended attribute.
>>                   In case of gnfs server if hashed subvol has not
>>     found based on
>>                   loc then wind a call on all subvol to update xattr.
>>
>>         Fix:    1) Save MDS subvol on inode ctx
>>                 2) Check if mds subvol is present on inode ctx
>>                 3) If mds subvol is down then call unwind with error
>>     ENOTCONN and if it is up
>>                    then set new xattr "GF_DHT_XATTR_MDS" to -1 and
>>     wind a call on other
>>                    subvol.
>>                 4) If setxattr fop is successful on non-mds subvol
>>     then increment the value of
>>                    internal xattr to +1
>>                 5) At the time of directory_lookup check the value of
>>     new xattr GF_DHT_XATTR_MDS
>>                 6) If value is not 0 in dht_lookup_dir_cbk(other cbk)
>>     functions then call heal
>>                    function to heal user xattr
>>                 7) syncop_setxattr on hashed_subvol to reset the
>>     value of xattr to 0
>>                    if heal is successful on all subvol.
>>
>>         Test : To reproduce the issue followed below steps
>>                1) Create a distributed volume and create mount point
>>                2) Create some directory from mount point mkdir tmp{1..5}
>>                3) Kill any one brick from the volume
>>                4) Set extended attribute from mount point on directory
>>                   setfattr -n user.foo -v "abc" ./tmp{1..5}
>>                   It will throw error " Transport End point is not
>>     connected "
>>                   for those hashed subvol is down
>>                5) Start volume with force option to start brick process
>>                6) Execute getfattr command on mount point for directory
>>                7) Check extended attribute on brick
>>                   getfattr -n user.foo <volume-location>/tmp{1..5}
>>                   It shows correct value for directories for those
>>                   xattr fop were executed successfully.
>>
>>         Note: The patch will resolve xattr healing problem only for
>>     fuse mount
>>               not for nfs mount.
>>
>>         BUG: 1371806
>>         Signed-off-by: Mohit Agrawal <moagrawa at redhat.com
>>     <mailto:moagrawa at redhat.com>>
>>
>>         Change-Id: I4eb137eace24a8cb796712b742f1d177a65343d5
>>
>>
>>
>>     _______________________________________________
>>     Gluster-devel mailing list
>>     Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>>     http://lists.gluster.org/mailman/listinfo/gluster-devel
>>     <http://lists.gluster.org/mailman/listinfo/gluster-devel>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20171006/741fca62/attachment.html>