[Bugs] [Bug 1611116] New: 'custom extended attributes' set on a directory are not healed after bringing back the down sub-volumes

bugzilla at redhat.com bugzilla at redhat.com
Thu Aug 2 05:49:59 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1611116

            Bug ID: 1611116
           Summary: 'custom extended attributes' set on a directory are
                    not healed after bringing back the down sub-volumes
           Product: GlusterFS
           Version: 4.1
         Component: distribute
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: khiremat at redhat.com
                CC: bugs at gluster.org, moagrawa at redhat.com,
                    rhs-bugs at redhat.com, sankarshan at redhat.com,
                    storage-qa-internal at redhat.com, tdesala at redhat.com
        Depends On: 1582119, 1584098



+++ This bug was initially created as a clone of Bug #1584098 +++

+++ This bug was initially created as a clone of Bug #1582119 +++

Description of problem:
=======================
'custom extended attributes' set on a directory are not healed after bringing
back the down sub-volumes.

Client:
=======
getfattr -n user.foo c
# file: c
user.foo="bar1"

Backend bricks:
===============
[root at dhcpnode1 distrepx3-b0]# getfattr -d -e hex -m .
/bricks/brick0/distrepx3-b0/c
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick0/distrepx3-b0/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x00000000000000007ffffffc9ffffffa

[root at dhcpnode1 distrepx3-b0]# getfattr -d -e hex -m .
/bricks/brick1/distrepx3-b1/c
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x00000000000000009ffffffbbffffff9


Version-Release number of selected component (if applicable):
3.12.2-11.el7rhgs.x86_64

How reproducible:
1/1

Steps to Reproduce:
====================
1) Create a distributed-replicated volume and start it.
2) FUSE mount it on a client.
3) From client create few directories of depth 3.
4) Now, bring down few dht sub-vols using gf_attach command. (I brought down 2
dht sub-vol and 1 brick in another replica pair)
5) Make metadata changes to the directories. like uid, gid, perms, setxattr
6) Bring back the down sub-vols.
7) Check all the bricks for consistency.

Actual results:
================
'custom extended attributes' are not healed after bringing back the down
sub-volumes

Expected results:
=================
No inconsistencies.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2018-05-24
05:31:08 EDT ---

This bug is automatically being proposed for the release of Red Hat Gluster
Storage 3 under active development and open for bug fixes, by setting the
release flag 'rhgs‑3.4.0' to '?'. 

If this bug should be proposed for a different release, please manually change
the proposed release flag.

--- Additional comment from Prasad Desala on 2018-05-24 05:36:37 EDT ---

sosreports@ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/
Collected xattr from all the backend bricks @
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/

volname: distrepx3
protocol: FUSE

[root at dhcp42-143 /]# gluster v info distrepx3

Volume Name: distrepx3
Type: Distributed-Replicate
Volume ID: 4bd32cc9-0020-400f-9b48-af4bb50210d2
Status: Started
Snapshot Count: 0
Number of Bricks: 8 x 3 = 24
Transport-type: tcp
Bricks:
Brick1: 10.70.42.143:/bricks/brick0/distrepx3-b0
Brick2: 10.70.43.41:/bricks/brick0/distrepx3-b0
Brick3: 10.70.43.35:/bricks/brick0/distrepx3-b0
Brick4: 10.70.43.37:/bricks/brick0/distrepx3-b0
Brick5: 10.70.42.143:/bricks/brick1/distrepx3-b1
Brick6: 10.70.43.41:/bricks/brick1/distrepx3-b1
Brick7: 10.70.43.35:/bricks/brick1/distrepx3-b1
Brick8: 10.70.43.37:/bricks/brick1/distrepx3-b1
Brick9: 10.70.42.143:/bricks/brick2/distrepx3-b2
Brick10: 10.70.43.41:/bricks/brick2/distrepx3-b2
Brick11: 10.70.43.35:/bricks/brick2/distrepx3-b2
Brick12: 10.70.43.37:/bricks/brick2/distrepx3-b2
Brick13: 10.70.42.143:/bricks/brick3/distrepx3-b3
Brick14: 10.70.43.41:/bricks/brick3/distrepx3-b3
Brick15: 10.70.43.35:/bricks/brick3/distrepx3-b3
Brick16: 10.70.43.37:/bricks/brick3/distrepx3-b3
Brick17: 10.70.42.143:/bricks/brick4/distrepx3-b4
Brick18: 10.70.43.41:/bricks/brick4/distrepx3-b4
Brick19: 10.70.43.35:/bricks/brick4/distrepx3-b4
Brick20: 10.70.43.37:/bricks/brick4/distrepx3-b4
Brick21: 10.70.42.143:/bricks/brick5/distrepx3-b5
Brick22: 10.70.43.41:/bricks/brick5/distrepx3-b5
Brick23: 10.70.43.35:/bricks/brick5/distrepx3-b5
Brick24: 10.70.43.37:/bricks/brick5/distrepx3-b5
Options Reconfigured:
diagnostics.client-log-level: TRACE
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Brought down bricks:
====================
[root at dhcp42-143 ~]# gluster v status
Status of volume: distrepx3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.143:/bricks/brick0/distrepx3
-b0                                         N/A       N/A        N       N/A  
Brick 10.70.43.41:/bricks/brick0/distrepx3-
b0                                          N/A       N/A        N       N/A  
Brick 10.70.43.35:/bricks/brick0/distrepx3-
b0                                          N/A       N/A        N       N/A  
Brick 10.70.43.37:/bricks/brick0/distrepx3-
b0                                          N/A       N/A        N       N/A  
Brick 10.70.42.143:/bricks/brick1/distrepx3
-b1                                         N/A       N/A        N       N/A  
Brick 10.70.43.41:/bricks/brick1/distrepx3-
b1                                          N/A       N/A        N       N/A  
Brick 10.70.43.35:/bricks/brick1/distrepx3-
b1                                          N/A       N/A        N       N/A  
Brick 10.70.43.37:/bricks/brick1/distrepx3-
b1                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick2/distrepx3
-b2                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick2/distrepx3-
b2                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick2/distrepx3-
b2                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick2/distrepx3-
b2                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick3/distrepx3
-b3                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick3/distrepx3-
b3                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick3/distrepx3-
b3                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick3/distrepx3-
b3                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick4/distrepx3
-b4                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick4/distrepx3-
b4                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick4/distrepx3-
b4                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick4/distrepx3-
b4                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick5/distrepx3
-b5                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick5/distrepx3-
b5                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick5/distrepx3-
b5                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick5/distrepx3-
b5                                          49153     0          Y       31268
Self-heal Daemon on localhost               N/A       N/A        Y       14510
Self-heal Daemon on 10.70.43.41             N/A       N/A        Y       21427
Self-heal Daemon on 10.70.43.37             N/A       N/A        Y       6899 
Self-heal Daemon on 10.70.43.35             N/A       N/A        Y       9416 

Task Status of Volume distrepx3
------------------------------------------------------------------------------
There are no active volume tasks


[root at dhcp42-143 /]# gluster v status distrepx3
Status of volume: distrepx3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.143:/bricks/brick0/distrepx3
-b0                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick0/distrepx3-
b0                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick0/distrepx3-
b0                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick0/distrepx3-
b0                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick1/distrepx3
-b1                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick1/distrepx3-
b1                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick1/distrepx3-
b1                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick1/distrepx3-
b1                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick2/distrepx3
-b2                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick2/distrepx3-
b2                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick2/distrepx3-
b2                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick2/distrepx3-
b2                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick3/distrepx3
-b3                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick3/distrepx3-
b3                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick3/distrepx3-
b3                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick3/distrepx3-
b3                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick4/distrepx3
-b4                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick4/distrepx3-
b4                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick4/distrepx3-
b4                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick4/distrepx3-
b4                                          49153     0          Y       31268
Brick 10.70.42.143:/bricks/brick5/distrepx3
-b5                                         49152     0          Y       14419
Brick 10.70.43.41:/bricks/brick5/distrepx3-
b5                                          49154     0          Y       21299
Brick 10.70.43.35:/bricks/brick5/distrepx3-
b5                                          49155     0          Y       1609 
Brick 10.70.43.37:/bricks/brick5/distrepx3-
b5                                          49153     0          Y       31268
Self-heal Daemon on localhost               N/A       N/A        Y       5898 
Self-heal Daemon on 10.70.43.41             N/A       N/A        Y       13498
Self-heal Daemon on 10.70.43.35             N/A       N/A        Y       809  
Self-heal Daemon on 10.70.43.37             N/A       N/A        Y       30996

Task Status of Volume distrepx3
------------------------------------------------------------------------------
There are no active volume tasks

Mount point output:
==================
[root at dhcp37-110 distrepx3_new]# ls -lRt
.:
total 24
drwxr-xr-x. 3 root  root  4096 May 24 11:29 f
drwxrwxrwx. 3 root  root  4096 May 24 11:29 e
drwxr-xr-x. 3 user1 user1 4096 May 24 11:28 d
drwxr-xr-x. 3 user1 root  4096 May 24 11:28 c
drwxr-xr-x. 3 root  user1 4096 May 24 11:28 b
drwxr-xr-x. 3 root  root  4096 May 24 11:28 a

./f:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 i

./f/i:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 j

./f/i/j:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 k

./f/i/j/k:
total 0

./e:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 f

./e/f:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 g

./e/f/g:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 h

./e/f/g/h:
total 0

./d:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 e

./d/e:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 f

./d/e/f:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 g

./d/e/f/g:
total 0

./c:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 d

./c/d:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 e

./c/d/e:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 f

./c/d/e/f:
total 0

./b:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 c

./b/c:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 d

./b/c/d:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 e

./b/c/d/e:
total 0

./a:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 b

./a/b:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 c

./a/b/c:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 d

./a/b/c/d:
total 0

--- Additional comment from Mohit Agrawal on 2018-05-29 04:56:08 EDT ---

Hi,

  I have analyzed the root cause why xattr was not healed. I have found
internal
  xattr(MDS) was not updated on one afr children because the children were down
at the 
  time of updating xattr. Ideally, after starting down subvols afr should heal
the same. 
  If afr returns MDS value 0 to DHT from wrong subvol, in that case, DHT will
not take 
  any action to heal xattr.I have tried to reproduce the same, I am able to
reproduce it 
  and discussed the same with Kartik also so I am changing component from DHT
to AFR. 

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.43.37/37_b1
 file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.distrepx3-client-6=0x000000000000000000000000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0xfffffffd
user.foo=0x62617231
user.foo1=0x6261723


http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.43.35/35_b1
# file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0x00000000
user.foo=0x62617231
user.foo1=0x62617232


http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.42.143/143.bug_b2
# file: bricks/brick2/distrepx3-b2/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.distrepx3-client-6=0x000000000000000000000000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0xfffffffd
user.foo=0x62617231
user.foo1=0x62617232


Regards
Mohit Agrawal

--- Additional comment from Ravishankar N on 2018-05-29 04:59:59 EDT ---

Assigning to Karthik as he's taking a look at this.

--- Additional comment from Mohit Agrawal on 2018-05-30 05:33:29 EDT ---

Hi,

I have check further same with afr team, the MDS internal xattr was not healed
after 
start subvol because in posix we are ignoring it so I am assigning to myself to
resolve the same.

Regards
Mohit Agrawal

--- Additional comment from Worker Ant on 2018-05-30 05:44:53 EDT ---

REVIEW: https://review.gluster.org/20102 (dht: Delete MDS internal xattr from
dict in dht_getxattr_cbk) posted (#1) for review on master by MOHIT AGRAWAL

--- Additional comment from Worker Ant on 2018-06-02 23:22:36 EDT ---

COMMIT: https://review.gluster.org/20102 committed in master by "Raghavendra G"
<rgowdapp at redhat.com> with a commit message- dht: Delete MDS internal xattr
from dict in dht_getxattr_cbk

Problem: At the time of fetching xattr to heal xattr by afr
         it is not able to fetch xattr because posix_getxattr
         has a check to ignore if xattr name is MDS

Solution: To ignore same xattr update a check in dht_getxattr_cbk
          instead of having a check in posix_getxattr

BUG: 1584098
Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc
fixes: bz#1584098
Signed-off-by: Mohit Agrawal <moagrawa at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1582119
[Bug 1582119] 'custom extended attributes' set on a directory are not
healed after bringing back the down sub-volumes
https://bugzilla.redhat.com/show_bug.cgi?id=1584098
[Bug 1584098] 'custom extended attributes' set on a directory are not
healed after bringing back the down sub-volumes
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list