[Bugs] [Bug 1611116] New: 'custom extended attributes' set on a directory are not healed after bringing back the down sub-volumes
bugzilla at redhat.com
bugzilla at redhat.com
Thu Aug 2 05:49:59 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1611116
Bug ID: 1611116
Summary: 'custom extended attributes' set on a directory are
not healed after bringing back the down sub-volumes
Product: GlusterFS
Version: 4.1
Component: distribute
Severity: high
Assignee: bugs at gluster.org
Reporter: khiremat at redhat.com
CC: bugs at gluster.org, moagrawa at redhat.com,
rhs-bugs at redhat.com, sankarshan at redhat.com,
storage-qa-internal at redhat.com, tdesala at redhat.com
Depends On: 1582119, 1584098
+++ This bug was initially created as a clone of Bug #1584098 +++
+++ This bug was initially created as a clone of Bug #1582119 +++
Description of problem:
=======================
'custom extended attributes' set on a directory are not healed after bringing
back the down sub-volumes.
Client:
=======
getfattr -n user.foo c
# file: c
user.foo="bar1"
Backend bricks:
===============
[root at dhcpnode1 distrepx3-b0]# getfattr -d -e hex -m .
/bricks/brick0/distrepx3-b0/c
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick0/distrepx3-b0/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x00000000000000007ffffffc9ffffffa
[root at dhcpnode1 distrepx3-b0]# getfattr -d -e hex -m .
/bricks/brick1/distrepx3-b1/c
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x00000000000000009ffffffbbffffff9
Version-Release number of selected component (if applicable):
3.12.2-11.el7rhgs.x86_64
How reproducible:
1/1
Steps to Reproduce:
====================
1) Create a distributed-replicated volume and start it.
2) FUSE mount it on a client.
3) From client create few directories of depth 3.
4) Now, bring down few dht sub-vols using gf_attach command. (I brought down 2
dht sub-vol and 1 brick in another replica pair)
5) Make metadata changes to the directories. like uid, gid, perms, setxattr
6) Bring back the down sub-vols.
7) Check all the bricks for consistency.
Actual results:
================
'custom extended attributes' are not healed after bringing back the down
sub-volumes
Expected results:
=================
No inconsistencies.
--- Additional comment from Red Hat Bugzilla Rules Engine on 2018-05-24
05:31:08 EDT ---
This bug is automatically being proposed for the release of Red Hat Gluster
Storage 3 under active development and open for bug fixes, by setting the
release flag 'rhgs‑3.4.0' to '?'.
If this bug should be proposed for a different release, please manually change
the proposed release flag.
--- Additional comment from Prasad Desala on 2018-05-24 05:36:37 EDT ---
sosreports@ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/
Collected xattr from all the backend bricks @
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/
volname: distrepx3
protocol: FUSE
[root at dhcp42-143 /]# gluster v info distrepx3
Volume Name: distrepx3
Type: Distributed-Replicate
Volume ID: 4bd32cc9-0020-400f-9b48-af4bb50210d2
Status: Started
Snapshot Count: 0
Number of Bricks: 8 x 3 = 24
Transport-type: tcp
Bricks:
Brick1: 10.70.42.143:/bricks/brick0/distrepx3-b0
Brick2: 10.70.43.41:/bricks/brick0/distrepx3-b0
Brick3: 10.70.43.35:/bricks/brick0/distrepx3-b0
Brick4: 10.70.43.37:/bricks/brick0/distrepx3-b0
Brick5: 10.70.42.143:/bricks/brick1/distrepx3-b1
Brick6: 10.70.43.41:/bricks/brick1/distrepx3-b1
Brick7: 10.70.43.35:/bricks/brick1/distrepx3-b1
Brick8: 10.70.43.37:/bricks/brick1/distrepx3-b1
Brick9: 10.70.42.143:/bricks/brick2/distrepx3-b2
Brick10: 10.70.43.41:/bricks/brick2/distrepx3-b2
Brick11: 10.70.43.35:/bricks/brick2/distrepx3-b2
Brick12: 10.70.43.37:/bricks/brick2/distrepx3-b2
Brick13: 10.70.42.143:/bricks/brick3/distrepx3-b3
Brick14: 10.70.43.41:/bricks/brick3/distrepx3-b3
Brick15: 10.70.43.35:/bricks/brick3/distrepx3-b3
Brick16: 10.70.43.37:/bricks/brick3/distrepx3-b3
Brick17: 10.70.42.143:/bricks/brick4/distrepx3-b4
Brick18: 10.70.43.41:/bricks/brick4/distrepx3-b4
Brick19: 10.70.43.35:/bricks/brick4/distrepx3-b4
Brick20: 10.70.43.37:/bricks/brick4/distrepx3-b4
Brick21: 10.70.42.143:/bricks/brick5/distrepx3-b5
Brick22: 10.70.43.41:/bricks/brick5/distrepx3-b5
Brick23: 10.70.43.35:/bricks/brick5/distrepx3-b5
Brick24: 10.70.43.37:/bricks/brick5/distrepx3-b5
Options Reconfigured:
diagnostics.client-log-level: TRACE
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
Brought down bricks:
====================
[root at dhcp42-143 ~]# gluster v status
Status of volume: distrepx3
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.70.42.143:/bricks/brick0/distrepx3
-b0 N/A N/A N N/A
Brick 10.70.43.41:/bricks/brick0/distrepx3-
b0 N/A N/A N N/A
Brick 10.70.43.35:/bricks/brick0/distrepx3-
b0 N/A N/A N N/A
Brick 10.70.43.37:/bricks/brick0/distrepx3-
b0 N/A N/A N N/A
Brick 10.70.42.143:/bricks/brick1/distrepx3
-b1 N/A N/A N N/A
Brick 10.70.43.41:/bricks/brick1/distrepx3-
b1 N/A N/A N N/A
Brick 10.70.43.35:/bricks/brick1/distrepx3-
b1 N/A N/A N N/A
Brick 10.70.43.37:/bricks/brick1/distrepx3-
b1 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick2/distrepx3
-b2 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick2/distrepx3-
b2 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick2/distrepx3-
b2 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick2/distrepx3-
b2 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick3/distrepx3
-b3 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick3/distrepx3-
b3 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick3/distrepx3-
b3 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick3/distrepx3-
b3 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick4/distrepx3
-b4 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick4/distrepx3-
b4 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick4/distrepx3-
b4 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick4/distrepx3-
b4 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick5/distrepx3
-b5 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick5/distrepx3-
b5 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick5/distrepx3-
b5 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick5/distrepx3-
b5 49153 0 Y 31268
Self-heal Daemon on localhost N/A N/A Y 14510
Self-heal Daemon on 10.70.43.41 N/A N/A Y 21427
Self-heal Daemon on 10.70.43.37 N/A N/A Y 6899
Self-heal Daemon on 10.70.43.35 N/A N/A Y 9416
Task Status of Volume distrepx3
------------------------------------------------------------------------------
There are no active volume tasks
[root at dhcp42-143 /]# gluster v status distrepx3
Status of volume: distrepx3
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.70.42.143:/bricks/brick0/distrepx3
-b0 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick0/distrepx3-
b0 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick0/distrepx3-
b0 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick0/distrepx3-
b0 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick1/distrepx3
-b1 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick1/distrepx3-
b1 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick1/distrepx3-
b1 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick1/distrepx3-
b1 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick2/distrepx3
-b2 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick2/distrepx3-
b2 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick2/distrepx3-
b2 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick2/distrepx3-
b2 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick3/distrepx3
-b3 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick3/distrepx3-
b3 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick3/distrepx3-
b3 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick3/distrepx3-
b3 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick4/distrepx3
-b4 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick4/distrepx3-
b4 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick4/distrepx3-
b4 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick4/distrepx3-
b4 49153 0 Y 31268
Brick 10.70.42.143:/bricks/brick5/distrepx3
-b5 49152 0 Y 14419
Brick 10.70.43.41:/bricks/brick5/distrepx3-
b5 49154 0 Y 21299
Brick 10.70.43.35:/bricks/brick5/distrepx3-
b5 49155 0 Y 1609
Brick 10.70.43.37:/bricks/brick5/distrepx3-
b5 49153 0 Y 31268
Self-heal Daemon on localhost N/A N/A Y 5898
Self-heal Daemon on 10.70.43.41 N/A N/A Y 13498
Self-heal Daemon on 10.70.43.35 N/A N/A Y 809
Self-heal Daemon on 10.70.43.37 N/A N/A Y 30996
Task Status of Volume distrepx3
------------------------------------------------------------------------------
There are no active volume tasks
Mount point output:
==================
[root at dhcp37-110 distrepx3_new]# ls -lRt
.:
total 24
drwxr-xr-x. 3 root root 4096 May 24 11:29 f
drwxrwxrwx. 3 root root 4096 May 24 11:29 e
drwxr-xr-x. 3 user1 user1 4096 May 24 11:28 d
drwxr-xr-x. 3 user1 root 4096 May 24 11:28 c
drwxr-xr-x. 3 root user1 4096 May 24 11:28 b
drwxr-xr-x. 3 root root 4096 May 24 11:28 a
./f:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 i
./f/i:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 j
./f/i/j:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 k
./f/i/j/k:
total 0
./e:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 f
./e/f:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 g
./e/f/g:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 h
./e/f/g/h:
total 0
./d:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 e
./d/e:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:29 f
./d/e/f:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:29 g
./d/e/f/g:
total 0
./c:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 d
./c/d:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 e
./c/d/e:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 f
./c/d/e/f:
total 0
./b:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 c
./b/c:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 d
./b/c/d:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 e
./b/c/d/e:
total 0
./a:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 b
./a/b:
total 4
drwxr-xr-x. 3 root root 4096 May 24 11:28 c
./a/b/c:
total 4
drwxr-xr-x. 2 root root 4096 May 24 11:28 d
./a/b/c/d:
total 0
--- Additional comment from Mohit Agrawal on 2018-05-29 04:56:08 EDT ---
Hi,
I have analyzed the root cause why xattr was not healed. I have found
internal
xattr(MDS) was not updated on one afr children because the children were down
at the
time of updating xattr. Ideally, after starting down subvols afr should heal
the same.
If afr returns MDS value 0 to DHT from wrong subvol, in that case, DHT will
not take
any action to heal xattr.I have tried to reproduce the same, I am able to
reproduce it
and discussed the same with Kartik also so I am changing component from DHT
to AFR.
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.43.37/37_b1
file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.distrepx3-client-6=0x000000000000000000000000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0xfffffffd
user.foo=0x62617231
user.foo1=0x6261723
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.43.35/35_b1
# file: bricks/brick1/distrepx3-b1/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0x00000000
user.foo=0x62617231
user.foo1=0x62617232
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1582066/xattr/10.70.42.143/143.bug_b2
# file: bricks/brick2/distrepx3-b2/c
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.distrepx3-client-6=0x000000000000000000000000
trusted.gfid=0x326aa2338e3846afa3006e873b313416
trusted.glusterfs.dht=0x0000000000000000bffffffadffffff8
trusted.glusterfs.dht.mds=0xfffffffd
user.foo=0x62617231
user.foo1=0x62617232
Regards
Mohit Agrawal
--- Additional comment from Ravishankar N on 2018-05-29 04:59:59 EDT ---
Assigning to Karthik as he's taking a look at this.
--- Additional comment from Mohit Agrawal on 2018-05-30 05:33:29 EDT ---
Hi,
I have check further same with afr team, the MDS internal xattr was not healed
after
start subvol because in posix we are ignoring it so I am assigning to myself to
resolve the same.
Regards
Mohit Agrawal
--- Additional comment from Worker Ant on 2018-05-30 05:44:53 EDT ---
REVIEW: https://review.gluster.org/20102 (dht: Delete MDS internal xattr from
dict in dht_getxattr_cbk) posted (#1) for review on master by MOHIT AGRAWAL
--- Additional comment from Worker Ant on 2018-06-02 23:22:36 EDT ---
COMMIT: https://review.gluster.org/20102 committed in master by "Raghavendra G"
<rgowdapp at redhat.com> with a commit message- dht: Delete MDS internal xattr
from dict in dht_getxattr_cbk
Problem: At the time of fetching xattr to heal xattr by afr
it is not able to fetch xattr because posix_getxattr
has a check to ignore if xattr name is MDS
Solution: To ignore same xattr update a check in dht_getxattr_cbk
instead of having a check in posix_getxattr
BUG: 1584098
Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc
fixes: bz#1584098
Signed-off-by: Mohit Agrawal <moagrawa at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1582119
[Bug 1582119] 'custom extended attributes' set on a directory are not
healed after bringing back the down sub-volumes
https://bugzilla.redhat.com/show_bug.cgi?id=1584098
[Bug 1584098] 'custom extended attributes' set on a directory are not
healed after bringing back the down sub-volumes
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list