[Bugs] [Bug 1684385] [ovirt-gluster] Rolling gluster upgrade from 3.12.5 to 5.3 led to shard on-disk xattrs disappearing

Fri Mar 1 07:23:02 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1684385

--- Comment #2 from Krutika Dhananjay <kdhananj at redhat.com> ---
On further investigation, it was found that the shard xattrs were genuinely
missing on all 3 replicas -

[root at tendrl27 ~]# getfattr -d -m . -e hex
/gluster_bricks/engine/engine/36ea5b11-19fb-4755-b664-088f6e5c4df2/dom_md/ids
getfattr: Removing leading '/' from absolute path names
# file:
gluster_bricks/engine/engine/36ea5b11-19fb-4755-b664-088f6e5c4df2/dom_md/ids
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.engine-client-1=0x000000000000000000000000
trusted.afr.engine-client-2=0x000000000000000000000000
trusted.gfid=0x3ad3f0c6a4e64b17bd2997c32ecc54d7
trusted.gfid2path.5f2a4f417210b896=0x64373265323737612d353761642d343136322d613065332d6339346463316231366230322f696473

[root at localhost ~]# getfattr -d -m . -e hex
/gluster_bricks/engine/engine/36ea5b11-19fb-4755-b664-088f6e5c4df2/dom_md/ids
getfattr: Removing leading '/' from absolute path names
# file:
gluster_bricks/engine/engine/36ea5b11-19fb-4755-b664-088f6e5c4df2/dom_md/ids
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.engine-client-0=0x0000000e0000000000000000
trusted.afr.engine-client-2=0x000000000000000000000000
trusted.gfid=0x3ad3f0c6a4e64b17bd2997c32ecc54d7
trusted.gfid2path.5f2a4f417210b896=0x64373265323737612d353761642d343136322d613065332d6339346463316231366230322f696473

[root at tendrl25 ~]# getfattr -d -m . -e hex
/gluster_bricks/engine/engine/36ea5b11-19fb-4755-b664-088f6e5c4df2/dom_md/ids
getfattr: Removing leading '/' from absolute path names
# file:
gluster_bricks/engine/engine/36ea5b11-19fb-4755-b664-088f6e5c4df2/dom_md/ids
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.engine-client-0=0x000000100000000000000000
trusted.afr.engine-client-1=0x000000000000000000000000
trusted.gfid=0x3ad3f0c6a4e64b17bd2997c32ecc54d7
trusted.gfid2path.5f2a4f417210b896=0x64373265323737612d353761642d343136322d613065332d6339346463316231366230322f696473

Also from the logs, it appears the file underwent metadata self-heal moments
before these errors started to appear-
[2019-02-26 13:35:37.253896] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-engine-replicate-0:
performing metadata selfheal on 3ad3f0c6-a4e6-4b17-bd29-97c32ecc54d7
[2019-02-26 13:35:37.254734] W [MSGID: 101016] [glusterfs3.h:752:dict_to_xdr]
0-dict: key 'trusted.glusterfs.shard.file-size' is not sent on wire [Invalid
argument]
[2019-02-26 13:35:37.254749] W [MSGID: 101016] [glusterfs3.h:752:dict_to_xdr]
0-dict: key 'trusted.glusterfs.shard.block-size' is not sent on wire [Invalid
argument]
[2019-02-26 13:35:37.255777] I [MSGID: 108026]
[afr-self-heal-common.c:1729:afr_log_selfheal] 0-engine-replicate-0: Completed
metadata selfheal on 3ad3f0c6-a4e6-4b17-bd29-97c32ecc54d7. sources=[0]  sinks=2
[2019-02-26 13:35:37.258032] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-engine-replicate-0:
performing metadata selfheal on 3ad3f0c6-a4e6-4b17-bd29-97c32ecc54d7
[2019-02-26 13:35:37.258792] W [MSGID: 101016] [glusterfs3.h:752:dict_to_xdr]
0-dict: key 'trusted.glusterfs.shard.file-size' is not sent on wire [Invalid
argument]
[2019-02-26 13:35:37.258807] W [MSGID: 101016] [glusterfs3.h:752:dict_to_xdr]
0-dict: key 'trusted.glusterfs.shard.block-size' is not sent on wire [Invalid
argument]
[2019-02-26 13:35:37.259633] I [MSGID: 108026]
[afr-self-heal-common.c:1729:afr_log_selfheal] 0-engine-replicate-0: Completed
metadata selfheal on 3ad3f0c6-a4e6-4b17-bd29-97c32ecc54d7. sources=[0]  sinks=2 

Metadata heal as we know does three things - 1. bulk getxattr from source
brick; 2. removexattr on sink bricks 3. bulk setxattr on the sink bricks

But what's clear from these logs is the dict_to_xdr() messages at the time of
metadata heal, indicating that the shard xattrs were possibly not "sent on
wire" as part of step 3.
Turns out due to the newly introduced dict_to_xdr() code in 5.3 which is absent
in 3.12.5.

The bricks were upgraded to 5.3 in the order tendrl25 followed by tendrl26 with
tendrl27 still at 3.12.5 when this issue was hit -

Tendrl25:
[2019-02-26 12:47:53.595647] I [MSGID: 100030] [glusterfsd.c:2715:main]
0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 5.3 (args:
/usr/sbin/glusterfsd -s tendrl25.lab.eng.blr.redhat.com --volfile-id
engine.tendrl25.lab.eng.blr.redhat.com.gluster_bricks-engine-engine -p
/var/run/gluster/vols/engine/tendrl25.lab.eng.blr.redhat.com-gluster_bricks-engine-engine.pid
-S /var/run/gluster/aae83600c9a783dd.socket --brick-name
/gluster_bricks/engine/engine -l
/var/log/glusterfs/bricks/gluster_bricks-engine-engine.log --xlator-option
*-posix.glusterd-uuid=9373b871-cfce-41ba-a815-0b330f6975c8 --process-name brick
--brick-port 49153 --xlator-option engine-server.listen-port=49153)

Tendrl26:
[2019-02-26 13:35:05.718052] I [MSGID: 100030] [glusterfsd.c:2715:main]
0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 5.3 (args:
/usr/sbin/glusterfsd -s tendrl26.lab.eng.blr.redhat.com --volfile-id
engine.tendrl26.lab.eng.blr.redhat.com.gluster_bricks-engine-engine -p
/var/run/gluster/vols/engine/tendrl26.lab.eng.blr.redhat.com-gluster_bricks-engine-engine.pid
-S /var/run/gluster/8010384b5524b493.socket --brick-name
/gluster_bricks/engine/engine -l
/var/log/glusterfs/bricks/gluster_bricks-engine-engine.log --xlator-option
*-posix.glusterd-uuid=18fa886f-8d1a-427c-a5e6-9a4e9502ef7c --process-name brick
--brick-port 49153 --xlator-option engine-server.listen-port=49153)

Tendrl27:
[root at tendrl27 bricks]# rpm -qa | grep gluster
glusterfs-fuse-3.12.15-1.el7.x86_64
glusterfs-libs-3.12.15-1.el7.x86_64
glusterfs-3.12.15-1.el7.x86_64
glusterfs-server-3.12.15-1.el7.x86_64
glusterfs-client-xlators-3.12.15-1.el7.x86_64
glusterfs-api-3.12.15-1.el7.x86_64
glusterfs-events-3.12.15-1.el7.x86_64
libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.4.x86_64
glusterfs-gnfs-3.12.15-1.el7.x86_64
glusterfs-geo-replication-3.12.15-1.el7.x86_64
glusterfs-cli-3.12.15-1.el7.x86_64
vdsm-gluster-4.20.46-1.el7.x86_64
python2-gluster-3.12.15-1.el7.x86_64
glusterfs-rdma-3.12.15-1.el7.x86_64

And as per the metadata heal logs, the source was brick0 (corresponding to
tendrl27) and sink was brick 2 (corresponding to tendrl 25).
This means step 1 of metadata heal did a getxattr on tendrl27 which was still
at 3.12.5 and got the dicts with a certain format which didn't have the "value"
type (because it's only introduced in 5.3).
And this same dict was used for setxattr in step 3 which silently fails to add
"trusted.glusterfs.shard.block-size" and "trusted.glusterfs.shard.file-size"
xattrs to the setxattr request because of the dict_to_xdr() conversion failure
in protocol/client but succeeds the overall operation. So afr thought the heal
succeeded although the xattr that needed heal was never sent over the wire.
This led to one or more files ending up with shard xattrs removed on-disk
failing every other operation on it pretty much.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.