[Bugs] [Bug 1356976] New: seeing dataheal pending bits bump up when source databrick is down and arbiter does metadata and entry heal

Fri Jul 15 12:16:27 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1356976

            Bug ID: 1356976
           Summary: seeing dataheal pending bits bump up when source
                    databrick is down and arbiter does metadata and entry
                    heal
           Product: GlusterFS
           Version: 3.7.9
         Component: arbiter
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: nchilaka at redhat.com
                CC: bugs at gluster.org

Description of problem:
==========================
I see that the dataheal bits in truster.afr xattr are getting bumped up when
arbiter brick heals  a destination data brick for metadata and entry heal while
source brick is down

Version-Release number of selected component (if applicable):
glusterfs 3.9dev built on Jul 11 2016 10:04:54

How reproducible:
always

Steps to Reproduce:
================
1.create a 1x(2+1) replicate arbiter vol
2.now mount the vol by fuse
3.create a directory say dir1
4. Now bring down the first data brick(db1) 
5. create a file sat f1 under dir1 with some contents 
6. note down the getfattr details from both db2 and ab1 
for eg below is the info:
===>from db2:
[root at dhcp43-153 ~]# getfattr -d -m . -e hex
/bricks/brick1/arbit/db1_Down/datafile
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/arbit/db1_Down/datafile
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.arbit-client-0=0x000000020000000200000000
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005788cb300009088d
trusted.gfid=0x091d29ddf4e149da83531686e59818de

===>from ab1:
[root at dhcp43-157 ~]#  getfattr -d -m . -e hex
/bricks/brick2/arbit/db1_Down/datafile
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick2/arbit/db1_Down/datafile
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.arbit-client-0=0x000000020000000200000000
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005788cb3000085f14
trusted.gfid=0x091d29ddf4e149da83531686e59818de

7.Now bring down the other data brick too ie db2
7. bring up the db1 which was down while keeping db2 down
8. check heal info and trigger a manual heal
9. once the entry and metadata heal is over check the xattr info on ab1 for the
file. It can be seen that the data bit is bumped up

[root at dhcp43-157 ~]#  getfattr -d -m . -e hex
/bricks/brick2/arbit/db1_Down/datafile
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick2/arbit/db1_Down/datafile
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.arbit-client-0=0x000000030000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005788cb3000085f14
trusted.gfid=0x091d29ddf4e149da83531686e59818de

This shouldn't be happening

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.