[Gluster-infra] [Bug 1408660] Setup a CentOS 7 VM to test split-brain-favorite-child-policy.t failures

bugzilla at redhat.com bugzilla at redhat.com
Mon Dec 26 12:32:24 UTC 2016


Ravishankar N <ravishankar at redhat.com> changed:

           What    |Removed                     |Added
                 CC|                            |ravishankar at redhat.com

--- Comment #2 from Ravishankar N <ravishankar at redhat.com> ---
Thanks for the setup Nigel. What is happening is this:

When `TEST dd if=/dev/urandom of=$M0/file bs=1024 count=1024` is run with a
brick down, on Fedora, CentOS-6 etc, there are only pending data heals because
writevs are the only FOPS hitting the file.

# getfattr -d -m . -e hex /d/backends/patchy*/file
getfattr: Removing leading '/' from absolute path names
# file: d/backends/patchy0/file

But when the same test is run on CentOS7, there is also a removexattr FOP
afr_removexattr (frame=0x7f61e00163cc, this=0x7f61e800b8a0, loc=0x7f61dc03a33c,
name=0x7f61dc02b760 "security.ima", xdata=0x7f61e001310c) at

Since a brick is down, the dirty is set in the pre-op, and since security.ima
is not there in the brick, the cbk gets a op-ret -1 and errono=ENODATA. This is
treated as a symmetric error the FOP is treated as success and dirty is not
unset. Thus we have:

[root at centos-7-test glusterfs]# g /d/backends/patchy*/file
getfattr: Removing leading '/' from absolute path names
# file: d/backends/patchy0/file
trusted.afr.dirty=0x000000000000000100000000 <------- This is not cleared.

Now when both bricks are up, metadata self-heal happens and updates the ctimes
on the bricks as a part of undo-pending. Since this is done in a for loop, the
2nd brick will have latest ctime.

This breaks the assumption in the .t that the 1st brick has the latest ctime
(which would have been the case had it not been for the removexattr FOP),
resulting in a heal in the opposite direction, hence failing the md5sum
comparison check in the .t

I will fix the .t to note ctime of both bricks from the back end and then pick
up the one with the latest ctime as source. I'm retaining the machine until I
test and send the patch.

You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=PGjfrKJvgG&a=cc_unsubscribe

More information about the Gluster-infra mailing list