[Bugs] [Bug 1801624] New: Heal pending on brick post upgrading from RHV 4.2.8 or RHV 4.3.7 to RHV 4.3.8

bugzilla at redhat.com bugzilla at redhat.com
Tue Feb 11 11:44:53 UTC 2020


https://bugzilla.redhat.com/show_bug.cgi?id=1801624

            Bug ID: 1801624
           Summary: Heal pending on brick post upgrading from RHV 4.2.8 or
                    RHV 4.3.7 to RHV 4.3.8
           Product: GlusterFS
           Version: mainline
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: replicate
          Severity: medium
          Assignee: bugs at gluster.org
          Reporter: ravishankar at redhat.com
                CC: bugs at gluster.org, mwaykole at redhat.com,
                    ravishankar at redhat.com, rhs-bugs at redhat.com,
                    sasundar at redhat.com
        Depends On: 1792821
  Target Milestone: ---
    Classification: Community



+++ This bug was initially created as a clone of Bug #1792821 +++

Description of problem:
.prob* file is found in one brick and missing on other 2 bricks

Version-Release number of selected component (if applicable):

RHGS 3.5.1 (6.0-28)
RHVH-4.3.8


Steps to Reproduce:
1.Create a VM 
2.Run I/O in the background  
3.while running the I/O kill one engine brick 
4. wait for 10 minutes 
5. restart glusterd 

Actual results:
.prob missing on 2 brick

Expected results:
There should be no heal pending in the engine and .prob file should be present
on all the engine brick


--- Additional comment from Ravishankar N on 2020-01-24 12:40:44 UTC ---

On looking at the setup we found that the entry was not getting healed because
the parent dir did not have any entry pending xattrs. The test (thanks Sas for
the info) that writes to the prob file apparently unlinks the file before
continuing to write to it, so maybe the expected result is that the file be
_removed_ from all bricks, not that it is present on them:
------------------------
f = os.open(path, os.O_WRONLY | os.O_DIRECT | os.O_DSYNC | os.O_CREAT |
os.O_EXCL, stat.S_IRUSR | stat.S_IWUSR)
#time.sleep(20)
os.unlink(path)

#time.sleep(20)
m = mmap.mmap(-1, 1024)
s = b' ' * 1024

m.write(s)
os.write(f, m)
os.close(f)
------------------------
So it looks like one of the bricks (engine-client-0) was killed at the time of
unlink of the prob file so the unlink did not go through on it. But AFR should
have marked pending xattrs during post-op on the good bricks (so that selfheal
later on removes the prob file from this brick also). I do not see any network
errors on the client log which can explain a post-op failure, so I'm not sure
what happened here. We need to see if this can be consistently recreated.
Leaving a need-info on Milind for the same. We need the exact time the killing
and restating of the bricks happen to correlate it with the log.

--- Additional comment from SATHEESARAN on 2020-02-10 07:09:33 UTC ---

I have also seen the same behavior when upgrading from RHV 4.2.8 to RHV 4.3.8
and also from RHV 4.3.7 to RHV 4.3.8

During this upgrade, one of the bricks were killed, and gluster software was
upgraded from RHGS 3.4.4 ( gluster-3.12.2-47.5 ) to RHGS 3.5.1 ( gluster-6.0-29
)

After upgrading one of the node, the he.metadata and he.lockspace files were
shown are pending to heal and
that continued forever. On checking for its GFID, then it was mismatching with
the same file on other 2 bricks,
but self-heal was not happening though, as the changelog entry was missing in
the parent directory.

--- Additional comment from Ravishankar N on 2020-02-10 07:30:02 UTC ---

So I am able to reproduce the issue fairly consistently. 
1. Create a 1x3 volume with RHHI options enabled.
2. Create and write to a file from the mount.
3. Bring one brick down, delete and re-create the file so that there is pending
(granular) entry heal.
4. With the brick still down, launch the index heal.
xat
Even though there is nothing to be healed (since the sink brick is still down),
index heal seems to be doing a no-op and resetting parent dir's afr changelog
xattrs, which is why the entry never gets healed. 
In the QE setup also, this race is what is happening. Even before the upgraded
node comes online, the shd does the entry heal described above. We can see
messages like these in the shd log where there is no 'source' and the good
bricks are 'sinks':
[2020-02-10 05:57:55.847756] I [MSGID: 108026]
[afr-self-heal-common.c:1750:afr_log_selfheal] 0-testvol-replicate-0: Completed
entry selfheal on 77dd5a45-dbf5-4592-b31b-b440382302e9. sources= sinks=0 2

I need to check where the bug is in the code, if it is specific to granular
entry heal and how to fix it.

--- Additional comment from Ravishankar N on 2020-02-11 11:41:54 UTC ---

(In reply to Ravishankar N from comment #8)
> I need to check where the bug is in the code, if it is specific to granular
> entry heal and how to fix it.

So the gfid split-brain will happen only if granular-entry heal is enabled, but
even otherwise, even if only two good bricks are up, spurious entry heals are
triggered continuously leading to multiple unnecessary network ops. I'm sending
a fix upstream for review.


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1792821
[Bug 1792821] Heal pending on brick post upgrading from RHV 4.2.8 or RHV 4.3.7
to RHV 4.3.8
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list