[Bugs] [Bug 1641761] New: Spurious failures in bug-1637802-arbiter-stale-data-heal-lock.t
bugzilla at redhat.com
bugzilla at redhat.com
Mon Oct 22 16:08:02 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1641761
Bug ID: 1641761
Summary: Spurious failures in
bug-1637802-arbiter-stale-data-heal-lock.t
Product: GlusterFS
Version: 4.1
Component: tests
Keywords: Triaged
Assignee: bugs at gluster.org
Reporter: ravishankar at redhat.com
CC: bugs at gluster.org
Depends On: 1641344
+++ This bug was initially created as a clone of Bug #1641344 +++
Problem:
https://review.gluster.org/#/c/glusterfs/+/21427/ seems to be failing
this .t spuriously. On checking one of the failure logs, I see:
22:05:44 Launching heal operation to perform index self heal on volume
patchy has been unsuccessful:
22:05:44 Self-heal daemon is not running. Check self-heal daemon log file.
22:05:44 not ok 20 , LINENUM:38
In glusterd log:
[2018-10-18 22:05:44.298832] E [MSGID: 106301]
[glusterd-syncop.c:1352:gd_stage_op_phase] 0-management: Staging of operation
'Volume Heal' failed on localhost : Self-heal daemon is not running. Check
self-heal daemon log file
But the tests which preceed this check whether via a statedump if the shd
is
conected to the bricks, and they have succeeded and even started
healing. From glustershd.log:
[2018-10-18 22:05:40.975268] I [MSGID: 108026]
[afr-self-heal-common.c:1732:afr_log_selfheal] 0-patchy-replicate-0: Completed
data selfheal on 3b83d2dd-4cf2-4ea3-a33e-4275be40f440. sources=[0] 1 sinks=2
So the only reason I can see launching heal via cli failing is a race where
shd has been spawned but glusterd has not yet updated in-memory that it is
up,
and hence failing the CLI.
Fix:
Check for shd up status before launching heal via CLI
--- Additional comment from Worker Ant on 2018-10-21 08:17:59 EDT ---
REVIEW: https://review.gluster.org/21451 (tests: check for shd up status in
bug-1637802-arbiter-stale-data-heal-lock.t) posted (#1) for review on master by
Ravishankar N
--- Additional comment from Worker Ant on 2018-10-22 09:49:30 EDT ---
COMMIT: https://review.gluster.org/21451 committed in master by "Pranith Kumar
Karampuri" <pkarampu at redhat.com> with a commit message- tests: check for shd up
status in bug-1637802-arbiter-stale-data-heal-lock.t
Problem:
https://review.gluster.org/#/c/glusterfs/+/21427/ seems to be failing
this .t spuriously. On checking one of the failure logs, I see:
22:05:44 Launching heal operation to perform index self heal on volume patchy
has been unsuccessful:
22:05:44 Self-heal daemon is not running. Check self-heal daemon log file.
22:05:44 not ok 20 , LINENUM:38
In glusterd log:
[2018-10-18 22:05:44.298832] E [MSGID: 106301]
[glusterd-syncop.c:1352:gd_stage_op_phase] 0-management: Staging of operation
'Volume Heal' failed on localhost : Self-heal daemon is not running. Check
self-heal daemon log file
But the tests which preceed this check whether via a statedump if the shd is
conected to the bricks, and they have succeeded and even started
healing. From glustershd.log:
[2018-10-18 22:05:40.975268] I [MSGID: 108026]
[afr-self-heal-common.c:1732:afr_log_selfheal] 0-patchy-replicate-0: Completed
data selfheal on 3b83d2dd-4cf2-4ea3-a33e-4275be40f440. sources=[0] 1 sinks=2
So the only reason I can see launching heal via cli failing is a race where
shd has been spawned but glusterd has not yet updated in-memory that it is up,
and hence failing the CLI.
Fix:
Check for shd up status before launching heal via CLI
Change-Id: Ic88abf14ad3d51c89cb438db601fae4df179e8f4
fixes: bz#1641344
Signed-off-by: Ravishankar N <ravishankar at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1641344
[Bug 1641344] Spurious failures in
bug-1637802-arbiter-stale-data-heal-lock.t
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list