[Bugs] [Bug 1541928] New: A down brick is incorrectly considered to be online and makes the volume to be started without any brick available
bugzilla at redhat.com
bugzilla at redhat.com
Mon Feb 5 08:53:54 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1541928
Bug ID: 1541928
Summary: A down brick is incorrectly considered to be online
and makes the volume to be started without any brick
available
Product: GlusterFS
Version: 4.0
Component: replicate
Assignee: bugs at gluster.org
Reporter: jahernan at redhat.com
CC: bugs at gluster.org
Depends On: 1541038
+++ This bug was initially created as a clone of Bug #1541038 +++
Description of problem:
In a replica 2 volume, if one of the bricks is down and it reports its state
before the online one, AFR tries to find another online brick in
find_best_down_child(). Since priv->child_up array has been initialized with -1
and this function only checks if it's 0, it considers that the other brick is
alive and sends a CHILD_UP notification.
At this point the other xlators start sending requests, which fail with
ENOTCONN when they reach afr. This can cause several unexpected errors.
Version-Release number of selected component (if applicable): mainline
How reproducible:
It happens randomly, depending on the order in which bricks are started.
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1541038
[Bug 1541038] A down brick is incorrectly considered to be online and makes
the volume to be started without any brick available
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list