[Bugs] [Bug 1541038] New: A down brick is incorrectly considered to be online and makes the volume to be started without any brick available
bugzilla at redhat.com
bugzilla at redhat.com
Thu Feb 1 15:04:38 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1541038
Bug ID: 1541038
Summary: A down brick is incorrectly considered to be online
and makes the volume to be started without any brick
available
Product: GlusterFS
Version: mainline
Component: replicate
Assignee: bugs at gluster.org
Reporter: jahernan at redhat.com
CC: bugs at gluster.org
Description of problem:
In a replica 2 volume, if one of the bricks is down and it reports its state
before the online one, AFR tries to find another online brick in
find_best_down_child(). Since priv->child_up array has been initialized with -1
and this function only checks if it's 0, it considers that the other brick is
alive and sends a CHILD_UP notification.
At this point the other xlators start sending requests, which fail with
ENOTCONN when they reach afr. This can cause several unexpected errors.
Version-Release number of selected component (if applicable): mainline
How reproducible:
It happens randomly, depending on the order in which bricks are started.
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list