[Bugs] [Bug 1398566] self-heal info command hangs after triggering self-heal
bugzilla at redhat.com
bugzilla at redhat.com
Sat Nov 26 11:10:35 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1398566
--- Comment #2 from Worker Ant <bugzilla-bot at gluster.org> ---
COMMIT: http://review.gluster.org/15929 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com)
------
commit 2fe8ba52108e94268bc816ba79074a96c4538271
Author: Krutika Dhananjay <kdhananj at redhat.com>
Date: Fri Nov 25 15:54:30 2016 +0530
cluster/afr: Fix deadlock due to compound fops
When an afr data transaction is eligible for using
eager-lock, this information is represented in
local->transaction.eager_lock_on. However, if non-blocking
inodelk attempt (which is a full lock) fails, AFR falls back
to blocking locks which are range locks. At this point,
local->transaction.eager_lock[] per brick is reset but
local->transaction.eager_lock_on is still true.
When AFR decides to compound post-op and unlock, it is after
confirming that the transaction did not use eager lock (well,
except for a small bug where local->transaction.locks_acquired[]
is not considered).
But within afr_post_op_unlock_do(), afr again incorrectly sets
the lock range to full-lock based on local->transaction.eager_lock_on
value. This is a bug and can lead to deadlock since the locks acquired
were range locks and a full unlock is being sent leading to unlock failure
and thereby every other lock request (be it from SHD or other clients or
glfsheal) getting blocked forever and the user perceives a hang.
FIX:
Unconditionally rely on the range locks in inodelk object for unlocking
when using compounded post-op + unlock.
Big thanks to Pranith for helping with the debugging.
Change-Id: Idb4938f90397fb4bd90921f9ae6ea582042e5c67
BUG: 1398566
Signed-off-by: Krutika Dhananjay <kdhananj at redhat.com>
Reviewed-on: http://review.gluster.org/15929
Smoke: Gluster Build System <jenkins at build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=K1qClCgTrm&a=cc_unsubscribe
More information about the Bugs
mailing list