[Bugs] [Bug 1176756] New: glusterd: remote locking failure when multiple synctask transactions are run
bugzilla at redhat.com
bugzilla at redhat.com
Tue Dec 23 04:23:26 UTC 2014
https://bugzilla.redhat.com/show_bug.cgi?id=1176756
Bug ID: 1176756
Summary: glusterd: remote locking failure when multiple
synctask transactions are run
Product: GlusterFS
Version: 3.6.0
Component: glusterd
Keywords: Triaged
Assignee: bugs at gluster.org
Reporter: amukherj at redhat.com
CC: bugs at gluster.org, gluster-bugs at redhat.com
Depends On: 1173414
+++ This bug was initially created as a clone of Bug #1173414 +++
Description of problem:
When two volume set operations are run in two different volumes simultaneously
in a loop some volume set transactions fail with a remote lock failure.
Version-Release number of selected component (if applicable):
Mainline
How reproducible:
Always
Steps to Reproduce:
1. Setup a 2 node cluster
2. Create two volumes say vol1 & vol2 & start them
3. Run following script from any one of the node in the cluster
for i in {1..10}
do
gluster v set vol1 diagnostics.client-log-level DEBUG &
gluster v set vol2 features.barrier on
done
Actual results:
Some of the transaction fails saying "Locking failed in <Peer node>, Please
check log file for details"
Expected results:
Local locking might fail, but remote locking should never fail here.
Additional info:
--- Additional comment from Anand Avati on 2014-12-12 00:50:13 EST ---
REVIEW: http://review.gluster.org/9269 (glusterd: Maintain per transaction
xaction_peers list in syncop) posted (#1) for review on master by Atin
Mukherjee (amukherj at redhat.com)
--- Additional comment from Anand Avati on 2014-12-16 07:05:30 EST ---
REVIEW: http://review.gluster.org/9269 (glusterd: Maintain per transaction
xaction_peers list in syncop & mgmt_v3) posted (#2) for review on master by
Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Anand Avati on 2014-12-17 01:52:55 EST ---
REVIEW: http://review.gluster.org/9269 (glusterd: Maintain per transaction
xaction_peers list in syncop & mgmt_v3) posted (#3) for review on master by
Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Anand Avati on 2014-12-22 02:00:50 EST ---
REVIEW: http://review.gluster.org/9269 (glusterd: Maintain per transaction
xaction_peers list in syncop & mgmt_v3) posted (#4) for review on master by
Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Anand Avati on 2014-12-22 03:39:26 EST ---
REVIEW: http://review.gluster.org/9269 (glusterd: Maintain per transaction
xaction_peers list in syncop & mgmt_v3) posted (#5) for review on master by
Atin Mukherjee (amukherj at redhat.com)
--- Additional comment from Anand Avati on 2014-12-22 23:14:19 EST ---
COMMIT: http://review.gluster.org/9269 committed in master by Kaushal M
(kaushal at redhat.com)
------
commit da9deb54df91dedc51ebe165f3a0be646455cb5b
Author: Atin Mukherjee <amukherj at redhat.com>
Date: Fri Dec 12 07:21:19 2014 +0530
glusterd: Maintain per transaction xaction_peers list in syncop & mgmt_v3
In current implementation xaction_peers list is maintained in a global
variable
(glustrd_priv_t) for syncop/mgmt_v3. This means consistency and atomicity
of
peerinfo list across transactions is not guranteed when multiple
syncop/mgmt_v3
transaction are going through.
We had got into a problem in mgmt_v3-locks.t which was failing spuriously,
the
reason for that was two volume set operations (in two different volume) was
going through simultaneouly and both of these transaction were manipulating
the
same xaction_peers structure which lead to a corrupted list. Because of
which in
some cases unlock request to peer was never triggered and we end up with
having
stale locks.
Solution is to maintain a per transaction local xaction_peers list for
every
syncop.
Please note I've identified this problem in op-sm area as well and a
separate
patch will be attempted to fix it.
Finally thanks to Krishnan Parthasarathi and Kaushal M for your constant
help to
get to the root cause.
Change-Id: Ib1eaac9e5c8fc319f4e7f8d2ad965bc1357a7c63
BUG: 1173414
Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
Reviewed-on: http://review.gluster.org/9269
Tested-by: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Kaushal M <kaushal at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1173414
[Bug 1173414] glusterd: remote locking failure when multiple synctask
transactions are run
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list