[Bugs] [Bug 1343038] IO ERROR when multiple graph switches
bugzilla at redhat.com
bugzilla at redhat.com
Wed Aug 10 10:41:01 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1343038
--- Comment #7 from Vijay Bellur <vbellur at redhat.com> ---
COMMIT: http://review.gluster.org/14722 committed in master by Rajesh Joseph
(rjoseph at redhat.com)
------
commit 30019e51ddefc266c939a61d26d324b7ebf3ad95
Author: Poornima G <pgurusid at redhat.com>
Date: Tue Jul 19 15:20:09 2016 +0530
gfapi: Fix IO error caused when there is consecutive graph switches
This is part 2 of the fix, the part 1 can be found at:
http://review.gluster.org/#/c/14656/
Problem:
=======
Consider a race between, __glfs_active_subvol() and graph_setup().
Lets say @TIME T1:
fs->active_subvol = A
fs->next_subvol = B
__glfs_active_subvol() //under lock fs->mutex
{
....
new_subvol = fs->next_subvol //which is B
.... //Start migration from A to B
__glfs_first_lookup(){
....
unlock fs->mutex //@TIME T2
network fop
lock fs->mutex
....
}
.... //migration continue on B
fs->active_subvol = fs->next_subvol //which is C (explained below)
....
}
@Time T2, lets say in another thread, graph_setup() is called with C,
note that at T2, fs->mutex is unlocked.
graph_stup(C...)
{
lock fs->mutex
....
if (fs->next_subvol) // which is B
destroy subvol (fs->next_subvol)
....
fs->next_subvol = C
....
unlock fs->mutex
}
Thus at the end of this,
fs->old_subvol = A;
fs->active_subvol = C;
fs->next_subvol = NULL;
which is wrong, as B completed migration, but was destroyed by
graph_setup, and C never was migrated.
Solution:
=========
Any new graph can be in one of the 2 states:
- Picked for migration, migration in progress (fs->mip_subvol)
- Not picked so far for migration (fs->next_subvol)
graph_setup() updates fs->next_subvol only, __glfs_active_subvol()
moves fs->next_subvol to fs->mip_subvol and fs->next_subvol = NULL
atomically, and then once the migration is complete, make that the
fs->active_subvol
Change-Id: Ib6ff0565105c5eedb912a43da4017cd413243612
BUG: 1343038
Signed-off-by: Poornima G <pgurusid at redhat.com>
Reviewed-on: http://review.gluster.org/14722
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Smoke: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur at redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph at redhat.com>
Reviewed-by: Niels de Vos <ndevos at redhat.com>
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=teHoQPEn83&a=cc_unsubscribe
More information about the Bugs
mailing list