[Gluster-devel] question on glustershd
Emmanuel Dreyfus
manu at netbsd.org
Tue Dec 2 16:59:01 UTC 2014
Hi
I have been tracking down a bug reported by /tests/basic/afr/entry-self-heal.t
on NetBSD, and now I wonder how glustershd is supposed to work.
In xlators/cluster/afr/src/afr-self-heald.c, we create a healder for
each AFR subvolume. In afr_selfheal_tryinodelk(), each healer performs
the INODELK for each AFR subvolume, using AFR_ONALL().
The result is that healers compete for the locks on the same inodes
in the subvolumes. They sometime conflict, and if we have only two
subvolumes, we ran into this condition:
if (ret < AFR_SH_MIN_PARTICIPANTS) {
/* Either less than two subvols available, or another
selfheal (from another server) is in progress. Skip
for now in any case there isn't anything to do.
*/
ret = -ENOTCONN;
goto unlock;
}
Since there is no glustershd doing the work on another server, the entry
will remain unhealed. I beleive this is exactly the same problem I am
trying to address in http://review.gluster.org/9074
What is wrong here? Should there really be healers for each subvolume,
or is it the AFR_ONALL() usage that is wrong? Or did I completely miss
the thing?
--
Emmanuel Dreyfus
manu at netbsd.org
More information about the Gluster-devel
mailing list