[Gluster-devel] races in dict_foreach() causing crashes in tier-file-creat.t
Jeff Darcy
jdarcy at redhat.com
Fri Mar 11 12:55:15 UTC 2016
> Raghavendra G and I discussed about this problem and the right way to
> fix it is to take a copy(without dict_foreach) of the dictionary in
> dict_foreach inside a lock and then loop over the local dictionary. I am
> worried about the performance implication of this
I'm worried about the correctness implications. Any such copy will have
to do the equivalent of dict_foreach even if it doesn't call the function
of that name, and will be subject to the same races.
> Also included Xavi, who earlier said we need to change dict.c but it is
> a bigger change. May be the time has come? I would love to gather all
> your inputs and implement a better version of dict if we need one.
There are three solutions I can think of.
(1) Have tier use STACK_WIND_COOKIE to pass the address of a lock down
both paths, so the two can synchronize between themselves.
(2) Have tier issue the lookup down the two paths *serially* instead
of in parallel, so there's no contention on the dictionary. This is
probably most complicated *and* worst for performance, but I include
it for the sake of completeness.
(3) Enhance dict_t with a gf_lock_t that can be used to serialize
access. We don't have to use the lock in every invocation of
dict_foreach (though we should probably investigate that). For
now, we can just use it in the code paths we know are contending.
I suspect Xavi was suggesting (3) and I concur that it's the best
long-term solution.
More information about the Gluster-devel
mailing list