[Gluster-devel] races in dict_foreach() causing crashes in tier-file-creat.t

Jeff Darcy jdarcy at redhat.com
Fri Mar 11 12:55:15 UTC 2016


> Raghavendra G and I discussed about this problem and the right way to
> fix it is to take a copy(without dict_foreach) of the dictionary in
> dict_foreach inside a lock and then loop over the local dictionary. I am
> worried about the performance implication of this

I'm worried about the correctness implications.  Any such copy will have
to do the equivalent of dict_foreach even if it doesn't call the function
of that name, and will be subject to the same races.

> Also included Xavi, who earlier said we need to change dict.c but it is
> a bigger change. May be the time has come? I would love to gather all
> your inputs and implement a better version of dict if we need one.

There are three solutions I can think of.

(1) Have tier use STACK_WIND_COOKIE to pass the address of a lock down
both paths, so the two can synchronize between themselves.

(2) Have tier issue the lookup down the two paths *serially* instead
of in parallel, so there's no contention on the dictionary.  This is
probably most complicated *and* worst for performance, but I include
it for the sake of completeness.

(3) Enhance dict_t with a gf_lock_t that can be used to serialize
access.  We don't have to use the lock in every invocation of
dict_foreach (though we should probably investigate that).  For
now, we can just use it in the code paths we know are contending.

I suspect Xavi was suggesting (3) and I concur that it's the best
long-term solution.


More information about the Gluster-devel mailing list