[Gluster-devel] spurious failure with sparse-file-heal.t test

Sun Jun 7 12:10:47 UTC 2015


On 06/05/2015 09:10 AM, Krishnan Parthasarathi wrote:
>
> ----- Original Message -----
>>> This seems to happen because of race between STACK_RESET and stack
>>> statedump. Still thinking how to fix it without taking locks around
>>> writing to file.
>> Why should we still keep the stack being reset as part of pending pool of
>> frames? Even we if we had to (can't guess why?), when we remove we should do
>> the following to prevent gf_proc_dump_pending_frames from crashing.
>>
>> ...
>>
>> call_frame_t *toreset = NULL;
>>
>> LOCK (&stack->pool->lock)
>> {
>>    toreset = stack->frames;
>>    stack->frames = NULL;
>> }
>> UNLOCK (&stack->pool->lock);
>>
>> ...
>>
>> Now, perform all operations that are done on stack->frames on toreset
>> instead. Thoughts?
> Is there a reason you want to avoid locks here? STACK_DESTROY uses the
> call_pool lock to remove the stack from the list of pending frames.
It is always better to prevent spin-locks while doing a slow operation 
like write. That is the only reasoning behind it.

Pranith
>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>