[Gluster-devel] spurious failure with sparse-file-heal.t test

Sun Jun 7 12:43:57 UTC 2015


On 06/07/2015 05:40 PM, Pranith Kumar Karampuri wrote:
>
>
> On 06/05/2015 09:10 AM, Krishnan Parthasarathi wrote:
>>
>> ----- Original Message -----
>>>> This seems to happen because of race between STACK_RESET and stack
>>>> statedump. Still thinking how to fix it without taking locks around
>>>> writing to file.
>>> Why should we still keep the stack being reset as part of pending 
>>> pool of
>>> frames? Even we if we had to (can't guess why?), when we remove we 
>>> should do
>>> the following to prevent gf_proc_dump_pending_frames from crashing.
>>>
>>> ...
>>>
>>> call_frame_t *toreset = NULL;
>>>
>>> LOCK (&stack->pool->lock)
>>> {
>>>    toreset = stack->frames;
>>>    stack->frames = NULL;
>>> }
>>> UNLOCK (&stack->pool->lock);
>>>
>>> ...
>>>
>>> Now, perform all operations that are done on stack->frames on toreset
>>> instead. Thoughts?
>> Is there a reason you want to avoid locks here? STACK_DESTROY uses the
>> call_pool lock to remove the stack from the list of pending frames.
> It is always better to prevent spin-locks while doing a slow operation 
> like write. That is the only reasoning behind it.
Seems like we are already inside pool->lock while doing statedump which 
does writes to files, so may be I shouldn't think too much :-/. I will 
take a look at your patch once.

Pranith
>
> Pranith
>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel