[Gluster-devel] spurious failure with sparse-file-heal.t test

Sun Jun 7 12:09:01 UTC 2015


On 06/05/2015 09:01 AM, Krishnan Parthasarathi wrote:
>> This seems to happen because of race between STACK_RESET and stack
>> statedump. Still thinking how to fix it without taking locks around
>> writing to file.
> Why should we still keep the stack being reset as part of pending pool of
> frames? Even we if we had to (can't guess why?), when we remove we should do
> the following to prevent gf_proc_dump_pending_frames from crashing.
C stack actually gives up the memory it takes when the function call 
returns. But there was no such mechanism for gluster stacks before 
STACK_RESET. So for long running operations like BIG file self-heal, Big 
directory read etc we can keep RESETting the stack to prevent it to grow 
to a large size.

Pranith
>
> ...
>
> call_frame_t *toreset = NULL;
>
> LOCK (&stack->pool->lock)
> {
>    toreset = stack->frames;
>    stack->frames = NULL;
> }
> UNLOCK (&stack->pool->lock);
>
> ...
>
> Now, perform all operations that are done on stack->frames on toreset
> instead. Thoughts?