[Gluster-devel] spurious failure with sparse-file-heal.t test

Krishnan Parthasarathi kparthas at redhat.com
Fri Jun 5 03:31:28 UTC 2015


> This seems to happen because of race between STACK_RESET and stack
> statedump. Still thinking how to fix it without taking locks around
> writing to file.

Why should we still keep the stack being reset as part of pending pool of
frames? Even we if we had to (can't guess why?), when we remove we should do
the following to prevent gf_proc_dump_pending_frames from crashing.

...

call_frame_t *toreset = NULL;

LOCK (&stack->pool->lock)
{
  toreset = stack->frames;
  stack->frames = NULL;
}
UNLOCK (&stack->pool->lock);

...

Now, perform all operations that are done on stack->frames on toreset
instead. Thoughts?


More information about the Gluster-devel mailing list