[Gluster-devel] Spurious failures because of nfs and snapshots

Thu May 22 06:05:54 UTC 2014

On 05/21/2014 08:50 PM, Vijaikumar M wrote:
> KP, Atin and myself did some debugging and found that there was a
> deadlock in glusterd.
>
> When creating a volume snapshot, the back-end operation 'taking a
> lvm_snapshot and starting brick' for the each brick
> are executed in parallel using synctask framework.
>
> brick_start was releasing a big_lock with brick_connect and does a lock
> again.
> This caused a deadlock in some race condition where main-thread waiting
> for one of the synctask thread to finish and
> synctask-thread waiting for the big_lock.
>
>
> We are working on fixing this issue.
>

If this fix is going to take more time, can we please log a bug to track 
this problem and remove the test cases that need to be addressed from 
the test unit? This way other valid patches will not be blocked by the 
failure of the snapshot test unit.

We can introduce these tests again as part of the fix for the problem.

-Vijay