[Gluster-devel] ./tests/bugs/bug-1110262.t is a bad test susceptible to failures due to race-conditions

Raghavendra Gowdappa rgowdapp at redhat.com
Tue Mar 13 11:16:49 UTC 2018


All,

This test does:

1. mount a volume
2. kill a brick in the volume
3. mkdir (/somedir)

In my local tests and in [1], I see that mkdir in step 3 fails because
there is no dht-layout on root directory.

The reason I think is by the time first lookup on "/" hit dht, a brick was
killed as per step 2. This means layout was not healed for "/" and since
this is a new volume, no layout is present on it. Note that the first
lookup done on "/" by fuse-bridge is not synchronized with parent process
of daemonized glusterfs mount completing. IOW, by the time glusterfs cmd
executed there is no guarantee that lookup on "/" is complete. So, if step
2 races ahead of fuse_first_lookup on "/", we end up with an invalid
dht-layout on "/" resulting in failures.

I've sent a patch [2] to fix this race condition.

[1] https://build.gluster.org/job/centos7-regression/298/console
[2] https://review.gluster.org/#/c/19707/

regards,
Raghavendra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180313/f21814a4/attachment.html>


More information about the Gluster-devel mailing list