[Bugs] [Bug 1638453] New: Gfid mismatch seen on shards when lookup and mknod are in progress at the same time
bugzilla at redhat.com
bugzilla at redhat.com
Thu Oct 11 15:15:46 UTC 2018
Bug ID: 1638453
Summary: Gfid mismatch seen on shards when lookup and mknod are
in progress at the same time
Assignee: bugs at gluster.org
Reporter: kdhananj at redhat.com
CC: bugs at gluster.org
Description of problem:
Occasionally, dd on a sharded file in tests/bugs/shard/bug-1251824.t fails with
Turns out this is caused by gfid-mismatch between the replicas.
On investigation, it was found that this is due to a race between posix mknod
and posix lookup.
posix mknod has 3 important stages, among other operations:
1. creation of the file itself
2. setting the gfid xattr on the file, and
3. creating the gfid link under .glusterfs.
Now assume the thread doing posix mknod has executed steps 1 and 2 and is on
its way to executing 3.
And a parallel lookup from another thread sees that loc->inode->gfid is NULL,
so it tries to perform gfid_heal and also attempts to create the gfid link
Assume lookup wins the race and creates the gfid link. posix_gfid_set() through
mknod fails with EEXIST.
In the older code, mknod under such conditions was NOT being treated as a
But ever since the following commit was merged:
Parent: 788cda4c (glusterd: fix some coverity issues)
Author: karthik-us <ksubrahm at redhat.com>
AuthorDate: 2018-08-03 15:55:18 +0530
Commit: Amar Tumballi <amarts at redhat.com>
CommitDate: 2018-08-20 12:14:22 +0000
posix: Delete the entry if gfid link creation fails
If the gfid link file inside .glusterfs is not present for a file,
the operations which are dependent on the gfid will fail,
complaining the link file does not exists inside .glusterfs.
If the link file creation fails, fail the entry creation operation
and delete the original file.
Signed-off-by: karthik-us <ksubrahm at redhat.com>
... this behavior changes and the mknod is treated as failure and the
subsequent entry deleted.
When sometime in future, shard sends another mknod on the shard, the file is
created, although this time with a new gfid (since "gfid-req" that is passed
now is a new UUID. This leads to a gfid-mismatch across the replicas.
Version-Release number of selected component (if applicable):
Fairly consistently. Just run the test tests/bugs/shard/bug-1251824.t in a loop
on your laptop. I was able to hit it in less than 5 mins time.
Steps to Reproduce:
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs