[Bugs] [Bug 1488120] Moving multiple temporary files to the same destination concurrently causes ESTALE error
bugzilla at redhat.com
bugzilla at redhat.com
Mon Mar 12 14:10:09 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1488120
--- Comment #7 from Raghavendra G <rgowdapp at redhat.com> ---
(In reply to Raghavendra G from comment #6)
> 4. linkfile creation during lookup has to be done under lock which
> synchronizes linkfile creation with any renames involving the file. This
> part of the solution is implemented. But, I am still thinking through
> whether this locking is actually required. IOW, I am not able to find the
> RCA which requires this solution. But, having this lock gets test working.
The reason we need locks here is because a half done rename can result in
multiple gfids for the same path (dst) (though this is transient which will get
corrected once rename is complete - either successfully or a failure. The
exception is client crashing in the middle of a rename). Gfid of cached file at
the time of lookup (outside locks) can be different by the time linkfile is
created. This results in a permanent condition of linkto file having a
different gfid than data file. So, lookup before attempting linkto creation,
* acquire entrylk on parent, so that renames are blocked.
* check whether conditions for linkto creation are still valid - like data-file
has the same gfid as the inode in glusterfs process, linkto file abset etc. If
any of these checks fail, abandon linkto creation.
> With the above set of solutions, I am able to get the test working (with 4
> clients simultaneously executing the above script and on client continuously
> doing lookup on the contents of directory in which renames are being done)
> for couple of hours. But after that I end up with rename failing and two
> dst data files in the volume. I am in the process of debugging this.
Previously I was not verifying conditions for creation of linkto are still
valid _after_ acquiring entrylk. This resulted in lookup of dst failing with
ESTALE and dst-cached getting set as NULL. Subsequent renames would result in
more than one data file, with each having different gfids. Tests have been
running successfully for the past hour and I am optimistic that they'll
continue to run successfully.
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=QfTuaAcR17&a=cc_unsubscribe
More information about the Bugs
mailing list