[Gluster-devel] Non-blocking lock for renames

Raghavendra Gowdappa rgowdapp at redhat.com
Fri Feb 5 04:39:13 UTC 2016



----- Original Message -----
> From: "Vijay Bellur" <vbellur at redhat.com>
> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> Cc: "Shyamsundar Ranganathan" <srangana at redhat.com>, "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Friday, February 5, 2016 10:06:07 AM
> Subject: Re: Non-blocking lock for renames
> 
> On 02/04/2016 12:58 AM, Raghavendra Gowdappa wrote:
> >
> >
> > ----- Original Message -----
> >> From: "Vijay Bellur" <vbellur at redhat.com>
> >> To: "Shyamsundar Ranganathan" <srangana at redhat.com>, "Raghavendra
> >> Gowdappa" <rgowdapp at redhat.com>
> >> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> >> Sent: Thursday, February 4, 2016 9:55:04 AM
> >> Subject: Non-blocking lock for renames
> >>
> >> DHT developers,
> >>
> >> We introduced a non-blocking lock prior to a rename operation in dht and
> >> fail the rename if the lock acquisition is not successful with 3.6. I
> >> ran into an user in IRC yesterday who is affected by this behavior change:
> >>
> >> "We're seeing a behavior in Gluster 3.7.x that we did not see in 3.4.x
> >> and we're not sure how to fix it. When multiple processes are attempting
> >> to rename a file to the same destination at once, we're now seeing
> >> "Device or resource busy" and "Stale file handle" errors. Here's the
> >> command to replicate it: cd /mnt/glustermount; while true; do
> >> FILE=$RANDOM; touch $FILE; mv $FILE file-fv; done. The above command
> >> would be ran on two or three servers within the same gluster cluster. In
> >> the output, one would always be sucessfull in the rename, while the 2
> >> other ones would fail with the above error."
> >>
> >> The use case for concurrent renames was described as:
> >>
> >> "we generate files and push them to the gluster cluster. Some are
> >> generated multiple times and end up being pushed to the cluster at the
> >> same time by different data generators; resulting in the 'rename
> >> collision'. We use also the cluster.extra-hash-regex to make sure the
> >> data is written in place. And this does the rename."
> >>
> >> Is a non-blocking lock essential? Can we not use a blocking lock instead
> >> of a non-blocking lock or fallback to a blocking lock if the original
> >> non-blocking lock acquisition fails?
> >
> > This lock synchronizes:
> > 1. rename from application with file migration from rebalance process [1].
> > 2. multiple renames from application on same file.
> >
> > I think lock is still required for 1. However, since migration can
> > potentially take large time, we chose a non-blocking lock to make sure
> > application is not blocked for longer period.
> 
> Since rebalance involves reduced performance and if performance/latency
> is the only reason why we have non-blocking locks, I would prefer that
> we block a rename during rebalance and preserve application continuity.
> 
> >
> > The case 2 is what causing the issue mentioned in this thread. We did see
> > some files being removed with parallel renames on the same file. But, by
> > the time we had identified that its a bug in 'mv' (mv issues an unlink on
> > src if src and dst happens to be hardlinks [2]. But test for hardlink
> > check and unlink are not atomic. Dht breaks rename into a series of links
> > and unlinks), we had introduced synchronizing b/w renames. So, we have two
> > options:
> >
> > 1. Use different domains for use cases 1 and 2 above. With different
> > domains, use-case 2 above can be changed to use blocking locks. It might
> > not be advisable to use blocking locks for use-case 1.
> > 2. Since we identified the issue is with mv (I couldn't find another bug we
> > filed on mv, but [2] is close to it), probably we don't need locking in 2
> > at all.
> >
> > Suggestions?
> 
> I would still preserve locking for 2. as the mv fixes are unlikely to
> hit all releases of all distributions. 

Yes. Even I was leaning to preserve locking because of issues not being fixed in mv.

> If we change the rename lock to
> be blocking, I feel that we would be covering both 1. and 2. while
> preserving application continuity.

Yes. We'll send a patch to make locking in rename blocking.

> 
> Regards,
> Vijay
> 
> 


More information about the Gluster-devel mailing list