[Gluster-devel] Proposal to change locking in data-self-heal

Tue May 21 19:49:19 UTC 2013

On Tue, May 21, 2013 at 7:05 AM, Jeff Darcy <jdarcy at redhat.com> wrote:

> On 05/21/2013 09:10 AM, Pranith Kumar Karampuri wrote:
>
>> scenario-1 won't happen because there exists a chance for it to acquire
>> truncate's full file lock after any 128k range sync happens.
>>
>> Scenario-2 won't happen because extra self-heals that are launched on the
>> same file will be blocked in self-heal-domain so the data-path's locks are
>> not affected by this.
>>
>
> This is two changes for two scenarios:
>
> * Change granular-lock order to avoid scenario 1.
>
> * Add a new lock to avoid scenario 2.
>
> I'm totally fine with the second part, but the first part makes me a bit
> uneasy.  As I recall, the "chained" locking (lock second range before
> unlocking the first) was a deliberate choice to avoid windows where
> somebody could jump in with a truncate during self-heal.  It certainly
> wasn't the obvious thing to do, suggesting there was a specific reason.

Chained locking really was to avoid the start of a second self-heal while
one is in progress. By giving up the big lock (after holding the small
lock), regional operations can progress but the second self-heal waiting to
hold the big lock is still held off. Truncating isn't really an issue as
long as the truncate transaction locks from new EOF till infinity (which it
is) and self-heal will not race in those regions. Of course, with chained
locking, full-file truncates can starve.

It's not obvious what use case Pranith is facing where self-heal and
ftruncate() are competing. Is it just an artificial/hand-crafted test case,
or a real-life access pattern?

> Until we recall what that reason was I don't think we can determine
> whether this is a bug or a feature.  If it's a feature, or if we don't
> know, this change is likely to break something instead of fixing it.

The sole reason was to prevent the start of a second self-heal. Having
self-heals serialize in a separate domain solves the problem (except if we
are looking at runtime compatibility across multiple versions in this
scenario.. aargh!)

Avati
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20130521/8f1d854b/attachment-0001.html>