[Gluster-devel] Serialization of fops acting on same dentry on server

Mohammed Rafi K C rkavunga at redhat.com
Tue Aug 23 14:46:54 UTC 2016


Hi,

We have pushed a patch for fop serialization on server side [1]. If you
have some time, please take a look into the patch. You are reviews are
most welcome :)


If I can accommodate all the comments by End of the week, we are
planning to get this before the coming Friday.


Note: Meantime I will be working to get the performance numbers to see
how much performance drop can it cause.


[1] : http://review.gluster.org/13451

Regards

Rafi KC


On 08/19/2015 02:55 PM, Pranith Kumar Karampuri wrote:
> + Ravi, Anuradha
>
> On 08/17/2015 10:39 AM, Raghavendra Gowdappa wrote:
>> All,
>>
>> Pranith and me were discussing about implementation of compound
>> operations like "create + lock", "mkdir + lock", "open + lock" etc.
>> These operations are useful in situations like:
>>
>> 1. To prevent locking on all subvols during directory creation as
>> part of self heal in dht. Currently we are following approach of
>> locking _all_ subvols by both rmdir and lookup-heal [1].
>> 2. To lock a file in advance so that there is less performance hit
>> during transactions in afr.
>>
>> While thinking about implementing such compound operations, it
>> occurred to me that one of the problems would be how do we handle a
>> racing mkdir/create and a (named lookup - simply referred as lookup
>> from now on - followed by lock). This is because,
>> 1. creation of directory/file on backend
>> 2. linking of the inode with the gfid corresponding to that
>> file/directory
>>
>> are not atomic. It is not guaranteed that inode passed down during
>> mkdir/create call need not be the one that survives in inode table.
>> Since posix-locks xlator maintains all the lock-state in inode, it
>> would be a problem if a different inode is linked in inode table than
>> the one passed during mkdir/create. One way to solve this problem is
>> to serialize fops (like mkdir/create, lookup, rename, rmdir, unlink)
>> that are happening on a particular dentry. This serialization would
>> also solve other bugs like:
>>
>> 1. issues solved by [2][3] and possibly many such issues.
>> 2. Stale dentries left out in bricks' inode table because of a racing
>> lookup and dentry modification ops (like rmdir, unlink, rename etc).
>>
>> Initial idea I've now is to maintain fops in-progress on a dentry in
>> parent inode (may be resolver code in protocol/server). Based on this
>> we can serialize the operations. Since we need to serialize _only_
>> operations on a dentry (we don't serialize nameless lookups), it is
>> guaranteed that we do have a parent inode always. Any
>> comments/discussion on this would be appreciated.
>>
>> [1] http://review.gluster.org/11725
>> [2] http://review.gluster.org/9913
>> [3] http://review.gluster.org/5240
>>
>> regards,
>> Raghavendra.
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel



More information about the Gluster-devel mailing list