[Gluster-devel] Reduce number of inodelk/entrylk calls on ec xlator

haiwei.xie-soulinfo haiwei.xie at soulinfo.com
Tue Jul 1 13:37:57 UTC 2014


hi Xavi, 

   Writev inodelk lock whole file, so write speed is bad. If inodelk(offset,len), 
IDA_KEY_SIZE xattr will be not consistent crossing bricks from unorder writev.

   So how about just use IDA_KEY_VERSION and bricks ia_size to check data crash?
Drop IDA_KEY_SIZE, and lookup lock whole file, readv lock (offset,len).
 
   I guess, this can get good performance and data consistent. 

   Thanks.

-terrs

> Hi,
> 
> current implementation of ec xlator uses inodelk/entrylk before each operation 
> to guarantee exclusive access to the inode. This implementation blocks any 
> other request to the same inode/entry until the previous operation has 
> completed and unlocked it.
> 
> This adds a lot of latency to each operation, even if there are no conflicts 
> with other clients. To improve this I was thinking to implement something 
> similar to eager-locking and piggy-backing.
> 
> The following is an schematic description of the idea:
> 
> * Each operation will build a list of things to be locked (this could be 1
>   inode or up to 2 entries).
> * For each lock in the list:
>    * If the lock is already acquired by another operation, it will add itself
>      to a list of waiting operations associated to the operation that
>      currently holds the lock.
>    * If the lock is not acquired, it will initiate the normal inodelk/entrylk
>      calls.
>    * The locks will be acquired in a special order to guarantee that there
>      couldn't be deadlocks.
> * When the operation that is currently holding the lock terminates, it will
>   test if there are waiting operations on it before unlocking. If so, it will
>   resume execution of the next operation without unlocking.
> * In the same way, xattr updating after operation will be delayed if another
>   request was waiting to modify the same inode.
> 
> The case with 2 locks must be analyzed deeper to guarantee that intermediate 
> states combined with other operations don't generate deadlocks.
> 
> To avoid stalls of other clients I'm thinking to use GLUSTERFS_OPEN_FD_COUNT 
> to see if the same file is open by other clients. In this case, the operation 
> will unlock the inode even if there are other operations waiting. Once the 
> unlock is finished, the waiting operation will restart the inodelk/entrylk 
> procedure.
> 
> Do you think this is a good approximation ?
> 
> Any thoughts/ideas/feedback will be welcome.
> 
> Xavi
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel



More information about the Gluster-devel mailing list