[Gluster-devel] Reduce number of inodelk/entrylk calls on ec xlator
haiwei.xie-soulinfo
haiwei.xie at soulinfo.com
Tue Jul 1 13:37:57 UTC 2014
hi Xavi,
Writev inodelk lock whole file, so write speed is bad. If inodelk(offset,len),
IDA_KEY_SIZE xattr will be not consistent crossing bricks from unorder writev.
So how about just use IDA_KEY_VERSION and bricks ia_size to check data crash?
Drop IDA_KEY_SIZE, and lookup lock whole file, readv lock (offset,len).
I guess, this can get good performance and data consistent.
Thanks.
-terrs
> Hi,
>
> current implementation of ec xlator uses inodelk/entrylk before each operation
> to guarantee exclusive access to the inode. This implementation blocks any
> other request to the same inode/entry until the previous operation has
> completed and unlocked it.
>
> This adds a lot of latency to each operation, even if there are no conflicts
> with other clients. To improve this I was thinking to implement something
> similar to eager-locking and piggy-backing.
>
> The following is an schematic description of the idea:
>
> * Each operation will build a list of things to be locked (this could be 1
> inode or up to 2 entries).
> * For each lock in the list:
> * If the lock is already acquired by another operation, it will add itself
> to a list of waiting operations associated to the operation that
> currently holds the lock.
> * If the lock is not acquired, it will initiate the normal inodelk/entrylk
> calls.
> * The locks will be acquired in a special order to guarantee that there
> couldn't be deadlocks.
> * When the operation that is currently holding the lock terminates, it will
> test if there are waiting operations on it before unlocking. If so, it will
> resume execution of the next operation without unlocking.
> * In the same way, xattr updating after operation will be delayed if another
> request was waiting to modify the same inode.
>
> The case with 2 locks must be analyzed deeper to guarantee that intermediate
> states combined with other operations don't generate deadlocks.
>
> To avoid stalls of other clients I'm thinking to use GLUSTERFS_OPEN_FD_COUNT
> to see if the same file is open by other clients. In this case, the operation
> will unlock the inode even if there are other operations waiting. Once the
> unlock is finished, the waiting operation will restart the inodelk/entrylk
> procedure.
>
> Do you think this is a good approximation ?
>
> Any thoughts/ideas/feedback will be welcome.
>
> Xavi
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-devel
mailing list