[Gluster-devel] Reduce number of inodelk/entrylk calls on ec xlator

Xavier Hernandez xhernandez at datalab.es
Tue Jul 1 11:22:18 UTC 2014


Hi,

current implementation of ec xlator uses inodelk/entrylk before each operation 
to guarantee exclusive access to the inode. This implementation blocks any 
other request to the same inode/entry until the previous operation has 
completed and unlocked it.

This adds a lot of latency to each operation, even if there are no conflicts 
with other clients. To improve this I was thinking to implement something 
similar to eager-locking and piggy-backing.

The following is an schematic description of the idea:

* Each operation will build a list of things to be locked (this could be 1
  inode or up to 2 entries).
* For each lock in the list:
   * If the lock is already acquired by another operation, it will add itself
     to a list of waiting operations associated to the operation that
     currently holds the lock.
   * If the lock is not acquired, it will initiate the normal inodelk/entrylk
     calls.
   * The locks will be acquired in a special order to guarantee that there
     couldn't be deadlocks.
* When the operation that is currently holding the lock terminates, it will
  test if there are waiting operations on it before unlocking. If so, it will
  resume execution of the next operation without unlocking.
* In the same way, xattr updating after operation will be delayed if another
  request was waiting to modify the same inode.

The case with 2 locks must be analyzed deeper to guarantee that intermediate 
states combined with other operations don't generate deadlocks.

To avoid stalls of other clients I'm thinking to use GLUSTERFS_OPEN_FD_COUNT 
to see if the same file is open by other clients. In this case, the operation 
will unlock the inode even if there are other operations waiting. Once the 
unlock is finished, the waiting operation will restart the inodelk/entrylk 
procedure.

Do you think this is a good approximation ?

Any thoughts/ideas/feedback will be welcome.

Xavi


More information about the Gluster-devel mailing list