[Gluster-devel] Storage/posix: syscalls done holding inode->lock

Raghavendra Gowdappa rgowdapp at redhat.com
Mon Feb 6 09:30:54 UTC 2017

Hi all,

Storage/posix does syscalls on backend filesystem holding inode->lock. This is bad as it is a lock global to the inode and can cause unnecessary contention with unrelated translators doing unrelated operations (like inode_ctx_get etc). I've discussed one such issue in [2]. A quick git grep on "inode->lock" in storage/posix gave following results:

* posix_writev - 
  GLUSTERFS_WRITE_IS_APPEND - looks like used by arbiter/afr.
  GLUSTERFS_WRITE_UPDATE_ATOMIC - looks like used by shard
* posix_fd_ctx_get - resolves gfid handle (which can involve multiple readlinks and lstats) in inode->lock. Though the cost is only once when fd-ctx is freshly created.
* code that maintains pgfid xattrs - executed in various dentry modification fops like mkdir, create, mknod, unlink, rename, link etc
* code that uses GET_LINK_COUNT - looks like used by shard and EC. Note that this code is executed during rename/unlink.
* posix_create_link_if_gfid_exists - looks like used by afr entry selfheal
* posix_(f)xattrop - various xlators like afr, marker during different fops.

The question here is can we've synchronization using a lock visible only to storage/posix so that contention is localized (like [1])?

[1] https://review.gluster.org/16462
[2] http://lists.gluster.org/pipermail/gluster-devel/2017-January/051936.html


