[Gluster-devel] Storage/posix: syscalls done holding inode->lock

Vijay Bellur vbellur at redhat.com
Mon Feb 6 23:20:22 UTC 2017


On Mon, Feb 6, 2017 at 4:30 AM, Raghavendra Gowdappa <rgowdapp at redhat.com>
wrote:

> Hi all,
>
> Storage/posix does syscalls on backend filesystem holding inode->lock.
> This is bad as it is a lock global to the inode and can cause unnecessary
> contention with unrelated translators doing unrelated operations (like
> inode_ctx_get etc). I've discussed one such issue in [2]. A quick git grep
> on "inode->lock" in storage/posix gave following results:
>
> * posix_writev -
>   GLUSTERFS_WRITE_IS_APPEND - looks like used by arbiter/afr.
>   GLUSTERFS_WRITE_UPDATE_ATOMIC - looks like used by shard
> * posix_fd_ctx_get - resolves gfid handle (which can involve multiple
> readlinks and lstats) in inode->lock. Though the cost is only once when
> fd-ctx is freshly created.
> * code that maintains pgfid xattrs - executed in various dentry
> modification fops like mkdir, create, mknod, unlink, rename, link etc
> * code that uses GET_LINK_COUNT - looks like used by shard and EC. Note
> that this code is executed during rename/unlink.
> * posix_create_link_if_gfid_exists - looks like used by afr entry selfheal
> * posix_(f)xattrop - various xlators like afr, marker during different
> fops.
>
> The question here is can we've synchronization using a lock visible only
> to storage/posix so that contention is localized (like [1])?
>
>
I think the answer depends on the degree of isolation required across
threads operating on the same inode. If the operations being done within
"inode->lock" do not cause any side effects elsewhere in the xlator stack,
we should be able to replace inode->lock with a more local lock.  At the
outset it looks like we should be able to synchronize using a smaller lock
for most cases. A more careful analysis is needed to determine if there are
scenarios where inode->lock helps.

Regards,
Vijay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20170206/f3a3cd5d/attachment-0001.html>


More information about the Gluster-devel mailing list