[Gluster-devel] [RFC] inode table locking contention reduction experiment

Amar Tumballi amarts at gmail.com
Mon Nov 4 04:39:35 UTC 2019


Thanks for this, github works for review right now :-)

I am occupied till Wednesday, and will review them by this week. A glance
on the changes looks good to me.

Few tests which can run for validations are :

tests/bugs/shard/bug-1696136-lru-limit-equals-deletion-rate.t

tests/features/fuse-lru-limit.t

tests/bugs/shard/shard-inode-refcount-test.t


Ideal is to run the full regression with `./run-tests.sh -c`


Regards,

Amar


On Mon, Nov 4, 2019 at 9:21 AM Changwei Ge <chge at linux.alibaba.com> wrote:

> Hi Amar,
>
> On 2019/10/31 6:30 下午, Amar Tumballi wrote:
> >
> >
> > On Wed, Oct 30, 2019 at 4:32 PM Xavi Hernandez <jahernan at redhat.com
> > <mailto:jahernan at redhat.com>> wrote:
> >
> >     Hi Changwei,
> >
> >     On Tue, Oct 29, 2019 at 7:56 AM Changwei Ge <chge at linux.alibaba.com
> >     <mailto:chge at linux.alibaba.com>> wrote:
> >
> >         Hi,
> >
> >         I am recently working on reducing inode_[un]ref() locking
> >         contention by
> >         getting rid of inode table lock. Just use inode lock to protect
> >         inode
> >         REF. I have already discussed a couple rounds with several
> >         Glusterfs
> >         developers via emails and Gerrit and basically get understood on
> >         major
> >         logic around.
> >
> >         Currently, inode REF can be ZERO and be reused by increasing it
> >         to ONE.
> >         This is IMO why we have to burden so much work for inode table
> when
> >         REF/UNREF. It makes inode [un]ref() and inode table and
> >         dentries(alias)
> >         searching hard to run concurrently.
> >
> >         So my question is in what cases, how can we find a inode whose
> >         REF is ZERO?
> >
> >         As Glusterfs store its inode memory address into kernel/fuse,
> >         can we
> >         conclude that only fuse_ino_to_inode() can bring back a REF=0
> inode?
> >
> >
> > Xavi's answer below provides some insights. and same time, assuming that
> > only fuse_ino_to_inode() can bring back inode from ref=0 state (for
> > now), is a good start.
> >
> >
> >     Yes, when an inode gets refs = 0, it means that gluster code is not
> >     using it anywhere, so it cannot be referenced again unless kernel
> >     sends new requests on the same inode. Once refs=0 and nlookup=0, the
> >     inode can be destroyed.
> >
> >     Inode code is quite complex right now and I haven't had time to
> >     investigate this further, but I think we could simplify inode
> >     management significantly (specially unref) if we add a reference
> >     when nlookup becomes > 0, and remove a reference when
> >     nlookup becomes 0 again. Maybe with this approach we could avoid
> >     inode table lock in many cases. However we need to make sure we
> >     correctly handle invalidation logic to keep inode table size under
> >     control.
> >
> >
> > My suggestion is, don't wait for a complete solution for posting the
> > patch. Let us get a chance to have a look at WorkInProgress patches, so
> > we can have discussions on code itself. It would help to reach better
> > solutions sooner.
>
> Agree.
>
> I have almost implemented my draft design for this experiment.
> The immature code has been pushed to my personal Glusterfs repo[1].
>
> Now it's a single large patch, I will split it to patches when I decide
> to push it to Gerrit for review convenience. If you prefer to push it to
> Gerrit for a early review and discussion, I can do that :-). But I am
> still doing some debug stuff.
>
> My work includes:
>
> 1. Move inode refing and unrefing logic unrelated logic out from
> `__inode_[un]ref()` hence to reduce their arguments.
> 2. Add a specific ‘ref_lock’ to inode to keep ref/unref atomicity.
> 3. As `inode_table::active_size` is only used for debug purpose, convert
> it to atomic variable.
> 4. Factor out pruning inode.
> 5. In order to run inode search and grep run concurrently, firstly  use
> RDLOCK  and then convert it WRLOCK if necessary.
> 6. Inode table lock is not necessary for inode ref/unref unless we have
> to move it between table lists.
>
> etc...
>
> Any comments, ideas, suggestions are kindly welcomed.
>
> Thanks,
> Changwei
>
> [1]:
>
> https://github.com/changweige/glusterfs/commit/d7226d2458281212af19ec8c2ca3d8c8caae1330
>
> >
> >     Regards,
> >
> >     Xavi
> >
> >
> >
> >         Thanks,
> >         Changwei
> >         _______________________________________________
> >
> >         Community Meeting Calendar:
> >
> >         APAC Schedule -
> >         Every 2nd and 4th Tuesday at 11:30 AM IST
> >         Bridge: https://bluejeans.com/118564314
> >
> >         NA/EMEA Schedule -
> >         Every 1st and 3rd Tuesday at 01:00 PM EDT
> >         Bridge: https://bluejeans.com/118564314
> >
> >         Gluster-devel mailing list
> >         Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
> >         https://lists.gluster.org/mailman/listinfo/gluster-devel
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20191104/4bbeda4b/attachment-0001.html>


More information about the Gluster-devel mailing list