[Gluster-devel] Fuse mounts and inodes

Raghavendra G raghavendra at gluster.com
Wed Sep 6 04:57:03 UTC 2017


Another parallel effort could be trying to configure the number of
inodes/dentries cached by kernel VFS using /proc/sys/vm interface.

==============================================================

vfs_cache_pressure
------------------

This percentage value controls the tendency of the kernel to reclaim
the memory which is used for caching of directory and inode objects.

At the default value of vfs_cache_pressure=100 the kernel will attempt to
reclaim dentries and inodes at a "fair" rate with respect to pagecache and
swapcache reclaim.  Decreasing vfs_cache_pressure causes the kernel to prefer
to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will
never reclaim dentries and inodes due to memory pressure and this can easily
lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
causes the kernel to prefer to reclaim dentries and inodes.

Increasing vfs_cache_pressure significantly beyond 100 may have negative
performance impact. Reclaim code needs to take various locks to find freeable
directory and inode objects. With vfs_cache_pressure=1000, it will look for
ten times more freeable objects than there are.

Also we've an article for sysadmins which has a section:

<quote>

With GlusterFS, many users with a lot of storage and many small files
easily end up using a lot of RAM on the server side due to
'inode/dentry' caching, leading to decreased performance when the kernel
keeps crawling through data-structures on a 40GB RAM system. Changing
this value higher than 100 has helped many users to achieve fair caching
and more responsiveness from the kernel.

</quote>

Complete article can be found at:
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Linux%20Kernel%20Tuning/

regards,


On Tue, Sep 5, 2017 at 5:20 PM, Raghavendra Gowdappa <rgowdapp at redhat.com>
wrote:

> +gluster-devel
>
> Ashish just spoke to me about need of GC of inodes due to some state in
> inode that is being proposed in EC. Hence adding more people to
> conversation.
>
> > > On 4 September 2017 at 12:34, Csaba Henk <chenk at redhat.com> wrote:
> > >
> > > > I don't know, depends on how sophisticated GC we need/want/can get
> by. I
> > > > guess the complexity will be inherent, ie. that of the algorithm
> chosen
> > > > and
> > > > how we address concurrency & performance impacts, but once that's got
> > > > right
> > > > the other aspects of implementation won't be hard.
> > > >
> > > > Eg. would it be good just to maintain a simple LRU list?
> > > >
> >
> > Yes. I was also thinking of leveraging lru list. We can invalidate first
> "n"
> > inodes from lru list of fuse inode table.
> >
> > >
> > > That might work for starters.
> > >
> > > >
> > > > Csaba
> > > >
> > > > On Mon, Sep 4, 2017 at 8:48 AM, Nithya Balachandran <
> nbalacha at redhat.com>
> > > > wrote:
> > > >
> > > >>
> > > >>
> > > >> On 4 September 2017 at 12:14, Csaba Henk <chenk at redhat.com> wrote:
> > > >>
> > > >>> Basically how I see the fuse invalidate calls as rescuers of
> sanity.
> > > >>>
> > > >>> Normally, when you have lot of certain kind of stuff that tends to
> > > >>> accumulate, the immediate thought is: let's set up some garbage
> > > >>> collection
> > > >>> mechanism, that will take care of keeping the accumulation at bay.
> But
> > > >>> that's what doesn't work with inodes in a naive way, as they are
> > > >>> referenced
> > > >>> from kernel, so we have to keep them around until kernel tells us
> it's
> > > >>> giving up its reference. However, with the fuse invalidate calls
> we can
> > > >>> take the initiative and instruct the kernel: "hey, kernel, give up
> your
> > > >>> references to this thing!"
> > > >>>
> > > >>> So we are actually free to implement any kind of inode GC in
> glusterfs,
> > > >>> just have to take care to add the proper callback to
> fuse_invalidate_*
> > > >>> and
> > > >>> we are good to go.
> > > >>>
> > > >>>
> > > >> That sounds good and something we need to do in the near future. Is
> this
> > > >> something that is easy to implement?
> > > >>
> > > >>
> > > >>> Csaba
> > > >>>
> > > >>> On Mon, Sep 4, 2017 at 7:00 AM, Nithya Balachandran
> > > >>> <nbalacha at redhat.com
> > > >>> > wrote:
> > > >>>
> > > >>>>
> > > >>>>
> > > >>>> On 4 September 2017 at 10:25, Raghavendra Gowdappa
> > > >>>> <rgowdapp at redhat.com
> > > >>>> > wrote:
> > > >>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> ----- Original Message -----
> > > >>>>> > From: "Nithya Balachandran" <nbalacha at redhat.com>
> > > >>>>> > Sent: Monday, September 4, 2017 10:19:37 AM
> > > >>>>> > Subject: Fuse mounts and inodes
> > > >>>>> >
> > > >>>>> > Hi,
> > > >>>>> >
> > > >>>>> > One of the reasons for the memory consumption in gluster fuse
> > > >>>>> > mounts
> > > >>>>> is the
> > > >>>>> > number of inodes in the table which are never kicked out.
> > > >>>>> >
> > > >>>>> > Is there any way to default to an entry-timeout and
> > > >>>>> attribute-timeout value
> > > >>>>> > while mounting Gluster using Fuse? Say 60s each so those
> entries
> > > >>>>> will be
> > > >>>>> > purged periodically?
> > > >>>>>
> > > >>>>> Once the entry timeouts, inodes won't be purged. Kernel sends a
> > > >>>>> lookup
> > > >>>>> to revalidate the mapping of path to inode. AFAIK, reverse
> > > >>>>> invalidation
> > > >>>>> (see inode_invalidate) is the only way to make kernel forget
> > > >>>>> inodes/attributes.
> > > >>>>>
> > > >>>>> Is that something that can be done from the Fuse mount ? Or is
> this
> > > >>>> something that needs to be added to Fuse?
> > > >>>>
> > > >>>>> >
> > > >>>>> > Regards,
> > > >>>>> > Nithya
> > > >>>>> >
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >
> >
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20170906/458fda9c/attachment.html>


More information about the Gluster-devel mailing list