[Gluster-devel] On making performance.parallel-readdir as a default option
Soumya Koduri
skoduri at redhat.com
Mon Sep 24 08:17:51 UTC 2018
Please find my comments inline.
On 9/22/18 8:56 AM, Raghavendra Gowdappa wrote:
>
>
> On Fri, Sep 21, 2018 at 11:25 PM Raghavendra Gowdappa
> <rgowdapp at redhat.com <mailto:rgowdapp at redhat.com>> wrote:
>
> Hi all,
>
> We've a feature performance.parallel-readdir [1] that is known to
> improve performance of readdir operations [2][3][4]. The option is
> especially useful when distribute scale is relatively large (>10)
> and is known to improve performance of readdir operations even on
> smaller scale of distribute count 1 [4].
>
> However, this option is not enabled by default. I am here proposing
> to make this as a default feature.
>
> But, there are some important things to be addressed in
> readdir-ahead (which is core part of parallel-readdir), before we
> can do so:
>
> To summarize issues with readdir-ahead:
> * There seems to be one prominent problem of missing dentries with
> parallel-readdir. There was one problem discussed on tech-list just
> yesterday. I've heard about this recurrently earlier too. Not sure
> whether this is the problem of missing unlink/rmdir/create etc fops
> (see below) in readdir-ahead. ATM, no RCA.
IMHO, this is a must fix to enable this option by default.
> * fixes to maintain stat-consistency in dentries pre-fetched have
> not made into downstream yet (though merged upstream [5]).
> * readdir-ahead doesn't implement directory modification fops like
> rmdir/create/symlink/link/unlink/rename. This means cache won't be
> updated wiith newer content, even on single mount till its consumed
> by application or purged.
As you had explained, since this affects cache-consistency, this as well
needs to be addressed.
> * dht linkto-files should store relative positions of subvolumes
> instead of absolute subvolume name, so that changes to immediate
> child won't render them stale.
FWIU from your explanation, this may affect performance for a brief
moment when the option is turned on but as such doesn't result in
incorrect results. So considering that these options are usually
configured at the beginning of the volume configuration and not toggled
often, this may not be blocker.
> * Features parallel-readdir depends on to be working should be
> enabled automatically even though they were off earlier when
> parallel-readdir is enabled [6].
Since readdir-ahead is one such option which was not turned on (by
default) till now and most of the above mentioned issues are with
readdir-ahead, will it be helpful if we enable only readdir-ahead for
few releases, get enough testing done and then consider parallel-readdir?
Thanks,
Soumya
>
> I've listed important known issues above. But we can discuss which
> are the blockers for making this feature as a default.
>
> Thoughts?
>
> [1] http://review.gluster.org/#/c/16090/
> [2]
> https://events.static.linuxfound.org/sites/events/files/slides/Gluster_DirPerf_Vault2017_0.pdf
> (sections on small directory)
> [3] https://bugzilla.redhat.com/show_bug.cgi?id=1628807#c35
> <https://bugzilla.redhat.com/show_bug.cgi?id=1628807#c34>
> [4] https://www.spinics.net/lists/gluster-users/msg34956.html
> [5] http://review.gluster.org/#/c/glusterfs/+/20639/
> [6] https://bugzilla.redhat.com/show_bug.cgi?id=1631406
>
> regards,
> Raghavendra
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list