[Bugs] [Bug 1662830] [RFE] Enable parallel-readdir by default for all gluster volumes

Wed Jan 2 06:42:18 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1662830

--- Comment #2 from Raghavendra G <rgowdapp at redhat.com> ---
Also see:
1. https://lists.gluster.org/pipermail/gluster-devel/2018-September/055419.html
2. https://lists.gnu.org/archive/html/gluster-devel/2013-09/msg00034.html

>From a mail to gluster-devel titled "serialized readdir(p) across subvols and
effect on performance"

<snip>
All,

As many of us are aware, readdir(p)s are serialized across DHT subvols. One of
the intuitive first reactions for this algorithm is that readdir(p) is going to
be slow.

However this is partly true as reading the contents of a directory is normally
split into multiple readdir(p) calls and most of the times (when a directory is
sufficiently large to have dentries and inode data is bigger than a typical
readdir(p) buffer size - 128K when readdir-ahead is enabled and 4KB on fuse
when readdir-ahead is disabled - on each subvol) a single readdir(p) request is
served from a single subvolume (or two subvolumes in the worst case) and hence
a single readdir(p) is not serialized across all subvolumes.

Having said that, there are definitely cases where a single readdir(p) request
can be serialized on many subvolumes. A best example for this is a readdir(p)
request on an empty directory. Other relevant examples are those directories
which don't have enough dentries to fit into a single readdir(p) buffer size on
each subvolume of DHT. This is where performance.parallel-readdir helps. Also,
note that this is the same reason why having cache-size for each readdir-ahead
(loaded as a parent for each DHT subvolume) way bigger than a single readdir(p)
buffer size won't really improve the performance in proportion to cache-size
when performance.parallel-readdir is enabled.

Though this is not a new observation [1] (I stumbled upon [1] after realizing
the above myself independently while working on performance.parallel-readdir),
I felt this as a common misconception (I ran into similar argument while trying
to explain DHT architecture to someone new to Glusterfs recently) and hence
thought of writing out a mail to clarify the same.

[1] https://lists.gnu.org/archive/html/gluster-devel/2013-09/msg00034.html

regards,
Raghavendra

</snip>

-- 
You are receiving this mail because:
You are on the CC list for the bug.