[Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"

Raghavendra Gowdappa rgowdapp at redhat.com
Tue Nov 8 05:07:56 UTC 2016

Hi all,

We have an option in called "cluster.readdir-optimize" which alters the behavior of readdirp in DHT. This value affects how storage/posix treats dentries corresponding to directories (not for files).

When this value is on, 
* DHT asks only one subvol/brick to return dentries corresponding to directories.
* Other subvols/bricks filter dentries corresponding to directories and send only dentries corresponding to files.

When this value is off (this is the default value),
* All subvols return all dentries stored on them. IOW, bricks don't filter any dentries.
* Since a directory has one dentry representing it on each subvol, dht (loaded on client) picks up dentry only from hashed subvol.

Note that irrespective of value of this option, _all_ subvols return dentries corresponding to files which are stored on them.

This option was introduced to boost performance of readdir as (when set on), filtering of dentries happens on bricks and hence there is reduced:
1. network traffic (with filtering all the redundant dentry information)
2. number of readdir calls between client and server for the same number of dentries returned to application (If filtering happens on client, lesser number of dentries in result and hence more number of readdir calls. IOW, result buffer is not filled to maximum capacity).

We want to hear from you Whether you've used this option and if yes,
1. Did it really boost readdir performance?
2. Do you've any performance data to find out what was the percentage of improvement (or deterioration)?
3. Data set you had (Number of files, directories and organisation of directories).

If we find out that this option is really helping you, we can spend our energies on fixing issues that will arise when this option is set to on. One common issue with turning this option on is that when this option is set, some directories might not show up in directory listing [1]. The reason for this is that:
1. If a directory can be created on a hashed subvol, mkdir (result to application) will be successful, irrespective of result of mkdir on rest of the subvols.
2. So, any subvol we pick to give us dentries corresponding to directory need not contain all the directories and we might miss out those directories in listing.

Your feedback is important for us and will help us to prioritize and improve things.

[1] https://www.gluster.org/pipermail/gluster-users/2016-October/028703.html


