[Gluster-devel] [Gluster-users] Feedback on DHT option "cluster.readdir-optimize"

Raghavendra G raghavendra at gluster.com
Thu Nov 10 07:21:53 UTC 2016


Kyle,

Thanks for your your response :). This really helps. From 13s to 0.23s
seems like huge improvement.

regards,
Raghavendra

On Tue, Nov 8, 2016 at 8:21 PM, Kyle Johnson <kjohnson at gnulnx.net> wrote:

> Hey there,
>
> We have a number of processes which daily walk our entire directory tree
> and perform operations on the found files.
>
> Pre-gluster, this processes was able to complete within 24 hours of
> starting.  After outgrowing that single server and moving to a gluster
> setup (two bricks, two servers, distribute, 10gig uplink), the processes
> became unusable.
>
> After turning this option on, we were back to normal run times, with the
> process completing within 24 hours.
>
> Our data is heavy nested in a large number of subfolders under /media/ftp.
>
> A subset of our data:
>
> 15T of files in 48163 directories under /media/ftp/dig_dis.
>
> Without readdir-optimize:
>
> [root at colossus dig_dis]# time ls|wc -l
> 48163
>
> real    13m1.582s
> user    0m0.294s
> sys     0m0.205s
>
>
> With readdir-optimize:
>
> [root at colossus dig_dis]# time ls | wc -l
> 48163
>
> real    0m23.785s
> user    0m0.296s
> sys     0m0.108s
>
>
> Long story short - this option is super important to me as it resolved an
> issue that would have otherwise made me move my data off of gluster.
>
>
> Thank you for all of your work,
>
> Kyle
>
>
>
>
>
> On 11/07/2016 10:07 PM, Raghavendra Gowdappa wrote:
>
>> Hi all,
>>
>> We have an option in called "cluster.readdir-optimize" which alters the
>> behavior of readdirp in DHT. This value affects how storage/posix treats
>> dentries corresponding to directories (not for files).
>>
>> When this value is on,
>> * DHT asks only one subvol/brick to return dentries corresponding to
>> directories.
>> * Other subvols/bricks filter dentries corresponding to directories and
>> send only dentries corresponding to files.
>>
>> When this value is off (this is the default value),
>> * All subvols return all dentries stored on them. IOW, bricks don't
>> filter any dentries.
>> * Since a directory has one dentry representing it on each subvol, dht
>> (loaded on client) picks up dentry only from hashed subvol.
>>
>> Note that irrespective of value of this option, _all_ subvols return
>> dentries corresponding to files which are stored on them.
>>
>> This option was introduced to boost performance of readdir as (when set
>> on), filtering of dentries happens on bricks and hence there is reduced:
>> 1. network traffic (with filtering all the redundant dentry information)
>> 2. number of readdir calls between client and server for the same number
>> of dentries returned to application (If filtering happens on client, lesser
>> number of dentries in result and hence more number of readdir calls. IOW,
>> result buffer is not filled to maximum capacity).
>>
>> We want to hear from you Whether you've used this option and if yes,
>> 1. Did it really boost readdir performance?
>> 2. Do you've any performance data to find out what was the percentage of
>> improvement (or deterioration)?
>> 3. Data set you had (Number of files, directories and organisation of
>> directories).
>>
>> If we find out that this option is really helping you, we can spend our
>> energies on fixing issues that will arise when this option is set to on.
>> One common issue with turning this option on is that when this option is
>> set, some directories might not show up in directory listing [1]. The
>> reason for this is that:
>> 1. If a directory can be created on a hashed subvol, mkdir (result to
>> application) will be successful, irrespective of result of mkdir on rest of
>> the subvols.
>> 2. So, any subvol we pick to give us dentries corresponding to directory
>> need not contain all the directories and we might miss out those
>> directories in listing.
>>
>> Your feedback is important for us and will help us to prioritize and
>> improve things.
>>
>> [1] https://www.gluster.org/pipermail/gluster-users/2016-October
>> /028703.html
>>
>> regards,
>> Raghavendra
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20161110/0ffb0692/attachment-0001.html>


More information about the Gluster-devel mailing list