[Gluster-devel] [Gluster-users] Disabling read-ahead and io-cache for native fuse mounts

Wed Feb 13 07:43:03 UTC 2019

fyi: we have 3 servers, each with 2 SW RAID10 used as bricks in a
replicate 3 setup (so 2 volumes); the default values set by OS (debian
stretch) are:

/dev/md3
Array Size : 29298911232 (27941.62 GiB 30002.09 GB)
/sys/block/md3/queue/read_ahead_kb : 3027

/dev/md4
Array Size : 19532607488 (18627.75 GiB 20001.39 GB)
/sys/block/md4/queue/read_ahead_kb : 2048

maybe that helps somehow :)

Hubert

Am Mi., 13. Feb. 2019 um 06:46 Uhr schrieb Manoj Pillai <mpillai at redhat.com>:
>
>
>
> On Wed, Feb 13, 2019 at 10:51 AM Raghavendra Gowdappa <rgowdapp at redhat.com> wrote:
>>
>>
>>
>> On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa <rgowdapp at redhat.com> wrote:
>>>
>>> All,
>>>
>>> We've found perf xlators io-cache and read-ahead not adding any performance improvement. At best read-ahead is redundant due to kernel read-ahead
>>
>>
>> One thing we are still figuring out is whether kernel read-ahead is tunable. From what we've explored, it _looks_ like (may not be entirely correct), ra is capped at 128KB. If that's the case, I am interested in few things:
>> * Are there any realworld applications/usecases, which would benefit from larger read-ahead (Manoj says block devices can do ra of 4MB)?
>
>
> kernel read-ahead is adaptive but influenced by the read-ahead setting on the block device (/sys/block/<dev>/queue/read_ahead_kb), which can be tuned. For RHEL specifically, the default is 128KB (last I checked) but the default RHEL tuned-profile, throughput-performance, bumps that up to 4MB. It should be fairly easy to rig up a test  where 4MB read-ahead on the block device gives better performance than 128KB read-ahead.
>
> -- Manoj
>
>> * Is the limit on kernel ra tunable a hard one? IOW, what does it take to make it to do higher ra? If its difficult, can glusterfs read-ahead provide the expected performance improvement for these applications that would benefit from aggressive ra (as glusterfs can support larger ra sizes)?
>>
>> I am still inclined to prefer kernel ra as I think its more intelligent and can identify more sequential patterns than Glusterfs read-ahead [1][2].
>> [1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-273-284.pdf
>> [2] https://lwn.net/Articles/155510/
>>
>>> and at worst io-cache is degrading the performance for workloads that doesn't involve re-read. Given that VFS already have both these functionalities, I am proposing to have these two translators turned off by default for native fuse mounts.
>>>
>>> For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have these xlators on by having custom profiles. Comments?
>>>
>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029
>>>
>>> regards,
>>> Raghavendra
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users