[Gluster-devel] Changing the relative order of read-ahead and open-behind

Raghavendra G raghavendra at gluster.com
Tue Jul 25 09:08:39 UTC 2017


On Tue, Jul 25, 2017 at 10:39 AM, Amar Tumballi <atumball at redhat.com> wrote:

>
>
> On Tue, Jul 25, 2017 at 9:33 AM, Raghavendra Gowdappa <rgowdapp at redhat.com
> > wrote:
>
>>
>>
>> ----- Original Message -----
>> > From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>> > To: "Raghavendra G" <raghavendra at gluster.com>
>> > Cc: "Gluster Devel" <gluster-devel at gluster.org>
>> > Sent: Tuesday, July 25, 2017 7:51:07 AM
>> > Subject: Re: [Gluster-devel] Changing the relative order of read-ahead
>> and    open-behind
>> >
>> >
>> >
>> > On Mon, Jul 24, 2017 at 5:11 PM, Raghavendra G <
>> raghavendra at gluster.com >
>> > wrote:
>> >
>> >
>> >
>> >
>> >
>> > On Fri, Jul 21, 2017 at 6:39 PM, Vijay Bellur < vbellur at redhat.com >
>> wrote:
>> >
>> >
>> >
>> >
>> > On Fri, Jul 21, 2017 at 3:26 AM, Raghavendra Gowdappa <
>> rgowdapp at redhat.com >
>> > wrote:
>> >
>> >
>> > Hi all,
>> >
>> > We've a bug [1], due to which read-ahead is completely disabled when the
>> > workload is read-only. One of the easy fix was to make read-ahead as an
>> > ancestor of open-behind in xlator graph (Currently its a descendant). A
>> > patch has been sent out by Rafi to do the same. As noted in one of the
>> > comments, one flip side of this solution is that small files (which are
>> > eligible to be cached by quick read) are cached twice - once each in
>> > read-ahead and quick-read - wasting up precious memory. However, there
>> are
>> > no other simpler solutions for this issue. If you've concerns on the
>> > approach followed by [2] or have other suggestions please voice them
>> out.
>> > Otherwise, I am planning to merge [2] for lack of better alternatives.
>> >
>> >
>> > Since the maximum size of files cached by quick-read is 64KB, can we
>> have
>> > read-ahead kick in for offsets greater than 64KB?
>> >
>> > I got your point. We can enable read-ahead only for files whose size is
>> > greater than the size eligible for caching quick-read. IOW, read-ahead
>> gets
>> > disabled if file size is less than 64KB. Thanks for the suggestion.
>> >
>> > I added a comment on the patch to move the xlators in reverse to the
>> way the
>> > patch is currently doing. Milind I think implemented it. Will that lead
>> to
>> > any problem?
>>
>> From gerrit:
>>
>> <comment>
>>
>> It fixes the issue too and it is a better solution than the current one
>> as it doesn't run into duplicate cache problem. The reason open-behind was
>> loaded as an ancestor of quick-read was that it seemed unnecessary that
>> quick-read should even witness an open. However,
>>
>>    * looking into code qr_open is indeed setting some priority for the
>> inode which will be used during purging of cache due to exceeding cache
>> limit. So, it helps quick read to witness an open.
>>    * the real benefit of open-behind is avoiding fops over network. So,
>> as long as open-behind is loaded in client stack, we reap its benefits.
>>    * Also note that if option "read-after-open" is set in open-behind, an
>> open is anyways done over network irrespective of whether quick-read has
>> cached the file, which to me looks unnecessary. By moving open-behind as a
>> descendant of quick-read, open-behind won't even witness a read when the
>> file is cached by quick-read. But, if read-after-open option is implemented
>> in open-behind with the goal of fixing non-posix compliance for the case of
>> open fd on a file is unlinked, we might regress. But again, even this
>> approach doesn't fix the compliance problem completely. One has to turn
>> open-behind off to be completely posix complaint in this scenario.
>>
>> Given the reasons above, it helps just moving open-behind as a descendant
>> of read-ahead.
>>
>> </comment>
>>
>>
> Analysis looks good. But I would like us (all developers) to backup the
> theories like this with some data.
>

> How about you plan a test case which can demonstrate the difference ?
>

What is the scenario you want to measure here?


> I will help you set up metrics measuring with graphs [1] on experimental
> branch [2] to actually measure and graphically represent the hypothesis.
>
> We can set this as an example for future for anyone to try the permutation
> & combination of different xlator order. Who knows we may realize, for
> different work load, different order may be suitable.
>
> Regards,
> Amar
>
> [1] - https://github.com/amarts/glustermetrics
> [2] - https://github.com/gluster/glusterfs/tree/experimental
>
> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > Thanks,
>> > Vijay
>> >
>> > _______________________________________________
>> > Gluster-devel mailing list
>> > Gluster-devel at gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-devel
>> >
>> >
>> >
>> > --
>> > Raghavendra G
>> >
>> > _______________________________________________
>> > Gluster-devel mailing list
>> > Gluster-devel at gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-devel
>> >
>> >
>> >
>> > --
>> > Pranith
>> >
>> > _______________________________________________
>> > Gluster-devel mailing list
>> > Gluster-devel at gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-devel
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Amar Tumballi (amarts)
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20170725/15b82884/attachment.html>


More information about the Gluster-devel mailing list