[Gluster-devel] Changing the relative order of read-ahead and open-behind

Tue Jul 25 05:09:55 UTC 2017

On Tue, Jul 25, 2017 at 9:33 AM, Raghavendra Gowdappa <rgowdapp at redhat.com>
wrote:

>
>
> ----- Original Message -----
> > From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> > To: "Raghavendra G" <raghavendra at gluster.com>
> > Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > Sent: Tuesday, July 25, 2017 7:51:07 AM
> > Subject: Re: [Gluster-devel] Changing the relative order of read-ahead
> and    open-behind
> >
> >
> >
> > On Mon, Jul 24, 2017 at 5:11 PM, Raghavendra G < raghavendra at gluster.com
> >
> > wrote:
> >
> >
> >
> >
> >
> > On Fri, Jul 21, 2017 at 6:39 PM, Vijay Bellur < vbellur at redhat.com >
> wrote:
> >
> >
> >
> >
> > On Fri, Jul 21, 2017 at 3:26 AM, Raghavendra Gowdappa <
> rgowdapp at redhat.com >
> > wrote:
> >
> >
> > Hi all,
> >
> > We've a bug [1], due to which read-ahead is completely disabled when the
> > workload is read-only. One of the easy fix was to make read-ahead as an
> > ancestor of open-behind in xlator graph (Currently its a descendant). A
> > patch has been sent out by Rafi to do the same. As noted in one of the
> > comments, one flip side of this solution is that small files (which are
> > eligible to be cached by quick read) are cached twice - once each in
> > read-ahead and quick-read - wasting up precious memory. However, there
> are
> > no other simpler solutions for this issue. If you've concerns on the
> > approach followed by [2] or have other suggestions please voice them out.
> > Otherwise, I am planning to merge [2] for lack of better alternatives.
> >
> >
> > Since the maximum size of files cached by quick-read is 64KB, can we have
> > read-ahead kick in for offsets greater than 64KB?
> >
> > I got your point. We can enable read-ahead only for files whose size is
> > greater than the size eligible for caching quick-read. IOW, read-ahead
> gets
> > disabled if file size is less than 64KB. Thanks for the suggestion.
> >
> > I added a comment on the patch to move the xlators in reverse to the way
> the
> > patch is currently doing. Milind I think implemented it. Will that lead
> to
> > any problem?
>
> From gerrit:
>
> <comment>
>
> It fixes the issue too and it is a better solution than the current one as
> it doesn't run into duplicate cache problem. The reason open-behind was
> loaded as an ancestor of quick-read was that it seemed unnecessary that
> quick-read should even witness an open. However,
>
>    * looking into code qr_open is indeed setting some priority for the
> inode which will be used during purging of cache due to exceeding cache
> limit. So, it helps quick read to witness an open.
>    * the real benefit of open-behind is avoiding fops over network. So, as
> long as open-behind is loaded in client stack, we reap its benefits.
>    * Also note that if option "read-after-open" is set in open-behind, an
> open is anyways done over network irrespective of whether quick-read has
> cached the file, which to me looks unnecessary. By moving open-behind as a
> descendant of quick-read, open-behind won't even witness a read when the
> file is cached by quick-read. But, if read-after-open option is implemented
> in open-behind with the goal of fixing non-posix compliance for the case of
> open fd on a file is unlinked, we might regress. But again, even this
> approach doesn't fix the compliance problem completely. One has to turn
> open-behind off to be completely posix complaint in this scenario.
>
> Given the reasons above, it helps just moving open-behind as a descendant
> of read-ahead.
>
> </comment>
>
>
Analysis looks good. But I would like us (all developers) to backup the
theories like this with some data.

How about you plan a test case which can demonstrate the difference ? I
will help you set up metrics measuring with graphs [1] on experimental
branch [2] to actually measure and graphically represent the hypothesis.

We can set this as an example for future for anyone to try the permutation
& combination of different xlator order. Who knows we may realize, for
different work load, different order may be suitable.

Regards,
Amar

[1] - https://github.com/amarts/glustermetrics
[2] - https://github.com/gluster/glusterfs/tree/experimental

>
> >
> >
> >
> >
> >
> >
> >
> > Thanks,
> > Vijay
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> >
> >
> >
> > --
> > Raghavendra G
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> >
> >
> >
> > --
> > Pranith
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>

-- 
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20170725/95ddf55e/attachment.html>