[Gluster-devel] What's the correct way to enable direct-IO?

Raghavendra Gowdappa rgowdapp at redhat.com
Thu Feb 25 09:06:01 UTC 2016


> > >
> > > This is great info - with a lot of options to take in :)
> > >
> > > To summarise, to enable direct-io and bypass the kernel filesystem cache
> > > for a volume
> > > 1. Mount the brick with direct-io-mode=enable option
> > > 2. run vol set <vol> performance.strict-o-direct on
> >
> > Would individual files be opened with O_DIRECT? If yes, this is
> > sufficient. But if we want to do that for all files, the only way to do
> > this is to disable write-behind.
> >
> 
> file opens will not be with o_direct BUT if I understand  your earlier
> comments correctly the update to add o-direct to the storage/posix will
> force o-direct on open/create calls -  which would mean vol set <bla>
> performance.write-behind off would not be required.

The order of components in which a write request travels is:

application -> kernel -> glusterfs-client-process (write-behind is loaded here) -> glusterfs-brick-process (storage-posix is loaded here).

So, setting o-direct in storage/posix (which is _after_ a write request has left write-behind) has no impact on the caching behavior of write-behind.

> 
> 
> >
> > > 3. update the vol files with 'o-direct' option in storage/posix (at least
> > > for now)
> >
> > 4. Quick read should be made aware of O_DIRECT (it doesn't as of now).
> > Again this is on the read-path. If you don't want O_DIRECT semantics for
> > read path (is there one?), this is fine.
> >
> > performance.quick-read is off for the virt use case.
> 
> I'll try and test this out tomorrow.
> 
> Thanks!
> 
> 
> > >
> > > Is that right?
> > >
> > >
> > >
> > > On Thu, Feb 25, 2016 at 5:56 PM, Raghavendra Gowdappa <
> > rgowdapp at redhat.com>
> > > wrote:
> > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > > From: "Krutika Dhananjay" <kdhananj at redhat.com>
> > > > > To: "Gluster Devel" <gluster-devel at gluster.org>, "Raghavendra
> > Gowdappa"
> > > > <rgowdapp at redhat.com>
> > > > > Cc: "Paul Cuzner" <pcuzner at redhat.com>
> > > > > Sent: Thursday, February 25, 2016 7:28:30 AM
> > > > > Subject: What's the correct way to enable direct-IO?
> > > > >
> > > > > Hi,
> > > > >
> > > > > git-grep tells me there are multiple options in our code base for
> > > > enabling
> > > > > direct-IO on a gluster volume, at several layers in the translator
> > stack:
> > > > > i) use the mount option 'direct-io-mode=enable'
> > > >
> > > > This option is between kernel and glusterfs. Specifically it asks fuse
> > > > kernel module to bypass page-cache. Note that when this option is set,
> > > > direct-io is enabled for _all_ fds irrespective of whether applications
> > > > have used O_DIRECT in their open/create calls or not.
> > > >
> > > > > ii) enable 'network.remote-dio' which is a protocol/client option
> > using
> > > > > volume set command
> > > >
> > > > This is an option introduced by [1] to _filter_ O_DIRECT flags in
> > > > open/create calls before sending those requests to server. The option
> > name
> > > > is misleading here. However please note that this is the key (alias?)
> > used
> > > > by glusterd. The exact option name used by protocol/client is
> > > > "filter_O_DIRECT" and its fine. Probably we should file a bug on
> > glusterd
> > > > to change the name?
> > > >
> > > > Coming to your use case, we don't want to filter O_DIRECT from reaching
> > > > brick. Hence, we need to set this option to _off_ (by default its
> > > > disabled).
> > > >
> > > > I am still not sure what is the relevance of this option against the
> > bug
> > > > it was introduced. If we need direct-io, we've to pass it to brick
> > too, so
> > > > that backend fs on brick is configured appropriately.
> > > >
> > > > [1] http://review.gluster.org/4206
> > > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=845213
> > > >
> > > > > iii) enable performance.strict-o-direct which is a
> > > > performance/write-behind
> > > > > option using volume-set command
> > > >
> > > > Yes, write-behind honours O_DIRECT only if this option is set. So, we
> > need
> > > > to enable this for your use-case. Also, note that applications still
> > need
> > > > to use O_DIRECT in open/create calls.
> > > >
> > > > To summarize, following are the ways to bypass write-behind cache:
> > > > 1. disable write-behind :).
> > > > 2. applications use O_SYNC/O_DSYNC in open calls
> > > > 3. enable performance.strict-o-direct _and_ applications should use
> > > > O_DIRECT in open/create calls.
> > > >
> > > > > iv) use 'o-direct' option in storage/posix, volume-set on which
> > reports
> > > > that
> > > > > the option doesn't exist.
> > > >
> > > > The option exists in storage/posix. But, there is no way to set it
> > through
> > > > cli (probably you can send a patch to do that if necessary). With this
> > > > option, O_DIRECT is passed with _every_ open/create call on the brick.
> > > >
> > > > >
> > > > > So then the question is - what is a surefire way to get
> > direct-io-like
> > > > > behavior on gluster volume(s)?
> > > >
> > > > There is no one global option. You need to configure various
> > translators
> > > > in the stack. Probably [2] was asking for such a feature. Also, as you
> > > > might've noticed above the behavior/interpretation of these options is
> > not
> > > > same across all translators (like some are global and some are local
> > only
> > > > to an fd etc).
> > > >
> > > > Also note that apart from the options you listed above,
> > > > 1. Quick-read is not aware of O_DIRECT. We need to make it to disable
> > > > caching if open happens with O_DIRECT.
> > > > 2. Handling of Quota Marker xattrs is not synchronous (though not
> > exactly
> > > > an O_DIRECT requirement) as marking is done after sending reply to
> > calls
> > > > like writev.
> > > >
> > > > On a related note, found article [3] to be informative.
> > > >
> > > > [1] http://review.gluster.org/4206
> > > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=845213
> > > > [3] https://lwn.net/Articles/457667/
> > > >
> > > > regards,
> > > > Raghavendra.
> > > >
> > >
> >
> 


More information about the Gluster-devel mailing list