[Gluster-devel] What's the correct way to enable direct-IO?

Paul Cuzner pcuzner at redhat.com
Thu Feb 25 05:21:17 UTC 2016

This is great info - with a lot of options to take in :)

To summarise, to enable direct-io and bypass the kernel filesystem cache
for a volume
1. Mount the brick with direct-io-mode=enable option
2. run vol set <vol> performance.strict-o-direct on
3. update the vol files with 'o-direct' option in storage/posix (at least
for now)

Is that right?

On Thu, Feb 25, 2016 at 5:56 PM, Raghavendra Gowdappa <rgowdapp at redhat.com>

> ----- Original Message -----
> > From: "Krutika Dhananjay" <kdhananj at redhat.com>
> > To: "Gluster Devel" <gluster-devel at gluster.org>, "Raghavendra Gowdappa"
> <rgowdapp at redhat.com>
> > Cc: "Paul Cuzner" <pcuzner at redhat.com>
> > Sent: Thursday, February 25, 2016 7:28:30 AM
> > Subject: What's the correct way to enable direct-IO?
> >
> > Hi,
> >
> > git-grep tells me there are multiple options in our code base for
> enabling
> > direct-IO on a gluster volume, at several layers in the translator stack:
> > i) use the mount option 'direct-io-mode=enable'
> This option is between kernel and glusterfs. Specifically it asks fuse
> kernel module to bypass page-cache. Note that when this option is set,
> direct-io is enabled for _all_ fds irrespective of whether applications
> have used O_DIRECT in their open/create calls or not.
> > ii) enable 'network.remote-dio' which is a protocol/client option using
> > volume set command
> This is an option introduced by [1] to _filter_ O_DIRECT flags in
> open/create calls before sending those requests to server. The option name
> is misleading here. However please note that this is the key (alias?) used
> by glusterd. The exact option name used by protocol/client is
> "filter_O_DIRECT" and its fine. Probably we should file a bug on glusterd
> to change the name?
> Coming to your use case, we don't want to filter O_DIRECT from reaching
> brick. Hence, we need to set this option to _off_ (by default its disabled).
> I am still not sure what is the relevance of this option against the bug
> it was introduced. If we need direct-io, we've to pass it to brick too, so
> that backend fs on brick is configured appropriately.
> [1] http://review.gluster.org/4206
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=845213
> > iii) enable performance.strict-o-direct which is a
> performance/write-behind
> > option using volume-set command
> Yes, write-behind honours O_DIRECT only if this option is set. So, we need
> to enable this for your use-case. Also, note that applications still need
> to use O_DIRECT in open/create calls.
> To summarize, following are the ways to bypass write-behind cache:
> 1. disable write-behind :).
> 2. applications use O_SYNC/O_DSYNC in open calls
> 3. enable performance.strict-o-direct _and_ applications should use
> O_DIRECT in open/create calls.
> > iv) use 'o-direct' option in storage/posix, volume-set on which reports
> that
> > the option doesn't exist.
> The option exists in storage/posix. But, there is no way to set it through
> cli (probably you can send a patch to do that if necessary). With this
> option, O_DIRECT is passed with _every_ open/create call on the brick.
> >
> > So then the question is - what is a surefire way to get direct-io-like
> > behavior on gluster volume(s)?
> There is no one global option. You need to configure various translators
> in the stack. Probably [2] was asking for such a feature. Also, as you
> might've noticed above the behavior/interpretation of these options is not
> same across all translators (like some are global and some are local only
> to an fd etc).
> Also note that apart from the options you listed above,
> 1. Quick-read is not aware of O_DIRECT. We need to make it to disable
> caching if open happens with O_DIRECT.
> 2. Handling of Quota Marker xattrs is not synchronous (though not exactly
> an O_DIRECT requirement) as marking is done after sending reply to calls
> like writev.
> On a related note, found article [3] to be informative.
> [1] http://review.gluster.org/4206
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=845213
> [3] https://lwn.net/Articles/457667/
> regards,
> Raghavendra.
