<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <br>
    Hi,<br>
    <br>
    Today we experimented with some of the FUSE options that we found in
    the list.<br>
    <br>
    Changing these options had no effect:<br>
    <pre>gluster volume set test-volume performance.cache-max-file-size 2MB
gluster volume set test-volume performance.cache-refresh-timeout 4
gluster volume set test-volume performance.cache-size 256MB
gluster volume set test-volume performance.write-behind-window-size 4MB
gluster volume set test-volume performance.write-behind-window-size 8MB
</pre>
    Changing the following option from its default value made the speed
    slower<br>
    <pre>gluster volume set test-volume performance.write-behind off (on by default)
</pre>
    Changing the following options initially appeared to give a 10%
    increase in speed, but this vanished in subsequent tests (we think
    the apparent increase may have been to a lighter workload on the
    computer from other users)<br>
    <p><tt>gluster volume set test-volume performance.stat-prefetch on</tt><tt><br>
      </tt><tt>
        gluster volume set test-volume client.event-threads 4</tt><tt><br>
      </tt><tt>
        gluster volume set test-volume server.event-threads 4</tt></p>
    <br>
    Can anything be gleaned from these observations?  Are there other
    things we can try?<br>
    <br>
    Thanks<br>
    <br>
    Pat<br>
    <br>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 06/20/2017 12:06 PM, Pat Haley
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:f5f68782-eec2-7e9e-dea6-8855a33b7f2f@mit.edu">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <br>
      Hi Ben,<br>
      <br>
      Sorry this took so long, but we had a real-time forecasting
      exercise last week and I could only get to this now.<br>
      <br>
      Backend Hardware/OS:<br>
      <ul>
        <li>Much of the information on our back end system is included
          at the top of  <a class="moz-txt-link-freetext"
href="http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html"
            moz-do-not-send="true">http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html</a></li>
        <li>The specific model of the hard disks is SeaGate ENTERPRISE
          CAPACITY V.4 6TB (ST6000NM0024). The rated speed is 6Gb/s.</li>
        <li>Note: there is one physical server that hosts both the NFS
          and the GlusterFS areas<br>
        </li>
      </ul>
      Latest tests<br>
      <br>
      I have had time to run the tests for one of the dd tests you
      requested to the underlying XFS FS.  The median rate was 170
      MB/s.  The dd results and iostat record are in<br>
      <br>
      <a class="moz-txt-link-freetext"
        href="http://mseas.mit.edu/download/phaley/GlusterUsers/TestXFS/"
        moz-do-not-send="true">http://mseas.mit.edu/download/phaley/GlusterUsers/TestXFS/</a><br>
      <br>
      I'll add tests for the other brick and to the NFS area later.<br>
      <br>
      Thanks<br>
      <br>
      Pat<br>
      <br>
      <br>
      <div class="moz-cite-prefix">On 06/12/2017 06:06 PM, Ben Turner
        wrote:<br>
      </div>
      <blockquote type="cite"
        cite="mid:1222734242.17969442.1497305183579.JavaMail.zimbra@redhat.com">
        <pre wrap="">Ok you are correct, you have a pure distributed volume.  IE no replication overhead.  So normally for pure dist I use:

throughput = slowest of disks / NIC * .6-.7

In your case we have:

1200 * .6 = 720

So you are seeing a little less throughput than I would expect in your configuration.  What I like to do here is:

-First tell me more about your back end storage, will it sustain 1200 MB / sec?  What kind of HW?  How many disks?  What type and specs are the disks?  What kind of RAID are you using?

-Second can you refresh me on your workload?  Are you doing reads / writes or both?  If both what mix?  Since we are using DD I assume you are working iwth large file sequential I/O, is this correct?

-Run some DD tests on the back end XFS FS.  I normally have /xfs-mount/gluster-brick, if you have something similar just mkdir on the XFS -&gt; /xfs-mount/my-test-dir.  Inside the test dir run:

If you are focusing on a write workload run:

# dd if=/dev/zero of=/xfs-mount/file bs=1024k count=10000 conv=fdatasync  

If you are focusing on a read workload run:

# echo 3 &gt; /proc/sys/vm/drop_caches
# dd if=/gluster-mount/file of=/dev/null bs=1024k count=10000

** MAKE SURE TO DROP CACHE IN BETWEEN READS!! **

Run this in a loop similar to how you did in:

<a class="moz-txt-link-freetext" href="http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt" moz-do-not-send="true">http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt</a>

Run this on both servers one at a time and if you are running on a SAN then run again on both at the same time.  While this is running gather iostat for me:

# iostat -c -m -x 1 &gt; iostat-$(hostname).txt

Lets see how the back end performs on both servers while capturing iostat, then see how the same workload / data looks on gluster.

-Last thing, when you run your kernel NFS tests are you using the same filesystem / storage you are using for the gluster bricks?  I want to be sure we have an apples to apples comparison here.

-b



----- Original Message -----
</pre>
        <blockquote type="cite">
          <pre wrap="">From: "Pat Haley" <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;phaley@mit.edu&gt;</a>
To: "Ben Turner" <a class="moz-txt-link-rfc2396E" href="mailto:bturner@redhat.com" moz-do-not-send="true">&lt;bturner@redhat.com&gt;</a>
Sent: Monday, June 12, 2017 5:18:07 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Ben,

Here is the output:

[root@mseas-data2 ~]# gluster volume info

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off


On 06/12/2017 05:01 PM, Ben Turner wrote:
</pre>
          <blockquote type="cite">
            <pre wrap="">What is the output of gluster v info?  That will tell us more about your
config.

-b

----- Original Message -----
</pre>
            <blockquote type="cite">
              <pre wrap="">From: "Pat Haley" <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;phaley@mit.edu&gt;</a>
To: "Ben Turner" <a class="moz-txt-link-rfc2396E" href="mailto:bturner@redhat.com" moz-do-not-send="true">&lt;bturner@redhat.com&gt;</a>
Sent: Monday, June 12, 2017 4:54:00 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Ben,

I guess I'm confused about what you mean by replication.  If I look at
the underlying bricks I only ever have a single copy of any file.  It
either resides on one brick or the other  (directories exist on both
bricks but not files).  We are not using gluster for redundancy (or at
least that wasn't our intent).   Is that what you meant by replication
or is it something else?

Thanks

Pat

On 06/12/2017 04:28 PM, Ben Turner wrote:
</pre>
              <blockquote type="cite">
                <pre wrap="">----- Original Message -----
</pre>
                <blockquote type="cite">
                  <pre wrap="">From: "Pat Haley" <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;phaley@mit.edu&gt;</a>
To: "Ben Turner" <a class="moz-txt-link-rfc2396E" href="mailto:bturner@redhat.com" moz-do-not-send="true">&lt;bturner@redhat.com&gt;</a>, "Pranith Kumar Karampuri"
<a class="moz-txt-link-rfc2396E" href="mailto:pkarampu@redhat.com" moz-do-not-send="true">&lt;pkarampu@redhat.com&gt;</a>
Cc: "Ravishankar N" <a class="moz-txt-link-rfc2396E" href="mailto:ravishankar@redhat.com" moz-do-not-send="true">&lt;ravishankar@redhat.com&gt;</a>, <a class="moz-txt-link-abbreviated" href="mailto:gluster-users@gluster.org" moz-do-not-send="true">gluster-users@gluster.org</a>,
"Steve Postma" <a class="moz-txt-link-rfc2396E" href="mailto:SPostma@ztechnet.com" moz-do-not-send="true">&lt;SPostma@ztechnet.com&gt;</a>
Sent: Monday, June 12, 2017 2:35:41 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Guys,

I was wondering what our next steps should be to solve the slow write
times.

Recently I was debugging a large code and writing a lot of output at
every time step.  When I tried writing to our gluster disks, it was
taking over a day to do a single time step whereas if I had the same
program (same hardware, network) write to our nfs disk the time per
time-step was about 45 minutes. What we are shooting for here would be
to have similar times to either gluster of nfs.
</pre>
                </blockquote>
                <pre wrap="">I can see in your test:

<a class="moz-txt-link-freetext" href="http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt" moz-do-not-send="true">http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt</a>

You averaged ~600 MB / sec(expected for replica 2 with 10G, {~1200 MB /
sec} / #replicas{2} = 600).  Gluster does client side replication so with
replica 2 you will only ever see 1/2 the speed of your slowest part of
the
stack(NW, disk, RAM, CPU).  This is usually NW or disk and 600 is
normally
a best case.  Now in your output I do see the instances where you went
down to 200 MB / sec.  I can only explain this in three ways:

1.  You are not using conv=fdatasync and writes are actually going to
page
cache and then being flushed to disk.  During the fsync the memory is not
yet available and the disks are busy flushing dirty pages.
2.  Your storage RAID group is shared across multiple LUNS(like in a SAN)
and when write times are slow the RAID group is busy serviceing other
LUNs.
3.  Gluster bug / config issue / some other unknown unknown.

So I see 2 issues here:

1.  NFS does in 45 minutes what gluster can do in 24 hours.
2.  Sometimes your throughput drops dramatically.

WRT #1 - have a look at my estimates above.  My formula for guestimating
gluster perf is: throughput = NIC throughput or storage(whatever is
slower) / # replicas * overhead(figure .7 or .8).  Also the larger the
record size the better for glusterfs mounts, I normally like to be at
LEAST 64k up to 1024k:

# dd if=/dev/zero of=/gluster-mount/file bs=1024k count=10000
conv=fdatasync

WRT #2 - Again, I question your testing and your storage config.  Try
using
conv=fdatasync for your DDs, use a larger record size, and make sure that
your back end storage is not causing your slowdowns.  Also remember that
with replica 2 you will take ~50% hit on writes because the client uses
50% of its bandwidth to write to one replica and 50% to the other.

-b



</pre>
                <blockquote type="cite">
                  <pre wrap="">Thanks

Pat


On 06/02/2017 01:07 AM, Ben Turner wrote:
</pre>
                  <blockquote type="cite">
                    <pre wrap="">Are you sure using conv=sync is what you want?  I normally use
conv=fdatasync, I'll look up the difference between the two and see if
it
affects your test.


-b

----- Original Message -----
</pre>
                    <blockquote type="cite">
                      <pre wrap="">From: "Pat Haley" <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;phaley@mit.edu&gt;</a>
To: "Pranith Kumar Karampuri" <a class="moz-txt-link-rfc2396E" href="mailto:pkarampu@redhat.com" moz-do-not-send="true">&lt;pkarampu@redhat.com&gt;</a>
Cc: "Ravishankar N" <a class="moz-txt-link-rfc2396E" href="mailto:ravishankar@redhat.com" moz-do-not-send="true">&lt;ravishankar@redhat.com&gt;</a>,
<a class="moz-txt-link-abbreviated" href="mailto:gluster-users@gluster.org" moz-do-not-send="true">gluster-users@gluster.org</a>,
"Steve Postma" <a class="moz-txt-link-rfc2396E" href="mailto:SPostma@ztechnet.com" moz-do-not-send="true">&lt;SPostma@ztechnet.com&gt;</a>, "Ben
Turner" <a class="moz-txt-link-rfc2396E" href="mailto:bturner@redhat.com" moz-do-not-send="true">&lt;bturner@redhat.com&gt;</a>
Sent: Tuesday, May 30, 2017 9:40:34 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Pranith,

The "dd" command was:

        dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt conv=sync

There were 2 instances where dd reported 22 seconds. The output from
the
dd tests are in

<a class="moz-txt-link-freetext" href="http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt" moz-do-not-send="true">http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt</a>

Pat

On 05/30/2017 09:27 PM, Pranith Kumar Karampuri wrote:
</pre>
                      <blockquote type="cite">
                        <pre wrap="">Pat,
          What is the command you used? As per the following output,
          it
seems like at least one write operation took 16 seconds. Which is
really bad.
         96.39    1165.10 us      89.00 us*16487014.00 us*
         393212
         WRITE


On Tue, May 30, 2017 at 10:36 PM, Pat Haley &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
<a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt; wrote:


       Hi Pranith,

       I ran the same 'dd' test both in the gluster test volume and
       in
       the .glusterfs directory of each brick.  The median results
       (12
       dd
       trials in each test) are similar to before

         * gluster test volume: 586.5 MB/s
         * bricks (in .glusterfs): 1.4 GB/s

       The profile for the gluster test-volume is in

       <a class="moz-txt-link-freetext" href="http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt" moz-do-not-send="true">http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt</a>
       <a class="moz-txt-link-rfc2396E" href="http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt" moz-do-not-send="true">&lt;http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt&gt;</a>

       Thanks

       Pat




       On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:
</pre>
                        <blockquote type="cite">
                          <pre wrap="">       Let's start with the same 'dd' test we were testing with to
       see,
       what the numbers are. Please provide profile numbers for the
       same. From there on we will start tuning the volume to see
       what
       we can do.

       On Tue, May 30, 2017 at 9:16 PM, Pat Haley &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
       <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt; wrote:


           Hi Pranith,

           Thanks for the tip.  We now have the gluster volume
           mounted
           under /home.  What tests do you recommend we run?

           Thanks

           Pat



           On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:
</pre>
                          <blockquote type="cite">
                            <pre wrap="">           On Tue, May 16, 2017 at 9:20 PM, Pat Haley
           &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
           <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt; wrote:


               Hi Pranith,

               Sorry for the delay.  I never saw received your
               reply
               (but I did receive Ben Turner's follow-up to your
               reply).  So we tried to create a gluster volume
               under
               /home using different variations of

               gluster volume create test-volume
               mseas-data2:/home/gbrick_test_1
               mseas-data2:/home/gbrick_test_2 transport tcp

               However we keep getting errors of the form

               Wrong brick type: transport, use
               &lt;HOSTNAME&gt;:&lt;export-dir-abs-path&gt;

               Any thoughts on what we're doing wrong?


           You should give transport tcp at the beginning I think.
           Anyways, transport tcp is the default, so no need to
           specify
           so remove those two words from the CLI.


               Also do you have a list of the test we should be
               running
               once we get this volume created?  Given the
               time-zone
               difference it might help if we can run a small
               battery
               of tests and post the results rather than
               test-post-new
               test-post... .


           This is the first time I am doing performance analysis
           on
           users as far as I remember. In our team there are
           separate
           engineers who do these tests. Ben who replied earlier is
           one
           such engineer.

           Ben,
               Have any suggestions?


               Thanks

               Pat



               On 05/11/2017 12:06 PM, Pranith Kumar Karampuri
               wrote:
</pre>
                            <blockquote type="cite">
                              <pre wrap="">               On Thu, May 11, 2017 at 9:32 PM, Pat Haley
               &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt; wrote:


                   Hi Pranith,

                   The /home partition is mounted as ext4
                   /home ext4 defaults,usrquota,grpquota   1 2

                   The brick partitions are mounted ax xfs
                   /mnt/brick1 xfs defaults 0 0
                   /mnt/brick2 xfs defaults 0 0

                   Will this cause a problem with creating a
                   volume
                   under /home?


               I don't think the bottleneck is disk. You can do
               the
               same tests you did on your new volume to confirm?


                   Pat



                   On 05/11/2017 11:32 AM, Pranith Kumar Karampuri
                   wrote:
</pre>
                              <blockquote type="cite">
                                <pre wrap="">                   On Thu, May 11, 2017 at 8:57 PM, Pat Haley
                   &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt;
                   wrote:


                       Hi Pranith,

                       Unfortunately, we don't have similar
                       hardware
                       for a small scale test.  All we have is
                       our
                       production hardware.


                   You said something about /home partition which
                   has
                   lesser disks, we can create plain distribute
                   volume inside one of those directories. After
                   we
                   are done, we can remove the setup. What do you
                   say?


                       Pat




                       On 05/11/2017 07:05 AM, Pranith Kumar
                       Karampuri wrote:
</pre>
                                <blockquote type="cite">
                                  <pre wrap="">                       On Thu, May 11, 2017 at 2:48 AM, Pat
                       Haley
                       &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt;
                       wrote:


                           Hi Pranith,

                           Since we are mounting the partitions
                           as
                           the bricks, I tried the dd test
                           writing
                           to
                           &lt;brick-path&gt;/.glusterfs/&lt;file-to-be-removed-after-test&gt;.
                           The results without oflag=sync were
                           1.6
                           Gb/s (faster than gluster but not as
                           fast
                           as I was expecting given the 1.2 Gb/s
                           to
                           the no-gluster area w/ fewer disks).


                       Okay, then 1.6Gb/s is what we need to
                       target
                       for, considering your volume is just
                       distribute. Is there any way you can do
                       tests
                       on similar hardware but at a small scale?
                       Just so we can run the workload to learn
                       more
                       about the bottlenecks in the system? We
                       can
                       probably try to get the speed to 1.2Gb/s
                       on
                       your /home partition you were telling me
                       yesterday. Let me know if that is
                       something
                       you are okay to do.


                           Pat



                           On 05/10/2017 01:27 PM, Pranith Kumar
                           Karampuri wrote:
</pre>
                                  <blockquote type="cite">
                                    <pre wrap="">                           On Wed, May 10, 2017 at 10:15 PM,
                           Pat
                           Haley &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
                           <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt; wrote:


                               Hi Pranith,

                               Not entirely sure (this isn't my
                               area of expertise). I'll run
                               your
                               answer by some other people who
                               are
                               more familiar with this.

                               I am also uncertain about how to
                               interpret the results when we
                               also
                               add the dd tests writing to the
                               /home area (no gluster, still on
                               the
                               same machine)

                                 * dd test without oflag=sync
                                   (rough average of multiple
                                   tests)
                                     o gluster w/ fuse mount :
                                     570
                                     Mb/s
                                     o gluster w/ nfs mount:
                                     390
                                     Mb/s
                                     o nfs (no gluster):  1.2
                                     Gb/s
                                 * dd test with oflag=sync
                                 (rough
                                   average of multiple tests)
                                     o gluster w/ fuse mount:
                                     5
                                     Mb/s
                                     o gluster w/ nfs mount:
                                     200
                                     Mb/s
                                     o nfs (no gluster): 20
                                     Mb/s

                               Given that the non-gluster area
                               is
                               a
                               RAID-6 of 4 disks while each
                               brick
                               of the gluster area is a RAID-6
                               of
                               32 disks, I would naively expect
                               the
                               writes to the gluster area to be
                               roughly 8x faster than to the
                               non-gluster.


                           I think a better test is to try and
                           write to a file using nfs without
                           any
                           gluster to a location that is not
                           inside
                           the brick but someother location
                           that
                           is
                           on same disk(s). If you are mounting
                           the
                           partition as the brick, then we can
                           write to a file inside .glusterfs
                           directory, something like
                           &lt;brick-path&gt;/.glusterfs/&lt;file-to-be-removed-after-test&gt;.



                               I still think we have a speed
                               issue,
                               I can't tell if fuse vs nfs is
                               part
                               of the problem.


                           I got interested in the post because
                           I
                           read that fuse speed is lesser than
                           nfs
                           speed which is counter-intuitive to
                           my
                           understanding. So wanted
                           clarifications.
                           Now that I got my clarifications
                           where
                           fuse outperformed nfs without sync,
                           we
                           can resume testing as described
                           above
                           and try to find what it is. Based on
                           your email-id I am guessing you are
                           from
                           Boston and I am from Bangalore so if
                           you
                           are okay with doing this debugging
                           for
                           multiple days because of timezones,
                           I
                           will be happy to help. Please be a
                           bit
                           patient with me, I am under a
                           release
                           crunch but I am very curious with
                           the
                           problem you posted.

                               Was there anything useful in the
                               profiles?


                           Unfortunately profiles didn't help
                           me
                           much, I think we are collecting the
                           profiles from an active volume, so
                           it
                           has a lot of information that is not
                           pertaining to dd so it is difficult
                           to
                           find the contributions of dd. So I
                           went
                           through your post again and found
                           something I didn't pay much
                           attention
                           to
                           earlier i.e. oflag=sync, so did my
                           own
                           tests on my setup with FUSE so sent
                           that
                           reply.


                               Pat



                               On 05/10/2017 12:15 PM, Pranith
                               Kumar Karampuri wrote:
</pre>
                                    <blockquote type="cite">
                                      <pre wrap="">                               Okay good. At least this
                               validates
                               my doubts. Handling O_SYNC in
                               gluster NFS and fuse is a bit
                               different.
                               When application opens a file
                               with
                               O_SYNC on fuse mount then each
                               write syscall has to be written
                               to
                               disk as part of the syscall
                               where
                               as in case of NFS, there is no
                               concept of open. NFS performs
                               write
                               though a handle saying it needs
                               to
                               be a synchronous write, so
                               write()
                               syscall is performed first then
                               it
                               performs fsync(). so an write
                               on
                               an
                               fd with O_SYNC becomes
                               write+fsync.
                               I am suspecting that when
                               multiple
                               threads do this write+fsync()
                               operation on the same file,
                               multiple writes are batched
                               together to be written do disk
                               so
                               the throughput on the disk is
                               increasing is my guess.

                               Does it answer your doubts?

                               On Wed, May 10, 2017 at 9:35
                               PM,
                               Pat Haley &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
                               <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt; wrote:


                                   Without the oflag=sync and
                                   only
                                   a single test of each, the
                                   FUSE
                                   is going faster than NFS:

                                   FUSE:
                                   mseas-data2(dri_nascar)% dd
                                   if=/dev/zero count=4096
                                   bs=1048576 of=zeros.txt
                                   conv=sync
                                   4096+0 records in
                                   4096+0 records out
                                   4294967296 bytes (4.3 GB)
                                   copied, 7.46961 s, 575 MB/s


                                   NFS
                                   mseas-data2(HYCOM)% dd
                                   if=/dev/zero count=4096
                                   bs=1048576 of=zeros.txt
                                   conv=sync
                                   4096+0 records in
                                   4096+0 records out
                                   4294967296 bytes (4.3 GB)
                                   copied, 11.4264 s, 376 MB/s



                                   On 05/10/2017 11:53 AM,
                                   Pranith
                                   Kumar Karampuri wrote:
</pre>
                                      <blockquote type="cite">
                                        <pre wrap="">                                   Could you let me know the
                                   speed without oflag=sync
                                   on
                                   both the mounts? No need
                                   to
                                   collect profiles.

                                   On Wed, May 10, 2017 at
                                   9:17
                                   PM, Pat Haley
                                   &lt;<a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
                                   <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>&gt;
                                   wrote:


                                       Here is what I see
                                       now:

                                       [root@mseas-data2 ~]#
                                       gluster volume info

                                       Volume Name:
                                       data-volume
                                       Type: Distribute
                                       Volume ID:
                                       c162161e-2a2d-4dac-b015-f31fd89ceb18
                                       Status: Started
                                       Number of Bricks: 2
                                       Transport-type: tcp
                                       Bricks:
                                       Brick1:
                                       mseas-data2:/mnt/brick1
                                       Brick2:
                                       mseas-data2:/mnt/brick2
                                       Options Reconfigured:
                                       diagnostics.count-fop-hits:
                                       on
                                       diagnostics.latency-measurement:
                                       on
                                       nfs.exports-auth-enable:
                                       on
                                       diagnostics.brick-sys-log-level:
                                       WARNING
                                       performance.readdir-ahead:
                                       on
                                       nfs.disable: on
                                       nfs.export-volumes:
                                       off



                                       On 05/10/2017 11:44
                                       AM,
                                       Pranith Kumar
                                       Karampuri
                                       wrote:
</pre>
                                        <blockquote type="cite">
                                          <pre wrap="">                                       Is this the volume
                                       info
                                       you have?

                                       &gt;/[root at
                                       &gt;mseas-data2
                                       <a class="moz-txt-link-rfc2396E" href="http://www.gluster.org/mailman/listinfo/gluster-users" moz-do-not-send="true">&lt;http://www.gluster.org/mailman/listinfo/gluster-users&gt;</a>
                                       ~]# gluster volume
                                       info
                                       /&gt;//&gt;/Volume Name:
                                       data-volume /&gt;/Type:
                                       Distribute /&gt;/Volume
                                       ID:
                                       c162161e-2a2d-4dac-b015-f31fd89ceb18
                                       /&gt;/Status: Started
                                       /&gt;/Number
                                       of Bricks: 2
                                       /&gt;/Transport-type:
                                       tcp
                                       /&gt;/Bricks: /&gt;/Brick1:
                                       mseas-data2:/mnt/brick1
                                       /&gt;/Brick2:
                                       mseas-data2:/mnt/brick2
                                       /&gt;/Options
                                       Reconfigured:
                                       /&gt;/performance.readdir-ahead:
                                       on /&gt;/nfs.disable: on
                                       /&gt;/nfs.export-volumes:
                                       off
                                       /
                                       ​I copied this from
                                       old
                                       thread from 2016.
                                       This
                                       is
                                       distribute volume.
                                       Did
                                       you change any of the
                                       options in between?
</pre>
                                        </blockquote>
                                        <pre wrap="">                                       --

                                       -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                       Pat Haley
                                       <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
                                       <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
                                       Center for Ocean
                                       Engineering
                                       Phone:  (617) 253-6824
                                       Dept. of Mechanical
                                       Engineering
                                       Fax:    (617) 253-8125
                                       MIT, Room
                                       5-213http://web.mit.edu/phaley/www/
                                       77 Massachusetts
                                       Avenue
                                       Cambridge, MA
                                       02139-4301

                                   --
                                   Pranith
</pre>
                                      </blockquote>
                                      <pre wrap="">                                   --

                                   -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                   Pat Haley
                                   <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
                                   <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
                                   Center for Ocean
                                   Engineering
                                   Phone:  (617) 253-6824
                                   Dept. of Mechanical
                                   Engineering
                                   Fax:    (617) 253-8125
                                   MIT, Room
                                   5-213http://web.mit.edu/phaley/www/
                                   77 Massachusetts Avenue
                                   Cambridge, MA  02139-4301

                               --
                               Pranith
</pre>
                                    </blockquote>
                                    <pre wrap="">                               --

                               -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                               Pat Haley
                               <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
                               <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
                               Center for Ocean Engineering
                               Phone:
                               (617) 253-6824
                               Dept. of Mechanical Engineering
                               Fax:
                               (617) 253-8125
                               MIT, Room
                               5-213http://web.mit.edu/phaley/www/
                               77 Massachusetts Avenue
                               Cambridge, MA  02139-4301

                           --
                           Pranith
</pre>
                                  </blockquote>
                                  <pre wrap="">                           --

                           -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                           Pat Haley
                           <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
                           <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
                           Center for Ocean Engineering
                           Phone:
                           (617) 253-6824
                           Dept. of Mechanical Engineering
                           Fax:
                           (617) 253-8125
                           MIT, Room
                           5-213http://web.mit.edu/phaley/www/
                           77 Massachusetts Avenue
                           Cambridge, MA  02139-4301

                       --
                       Pranith
</pre>
                                </blockquote>
                                <pre wrap="">                       --

                       -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                       Pat Haley
                       <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
                       <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
                       Center for Ocean Engineering       Phone:
                       (617)
                       253-6824
                       Dept. of Mechanical Engineering    Fax:
                       (617)
                       253-8125
                       MIT, Room
                       5-213http://web.mit.edu/phaley/www/
                       77 Massachusetts Avenue
                       Cambridge, MA  02139-4301

                   --
                   Pranith
</pre>
                              </blockquote>
                              <pre wrap="">                   --

                   -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                   Pat Haley
                   <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
                   <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
                   Center for Ocean Engineering       Phone:
                   (617)
                   253-6824
                   Dept. of Mechanical Engineering    Fax:
                   (617)
                   253-8125
                   MIT, Room 5-213http://web.mit.edu/phaley/www/
                   77 Massachusetts Avenue
                   Cambridge, MA  02139-4301

               --
               Pranith
</pre>
                            </blockquote>
                            <pre wrap="">               --

               -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
               Pat Haley
               <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
               <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
               Center for Ocean Engineering       Phone:  (617)
               253-6824
               Dept. of Mechanical Engineering    Fax:    (617)
               253-8125
               MIT, Room 5-213http://web.mit.edu/phaley/www/
               77 Massachusetts Avenue
               Cambridge, MA  02139-4301




           --
           Pranith
</pre>
                          </blockquote>
                          <pre wrap="">           --

           -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
           Pat Haley                          <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
           <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
           Center for Ocean Engineering       Phone:  (617) 253-6824
           Dept. of Mechanical Engineering    Fax:    (617) 253-8125
           MIT, Room 5-213http://web.mit.edu/phaley/www/
           77 Massachusetts Avenue
           Cambridge, MA  02139-4301




       --
       Pranith
</pre>
                        </blockquote>
                        <pre wrap="">       --

       -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
       Pat Haley                          <a class="moz-txt-link-abbreviated" href="mailto:Email:phaley@mit.edu" moz-do-not-send="true">Email:phaley@mit.edu</a>
       <a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu" moz-do-not-send="true">&lt;mailto:phaley@mit.edu&gt;</a>
       Center for Ocean Engineering       Phone:  (617) 253-6824
       Dept. of Mechanical Engineering    Fax:    (617) 253-8125
       MIT, Room 5-213http://web.mit.edu/phaley/www/
       77 Massachusetts Avenue
       Cambridge, MA  02139-4301




--
Pranith
</pre>
                      </blockquote>
                      <pre wrap="">--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" moz-do-not-send="true">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301


</pre>
                    </blockquote>
                  </blockquote>
                  <pre wrap="">--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" moz-do-not-send="true">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301


</pre>
                </blockquote>
              </blockquote>
              <pre wrap="">--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" moz-do-not-send="true">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301


</pre>
            </blockquote>
          </blockquote>
          <pre wrap="">--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" moz-do-not-send="true">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301


</pre>
        </blockquote>
      </blockquote>
      <br>
      <pre class="moz-signature" cols="72">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" moz-do-not-send="true">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" moz-do-not-send="true">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users">http://lists.gluster.org/mailman/listinfo/gluster-users</a></pre>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre>
  </body>
</html>