<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 11, 2017 at 2:48 AM, Pat Haley <span dir="ltr">&lt;<a href="mailto:phaley@mit.edu" target="_blank">phaley@mit.edu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <br>
    Hi Pranith,<br>
    <br>
    Since we are mounting the partitions as the bricks, I tried the dd
    test writing to
    &lt;brick-path&gt;/.glusterfs/&lt;file-<wbr>to-be-removed-after-test&gt;.
    The results without oflag=sync were 1.6 Gb/s (faster than gluster
    but not as fast as I was expecting given the 1.2 Gb/s to the
    no-gluster area w/ fewer disks).<span class="HOEnZb"><font color="#888888"><br></font></span></div></blockquote><div><br></div><div>Okay, then 1.6Gb/s is what we need to target for, considering your volume is just distribute. Is there any way you can do tests on similar hardware but at a small scale? Just so we can run the workload to learn more about the bottlenecks in the system? We can probably try to get the speed to 1.2Gb/s on your /home partition you were telling me yesterday. Let me know if that is something you are okay to do.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><span class="HOEnZb"><font color="#888888">
    <br>
    Pat</font></span><div><div class="h5"><br>
    <br>
    <br>
    <div class="m_-7935477196994992055moz-cite-prefix">On 05/10/2017 01:27 PM, Pranith Kumar
      Karampuri wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr"><br>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Wed, May 10, 2017 at 10:15 PM, Pat
            Haley <span dir="ltr">&lt;<a href="mailto:phaley@mit.edu" target="_blank">phaley@mit.edu</a>&gt;</span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"> <br>
                Hi Pranith,<br>
                <br>
                Not entirely sure (this isn&#39;t my area of expertise). 
                I&#39;ll run your answer by some other people who are more
                familiar with this.<br>
                <br>
                I am also uncertain about how to interpret the results
                when we also add the dd tests writing to the /home area
                (no gluster, still on the same machine)<br>
                <ul>
                  <li>dd test without oflag=sync (rough average of
                    multiple tests)<br>
                  </li>
                  <ul>
                    <li>gluster w/ fuse mount : 570 Mb/s</li>
                    <li>gluster w/ nfs mount:  390 Mb/s</li>
                    <li>nfs (no gluster):  1.2 Gb/s</li>
                  </ul>
                  <li>dd test with oflag=sync (rough average of multiple
                    tests)</li>
                  <ul>
                    <li>gluster w/ fuse mount:  5 Mb/s</li>
                    <li>gluster w/ nfs mount:  200 Mb/s</li>
                    <li>nfs (no gluster): 20 Mb/s<br>
                    </li>
                  </ul>
                </ul>
                Given that the non-gluster area is a RAID-6 of 4 disks
                while each brick of the gluster area is a RAID-6 of 32
                disks, I would naively expect the writes to the gluster
                area to be roughly 8x faster than to the non-gluster.<br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>I think a better test is to try and write to a file
              using nfs without any gluster to a location that is not
              inside the brick but someother location that is on same
              disk(s). If you are mounting the partition as the brick,
              then we can write to a file inside .glusterfs directory,
              something like
              &lt;brick-path&gt;/.glusterfs/&lt;file-<wbr>to-be-removed-after-test&gt;.
              <br>
              <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"> <br>
                I still think we have a speed issue, I can&#39;t tell if
                fuse vs nfs is part of the problem.</div>
            </blockquote>
            <div><br>
            </div>
            I got interested in the post because I read that fuse speed
            is lesser than nfs speed which is counter-intuitive to my
            understanding. So wanted clarifications. Now that I got my
            clarifications where fuse outperformed nfs without sync, we
            can resume testing as described above and try to find what
            it is. Based on your email-id I am guessing you are from
            Boston and I am from Bangalore so if you are okay with doing
            this debugging for multiple days because of timezones, I
            will be happy to help. Please be a bit patient with me, I am
            under a release crunch but I am very curious with the
            problem you posted.<br>
            <br>
          </div>
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">  Was there anything
                useful in the profiles?<span class="m_-7935477196994992055HOEnZb"><font color="#888888"><br>
                  </font></span></div>
            </blockquote>
            <div><br>
            </div>
            <div>Unfortunately profiles didn&#39;t help me much, I think we
              are collecting the profiles from an active volume, so it
              has a lot of information that is not pertaining to dd so
              it is difficult to find the contributions of dd. So I went
              through your post again and found something I didn&#39;t pay
              much attention to earlier i.e. oflag=sync, so did my own
              tests on my setup with FUSE so sent that reply.<br>
               <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"><span class="m_-7935477196994992055HOEnZb"><font color="#888888"> <br>
                    Pat</font></span>
                <div>
                  <div class="m_-7935477196994992055h5"><br>
                    <br>
                    <br>
                    <div class="m_-7935477196994992055m_-1330903101597946930moz-cite-prefix">On
                      05/10/2017 12:15 PM, Pranith Kumar Karampuri
                      wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div>
                          <div>Okay good. At least this validates my
                            doubts. Handling O_SYNC in gluster NFS and
                            fuse is a bit different.<br>
                          </div>
                          When application opens a file with O_SYNC on
                          fuse mount then each write syscall has to be
                          written to disk as part of the syscall where
                          as in case of NFS, there is no concept of
                          open. NFS performs write though a handle
                          saying it needs to be a synchronous write, so
                          write() syscall is performed first then it
                          performs fsync(). so an write on an fd with
                          O_SYNC becomes write+fsync. I am suspecting
                          that when multiple threads do this
                          write+fsync() operation on the same file,
                          multiple writes are batched together to be
                          written do disk so the throughput on the disk
                          is increasing is my guess.<br>
                          <br>
                        </div>
                        Does it answer your doubts?<br>
                      </div>
                      <div class="gmail_extra"><br>
                        <div class="gmail_quote">On Wed, May 10, 2017 at
                          9:35 PM, Pat Haley <span dir="ltr">&lt;<a href="mailto:phaley@mit.edu" target="_blank">phaley@mit.edu</a>&gt;</span>
                          wrote:<br>
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            <div bgcolor="#FFFFFF" text="#000000"> <br>
                              Without the oflag=sync and only a single
                              test of each, the FUSE is going faster
                              than NFS:<br>
                              <br>
                              FUSE:<br>
                              <tt>mseas-data2(dri_nascar)% dd
                                if=/dev/zero count=4096 bs=1048576
                                of=zeros.txt conv=sync</tt><tt><br>
                              </tt><tt>4096+0 records in</tt><tt><br>
                              </tt><tt>4096+0 records out</tt><tt><br>
                              </tt><tt>4294967296 bytes (4.3 GB) copied,
                                7.46961 s, 575 MB/s</tt><tt><br>
                              </tt><tt><br>
                                <br>
                              </tt>NFS<br>
                              <tt>mseas-data2(HYCOM)% dd if=/dev/zero
                                count=4096 bs=1048576 of=zeros.txt
                                conv=sync</tt><tt><br>
                              </tt><tt>4096+0 records in</tt><tt><br>
                              </tt><tt>4096+0 records out</tt><tt><br>
                              </tt><tt>4294967296 bytes (4.3 GB) copied,
                                11.4264 s, 376 MB/s</tt>
                              <div>
                                <div class="m_-7935477196994992055m_-1330903101597946930h5"><tt><br>
                                  </tt><tt><br>
                                  </tt><tt><br>
                                  </tt>
                                  <div class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365moz-cite-prefix">On
                                    05/10/2017 11:53 AM, Pranith Kumar
                                    Karampuri wrote:<br>
                                  </div>
                                  <blockquote type="cite">
                                    <div dir="ltr">Could you let me know
                                      the speed without oflag=sync on
                                      both the mounts? No need to
                                      collect profiles.<br>
                                    </div>
                                    <div class="gmail_extra"><br>
                                      <div class="gmail_quote">On Wed,
                                        May 10, 2017 at 9:17 PM, Pat
                                        Haley <span dir="ltr">&lt;<a href="mailto:phaley@mit.edu" target="_blank">phaley@mit.edu</a>&gt;</span>
                                        wrote:<br>
                                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                          <div bgcolor="#FFFFFF" text="#000000"> <br>
                                            Here is what I see now:<br>
                                            <br>
                                            <tt>[root@mseas-data2 ~]#
                                              gluster volume info</tt><span><tt><br>
                                              </tt><tt> </tt><tt><br>
                                              </tt><tt>Volume Name:
                                                data-volume</tt><tt><br>
                                              </tt><tt>Type: Distribute</tt><tt><br>
                                              </tt><tt>Volume ID:
                                                c162161e-2a2d-4dac-b015-f31fd8<wbr>9ceb18</tt><tt><br>
                                              </tt><tt>Status: Started</tt><tt><br>
                                              </tt><tt>Number of Bricks:
                                                2</tt><tt><br>
                                              </tt><tt>Transport-type:
                                                tcp</tt><tt><br>
                                              </tt><tt>Bricks:</tt><tt><br>
                                              </tt><tt>Brick1:
                                                mseas-data2:/mnt/brick1</tt><tt><br>
                                              </tt><tt>Brick2:
                                                mseas-data2:/mnt/brick2</tt><tt><br>
                                              </tt><tt>Options
                                                Reconfigured:</tt><tt><br>
                                              </tt></span><tt>diagnostics.count-fop-hits:
                                              on</tt><tt><br>
                                            </tt><tt>diagnostics.latency-measuremen<wbr>t:
                                              on</tt><tt><br>
                                            </tt><tt>nfs.exports-auth-enable:
                                              on</tt><tt><br>
                                            </tt><tt>diagnostics.brick-sys-log-leve<wbr>l:
                                              WARNING</tt><span><tt><br>
                                              </tt><tt>performance.readdir-ahead:
                                                on</tt><tt><br>
                                              </tt><tt>nfs.disable: on</tt><tt><br>
                                              </tt><tt>nfs.export-volumes:
                                                off</tt><tt><br>
                                              </tt><br>
                                              <br>
                                              <br>
                                            </span>
                                            <div>
                                              <div class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365h5">
                                                <div class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365m_5079054141158038028moz-cite-prefix">On
                                                  05/10/2017 11:44 AM,
                                                  Pranith Kumar
                                                  Karampuri wrote:<br>
                                                </div>
                                                <blockquote type="cite">
                                                  <div dir="ltr">
                                                    <div>Is this the
                                                      volume info you
                                                      have?<br>
                                                      <br>
                                                      <pre>&gt;<i>     [<a href="http://www.gluster.org/mailman/listinfo/gluster-users" target="_blank">root at mseas-data2</a> ~]# gluster volume info
</i>&gt;<i>
</i>&gt;<i>     Volume Name: data-volume
</i>&gt;<i>     Type: Distribute
</i>&gt;<i>     Volume ID: c162161e-2a2d-4dac-b015-f31fd8<wbr>9ceb18
</i>&gt;<i>     Status: Started
</i>&gt;<i>     Number of Bricks: 2
</i>&gt;<i>     Transport-type: tcp
</i>&gt;<i>     Bricks:
</i>&gt;<i>     Brick1: mseas-data2:/mnt/brick1
</i>&gt;<i>     Brick2: mseas-data2:/mnt/brick2
</i>&gt;<i>     Options Reconfigured:
</i>&gt;<i>     performance.readdir-ahead: on
</i>&gt;<i>     nfs.disable: on
</i>&gt;<i>     nfs.export-volumes: off

</i></pre>
        </div>
        ​I copied this from old thread from 2016. This is distribute
        volume. Did you change any of the options in between?

      </div>
    </blockquote>
    

    </div></div><span><pre class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365m_5079054141158038028moz-signature" cols="72">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=<wbr>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=<wbr>-=-=-
Pat Haley                          Email:  <a class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365m_5079054141158038028moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" target="_blank">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365m_5079054141158038028moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" target="_blank">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre>
  </span></div>

</blockquote></div>


-- 
<div class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith
</div></div>
</div>



</blockquote>
<pre class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365moz-signature" cols="72">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=<wbr>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=<wbr>-=-=-
Pat Haley                          Email:  <a class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" target="_blank">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="m_-7935477196994992055m_-1330903101597946930m_7039630128565981365moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" target="_blank">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre></div></div></div></blockquote></div>


-- 
<div class="m_-7935477196994992055m_-1330903101597946930gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith
</div></div>
</div>



</blockquote>
<pre class="m_-7935477196994992055m_-1330903101597946930moz-signature" cols="72">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=<wbr>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=<wbr>-=-=-
Pat Haley                          Email:  <a class="m_-7935477196994992055m_-1330903101597946930moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" target="_blank">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="m_-7935477196994992055m_-1330903101597946930moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" target="_blank">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre></div></div></div></blockquote></div>


-- 
<div class="m_-7935477196994992055gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith
</div></div>
</div></div>



</blockquote>
<pre class="m_-7935477196994992055moz-signature" cols="72">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=<wbr>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=<wbr>-=-=-
Pat Haley                          Email:  <a class="m_-7935477196994992055moz-txt-link-abbreviated" href="mailto:phaley@mit.edu" target="_blank">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="m_-7935477196994992055moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/" target="_blank">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre></div></div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith<br></div></div>
</div></div>