<div dir='auto'><div><br><div class="gmail_extra"><br><div class="gmail_quote">Den 15 aug. 2018 07:43 skrev Pui Edylie &lt;email@edylie.net&gt;:<br type="attribution"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>

    <p>Hi Karli,<br>

      <br>

      I think Alex is right in regards with the NFS version and state.<br></p></div></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">Yeah, I'm setting up the tests now, I'll report back once it's done!</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><p>

      <br>

      I am only using NFSv3 and the failover is working per expectation.<br>

      <br>

      In my use case, I have 3 nodes with ESXI 6.7 as OS and setup 1x&nbsp;

      gluster VM on each of the ESXI host using its local datastore.<br>

      <br>

      Once I have formed the replicate 3, I use the CTDB VIP to present

      the NFS3 back to the Vcenter and uses it as a shared storage.<br>

      <br>

      Everything works great other than performance is not very good ...

      I am still looking for ways to improve it.<br></p></div></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">The obvious way would be to use oVirt instead of VMWare;)</div><div dir="auto"><br></div><div dir="auto">/K</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><p>

      <br>

      Cheers,<br>

      Edy<br>

    </p>

    <br>

    <div>On 8/15/2018 12:25 AM, Alex Chekholko

      wrote:<br>

    </div>

    <blockquote>

      <div dir="ltr">Hi Karli,

        <div><br>

        </div>

        <div>I'm not 100% sure this is related, but when I set up my ZFS

          NFS HA per&nbsp;<a href="https://github.com/ewwhite/zfs-ha/wiki">https://github.com/ewwhite/zfs-ha/wiki</a>

          I was not able to get the failover to work with NFS v4 but

          only with NFS v3.</div>

        <div><br>

        </div>

        <div>From the client point of view, it really looked like with

          NFS v4 there is an open file handle and that just goes stale

          and hangs, or something like that, whereas with NFSv3 the

          client retries and recovers and continues.&nbsp; I did not

          investigate further, I just use v3.&nbsp; I think it has something

          to do with NFSv4 being "stateful" and NFSv3 being "stateless".</div>

        <div><br>

        </div>

        <div>Can you re-run your test but using NFSv3 on the client

          mount?&nbsp; Or do you need to use v4.x?</div>

        <div><br>

        </div>

        <div>Regards,</div>

        <div>Alex</div>

      </div>

      <br>

      <div class="elided-text">

        <div dir="ltr">On Tue, Aug 14, 2018 at 6:11 AM Karli Sjöberg

          &lt;<a href="mailto:karli@inparadise.se">karli@inparadise.se</a>&gt; wrote:<br>

        </div>

        <blockquote style="margin:0 0 0 0.8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri,

          2018-08-10 at 09:39 -0400, Kaleb S. KEITHLEY wrote:<br>

          &gt; On 08/10/2018 09:23 AM, Karli Sjöberg wrote:<br>

          &gt; &gt; On Fri, 2018-08-10 at 21:23 +0800, Pui Edylie wrote:<br>

          &gt; &gt; &gt; Hi Karli,<br>

          &gt; &gt; &gt; <br>

          &gt; &gt; &gt; Storhaug works with glusterfs 4.1.2 and latest

          nfs-ganesha.<br>

          &gt; &gt; &gt; <br>

          &gt; &gt; &gt; I just installed them last weekend ... they are

          working very well<br>

          &gt; &gt; &gt; :)<br>

          &gt; &gt; <br>

          &gt; &gt; Okay, awesome!<br>

          &gt; &gt; <br>

          &gt; &gt; Is there any documentation on how to do that?<br>

          &gt; &gt; <br>

          &gt; <br>

          &gt; <a href="https://github.com/gluster/storhaug/wiki">https://github.com/gluster/storhaug/wiki</a><br>

          &gt; <br>

          <br>

          Thanks Kaleb and Edy!<br>

          <br>

          I have now redone the cluster using the latest and greatest

          following<br>

          the above guide and repeated the same test I was doing before

          (the<br>

          rsync while loop) with success. I let (forgot) it run for

          about a day<br>

          and it was still chugging along nicely when I aborted it, so

          success<br>

          there!<br>

          <br>

          On to the next test; the catastrophic failure test- where one

          of the<br>

          servers dies, I'm having a more difficult time with.<br>

          <br>

          1) I start with mounting the share over NFS 4.1 and then

          proceed with<br>

          writing a 8 GiB large random data file with 'dd', while

          "hard-cutting"<br>

          the power to the server I'm writing to, the transfer just

          stops<br>

          indefinitely, until the server comes back again. Is that

          supposed to<br>

          happen? Like this:<br>

          <br>

          # dd if=/dev/urandom of=/var/tmp/test.bin bs=1M count=8192<br>

          # mount -o vers=4.1 hv03v.localdomain:/data /mnt/<br>

          # dd if=/var/tmp/test.bin of=/mnt/test.bin bs=1M

          status=progress<br>

          2434793472 bytes (2,4 GB, 2,3 GiB) copied, 42 s, 57,9 MB/s<br>

          <br>

          (here I cut the power and let it be for almost two hours

          before turning<br>

          it on again)<br>

          <br>

          dd: error writing '/mnt/test.bin': Remote I/O error<br>

          2325+0 records in<br>

          2324+0 records out<br>

          2436890624 bytes (2,4 GB, 2,3 GiB) copied, 6944,84 s, 351 kB/s<br>

          # umount /mnt<br>

          <br>

          Here the unmount command hung and I had to hard reset the

          client.<br>

          <br>

          2) Another question I have is why some files "change" as you

          copy them<br>

          out to the Gluster storage? Is that the way it should be? This

          time, I<br>

          deleted eveything in the destination directory to start over:<br>

          <br>

          # mount -o vers=4.1 hv03v.localdomain:/data /mnt/<br>

          # rm -f /mnt/test.bin<br>

          # dd if=/var/tmp/test.bin of=/mnt/test.bin bs=1M

          status=progress<br>

          8557428736 bytes (8,6 GB, 8,0 GiB) copied, 122 s, 70,1 MB/s<br>

          8192+0 records in<br>

          8192+0 records out<br>

          8589934592 bytes (8,6 GB, 8,0 GiB) copied, 123,039 s, 69,8

          MB/s<br>

          # md5sum /var/tmp/test.bin <br>

          073867b68fa8eaa382ffe05adb90b583&nbsp; /var/tmp/test.bin<br>

          # md5sum /mnt/test.bin <br>

          634187d367f856f3f5fb31846f796397&nbsp; /mnt/test.bin<br>

          # umount /mnt<br>

          <br>

          Thanks in advance!<br>

          <br>

          /K<br>

          _______________________________________________<br>

          Gluster-users mailing list<br>

          <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

          <a href="https://lists.gluster.org/mailman/listinfo/gluster-users">https://lists.gluster.org/mailman/listinfo/gluster-users</a></blockquote>

      </div>

    </blockquote>

    <br>

  </div>

</blockquote></div><br></div></div></div>