<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p><br>

    </p>

    <br>

    <div class="moz-cite-prefix">On 11/24/2018 01:03 PM, Anh Vo wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAPUsBJ0WTWWpBY5kJ0HyFfNeov7_HjWVAPwKueSiwrHEGvO+=w@mail.gmail.com">

      <div dir="ltr">Looking at the source (afr-common.c) even in the

        case of using hashed mode and the hashed brick doesn't have a

        good copy it will try the next brick am I correct?</div>

    </blockquote>

    That is correct, no matter which brick the policy chooses,  if that

    brick is not readable for a given file (i.e. a heal is pending on it

    from the other good bricks), we just iterate from brick-0, and pick

    the first one that is good (i.e. readable).<br>

    -Ravi<br>

    <blockquote type="cite"

cite="mid:CAPUsBJ0WTWWpBY5kJ0HyFfNeov7_HjWVAPwKueSiwrHEGvO+=w@mail.gmail.com">

      <div dir="ltr"> I'm curious because your first reply seemed to

        place some significance on the part about pending self-heal. Is

        there anything about pending self-heal that would have made

        hashed mode worse, or is it about as bad as any brick selection

        policy?

        <div><br>

        </div>

        <div>Thanks</div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr">On Thu, Nov 22, 2018 at 7:59 PM Ravishankar N

          &lt;<a href="mailto:ravishankar@redhat.com"

            moz-do-not-send="true">ravishankar@redhat.com</a>&gt; wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div bgcolor="#FFFFFF">

            <p><br>

            </p>

            <br>

            <div class="gmail-m_7865974233100434149moz-cite-prefix">On

              11/22/2018 07:07 PM, Anh Vo wrote:<br>

            </div>

            <blockquote type="cite">

              <div dir="ltr">Thanks Ravi, I will try that option.

                <div>One question:</div>

                <div>Let's say there are self heal pending, how would

                  the default of "0" have worked? I understand 0 means

                  "first responder" What if first responder doesn't have

                  good copy? (and it failed in such a way that the dirty

                  attribute wasn't set on its copy - but there are index

                  heal pending from the other two sources)</div>

              </div>

            </blockquote>

            <br>

            0 = first readable child of AFR, starting from 1st child. So

            if 1st brick doesn't have the good copy, it will try the 2nd

            brick and so on.  <br>

            The default value seems to be '1' not '0'. You can look at

            afr_read_subvol_select_by_policy() in the source code to

            understand the preference of selection.<br>

            <br>

            Regards,<br>

            Ravi<br>

            <blockquote type="cite"><br>

              <div class="gmail_quote">

                <div dir="ltr">On Wed, Nov 21, 2018 at 9:57 PM

                  Ravishankar N &lt;<a

                    href="mailto:ravishankar@redhat.com" target="_blank"

                    moz-do-not-send="true">ravishankar@redhat.com</a>&gt;

                  wrote:<br>

                </div>

                <blockquote class="gmail_quote" style="margin:0px 0px

                  0px 0.8ex;border-left:1px solid

                  rgb(204,204,204);padding-left:1ex">

                  <div bgcolor="#FFFFFF"> Hi,<br>

                    If there are multiple clients , you can change the

                    'cluster.read-hash-mode' volume option's value to 2.

                    Then different reads should be served from different

                    bricks for different clients. The meaning of various

                    values for 'cluster.read-hash-mode' can be got from

                    `gluster volume set help`. gluster-4.1 also has

                    added a new value[1] to this option. Of course, the

                    assumption is that all bricks host good copies (i.e.

                    there are no self-heals pending).<br>

                    <br>

                    Hope this helps,<br>

                    Ravi<br>

                    <br>

                    [1]  <a

class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-freetext"

href="https://review.gluster.org/#/c/glusterfs/+/19698/" target="_blank"

                      moz-do-not-send="true">https://review.gluster.org/#/c/glusterfs/+/19698/</a><br>

                    <br>

                    <div

                      class="gmail-m_7865974233100434149m_-705483741577722289moz-cite-prefix">On

                      11/22/2018 10:20 AM, Anh Vo wrote:<br>

                    </div>

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div dir="ltr">Hi,

                          <div>Our setup: We have a distributed

                            replicated setup of 3 replica. The total

                            number of servers varies between clusters,

                            in some cases we have a total of 36 (12 x 3)

                            servers, in some of them we have 12 servers

                            (4 x 3). We're using gluster 3.12.15</div>

                          <div><br>

                          </div>

                          <div>In all instances what I am noticing is

                            that only one member of the replica is

                            serving read for a particular file, even

                            when all the members of the replica set is

                            online. We have many large input files (for

                            example: 150GB zip file) and when there are

                            50 clients reading from one single server

                            the performance degrades by several

                            magnitude for reading that file only.

                            Shouldn't all members of the replica

                            participate in serving the read requests?</div>

                          <div><br>

                          </div>

                          <div>Our options</div>

                          <div><br>

                          </div>

                          <div>cluster.shd-max-threads: 1</div>

                          <div>cluster.heal-timeout: 900</div>

                          <div>network.inode-lru-limit: 50000</div>

                          <div>performance.md-cache-timeout: 600</div>

                          <div>performance.cache-invalidation: on</div>

                          <div>performance.stat-prefetch: on</div>

                          <div>features.cache-invalidation-timeout: 600</div>

                          <div>features.cache-invalidation: on</div>

                          <div>cluster.metadata-self-heal: off</div>

                          <div>cluster.entry-self-heal: off</div>

                          <div>cluster.data-self-heal: off</div>

                          <div>features.inode-quota: off</div>

                          <div>features.quota: off</div>

                          <div>transport.listen-backlog: 100</div>

                          <div>transport.address-family: inet</div>

                          <div>performance.readdir-ahead: on</div>

                          <div>nfs.disable: on</div>

                          <div>performance.strict-o-direct: on</div>

                          <div>network.remote-dio: off</div>

                          <div>server.allow-insecure: on</div>

                          <div>performance.write-behind: off</div>

                          <div>cluster.nufa: disable</div>

                          <div>diagnostics.latency-measurement: on</div>

                          <div>diagnostics.count-fop-hits: on</div>

                          <div>cluster.ensure-durability: off</div>

                          <div>cluster.self-heal-window-size: 32</div>

                          <div>cluster.favorite-child-policy: mtime</div>

                          <div>performance.io-thread-count: 32</div>

                          <div>cluster.eager-lock: off</div>

                          <div>server.outstanding-rpc-limit: 128</div>

                          <div>cluster.rebal-throttle: aggressive</div>

                          <div>server.event-threads: 3</div>

                          <div>client.event-threads: 3</div>

                          <div>performance.cache-size: 6GB</div>

                          <div>cluster.readdir-optimize: on</div>

                          <div>storage.build-pgfid: on</div>

                          <div><br>

                          </div>

                          <div><br>

                          </div>

                          <div><br>

                          </div>

                          <div><br>

                          </div>

                        </div>

                      </div>

                      <br>

                      <fieldset

class="gmail-m_7865974233100434149m_-705483741577722289mimeAttachmentHeader"></fieldset>

                      <br>

                      <pre>_______________________________________________

Gluster-users mailing list

<a class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank" moz-do-not-send="true">Gluster-users@gluster.org</a>

<a class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-freetext" href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank" moz-do-not-send="true">https://lists.gluster.org/mailman/listinfo/gluster-users</a></pre>

                    </blockquote>

                    <br>

                  </div>

                </blockquote>

              </div>

            </blockquote>

            <br>

          </div>

        </blockquote>

      </div>

    </blockquote>

    <br>

  </body>

</html>