<div dir="ltr">Looking at the source (afr-common.c) even in the case of using hashed mode and the hashed brick doesn&#39;t have a good copy it will try the next brick am I correct? I&#39;m curious because your first reply seemed to place some significance on the part about pending self-heal. Is there anything about pending self-heal that would have made hashed mode worse, or is it about as bad as any brick selection policy?<div><br></div><div>Thanks</div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Nov 22, 2018 at 7:59 PM Ravishankar N &lt;<a href="mailto:ravishankar@redhat.com">ravishankar@redhat.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div bgcolor="#FFFFFF">

    <p><br>

    </p>

    <br>

    <div class="gmail-m_7865974233100434149moz-cite-prefix">On 11/22/2018 07:07 PM, Anh Vo wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">Thanks Ravi, I will try that option.

        <div>One question:</div>

        <div>Let&#39;s say there are self heal pending, how would the

          default of &quot;0&quot; have worked? I understand 0 means &quot;first

          responder&quot; What if first responder doesn&#39;t have good copy?

          (and it failed in such a way that the dirty attribute wasn&#39;t

          set on its copy - but there are index heal pending from the

          other two sources)</div>

      </div>

    </blockquote>

    <br>

    0 = first readable child of AFR, starting from 1st child. So if 1st

    brick doesn&#39;t have the good copy, it will try the 2nd brick and so

    on.  <br>

    The default value seems to be &#39;1&#39; not &#39;0&#39;. You can look at

    afr_read_subvol_select_by_policy() in the source code to understand

    the preference of selection.<br>

    <br>

    Regards,<br>

    Ravi<br>

    <blockquote type="cite"><br>

      <div class="gmail_quote">

        <div dir="ltr">On Wed, Nov 21, 2018 at 9:57 PM Ravishankar N

          &lt;<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a>&gt; wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div bgcolor="#FFFFFF"> Hi,<br>

            If there are multiple clients , you can change the

            &#39;cluster.read-hash-mode&#39; volume option&#39;s value to 2. Then

            different reads should be served from different bricks for

            different clients. The meaning of various values for

            &#39;cluster.read-hash-mode&#39; can be got from `gluster volume set

            help`. gluster-4.1 also has added a new value[1] to this

            option. Of course, the assumption is that all bricks host

            good copies (i.e. there are no self-heals pending).<br>

            <br>

            Hope this helps,<br>

            Ravi<br>

            <br>

            [1]  <a class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-freetext" href="https://review.gluster.org/#/c/glusterfs/+/19698/" target="_blank">https://review.gluster.org/#/c/glusterfs/+/19698/</a><br>

            <br>

            <div class="gmail-m_7865974233100434149m_-705483741577722289moz-cite-prefix">On

              11/22/2018 10:20 AM, Anh Vo wrote:<br>

            </div>

            <blockquote type="cite">

              <div dir="ltr">

                <div dir="ltr">Hi,

                  <div>Our setup: We have a distributed replicated setup

                    of 3 replica. The total number of servers varies

                    between clusters, in some cases we have a total of

                    36 (12 x 3) servers, in some of them we have 12

                    servers (4 x 3). We&#39;re using gluster 3.12.15</div>

                  <div><br>

                  </div>

                  <div>In all instances what I am noticing is that only

                    one member of the replica is serving read for a

                    particular file, even when all the members of the

                    replica set is online. We have many large input

                    files (for example: 150GB zip file) and when there

                    are 50 clients reading from one single server the

                    performance degrades by several magnitude for

                    reading that file only. Shouldn&#39;t all members of the

                    replica participate in serving the read requests?</div>

                  <div><br>

                  </div>

                  <div>Our options</div>

                  <div><br>

                  </div>

                  <div>cluster.shd-max-threads: 1</div>

                  <div>cluster.heal-timeout: 900</div>

                  <div>network.inode-lru-limit: 50000</div>

                  <div>performance.md-cache-timeout: 600</div>

                  <div>performance.cache-invalidation: on</div>

                  <div>performance.stat-prefetch: on</div>

                  <div>features.cache-invalidation-timeout: 600</div>

                  <div>features.cache-invalidation: on</div>

                  <div>cluster.metadata-self-heal: off</div>

                  <div>cluster.entry-self-heal: off</div>

                  <div>cluster.data-self-heal: off</div>

                  <div>features.inode-quota: off</div>

                  <div>features.quota: off</div>

                  <div>transport.listen-backlog: 100</div>

                  <div>transport.address-family: inet</div>

                  <div>performance.readdir-ahead: on</div>

                  <div>nfs.disable: on</div>

                  <div>performance.strict-o-direct: on</div>

                  <div>network.remote-dio: off</div>

                  <div>server.allow-insecure: on</div>

                  <div>performance.write-behind: off</div>

                  <div>cluster.nufa: disable</div>

                  <div>diagnostics.latency-measurement: on</div>

                  <div>diagnostics.count-fop-hits: on</div>

                  <div>cluster.ensure-durability: off</div>

                  <div>cluster.self-heal-window-size: 32</div>

                  <div>cluster.favorite-child-policy: mtime</div>

                  <div>performance.io-thread-count: 32</div>

                  <div>cluster.eager-lock: off</div>

                  <div>server.outstanding-rpc-limit: 128</div>

                  <div>cluster.rebal-throttle: aggressive</div>

                  <div>server.event-threads: 3</div>

                  <div>client.event-threads: 3</div>

                  <div>performance.cache-size: 6GB</div>

                  <div>cluster.readdir-optimize: on</div>

                  <div>storage.build-pgfid: on</div>

                  <div><br>

                  </div>

                  <div><br>

                  </div>

                  <div><br>

                  </div>

                  <div><br>

                  </div>

                </div>

              </div>

              <br>

              <fieldset class="gmail-m_7865974233100434149m_-705483741577722289mimeAttachmentHeader"></fieldset>

              <br>

              <pre>_______________________________________________

Gluster-users mailing list

<a class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-freetext" href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></pre>

            </blockquote>

            <br>

          </div>

        </blockquote>

      </div>

    </blockquote>

    <br>

  </div>

</blockquote></div>