<div dir="ltr">Looking at the source (afr-common.c) even in the case of using hashed mode and the hashed brick doesn&#39;t have a good copy it will try the next brick am I correct? I&#39;m curious because your first reply seemed to place some significance on the part about pending self-heal. Is there anything about pending self-heal that would have made hashed mode worse, or is it about as bad as any brick selection policy?<div><br></div><div>Thanks</div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Nov 22, 2018 at 7:59 PM Ravishankar N &lt;<a href="mailto:ravishankar@redhat.com">ravishankar@redhat.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF">
    <p><br>
    </p>
    <br>
    <div class="gmail-m_7865974233100434149moz-cite-prefix">On 11/22/2018 07:07 PM, Anh Vo wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">Thanks Ravi, I will try that option.
        <div>One question:</div>
        <div>Let&#39;s say there are self heal pending, how would the
          default of &quot;0&quot; have worked? I understand 0 means &quot;first
          responder&quot; What if first responder doesn&#39;t have good copy?
          (and it failed in such a way that the dirty attribute wasn&#39;t
          set on its copy - but there are index heal pending from the
          other two sources)</div>
      </div>
    </blockquote>
    <br>
    0 = first readable child of AFR, starting from 1st child. So if 1st
    brick doesn&#39;t have the good copy, it will try the 2nd brick and so
    on.  <br>
    The default value seems to be &#39;1&#39; not &#39;0&#39;. You can look at
    afr_read_subvol_select_by_policy() in the source code to understand
    the preference of selection.<br>
    <br>
    Regards,<br>
    Ravi<br>
    <blockquote type="cite"><br>
      <div class="gmail_quote">
        <div dir="ltr">On Wed, Nov 21, 2018 at 9:57 PM Ravishankar N
          &lt;<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a>&gt; wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div bgcolor="#FFFFFF"> Hi,<br>
            If there are multiple clients , you can change the
            &#39;cluster.read-hash-mode&#39; volume option&#39;s value to 2. Then
            different reads should be served from different bricks for
            different clients. The meaning of various values for
            &#39;cluster.read-hash-mode&#39; can be got from `gluster volume set
            help`. gluster-4.1 also has added a new value[1] to this
            option. Of course, the assumption is that all bricks host
            good copies (i.e. there are no self-heals pending).<br>
            <br>
            Hope this helps,<br>
            Ravi<br>
            <br>
            [1]  <a class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-freetext" href="https://review.gluster.org/#/c/glusterfs/+/19698/" target="_blank">https://review.gluster.org/#/c/glusterfs/+/19698/</a><br>
            <br>
            <div class="gmail-m_7865974233100434149m_-705483741577722289moz-cite-prefix">On
              11/22/2018 10:20 AM, Anh Vo wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div dir="ltr">Hi,
                  <div>Our setup: We have a distributed replicated setup
                    of 3 replica. The total number of servers varies
                    between clusters, in some cases we have a total of
                    36 (12 x 3) servers, in some of them we have 12
                    servers (4 x 3). We&#39;re using gluster 3.12.15</div>
                  <div><br>
                  </div>
                  <div>In all instances what I am noticing is that only
                    one member of the replica is serving read for a
                    particular file, even when all the members of the
                    replica set is online. We have many large input
                    files (for example: 150GB zip file) and when there
                    are 50 clients reading from one single server the
                    performance degrades by several magnitude for
                    reading that file only. Shouldn&#39;t all members of the
                    replica participate in serving the read requests?</div>
                  <div><br>
                  </div>
                  <div>Our options</div>
                  <div><br>
                  </div>
                  <div>cluster.shd-max-threads: 1</div>
                  <div>cluster.heal-timeout: 900</div>
                  <div>network.inode-lru-limit: 50000</div>
                  <div>performance.md-cache-timeout: 600</div>
                  <div>performance.cache-invalidation: on</div>
                  <div>performance.stat-prefetch: on</div>
                  <div>features.cache-invalidation-timeout: 600</div>
                  <div>features.cache-invalidation: on</div>
                  <div>cluster.metadata-self-heal: off</div>
                  <div>cluster.entry-self-heal: off</div>
                  <div>cluster.data-self-heal: off</div>
                  <div>features.inode-quota: off</div>
                  <div>features.quota: off</div>
                  <div>transport.listen-backlog: 100</div>
                  <div>transport.address-family: inet</div>
                  <div>performance.readdir-ahead: on</div>
                  <div>nfs.disable: on</div>
                  <div>performance.strict-o-direct: on</div>
                  <div>network.remote-dio: off</div>
                  <div>server.allow-insecure: on</div>
                  <div>performance.write-behind: off</div>
                  <div>cluster.nufa: disable</div>
                  <div>diagnostics.latency-measurement: on</div>
                  <div>diagnostics.count-fop-hits: on</div>
                  <div>cluster.ensure-durability: off</div>
                  <div>cluster.self-heal-window-size: 32</div>
                  <div>cluster.favorite-child-policy: mtime</div>
                  <div>performance.io-thread-count: 32</div>
                  <div>cluster.eager-lock: off</div>
                  <div>server.outstanding-rpc-limit: 128</div>
                  <div>cluster.rebal-throttle: aggressive</div>
                  <div>server.event-threads: 3</div>
                  <div>client.event-threads: 3</div>
                  <div>performance.cache-size: 6GB</div>
                  <div>cluster.readdir-optimize: on</div>
                  <div>storage.build-pgfid: on</div>
                  <div><br>
                  </div>
                  <div><br>
                  </div>
                  <div><br>
                  </div>
                  <div><br>
                  </div>
                </div>
              </div>
              <br>
              <fieldset class="gmail-m_7865974233100434149m_-705483741577722289mimeAttachmentHeader"></fieldset>
              <br>
              <pre>_______________________________________________
Gluster-users mailing list
<a class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a class="gmail-m_7865974233100434149m_-705483741577722289moz-txt-link-freetext" href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></pre>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </div>

</blockquote></div>