[Gluster-users] Gluster distributed replicated setup does not serve read from all bricks belonging to the same replica

Anh Vo vtqanh at gmail.com
Sat Nov 24 07:33:47 UTC 2018


Looking at the source (afr-common.c) even in the case of using hashed mode
and the hashed brick doesn't have a good copy it will try the next brick am
I correct? I'm curious because your first reply seemed to place some
significance on the part about pending self-heal. Is there anything about
pending self-heal that would have made hashed mode worse, or is it about as
bad as any brick selection policy?

Thanks

On Thu, Nov 22, 2018 at 7:59 PM Ravishankar N <ravishankar at redhat.com>
wrote:

>
>
> On 11/22/2018 07:07 PM, Anh Vo wrote:
>
> Thanks Ravi, I will try that option.
> One question:
> Let's say there are self heal pending, how would the default of "0" have
> worked? I understand 0 means "first responder" What if first responder
> doesn't have good copy? (and it failed in such a way that the dirty
> attribute wasn't set on its copy - but there are index heal pending from
> the other two sources)
>
>
> 0 = first readable child of AFR, starting from 1st child. So if 1st brick
> doesn't have the good copy, it will try the 2nd brick and so on.
> The default value seems to be '1' not '0'. You can look at
> afr_read_subvol_select_by_policy() in the source code to understand the
> preference of selection.
>
> Regards,
> Ravi
>
>
> On Wed, Nov 21, 2018 at 9:57 PM Ravishankar N <ravishankar at redhat.com>
> wrote:
>
>> Hi,
>> If there are multiple clients , you can change the
>> 'cluster.read-hash-mode' volume option's value to 2. Then different reads
>> should be served from different bricks for different clients. The meaning
>> of various values for 'cluster.read-hash-mode' can be got from `gluster
>> volume set help`. gluster-4.1 also has added a new value[1] to this option.
>> Of course, the assumption is that all bricks host good copies (i.e. there
>> are no self-heals pending).
>>
>> Hope this helps,
>> Ravi
>>
>> [1]  https://review.gluster.org/#/c/glusterfs/+/19698/
>>
>> On 11/22/2018 10:20 AM, Anh Vo wrote:
>>
>> Hi,
>> Our setup: We have a distributed replicated setup of 3 replica. The total
>> number of servers varies between clusters, in some cases we have a total of
>> 36 (12 x 3) servers, in some of them we have 12 servers (4 x 3). We're
>> using gluster 3.12.15
>>
>> In all instances what I am noticing is that only one member of the
>> replica is serving read for a particular file, even when all the members of
>> the replica set is online. We have many large input files (for example:
>> 150GB zip file) and when there are 50 clients reading from one single
>> server the performance degrades by several magnitude for reading that file
>> only. Shouldn't all members of the replica participate in serving the read
>> requests?
>>
>> Our options
>>
>> cluster.shd-max-threads: 1
>> cluster.heal-timeout: 900
>> network.inode-lru-limit: 50000
>> performance.md-cache-timeout: 600
>> performance.cache-invalidation: on
>> performance.stat-prefetch: on
>> features.cache-invalidation-timeout: 600
>> features.cache-invalidation: on
>> cluster.metadata-self-heal: off
>> cluster.entry-self-heal: off
>> cluster.data-self-heal: off
>> features.inode-quota: off
>> features.quota: off
>> transport.listen-backlog: 100
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> nfs.disable: on
>> performance.strict-o-direct: on
>> network.remote-dio: off
>> server.allow-insecure: on
>> performance.write-behind: off
>> cluster.nufa: disable
>> diagnostics.latency-measurement: on
>> diagnostics.count-fop-hits: on
>> cluster.ensure-durability: off
>> cluster.self-heal-window-size: 32
>> cluster.favorite-child-policy: mtime
>> performance.io-thread-count: 32
>> cluster.eager-lock: off
>> server.outstanding-rpc-limit: 128
>> cluster.rebal-throttle: aggressive
>> server.event-threads: 3
>> client.event-threads: 3
>> performance.cache-size: 6GB
>> cluster.readdir-optimize: on
>> storage.build-pgfid: on
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181123/8a1b5595/attachment.html>


More information about the Gluster-users mailing list