[Gluster-users] halo not work as desired!!!

atris adam atris.adam at gmail.com
Tue Feb 6 11:37:24 UTC 2018


I have checked more and mount the volume in another region (in region c),
the ping time from region c is as follows:

ping 10.0.0.1 & 10.0.0.3 are bellow  time=12 ms
ping 10.0.0.5 & 10.0.0.6 are more than  time=32 ms

I expect the bricks with lower ping time to be selected at write time, but
still the brick selection is not as desired and those bricks with more ping
time are selected. I change the cluster.halo-max-latency to 20, but this
not affect anything.

on more thing is, the previous email I wrote was not with the right result,
I though that by changing the range to [0-999999] everything will be ok,
but my today experience shows that I was wrong.


any help will be appreciated ;)

On Mon, Feb 5, 2018 at 4:04 PM, atris adam <atris.adam at gmail.com> wrote:

> I have mounted the halo glusterfs volume in debug mode, and the output is
> as follows:
> .
> .
> .
> [2018-02-05 11:42:48.282473] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk]
> 0-test-halo-client-1: Ping latency is 0ms
> [2018-02-05 11:42:48.282502] D [MSGID: 0] [afr-common.c:5025:afr_get_halo_latency]
> 0-test-halo-replicate-0: Using halo latency 10
> [2018-02-05 11:42:48.282525] D [MSGID: 0] [afr-common.c:4820:__afr_handle_ping_event]
> 0-test-halo-client-1: Client ping @ 140032933708544 ms
> .
> .
> .
> [2018-02-05 11:42:48.393776] D [MSGID: 0] [afr-common.c:4803:find_worst_up_child]
> 0-test-halo-replicate-0: Found worst up child (1) @ 140032933708544 ms
> latency
> [2018-02-05 11:42:48.393803] D [MSGID: 0] [afr-common.c:4903:__afr_handle_child_up_event]
> 0-test-halo-replicate-0: Marking child 1 down, doesn't meet halo threshold
> (10), and > halo_min_replicas (2)
> .
> .
> .
>
> I think these debug output means:
> As the ping time for test-halo-client-1 (brick2) is (0.5ms) and it is not
> under halo threshold (10 ms), this false decision for selecting bricks
> happen to halo.
> I can not set the halo threshold to 0 because:
>
> #gluster vol set test-halo cluster.halo-max-latency 0
> volume set: failed: '0' in 'option halo-max-latency 0' is out of range [1
> - 99999]
>
> so I think the range [1 - 99999] should change to [0 - 99999], so I can
> get the desired brick selection for halo feature, am I right? If not, why
> the halo decide to mark down the best brick which has ping time bellow
> 0.5ms?
>
> On Sun, Feb 4, 2018 at 2:27 PM, atris adam <atris.adam at gmail.com> wrote:
>
>> I have 2 data centers in two different region, each DC have 3 severs, I
>> have created glusterfs volume with 4 replica, this is glusterfs volume info
>> output:
>>
>>
>> Volume Name: test-halo
>> Type: Replicate
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 4 = 4
>> Transport-type: tcp
>> Bricks:
>> Brick1: 10.0.0.1:/mnt/test1
>> Brick2: 10.0.0.3:/mnt/test2
>> Brick3: 10.0.0.5:/mnt/test3
>> Brick4: 10.0.0.6:/mnt/test4
>> Options Reconfigured:
>> cluster.halo-shd-max-latency: 5
>> cluster.halo-max-latency: 10
>> cluster.quorum-count: 2
>> cluster.quorum-type: fixed
>> cluster.halo-enabled: yes
>> transport.address-family: inet
>> nfs.disable: on
>>
>> bricks with ip 10.0.0.1 & 10.0.0.3 are in region A and bricks with ip
>> 10.0.0.5 & 10.0.0.6 are in region B
>>
>>
>> when I mount the volume in region A, I except the data first store in
>> brick1 & brick2, then asynchronously the data copies in region B, on brick3
>> & brick4.
>>
>> Am I write? this is what halo claims?
>>
>> If yes, unfortunately, this not happen to me, no differ I mount the
>> volume in region A or mount the volume in region B, all the data are copied
>> in brick3 & brick4 and no data copies in brick1 & brick2.
>>
>> ping bricks ip from region A is as follows:
>> ping 10.0.0.1 & 10.0.0.3 are bellow  time=0.500 ms
>> ping 10.0.0.5 & 10.0.0.6 are more than  time=20 ms
>>
>> What is the logic that the halo select the bricks to write to?if it is
>> the access time, so when I mount the volume in region A, the ping time to
>> brick1 & brick2 is bellow 0.5 ms, but the halo select the brick3 &
>> brick4!!!!
>>
>> glusterfs version is:
>> glusterfs 3.12.4
>>
>> I really need to work with halo feature, But I am not successful to run
>> this case, Can anyone help me soon??
>>
>>
>> Thx alot
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180206/c83e80f8/attachment.html>


More information about the Gluster-users mailing list