<div dir="ltr"><div>Ravi/Karthick,<br><br></div>If one of the self heal process is down, will the statstics heal-count command work?<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Sep 4, 2017 at 7:24 PM, lejeczek <span dir="ltr">&lt;<a href="mailto:peljasz@yahoo.co.uk" target="_blank">peljasz@yahoo.co.uk</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">1) one peer, out of four, got separated from the network, from the rest of the cluster.<br>

2) that unavailable(while it was unavailable) peer got detached with &quot;gluster peer detach&quot; command which succeeded, so now cluster comprise of three peers<br>

3) Self-heal daemon (for some reason) does not start(with an attempt to restart glusted) on the peer which probed that fourth peer.<br>

4) fourth unavailable peer is still up &amp; running but is inaccessible to other peers for network is disconnected, segmented. That peer&#39;s gluster status show peer is still in the cluster.<br>

5) So, fourth peer&#39;s gluster(nor other processes) stack did not fail nor crushed, just network got, is disconnected.<br>

6) peer status show ok &amp; connected for current three peers.<br>

<br>

This is third time when it happens to me, very same way: each time net-disjointed peer was brought back online then statistics &amp; details worked again.<br>

<br>

can you not reproduce it?<br>

<br>

$ gluster vol info QEMU-VMs<br>

<br>

Volume Name: QEMU-VMs<br>

Type: Replicate<br>

Volume ID: 8709782a-daa5-4434-a816-c4e0ae<wbr>f8fef2<br>

Status: Started<br>

Snapshot Count: 0<br>

Number of Bricks: 1 x 3 = 3<br>

Transport-type: tcp<br>

Bricks:<br>

Brick1: 10.5.6.32:/__.aLocalStorages/0<wbr>/0-GLUSTERs/0GLUSTER-QEMU-VMs<br>

Brick2: 10.5.6.49:/__.aLocalStorages/0<wbr>/0-GLUSTERs/0GLUSTER-QEMU-VMs<br>

Brick3: 10.5.6.100:/__.aLocalStorages/<wbr>0/0-GLUSTERs/0GLUSTER-QEMU-VMs<br>

Options Reconfigured:<br>

transport.address-family: inet<br>

nfs.disable: on<br>

storage.owner-gid: 107<br>

storage.owner-uid: 107<br>

performance.readdir-ahead: on<br>

geo-replication.indexing: on<br>

geo-replication.ignore-pid-che<wbr>ck: on<br>

changelog.changelog: on<br>

<br>

$ gluster vol status QEMU-VMs<br>

Status of volume: QEMU-VMs<br>

Gluster process                       <wbr>      TCP Port  RDMA Port Online  Pid<br>

------------------------------<wbr>------------------------------<wbr>------------------<br>

Brick 10.5.6.32:/__.aLocalStorages/0<wbr>/0-GLUS<br>

TERs/0GLUSTER-QEMU-VMs        <wbr>              49156     0 Y       9302<br>

Brick 10.5.6.49:/__.aLocalStorages/0<wbr>/0-GLUS<br>

TERs/0GLUSTER-QEMU-VMs        <wbr>              49156     0 Y       7610<br>

Brick 10.5.6.100:/__.aLocalStorages/<wbr>0/0-GLU<br>

STERs/0GLUSTER-QEMU-VMs       <wbr>              49156     0 Y       11013<br>

Self-heal Daemon on localhost               N/A       N/A Y       3069276<br>

Self-heal Daemon on 10.5.6.32               N/A       N/A Y       3315870<br>

Self-heal Daemon on 10.5.6.49               N/A       N/A N       N/A  &lt;--- HERE<br>

Self-heal Daemon on 10.5.6.17               N/A       N/A Y       5163<br>

<br>

Task Status of Volume QEMU-VMs<br>

------------------------------<wbr>------------------------------<wbr>------------------<br>

There are no active volume tasks<br>

<br>

$ gluster vol heal QEMU-VMs statistics heal-count<br>

Gathering count of entries to be healed on volume QEMU-VMs has been unsuccessful on bricks that are down. Please check if all brick processes are running.<span class=""><br>

<br>

<br>

<br>

On 04/09/17 11:47, Atin Mukherjee wrote:<br>

</span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">

Please provide the output of gluster volume info, gluster volume status and gluster peer status.<br>

<br></span><div><div class="h5">

On Mon, Sep 4, 2017 at 4:07 PM, lejeczek &lt;<a href="mailto:peljasz@yahoo.co.uk" target="_blank">peljasz@yahoo.co.uk</a> &lt;mailto:<a href="mailto:peljasz@yahoo.co.uk" target="_blank">peljasz@yahoo.co.uk</a>&gt;&gt; wrote:<br>

<br>

    hi all<br>

<br>

    this:<br>

    $ vol heal $_vol info<br>

    outputs ok and exit code is 0<br>

    But if I want to see statistics:<br>

    $ gluster vol heal $_vol statistics<br>

    Gathering crawl statistics on volume GROUP-WORK has<br>

    been unsuccessful on bricks that are down. Please<br>

    check if all brick processes are running.<br>

<br>

    I suspect - gluster inability to cope with a situation<br>

    where one peer(which is not even a brick for a single<br>

    vol on the cluster!) is inaccessible to the rest of<br>

    cluster.<br>

    I have not played with any other variations of this<br>

    case, eg. more than one peer goes down, etc.<br>

    But I hope someone could try to replicate this simple<br>

    test case.<br>

<br>

    Cluster and vols, when something like this happens,<br>

    seem accessible and as such &quot;all&quot; works, except when<br>

    you want more details.<br>

    This also fails:<br>

    $ gluster vol status $_vol detail<br>

    Error : Request timed out<br>

<br>

    My gluster(3.10.5-1.el7.x86_64) exhibits these<br>

    symptoms every time one(at least) peers goes out of<br>

    the rest reach.<br>

<br>

    maybe @devel can comment?<br>

<br>

    many thanks, L.<br>

    ______________________________<wbr>_________________<br>

    Gluster-users mailing list<br>

    <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br></div></div>

    &lt;mailto:<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.<wbr>org</a>&gt;<br>

    <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>

    &lt;<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mail<wbr>man/listinfo/gluster-users</a>&gt;<br>

<br>

<br>

</blockquote>

<br>

</blockquote></div><br></div>