[Gluster-users] Healing queue rarely empty

Nicolas Ecarnot nicolas at ecarnot.net
Tue Dec 29 15:12:54 UTC 2015


Le 17/12/2015 10:51, Nicolas Ecarnot a écrit :
> Le 17/12/2015 10:10, Nicolas Ecarnot a écrit :
>> Hello,
>>
>> Our setup : 3 Centos 7.2 nodes, with gluster 3.7.6 in replica-3, used as
>> storage+compute for an oVirt 3.5.6 DC.
>>
>> Two days ago, we added some nagios/centreon monitoring watching every 5
>> minutes the state of the heal queue :
>> (something like "gluster volume heal some_vol info" with the adequate
>> grep).
>>
>> I expected the "Number of entries" of every node to appear in the graph
>> as a flat zero line, most of the times, except for the rare cases of
>> node reboot, after which healing is launched and takes some minutes
>> (sometimes hours) but is doing good.
>>
>> Instead, we see that the healing queue is doing 2 or 3 files healing say
>> 4 times an hour. All day long.
>>
>> Our DC is a small one, and has few VMs, so not more than only 8 big
>> files are stored in glusterfs.
>> I'm very surprised to see that these files constantly need healing, as I
>> thought I've understood that read/writes were synchronous at every time,
>> and replica-3 meant that every files were absolutely synced and commited
>> at all time.
>>
>> I've also read about the 10 minutes cron-like job of the self-healing
>> daemon, which we are using by default, but this is a second point.
>>
>> The first point leads to :
>> - Why do we see so frequent desynchronizations between nodes?
>> - Can I confirm that reading which logs?
>> - What must I check?
>>
>
> Self-replying, but as I found :
> https://www.mail-archive.com/gluster-users%40gluster.org/msg20611.html
>
> could this make sense to be surprised to see that :
>
> gluster volume get data cluster.op-version
> Option                                  Value
> ------                                  -----
> cluster.op-version                      30600
>
> in a 3.7.6 gluster cluster?


Ok, cluster.op-version bumped up, but no improvement.

Opening https://bugzilla.redhat.com/show_bug.cgi?id=1294675

-- 
Nicolas ECARNOT


More information about the Gluster-users mailing list