[Gluster-users] Healing queue rarely empty
Nicolas Ecarnot
nicolas at ecarnot.net
Thu Dec 17 09:10:30 UTC 2015
Hello,
Our setup : 3 Centos 7.2 nodes, with gluster 3.7.6 in replica-3, used as
storage+compute for an oVirt 3.5.6 DC.
Two days ago, we added some nagios/centreon monitoring watching every 5
minutes the state of the heal queue :
(something like "gluster volume heal some_vol info" with the adequate grep).
I expected the "Number of entries" of every node to appear in the graph
as a flat zero line, most of the times, except for the rare cases of
node reboot, after which healing is launched and takes some minutes
(sometimes hours) but is doing good.
Instead, we see that the healing queue is doing 2 or 3 files healing say
4 times an hour. All day long.
Our DC is a small one, and has few VMs, so not more than only 8 big
files are stored in glusterfs.
I'm very surprised to see that these files constantly need healing, as I
thought I've understood that read/writes were synchronous at every time,
and replica-3 meant that every files were absolutely synced and commited
at all time.
I've also read about the 10 minutes cron-like job of the self-healing
daemon, which we are using by default, but this is a second point.
The first point leads to :
- Why do we see so frequent desynchronizations between nodes?
- Can I confirm that reading which logs?
- What must I check?
--
Nicolas ECARNOT
More information about the Gluster-users
mailing list