[Bugs] [Bug 1745026] New: endless heal gluster volume; incrementing number of files to heal when all peers in volume are up

Fri Aug 23 13:52:36 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1745026

            Bug ID: 1745026
           Summary: endless heal gluster volume; incrementing number of
                    files to heal when all peers in volume are up
           Product: GlusterFS
           Version: 4.1
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: fuse
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: tvanberlo at vangenechten.com
                CC: bugs at gluster.org
  Target Milestone: ---
    Classification: Community

Description of problem:
files that need healing increment while gluster is healing.(Number of entries
goes up when a heal is already started)

Version-Release number of selected component (if applicable):
glusterfs.x86_64                              6.4-1.el7               
installed
glusterfs-api.x86_64                          6.4-1.el7               
installed
glusterfs-cli.x86_64                          6.4-1.el7               
installed
glusterfs-client-xlators.x86_64               6.4-1.el7               
installed
glusterfs-events.x86_64                       6.4-1.el7               
installed
glusterfs-fuse.x86_64                         6.4-1.el7               
installed
glusterfs-geo-replication.x86_64              6.4-1.el7               
installed
glusterfs-libs.x86_64                         6.4-1.el7               
installed
glusterfs-rdma.x86_64                         6.4-1.el7               
installed
glusterfs-server.x86_64                       6.4-1.el7               
installed
libvirt-daemon-driver-storage-gluster.x86_64  4.5.0-10.el7_6.12       
installed
python2-gluster.x86_64                        6.4-1.el7               
installed
vdsm-gluster.x86_64                           4.30.24-1.el7           
installed

How reproducible:
In our gluster cluster ( replica 3 - 1 arbiter) it is sufficient to reboot a
node of the cluster.
When the node is back online and the heal is started, more files are added to
the 'files that need healing' list. $(gluster volume heal ${volumeName} info|
grep entries)

Steps to Reproduce:
1. Reboot node in gluster cluster
2. check with 'gluster peer status' on all nodes, if all nodes are connected
(if not, stop firewalld, wait until every node is connected, start firewalld.
3. wait 10 minutes or trigger heal manually: 'gluster volume heal
${volumenName}'

Actual results:
the list of files that need healing grow. 'gluster volume heal ${volumeName}
info| grep entries'

Expected results:
The list of files should decrease continuously, because the gluster fuse should
write to all members of the gluster cluster.

Additional info:
Fix:
To fix this situation we execute the following steps on our ovirt cluster:
The gluster volume should be remounted, depending on how the storage domain is
used in ovirt this is done differently.
        * data volume: volume where all the vms are running on => 1 by 1 put
every host/hypervisor in maintenance mode and activate the host again. (This
will unmount and remount the data volume on that host)
        * engine volume: volume where hosted engine is running
            - on every host not running the engine, find the systemd scope the
engine mount is running with, and restart it
            - migrate the engine to another host and execute the steps on the
hypervisor the engine migrated from
            - commands to use for finding the correct scope and restarting it:
```
[root at compute0103 ~]# volname='engine'
[root at compute0103 ~]# systemctl list-units|grep rhev| grep scope| grep
${volname}
 run-17819.scope                                                               
                     loaded active running   /usr/bin/mount -t glusterfs -o
backup-volfile-servers=compute0103.priv.domain.com:compute0104.priv.domain.com
compute0102.priv.domain.com:/engine
/rhev/data-center/mnt/glusterSD/compute0102.priv.domain.com:_engine
[root at compute0103 ~]# systemctl restart run-17819.scope
```

```
[root at compute0103 ~]# gluster volume info data 

Volume Name: data
Type: Replicate
Volume ID: 404ec6b1-731c-4e65-a07f-4ca646054eb4
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: compute0102.priv.vangenechten.com:/gluster_bricks/data/data
Brick2: compute0103.priv.vangenechten.com:/gluster_bricks/data/data
Brick3: compute0104.priv.vangenechten.com:/gluster_bricks/data/data (arbiter)
Options Reconfigured:
server.event-threads: 4
client.event-threads: 4
features.read-only: off
features.barrier: disable
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.low-prio-threads: 32
network.remote-dio: enable
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
cluster.choose-local: off
storage.owner-uid: 36
storage.owner-gid: 36
network.ping-timeout: 30
performance.strict-o-direct: on
transport.address-family: inet
performance.client-io-threads: on
nfs.disable: on
disperse.shd-wait-qlength: 1024
storage.build-pgfid: on
```

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.