[Gluster-users] GlusterFS 9.3 - Replicate Volume (2 Bricks / 1 Arbiter) - Self-healing does not always work

Thorsten Walk darkiop at gmail.com
Fri Oct 29 06:57:47 UTC 2021

Hello GlusterFS Community,

I am using GlusterFS version 9.3 on two Intel NUCs and a Raspberry PI as
arbiter for a replicate volume. The whole thing serves me as distributed
storage for a Proxmox cluster.

I use version 9.3, because I could not find a more recent ARM package for
the RPI (= Debian 11).

The partions for the volume:

nvme0n1                      259:0    0 465.8G  0 disk
└─vg_glusterfs-lv_glusterfs  253:18   0 465.8G  0 lvm  /data/glusterfs

nvme0n1                      259:0    0 465.8G  0 disk
└─vg_glusterfs-lv_glusterfs  253:14   0 465.8G  0 lvm  /data/glusterfs

sda           8:0    1 29,8G  0 disk
└─sda1        8:1    1 29,8G  0 part /data/glusterfs

The volume was created with:

mkfs.xfs -f -i size=512 -n size=8192 -d su=128K,sw=10 -L GlusterFS

gluster volume create glusterfs-1-volume transport tcp replica 3 arbiter 1

After a certain time it always comes to the state that there are not
healable files in the GFS (in the example below:

Currently I have the GlusterFS volume in test mode and only 1-2 VMs running
on it. So far there are no negative effects. The replication and the
selfheal basically work, only now and then something remains that cannot be

Does anyone have an idea how to prevent or heal this? I have already
completely rebuilt the volume incl. partitions and glusterd to exclude old

If you need more information, please contact me.

Thanks a lot!


And here is some more info about the volume and the healing attempts:

>$ gstatus -ab
         Status: Healthy                 GlusterFS: 9.3
         Nodes: 3/3                      Volumes: 1/1


                Replicate          Started (UP) - 3/3 Bricks Up  - (Arbiter
                                   Capacity: (1.82% used) 8.00 GiB/466.00
GiB (used/total)
File(s) to heal).
                                      Distribute Group 1:

>$ gluster volume info
Volume Name: glusterfs-1-volume
Type: Replicate
Volume ID: f70d9b2c-b30d-4a36-b8ff-249c09c8b45d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Brick3: (arbiter)
Options Reconfigured:
cluster.lookup-optimize: off
server.keepalive-count: 5
server.keepalive-interval: 2
server.keepalive-time: 10
server.tcp-user-timeout: 20
network.ping-timeout: 20
server.event-threads: 4
client.event-threads: 4
cluster.choose-local: off
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
performance.strict-o-direct: on
network.remote-dio: disable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on

>$ gluster volume heal glusterfs-1-volume
Launching heal operation to perform index self heal on volume
glusterfs-1-volume has been successful
Use heal info commands to check status.

>$ gluster volume heal glusterfs-1-volume info
Status: Connected
Number of entries: 1

Status: Connected
Number of entries: 0

Status: Connected
Number of entries: 0

>$ gluster volume heal glusterfs-1-volume info split-brain
Status: Connected
Number of entries in split-brain: 0

Status: Connected
Number of entries in split-brain: 0

Status: Connected
Number of entries in split-brain: 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20211029/3860c1e1/attachment.html>

More information about the Gluster-users mailing list