[Gluster-users] Dispersed volumes won't heal on ARM

Fox foxxz.net at gmail.com
Sun Mar 1 04:08:31 UTC 2020


I am using a dozen odriod HC2 ARM systems each with a single HD/brick.
Running ubuntu 18 and glusterfs 7.2 installed from the gluster PPA.

I can create a dispersed volume and use it. If one of the cluster members
duck out, say gluster12 reboots, when it comes back online it shows
connected in the peer list but using
gluster volume heal <volname> info summary

It shows up as
Brick gluster12:/exports/sda/brick1/disp1
Status: Transport endpoint is not connected
Total Number of entries: -
Number of entries in heal pending: -
Number of entries in split-brain: -
Number of entries possibly healing: -

Trying to force a full heal doesn't fix it. The cluster member otherwise
works and heals for other non-disperse volumes even while showing up as
disconnected for the dispersed volume.

I have attached a terminal log of the volume creation and diagnostic
output. Could this be an ARM specific problem?

I tested a similar setup on x86 virtual machines. They were able to heal a
dispersed volume no problem. One thing I see in the ARM logs I don't see in
the x86 logs is lots of this..
[2020-03-01 03:54:45.856769] W [MSGID: 122035]
[ec-common.c:668:ec_child_select] 0-disp1-disperse-0: Executing operation
with some subvolumes unavailable. (800). FOP : 'LOOKUP' failed on '(null)'
with gfid 0d3c4cf3-e09c-4b9a-87d3-cdfc4f49b692
[2020-03-01 03:54:45.910203] W [MSGID: 122035]
[ec-common.c:668:ec_child_select] 0-disp1-disperse-0: Executing operation
with some subvolumes unavailable. (800). FOP : 'LOOKUP' failed on '(null)'
with gfid 0d806805-81e4-47ee-a331-1808b34949bf
[2020-03-01 03:54:45.932734] I [rpc-clnt.c:1963:rpc_clnt_reconfig]
0-disp1-client-11: changing port to 49152 (from 0)
[2020-03-01 03:54:45.956803] W [MSGID: 122035]
[ec-common.c:668:ec_child_select] 0-disp1-disperse-0: Executing operation
with some subvolumes unavailable. (800). FOP : 'LOOKUP' failed on '(null)'
with gfid d5768bad-7409-40f4-af98-4aef391d7ae4
[2020-03-01 03:54:46.000102] W [MSGID: 122035]
[ec-common.c:668:ec_child_select] 0-disp1-disperse-0: Executing operation
with some subvolumes unavailable. (800). FOP : 'LOOKUP' failed on '(null)'
with gfid 216f5583-e1b4-49cf-bef9-8cd34617beaf
[2020-03-01 03:54:46.044184] W [MSGID: 122035]
[ec-common.c:668:ec_child_select] 0-disp1-disperse-0: Executing operation
with some subvolumes unavailable. (800). FOP : 'LOOKUP' failed on '(null)'
with gfid 1b610b49-2d69-4ee6-a440-5d3edd6693d1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200229/f81ea03e/attachment.html>
-------------- next part --------------
root at gluster01:~# gluster peer status
Number of Peers: 11

Hostname: gluster02
Uuid: bed38dda-279e-4ee2-9c35-4bf2976b93bf
State: Peer in Cluster (Connected)

Hostname: gluster03
Uuid: 662bf82c-3097-4259-9674-ec4081f3fc08
State: Peer in Cluster (Connected)

Hostname: gluster04
Uuid: 4b6e1594-75b5-43a7-88d1-44e17077c805
State: Peer in Cluster (Connected)

Hostname: gluster05
Uuid: 601882c1-5c05-4b1f-839c-f497ad1b1e70
State: Peer in Cluster (Connected)

Hostname: gluster06
Uuid: 5c37e57c-c0e6-412c-ac21-a42eaf6d0426
State: Peer in Cluster (Connected)

Hostname: gluster07
Uuid: f85ba854-0136-4e0e-ba59-d28dff76d58c
State: Peer in Cluster (Connected)

Hostname: gluster08
Uuid: b8d2908d-b747-4b34-87c5-360011923b1f
State: Peer in Cluster (Connected)

Hostname: gluster09
Uuid: f4f3b416-ca8a-4d3f-a309-51f639f32665
State: Peer in Cluster (Connected)

Hostname: gluster10
Uuid: d3dc64f6-1a41-44af-90a9-64bf792b8b80
State: Peer in Cluster (Connected)

Hostname: gluster11
Uuid: b80cfaee-0343-4b0d-b068-415993149969
State: Peer in Cluster (Connected)

Hostname: gluster12
Uuid: c5934246-48ab-419e-9aff-e20d9af27b18
State: Peer in Cluster (Connected)


root at gluster01:~# gluster volume create disp1 disperse 12 gluster01:/exports/sda/brick1/disp1 gluster02:/exports/sda/brick1/disp1 gluster03:/exports/sda/brick1/disp1 gluster04:/exports/sda/brick1/disp1 gluster05:/exports/sda/brick1/disp1 gluster06:/exports/sda/brick1/disp1 gluster07:/exports/sda/brick1/disp1 gluster08:/exports/sda/brick1/disp1 gluster09:/exports/sda/brick1/disp1 gluster10:/exports/sda/brick1/disp1 gluster11:/exports/sda/brick1/disp1 gluster12:/exports/sda/brick1/disp1
The optimal redundancy for this configuration is 4. Do you want to create the volume with this value ? (y/n) y
volume create: disp1: success: please start the volume to access data

root at gluster01:~# gluster volume start disp1
volume start: disp1: success

root at gluster01:~# gluster volume heal disp1 enable
Enable heal on volume disp1 has been successful 

root at gluster01:~# gluster volume info disp1
 
Volume Name: disp1
Type: Disperse
Volume ID: 9c4070e5-e0b8-46ca-a783-96bd240247d1
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (8 + 4) = 12
Transport-type: tcp
Bricks:
Brick1: gluster01:/exports/sda/brick1/disp1
Brick2: gluster02:/exports/sda/brick1/disp1
Brick3: gluster03:/exports/sda/brick1/disp1
Brick4: gluster04:/exports/sda/brick1/disp1
Brick5: gluster05:/exports/sda/brick1/disp1
Brick6: gluster06:/exports/sda/brick1/disp1
Brick7: gluster07:/exports/sda/brick1/disp1
Brick8: gluster08:/exports/sda/brick1/disp1
Brick9: gluster09:/exports/sda/brick1/disp1
Brick10: gluster10:/exports/sda/brick1/disp1
Brick11: gluster11:/exports/sda/brick1/disp1
Brick12: gluster12:/exports/sda/brick1/disp1
Options Reconfigured:
cluster.disperse-self-heal-daemon: enable
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on


(CLIENT MOUNTS VOLUME AND BEGINS WRITING FILES)
(GLUSTER12 IS REBOOTED DURING)

root at gluster01:~# gluster volume info disp1
 
Volume Name: disp1
Type: Disperse
Volume ID: 9c4070e5-e0b8-46ca-a783-96bd240247d1
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (8 + 4) = 12
Transport-type: tcp
Bricks:
Brick1: gluster01:/exports/sda/brick1/disp1
Brick2: gluster02:/exports/sda/brick1/disp1
Brick3: gluster03:/exports/sda/brick1/disp1
Brick4: gluster04:/exports/sda/brick1/disp1
Brick5: gluster05:/exports/sda/brick1/disp1
Brick6: gluster06:/exports/sda/brick1/disp1
Brick7: gluster07:/exports/sda/brick1/disp1
Brick8: gluster08:/exports/sda/brick1/disp1
Brick9: gluster09:/exports/sda/brick1/disp1
Brick10: gluster10:/exports/sda/brick1/disp1
Brick11: gluster11:/exports/sda/brick1/disp1
Brick12: gluster12:/exports/sda/brick1/disp1
Options Reconfigured:
cluster.disperse-self-heal-daemon: enable
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on


root at gluster01:~# gluster peer status
Number of Peers: 11

Hostname: gluster02
Uuid: bed38dda-279e-4ee2-9c35-4bf2976b93bf
State: Peer in Cluster (Connected)

Hostname: gluster03
Uuid: 662bf82c-3097-4259-9674-ec4081f3fc08
State: Peer in Cluster (Connected)

Hostname: gluster04
Uuid: 4b6e1594-75b5-43a7-88d1-44e17077c805
State: Peer in Cluster (Connected)

Hostname: gluster05
Uuid: 601882c1-5c05-4b1f-839c-f497ad1b1e70
State: Peer in Cluster (Connected)

Hostname: gluster06
Uuid: 5c37e57c-c0e6-412c-ac21-a42eaf6d0426
State: Peer in Cluster (Connected)

Hostname: gluster07
Uuid: f85ba854-0136-4e0e-ba59-d28dff76d58c
State: Peer in Cluster (Connected)

Hostname: gluster08
Uuid: b8d2908d-b747-4b34-87c5-360011923b1f
State: Peer in Cluster (Connected)

Hostname: gluster09
Uuid: f4f3b416-ca8a-4d3f-a309-51f639f32665
State: Peer in Cluster (Connected)

Hostname: gluster10
Uuid: d3dc64f6-1a41-44af-90a9-64bf792b8b80
State: Peer in Cluster (Connected)

Hostname: gluster11
Uuid: b80cfaee-0343-4b0d-b068-415993149969
State: Peer in Cluster (Connected)

Hostname: gluster12
Uuid: c5934246-48ab-419e-9aff-e20d9af27b18
State: Peer in Cluster (Connected)

root at gluster01:~# gluster volume heal disp1 info summary
Brick gluster01:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster02:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster03:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster04:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster05:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster06:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster07:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster08:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster09:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster10:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster11:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 306
Number of entries in heal pending: 306
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster12:/exports/sda/brick1/disp1
Status: Transport endpoint is not connected
Total Number of entries: -
Number of entries in heal pending: -
Number of entries in split-brain: -
Number of entries possibly healing: -

root at gluster01:~# gluster volume heal disp1 full
Launching heal operation to perform full self heal on volume disp1 has been successful 
Use heal info commands to check status.

root at gluster01:~# gluster volume heal disp1 info summary
Brick gluster01:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster02:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster03:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster04:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster05:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster06:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster07:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster08:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster09:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster10:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster11:/exports/sda/brick1/disp1
Status: Connected
Total Number of entries: 293
Number of entries in heal pending: 293
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick gluster12:/exports/sda/brick1/disp1
Status: Transport endpoint is not connected
Total Number of entries: -
Number of entries in heal pending: -
Number of entries in split-brain: -
Number of entries possibly healing: -


root at gluster01:~# gluster volume status
Status of volume: disp1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gluster01:/exports/sda/brick1/disp1   49152     0          Y       3931 
Brick gluster02:/exports/sda/brick1/disp1   49152     0          Y       2755 
Brick gluster03:/exports/sda/brick1/disp1   49152     0          Y       2787 
Brick gluster04:/exports/sda/brick1/disp1   49152     0          Y       2780 
Brick gluster05:/exports/sda/brick1/disp1   49152     0          Y       2764 
Brick gluster06:/exports/sda/brick1/disp1   49152     0          Y       2760 
Brick gluster07:/exports/sda/brick1/disp1   49152     0          Y       2740 
Brick gluster08:/exports/sda/brick1/disp1   49152     0          Y       2729 
Brick gluster09:/exports/sda/brick1/disp1   49152     0          Y       2772 
Brick gluster10:/exports/sda/brick1/disp1   49152     0          Y       2791 
Brick gluster11:/exports/sda/brick1/disp1   49152     0          Y       2026 
Brick gluster12:/exports/sda/brick1/disp1   N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       3952 
Self-heal Daemon on gluster03               N/A       N/A        Y       2808 
Self-heal Daemon on gluster02               N/A       N/A        Y       2776 
Self-heal Daemon on gluster06               N/A       N/A        Y       2781 
Self-heal Daemon on gluster07               N/A       N/A        Y       2761 
Self-heal Daemon on gluster05               N/A       N/A        Y       2785 
Self-heal Daemon on gluster08               N/A       N/A        Y       2750 
Self-heal Daemon on gluster04               N/A       N/A        Y       2801 
Self-heal Daemon on gluster09               N/A       N/A        Y       2793 
Self-heal Daemon on gluster11               N/A       N/A        Y       2047 
Self-heal Daemon on gluster10               N/A       N/A        Y       2812 
Self-heal Daemon on gluster12               N/A       N/A        Y       542  
 
Task Status of Volume disp1
------------------------------------------------------------------------------
There are no active volume tasks


More information about the Gluster-users mailing list