[Bugs] [Bug 1811373] New: glusterd crashes healing disperse volumes on arm
bugzilla at redhat.com
bugzilla at redhat.com
Sun Mar 8 04:09:47 UTC 2020
https://bugzilla.redhat.com/show_bug.cgi?id=1811373
Bug ID: 1811373
Summary: glusterd crashes healing disperse volumes on arm
Product: GlusterFS
Version: 7
Hardware: armv7l
OS: Linux
Status: NEW
Component: glusterd
Assignee: bugs at gluster.org
Reporter: foxxz.net at gmail.com
CC: bugs at gluster.org
Target Milestone: ---
Classification: Community
Created attachment 1668387
--> https://bugzilla.redhat.com/attachment.cgi?id=1668387&action=edit
Excerpts from several gluster logs
Description of problem:
the gluster brick process on an arm node that needs healing will crash (almost
always) seconds after it starts and connects to other cluster members. Have
tested under ubuntu 18, gluster v7 and v4 running on odroid HC2 and raspbian
gluster v5 running on raspberry pi 3
Version-Release number of selected component (if applicable):
gluster 7.2 but have also reproduced the problem on 4 and 5
How reproducible:
Reliably reproducible
Steps to Reproduce:
1. Create disperse volume on a cluster with 3 or more members/bricks and enable
healing
2. Have a client mount volume and begin writing files to volume
3. Reboot a cluster member during client operations
4. Cluster member rejoins cluster and attempts to heal
5. glusterd on that member typically crashes seconds to minutes after startup.
In rare cases longer.
Actual results:
gluster volume status
shows the affected brick online briefly and then offline after it crashes. The
self heal daemon shows as online. The brick is never able to heal and rejoin
the cluster.
Expected results:
The brick should come online and sync up.
Additional info:
Have run the same test on x86 hardware and it does not exhibit the same crash.
I am willing to make this testbed available to developers to help debug this
issue. It is a 12 node system comprised of odroid HC2 units with a 4tb drive
attached to each unit.
Volume Name: bigdisp
Type: Disperse
Volume ID: 56fa5de3-36d5-45ec-9789-88d8aae02275
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (8 + 4) = 12
Transport-type: tcp
Bricks:
Brick1: gluster1:/exports/sda/brick1/bigdisp
Brick2: gluster2:/exports/sda/brick1/bigdisp
Brick3: gluster3:/exports/sda/brick1/bigdisp
Brick4: gluster4:/exports/sda/brick1/bigdisp
Brick5: gluster5:/exports/sda/brick1/bigdisp
Brick6: gluster6:/exports/sda/brick1/bigdisp
Brick7: gluster7:/exports/sda/brick1/bigdisp
Brick8: gluster8:/exports/sda/brick1/bigdisp
Brick9: gluster9:/exports/sda/brick1/bigdisp
Brick10: gluster10:/exports/sda/brick1/bigdisp
Brick11: gluster11:/exports/sda/brick1/bigdisp
Brick12: gluster12:/exports/sda/brick1/bigdisp
Options Reconfigured:
disperse.shd-max-threads: 4
client.event-threads: 8
cluster.disperse-self-heal-daemon: enable
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
Status of volume: bigdisp
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gluster1:/exports/sda/brick1/bigdisp 49152 0 Y 4632
Brick gluster2:/exports/sda/brick1/bigdisp 49152 0 Y 3115
Brick gluster3:/exports/sda/brick1/bigdisp N/A N/A N N/A
Brick gluster4:/exports/sda/brick1/bigdisp 49152 0 Y 2728
Brick gluster5:/exports/sda/brick1/bigdisp 49152 0 Y 3072
Brick gluster6:/exports/sda/brick1/bigdisp 49152 0 Y 2549
Brick gluster7:/exports/sda/brick1/bigdisp 49152 0 Y 16848
Brick gluster8:/exports/sda/brick1/bigdisp 49152 0 Y 16740
Brick gluster9:/exports/sda/brick1/bigdisp 49152 0 Y 2619
Brick gluster10:/exports/sda/brick1/bigdisp 49152 0 Y 2677
Brick gluster11:/exports/sda/brick1/bigdisp 49152 0 Y 3023
Brick gluster12:/exports/sda/brick1/bigdisp 49153 0 Y 2440
Self-heal Daemon on localhost N/A N/A Y 4653
Self-heal Daemon on gluster3 N/A N/A Y 7620
Self-heal Daemon on gluster10 N/A N/A Y 2698
Self-heal Daemon on gluster7 N/A N/A Y 16869
Self-heal Daemon on gluster8 N/A N/A Y 16761
Self-heal Daemon on gluster12 N/A N/A Y 2461
Self-heal Daemon on gluster9 N/A N/A Y 2640
Self-heal Daemon on gluster2 N/A N/A Y 3136
Self-heal Daemon on gluster5 N/A N/A Y 3093
Self-heal Daemon on gluster4 N/A N/A Y 2749
Self-heal Daemon on gluster6 N/A N/A Y 2570
Self-heal Daemon on gluster11 N/A N/A Y 3044
Task Status of Volume bigdisp
------------------------------------------------------------------------------
There are no active volume tasks
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list