[Bugs] [Bug 1784402] New: storage.reserve ignored by self-heal so that bricks are 100% full
bugzilla at redhat.com
bugzilla at redhat.com
Tue Dec 17 11:16:35 UTC 2019
https://bugzilla.redhat.com/show_bug.cgi?id=1784402
Bug ID: 1784402
Summary: storage.reserve ignored by self-heal so that bricks
are 100% full
Product: GlusterFS
Version: 5
Status: NEW
Component: posix
Assignee: bugs at gluster.org
Reporter: david.spisla at iternity.com
CC: bugs at gluster.org
Target Milestone: ---
Classification: Community
Created attachment 1645849
--> https://bugzilla.redhat.com/attachment.cgi?id=1645849&action=edit
Gluster vo info and status, df -hT, heal info, logs of glfsheal and all related
bricks
Description of problem:
Setup: 3-Node VMWare Cluster (2 Storage Nodes and 1 Arbiter Node),
Distribute-Replica 2 Volume with 1 Arbiter brick per Replica-Tupel (see
attached file for the detail configuration).
Version-Release number of selected component (if applicable):
Gluster FS v5.10
How reproducible:
Steps to Reproduce:
1. Mount volume from a dedicated client machine
2. Disable network of node 2
3. Write to node 1 in the volume until it is full. The storage.reserve limit of
the local bricks should take effect and the bricks should therefore be +-1%
empty.
4. Disable network of node 1
5. Enable network of node 2
6. Write to node 2 in the same volume, but write the data into another
subfolder or use completely different data. Otherwise one would get an
Split-brain error which is not the issue here. Also write data until the bricks
reaches the storage.reserve limit.
7. Now the volume is filled up with twice the amount of data
8. Enable network of node 1
Actual results:
storage.reserve was ignored and all bricks are 100% full within a few seconds.
All brick processes died. Volume not mountable and can not trigger heal.
Expected results:
self-heal process should be blocked by storage.reserve and brick processes
still running and volume is accessible.
Additional info:
See attached file
The above scenario was not only reproduced on a VM Cluster. We could also
monitor it on a real HW Cluster
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list