[Gluster-users] add/replace brick corrupting data
Lindsay Mathieson
lindsay.mathieson at gmail.com
Sat May 14 14:45:28 UTC 2016
Am testing replacing the brick in a replica 3 test volume. Gluster
3.7.11. Volume hosts two VM's. 3 Nodes, vna, vnb and vng.
*First off I tried removing/adding a brick.*
gluster v remove-brick replica 2
vng.proxmox.softlog:/tank/vmdata/test1 force.
That worked fine, VM's (on another node) kept running without a hiccup
I deleted /tank/vmdata/test1, then
gluster v add-brick replica 3
vng.proxmox.softlog:/tank/vmdata/test1 force.
Succeeded and heal statistics immediatly showed 3000+ shards being
healed on vna and vnb
Unfortunately it also show 100's of sharded being healed on vng, which
should not be happening as it had no data on it. Reverse heal basically.
Eventually all the heals completed, but the VM's were hopeless ccorrupted.
*Then I retried the above, but with all VM's shutdown*
i.e, no writes or reads happening on the volume.
This worked - i.e all the shards on vna & vnb healed, nothing in
reverse. Once completed the data (VM's) was fine.
Unfortunately this isn't practical in production - can' bring all the
VM's down for the 1-2 days it would take to heal.
*Replacing the brick
*I tried
killed the glusterfsd process on vng, then
gluster v replace-brick test1
vng.proxmox.softlog:/tank/vmdata/test1
vng.proxmox.softlog:/tank/vmdata/test1.1 commit force
*
*vna & vnb shards started healing, but vng showed 5 reverse heals happening.
Eventually it got down to 4-5 shards needing healing on each brick and
stopped. They didn't go away till I removed the test1.1 brick.
*
*Currently the replace brick processes seems to be unusable except when
the volume is not being used.
--
Lindsay Mathieson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160515/b212c79b/attachment.html>
More information about the Gluster-users
mailing list