[Bugs] [Bug 1322145] Glusterd fails to restart after replacing a failed GlusterFS node and a volume has a snapshot
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jun 21 17:50:45 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1322145
Ben Werthmann <ben at apcera.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|CLOSED |ASSIGNED
Resolution|NOTABUG |---
Keywords| |Reopened
--- Comment #8 from Ben Werthmann <ben at apcera.com> ---
Yes, the replacement peer (1 of 3) and new brick have a new IP. In our case,
DNS is not available. When the failed brick is removed via 'gluster volume
replace-brick $vol $failed_peer $new_peer_ip:$new_brick commit force' why are
there lingering references to the old peer/brick? Is there a reason that
'replace-brick' does not fix all of references to the old peer/brick? If
'replace-brick' has been issued, is it safe to drop the snapshot references of
the old peer/brick?
I suspect the suggested fix of using the hostname will fail if any snapshots
exist because in the case of new peer/new brick case, the new peer will not
have the LVM snapshots needed to resolve the snapshot references.
Put another way: why is the following set of operations not valid?
- deploy gluster with three servers, one brick each, one volume replicated
across all 3
- create a snapshot
- lose one server
- add a replacement peer and new brick with a new IP address
- replace-brick the missing brick onto the new server (wait for replication to
finish)
- force remove the old server
- verify everything is working as expected
- restart _any_ server in the cluster, without failure
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=q64KeSDcCL&a=cc_unsubscribe
More information about the Bugs
mailing list