[Gluster-users] Replica bricks fungible?
Zenon Panoussis
oracle at provocation.net
Fri Apr 23 13:55:14 UTC 2021
>> Are all replica (non-arbiter) bricks identical to each
>> other? If not, what do they differ in?
> No. At least meta-metadata is different, IIUC.
Hmm, but at first sight this shouldn't be a problem as
long as (a) the the "before" and the "after" configuration
contain the exact same bricks, only in a different
topology, and (b) the metadata do not include the local
server's hostname.
I did a rough check for this on one brick with
grep -r thisnodehostname /rawgfs/gv0/.glusterfs/
and found nothing. Thus, it seems that the brick itself
has no idea which server it lives on. The server, however,
does know which brick is local to it.
https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.6/Persistent%20AFR%20Changelog%20xattributes.md
pointed me to /var/lib/glusterd/vols/volname/bricks/,
where I find one file for each brick, named
"hostname:-gfs-volname" with contents like
uuid=7da10366-90e1-4743-8b32-3f05eafca3cb
hostname=serverhostname
path=/gfs/gv0
real_path=/gfs/gv0
listen-port=49152
rdma.listen-port=0
decommissioned=0
brick-id=gv0-client-1
mount_dir=/gv0
snap-status=0
brick-fsid=4818249190141190833
The uuid is referenced in /var/lib/glusterd/glusterd.info
and it is different on each server. The brick-fsid is set
for the local brick and is 0 for remote bricks.
Thus, I imagine it's possible to move a brick with some
slight acrobatics like this:
before
node1:brick1
node2:brick2
node3:brick3
1. Add node4:brick4 to the volume with an empty brick.
2. Stop glusterd on all nodes.
3. Move brick2 to node4.
4. Move brick4 to node2.
5. Edit hostname:-gfs-volname on node2 and node4 to swap
their brick-fsid= entries, leaving their respective
uuid= alone.
6. Start glusterd on all nodes.
7. Remove node2:brick2 from the volume.
after
node1:brick1
node3:brick3
node4:brick2
I'm not quite sure about brick-id= though, whether it
should also be swapped between node2 and node4 or not.
Gluster 10 is rumoured to break backwards compatibility
and upgradability. Reinstalling servers from scratch is
no big deal, but moving huge amounts of data is, especially
if it is live data that requires downtime in order to be
moved atomically. A pathway to moving bricks between
servers could be the easy solution to this problem.
More information about the Gluster-users
mailing list