[Gluster-users] Rebuilding a failed cluster

Sun Aug 13 02:25:31 UTC 2023

You must have been running a really old version of glusterfs, 2 node
systems haven't been supported for a few major releases now. if you
want n-1 reliability you need at least a 4 node system.
On the bright side setup your new gluster system with approriate
storage. Gluster doesn't do anything fancy with the data, it's all meta
data magic, so trying to get a new modern glusterfs system to adopt
your old bricks isn't worth the effort.
This is one of my bricks that I keep audio files on
z /gfss/brkaudio/
drwxr-xr-x. 8 root root 93 Dec 31 1969 /gfss/brkaudio/audio

# z /gfss/brkaudio/audio/
drwxr-xr-x. 5 root root 39 Dec 27 2019 /gfss/brkaudio/audio/music
drwxr-xr-x. 4 root root 28 Dec 27 2019 /gfss/brkaudio/audio/speech
drwxr-xr-x. 2 root root 6 Oct 15 2016 /gfss/brkaudio/audio/words
drwxrwxrwt. 2 root root 6 Jul 28 2020 /gfss/brkaudio/audio/work

This is where the meta data magic is, its all based on inode number
# ls -ald /gfss/brkaudio/audio/.*
drwxr-xr-x. 8 root root 93 Dec 31 1969 /gfss/brkaudio/audio/.
drwxr-xr-x. 3 root root 19 Dec 20 2019 /gfss/brkaudio/audio/..
drw-------. 263 root root 8192 Jul 18 2021
/gfss/brkaudio/audio/.glusterfs
drwxr-xr-x. 3 root root 25 Dec 27 2019 /gfss/brkaudio/audio/.trashcan

the way to recreate it is flip a coin pick your best bricks copy the
data to the new gluster volumes, let it replicate. Then write a script
with find to do compares with the second bricks data with the current
new gluster data and figure out the problems.

Been there and done that.

On Sat, 2023-08-12 at 00:46 -0400, Richard Betel wrote:
> I had a small cluster with a disperse 3 volume. 2 nodes had hardware
> failures and no longer boot, and I don't have replacement hardware
> for them (it's an old board called a PC-duino). However, I do have
> their intact root filesystems and the disks the bricks are on. 
> 
> So I need to rebuild the cluster on all new host hardware. does
> anyone have any suggestions on how to go about doing this? I've built
> 3 vms to be a new test cluster, but if I copy over a file from the 3
> nodes and try to read it, I can't and get errors in
> /var/log/glusterfs/foo.log:
> [2023-08-12 03:50:47.638134 +0000] W [MSGID: 114031] [client-rpc-
> fops_v2.c:2561:client4_0_lookup_cbk] 0-gv-client-0: remote operation
> failed. [{path=/helmetpart.scad}, {gfid=00000000-0000-0000-0000-
> 000000000000}
> , {errno=61}, {error=No data available}]
> [2023-08-12 03:50:49.834859 +0000] E [MSGID: 122066] [ec-
> common.c:1301:ec_prepare_update_cbk] 0-gv-disperse-0: Unable to get
> config xattr. FOP : 'FXATTROP' failed on gfid 076a511d-3721-4231-
> ba3b-5c4cbdbd7f5d. Pa
> rent FOP: READ [No data available]
> [2023-08-12 03:50:49.834930 +0000] W [fuse-
> bridge.c:2994:fuse_readv_cbk] 0-glusterfs-fuse: 39: READ => -1
> gfid=076a511d-3721-4231-ba3b-5c4cbdbd7f5d fd=0x7fbc9c001a98 (No data
> available)
> 
> so obviously, I need to copy over more stuff from the original
> cluster. If I force the 3 nodes and the volume to have the same
> uuids, will that be enough?
> ________
> 
> 
> 
> Community Meeting Calendar:
> 
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20230812/7dbf05ad/attachment.html>