[Gluster-users] GlusterFS 9.3 - Replicate Volume (2 Bricks / 1 Arbiter) - Self-healing does not always work
hunter86_bg at yahoo.com
Fri Nov 5 19:45:51 UTC 2021
You can mount the volume via # mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol
And then obtain the path:
getfattr -n trusted.glusterfs.pathinfo -e text /mnt/testvol/.gfid/<GFID>
Best Regards,Strahil Nikolov
On Fri, Nov 5, 2021 at 19:29, Thorsten Walk<darkiop at gmail.com> wrote: Hi Guys,
I pushed some VMs to the GlusterFS storage this week and ran them there. For a maintenance task, I moved these VMs to Proxmox-Node-2 and took Node-1 offline for a short time.After moving them back to Node-1 there were some file corpses left (see attachment). In the logs I can't find anything about the gfids :)
┬[15:36:51] [ssh:root at pve02(192.168.1.51): /home/darkiop (755)]
Status: Healthy GlusterFS: 9.3
Nodes: 3/3 Volumes: 1/1
Replicate Started (UP) - 3/3 Bricks Up - (Arbiter Volume)
Capacity: (17.89% used) 83.00 GiB/466.00 GiB (used/total)
192.168.1.51:/data/glusterfs (4 File(s) to heal).
Distribute Group 1:
Number of entries: 0
Number of entries: 4
Number of entries: 0
┬[15:37:03] [ssh:root at pve02(192.168.1.51): /home/darkiop (755)]
╰─># cat /data/glusterfs/.glusterfs/ad/e6/ade6f31c-b80b-457e-a054-6ca1548d9cd3
┬[15:37:13] [ssh:root at pve02(192.168.1.51): /home/darkiop (755)]
╰─># grep -ir 'ade6f31c-b80b-457e-a054-6ca1548d9cd3' /var/log/glusterfs/*.log
Am Mo., 1. Nov. 2021 um 07:51 Uhr schrieb Thorsten Walk <darkiop at gmail.com>:
After deleting the file, output of heal info is clear.
>Not sure why you ended up in this situation (maybe unlink partially failed on this brick?)
Neither did I, this was a completely fresh setup with 1-2 VMs and 1-2 Proxmox LXC templates. I let it run for a few days and at some point it had the mentioned state. I continue to monitor and start with fill the bricks with data.
Thanks for your help!
Am Mo., 1. Nov. 2021 um 02:54 Uhr schrieb Ravishankar N <ravishankar.n at pavilion.io>:
On Mon, Nov 1, 2021 at 12:02 AM Thorsten Walk <darkiop at gmail.com> wrote:
Hi Ravi, the file only exists at pve01 and since only once:
┬[19:22:10] [ssh:root at pve01(192.168.1.50): ~ (700)]
╰─># stat /data/glusterfs/.glusterfs/26/c5/26c5396c-86ff-408d-9cda-106acd2b0768
Size: 6 Blocks: 8 IO Block: 4096 regular file
Device: fd12h/64786d Inode: 528 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2021-10-30 14:34:50.385893588 +0200
Modify: 2021-10-27 00:26:43.988756557 +0200
Change: 2021-10-27 00:26:43.988756557 +0200
┬[19:24:41] [ssh:root at pve01(192.168.1.50): ~ (700)]
╰─># ls -l /data/glusterfs/.glusterfs/26/c5/26c5396c-86ff-408d-9cda-106acd2b0768
.rw-r--r-- root root 6B 4 days ago /data/glusterfs/.glusterfs/26/c5/26c5396c-86ff-408d-9cda-106acd2b0768
┬[19:24:54] [ssh:root at pve01(192.168.1.50): ~ (700)]
╰─># cat /data/glusterfs/.glusterfs/26/c5/26c5396c-86ff-408d-9cda-106acd2b0768
Hi Thorsten, you can delete the file. From the file size and contents, it looks like it belongs to ovirt sanlock. Not sure why you ended up in this situation (maybe unlink partially failed on this brick?). You can check the mount, brick and self-heal daemon logs for this gfid to see if you find related error/warning messages.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-users