<html><head>
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#467886;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Aptos",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:11.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:42557180;
mso-list-template-ids:1131597982;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:1.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:1.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:2.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:2.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:3.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:3.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:4.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:4.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1
{mso-list-id:767844617;
mso-list-template-ids:-694364516;}
@list l1:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level2
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:1.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:1.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:2.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level5
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:2.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:3.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:3.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level8
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:4.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:4.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style>
</head>
<body lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word;line-break:after-white-space"><div>Erik,</div><div><br></div><div>the original problem sounds to me rather like a qemu problem 😕</div><div>When using GFAPI with libvirtd have an eye on selinux/apparmor, this is sometimes really troublesome!<br><br>Your config looks exactly like mine, with the exception, that I don't use scsi.<br>I host the plain images (raw & qcow) on the gluster volume like<br><target dev="vda" bus="virtio"/></div><div><br></div><div>A.</div><div><br></div><div>Am Montag, dem 14.10.2024 um 15:57 +0000 schrieb Jacobson, Erik:</div><blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px #729fcf solid;padding-left:1ex"><div class="WordSection1"><p class="MsoNormal">First a heartfelt thanks for writing back.<o:p></o:p></p><p class="MsoNormal"><o:p> </o:p></p><p class="MsoNormal">In a solution (not having this issue) we do use nfs-ganesha to host filesystem squashfs root FS objects to compute nodes. It is working great. We also have fuse-through-LIO.<o:p></o:p></p><p class="MsoNormal"><o:p> </o:p></p><p class="MsoNormal">The solution here is 3 servers making up with cluster admin node.<o:p></o:p></p><p class="MsoNormal"><o:p> </o:p></p><p class="MsoNormal">The XFS issue is only observed when we try to replace an existing one with another XFS on top, and only with RAW, and only inside the VM. So it isn’t like data is being corrupted. However, it’s hard to replace a filesystem with another like you would do if you re-install one of what may be several operating systems on that disk image.<o:p></o:p></p><p class="MsoNormal"><o:p> </o:p></p><p class="MsoNormal">I am interested in your GFAPI information. I rebuilt RHEL9.4 qemu and changed the spec file to produce the needed gluster block package, and referred to the image file via the gluster protocol. My system got horrible scsi errors and sometimes didn’t even boot from a live environment. I repeated the same failure with sles15. I did this with a direct setup (not volumes/pools/etc).<o:p></o:p></p><p class="MsoNormal"><o:p> </o:p></p><p class="MsoNormal">I could experiment with Ubuntu if needed so that was a good data point.<o:p></o:p></p><p class="MsoNormal"><o:p> </o:p></p><p class="MsoNormal">I am interested in your setup to see what I may have missed. If I simply made a mistake configuring GFAPI that would be welcome news.<o:p></o:p></p><p class="MsoNormal"><o:p> </o:p></p><p class="MsoNormal"> <devices><o:p></o:p></p><p class="MsoNormal"> <emulator>/usr/libexec/qemu-kvm</emulator><o:p></o:p></p><p class="MsoNormal"> <disk type='network' device='disk'><o:p></o:p></p><p class="MsoNormal"> <driver name='qemu' type='raw' cache='none'/><o:p></o:p></p><p class="MsoNormal"> <source protocol='gluster' name='adminvm/images/adminvm.img' index='2'><o:p></o:p></p><p class="MsoNormal"> <host name='localhost' port='24007'/><o:p></o:p></p><p class="MsoNormal"> </source><o:p></o:p></p><p class="MsoNormal"> <backingStore/><o:p></o:p></p><p class="MsoNormal"> <target dev='sdh' bus='scsi'/><o:p></o:p></p><p class="MsoNormal"> <alias name='scsi1-0-0-0'/><o:p></o:p></p><p class="MsoNormal"> <address type='drive' controller='1' bus='0' target='0' unit='0'/><o:p></o:p></p><p class="MsoNormal"> </disk><o:p></o:p></p><p class="MsoNormal"><o:p> </o:p></p><div id="mail-editor-reference-message-container"><div><div><div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in"><p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:</span></b><span style="font-size:12.0pt;color:black">Gluster-users <gluster-users-bounces@gluster.org> on behalf of Andreas Schwibbe <a.schwibbe@gmx.net><br><b>Date: </b>Monday, October 14, 2024 at 4:34</span><span style="font-size:12.0pt;font-family:"Arial",sans-serif;color:black"> </span><span style="font-size:12.0pt;color:black">AM<br><b>To: </b>gluster-users@gluster.org <gluster-users@gluster.org><br><b>Subject: </b>Re: [Gluster-users] XFS corruption reported by QEMU virtual machine with image hosted on gluster<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">Hey Erik,<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">I am running a similar setup with no issues having Ubuntu Host Systems on HPE DL380 Gen 10.<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">I however used to run libvirt/qemu via nfs-ganesha on top of gluster flawlessly.<br>Recently I upgraded to the native GFAPI implementation, which is poorly documented with snippets all over the internet.<br><br>Although I cannot provide a direct solution for your issue, I am however suggesting to try either nfs-ganesha as a replacement for fuse mount or GFAPI.<br>Happy to share libvirt/GFAPI config hints to make it happen.<br><br>Best<br>A.<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">Am Sonntag, dem 13.10.2024 um 21:59 +0000 schrieb Jacobson, Erik:<o:p></o:p></span></p></div><blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px #729fcf solid;padding-left:1ex"><div><p class="MsoNormal">Hello all! We are experiencing a strange problem with QEMU virtual machines where the virtual machine image is hosted on a gluster volume. Access via fuse. (Our GFAPI attempt failed, it doesn’t seem to work properly with current QEMU/distro/gluster). We have the volume tuned for ‘virt’.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">So we use qemu-img to create a raw image. You can use sparse or falloc with equal results. We start a virtual machine (libvirt, qemu-kvm) and libvirt/qemu points to the fuse mount with the QEMU image file we created.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">When we create partitions and filesystems – like you might do for installing an operating system – all is well at first. This includes a root XFS filesystem.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">When we try to re-make the XFS filesystem over the old one, it will not mount and will report XFS corruption.<o:p></o:p></p><p class="MsoNormal">If you dig into XFS repair, you can find a UUID mismatch between the superblock and the log. The log always retains the UUID of the original filesystem (the one we tried to replace). Running xfs_repair doesn’t truly repair, it just reports more corruption. xfs_db forcing to remake the log doesn’t help.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">We can duplicate this with even a QEMU raw image of 50 megabytes. As far as we can tell, XFS is the only filesystem showing this behavior or at least the only one reporting a problem.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">If we take QEMU out of the picture and create partitions directly on the QEMU raw image file, then use kpartx to create devices to the partitions, and run a similar test – the gluster-hosted image behaves as you would expect and there is no problem reported by XFS. We can’t duplicate the problem outside of QEMU.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">We have observed the issue with Rocky 9.4 and SLES15 SP5 environments (including the matching QEMU versions). We have not tested more distros yet.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">We observed the problem originally with Gluster 9.3. We reproduced it with Gluster 9.6 and 10.5.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">If we switch from QEMU RAW to QCOW2, the problem disappears.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">The problem is not reproduced when we take gluster out of the equation (meaning, pointing QEMU at a local disk image instead of gluster-hosted one – that works fine).<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">The problem can be reproduced this way:<o:p></o:p></p><ul style="margin-top:0in" type="disc"><li class="MsoListParagraph" style="margin-left:0in;mso-list:l1 level1 lfo1">Assume /adminvm/images on a gluster sharded volume<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l1 level1 lfo1">rm /adminvm/images/adminvm.img<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l1 level1 lfo1">qemu-img create -f raw /adminvm/images/adminvm.img 50M<o:p></o:p></li></ul><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">Now start the virtual machine that refers to the above adminvm.img file<o:p></o:p></p><ul style="margin-top:0in" type="disc"><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">Boot up a rescue environment or a live mode or similar<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">sgdisk --zap-all /dev/sda<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">sgdisk --set-alignment=4096 --clear /dev/sda<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">sgdisk --set-alignment=4096 --new=1:0:0 /dev/sda<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">mkfs.xfs -L fs1 /dev/sda1<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">mkdir -p /a<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">mount /dev/sda1 /a<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">umount /a<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2"># MAKE same FS again:<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">mkfs.xfs -f -L fs1 /dev/sda1<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">mount /dev/sda1 /a<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">This will fail with kernel back traces and corruption reported<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo2">xfs_repair will report the log vs superblock UUID mismatch I mentioned<o:p></o:p></li></ul><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">Here are the volume settings:<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal"># gluster volume info adminvm<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">Volume Name: adminvm<o:p></o:p></p><p class="MsoNormal">Type: Replicate<o:p></o:p></p><p class="MsoNormal">Volume ID: de655913-aad9-4e17-bac4-ff0ad9c28223<o:p></o:p></p><p class="MsoNormal">Status: Started<o:p></o:p></p><p class="MsoNormal">Snapshot Count: 0<o:p></o:p></p><p class="MsoNormal">Number of Bricks: 1 x 3 = 3<o:p></o:p></p><p class="MsoNormal">Transport-type: tcp<o:p></o:p></p><p class="MsoNormal">Bricks:<o:p></o:p></p><p class="MsoNormal">Brick1: 172.23.254.181:/data/brick_adminvm_slot2<o:p></o:p></p><p class="MsoNormal">Brick2: 172.23.254.182:/data/brick_adminvm_slot2<o:p></o:p></p><p class="MsoNormal">Brick3: 172.23.254.183:/data/brick_adminvm_slot2<o:p></o:p></p><p class="MsoNormal">Options Reconfigured:<o:p></o:p></p><p class="MsoNormal">storage.owner-gid: 107<o:p></o:p></p><p class="MsoNormal">storage.owner-uid: 107<o:p></o:p></p><p class="MsoNormal">performance.io-thread-count: 32<o:p></o:p></p><p class="MsoNormal">network.frame-timeout: 10800<o:p></o:p></p><p class="MsoNormal">cluster.lookup-optimize: off<o:p></o:p></p><p class="MsoNormal">server.keepalive-count: 5<o:p></o:p></p><p class="MsoNormal">server.keepalive-interval: 2<o:p></o:p></p><p class="MsoNormal">server.keepalive-time: 10<o:p></o:p></p><p class="MsoNormal">server.tcp-user-timeout: 20<o:p></o:p></p><p class="MsoNormal">network.ping-timeout: 20<o:p></o:p></p><p class="MsoNormal">server.event-threads: 4<o:p></o:p></p><p class="MsoNormal">client.event-threads: 4<o:p></o:p></p><p class="MsoNormal">cluster.choose-local: off<o:p></o:p></p><p class="MsoNormal">user.cifs: off<o:p></o:p></p><p class="MsoNormal">features.shard: on<o:p></o:p></p><p class="MsoNormal">cluster.shd-wait-qlength: 10000<o:p></o:p></p><p class="MsoNormal">cluster.shd-max-threads: 8<o:p></o:p></p><p class="MsoNormal">cluster.locking-scheme: granular<o:p></o:p></p><p class="MsoNormal">cluster.data-self-heal-algorithm: full<o:p></o:p></p><p class="MsoNormal">cluster.server-quorum-type: server<o:p></o:p></p><p class="MsoNormal">cluster.quorum-type: auto<o:p></o:p></p><p class="MsoNormal">cluster.eager-lock: enable<o:p></o:p></p><p class="MsoNormal">performance.strict-o-direct: on<o:p></o:p></p><p class="MsoNormal">network.remote-dio: disable<o:p></o:p></p><p class="MsoNormal">performance.low-prio-threads: 32<o:p></o:p></p><p class="MsoNormal">performance.io-cache: off<o:p></o:p></p><p class="MsoNormal">performance.read-ahead: off<o:p></o:p></p><p class="MsoNormal">performance.quick-read: off<o:p></o:p></p><p class="MsoNormal">cluster.granular-entry-heal: enable<o:p></o:p></p><p class="MsoNormal">storage.fips-mode-rchecksum: on<o:p></o:p></p><p class="MsoNormal">transport.address-family: inet<o:p></o:p></p><p class="MsoNormal">nfs.disable: on<o:p></o:p></p><p class="MsoNormal">performance.client-io-threads: on<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">Any help or ideas would be appreciated. Let us know if we have a setting incorrect or have made an error.<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">Thank you all!<o:p></o:p></p><p class="MsoNormal"> <o:p></o:p></p><p class="MsoNormal">Erik<o:p></o:p></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">________<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">Community Meeting Calendar:<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">Schedule -<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">Bridge: <a href="https://urldefense.com/v3/__https:/meet.google.com/cpu-eiue-hvk__;!!NpxR!jSt14TrwzMja4B05NfcXTkhBNt7ps4Zo--AFh864UGye2TGsvjH2qzArrS9bXCtK4igbz1HI5mRbaecq8a70$">https://meet.google.com/cpu-eiue-hvk</a><o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt">Gluster-users mailing list<o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt"><a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><o:p></o:p></span></p></div><div><p class="MsoNormal"><span style="font-size:12.0pt"><a href="https://urldefense.com/v3/__https:/lists.gluster.org/mailman/listinfo/gluster-users__;!!NpxR!jSt14TrwzMja4B05NfcXTkhBNt7ps4Zo--AFh864UGye2TGsvjH2qzArrS9bXCtK4igbz1HI5mRbaUaR_VTZ$">https://lists.gluster.org/mailman/listinfo/gluster-users</a><o:p></o:p></span></p></div></blockquote><div><p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p></div></div></div></div></div></blockquote><div><br></div><div><span></span></div><div><br></div><div><span></span></div></body></html>