[Gluster-users] qemu raw image file - qemu and grub2 can't find boot content from VM

Wed Jan 27 19:02:27 UTC 2021

> Also, I would like to point that I have VMs with large disks 1TB and 2TB, and
> have no issues. definitely would upgrade Gluster version like let's say at
> least 7.9.

Great! Thank you! We can update but it's very sensitive due to the
workload. I can't officially update our gluster until we have a cluster
with a couple thousand nodes to test with. However, for this problem,
this is on my list on the test machine. I'm hoping I can reproduce it. So far
no luck making it happen again. Once I hit it, I will try to collect more data
and at the end update gluster.

What do you think about the suggestion to increase the shard size? Are
you using the default size on your 1TB and 2TB images?

> Amar also asked a question regarding enabling Sharding in the volume after
> creating the VMs disks, which would certainly mess up the volume if that what
> happened.

Oh I missed this question. I basically scripted it quick since I was
doing it so often.. I have a similar script that takes it away to start
over.

set -x
pdsh -g gluster mkdir /data/brick_adminvm/
gluster volume create adminvm replica 3 transport tcp 172.23.255.151:/data/brick_adminvm 172.23.255.152:/data/brick_adminvm 172.23.255.153:/data/brick_adminvm
gluster volume set adminvm group virt
gluster volume set adminvm granular-entry-heal enable
gluster volume set adminvm storage.owner-uid 439
gluster volume set adminvm storage.owner-gid 443
gluster volume start adminvm

pdsh -g gluster mount /adminvm

echo -n "press enter to continue for restore tarball"

pushd /adminvm
tar xvf /root/backup.tar
popd

echo -n "press enter to continue for qemu-img"

pushd /adminvm
qemu-img create -f raw -o preallocation=falloc /adminvm/images/adminvm.img 5T
popd

Thanks again for the kind responses,

Erik

> 
> On Wed, Jan 27, 2021 at 5:28 PM Erik Jacobson <erik.jacobson at hpe.com> wrote:
> 
>     > > Shortly after the sharded volume is made, there are some fuse mount
>     > > messages. I'm not 100% sure if this was just before or during the
>     > > big qemu-img command to make the 5T image
>     > > (qemu-img create -f raw -o preallocation=falloc
>     > > /adminvm/images/adminvm.img 5T)
>     > Any reason to have a single disk with this size ?
> 
>     > Usually in any
>     > virtualization I have used , it is always recommended to keep it lower.
>     > Have you thought about multiple disks with smaller size ?
> 
>     Yes, because the actual virtual machine is an admin node/head node cluster
>     manager for a supercomputer that hosts big OS images and drives
>     multi-thousand-node-clusters (boot, monitoring, image creation,
>     distribution, sometimes NFS roots, etc) . So this VM is a biggie.
> 
>     We could make multiple smaller images but it would be very painful since
>     it differs from the normal non-VM setup.
> 
>     So unlike many solutions where you have lots of small VMs with their
>     images small images, this solution is one giant VM with one giant image.
>     We're essentially using gluster in this use case (as opposed to others I
>     have posted about in the past) for head node failover (combined with
>     pacemaker).
> 
>     > Also worth
>     > noting is that RHII is supported only when the shard size is  512MB, so
>     > it's worth trying bigger shard size .
> 
>     I have put larger shard size and newer gluster version on the list to
>     try. Thank you! Hoping to get it failing again to try these things!
> 
> 
> 
> --
> Respectfully
> Mahdi