[Gluster-users] KVM guest I/O errors with xfs backed gluster volumes

Samuli Heinonen samppah at neutraali.net
Wed Nov 6 12:40:49 UTC 2013

6.11.2013 14:33, Jacob Yundt kirjoitti:
> On Tue, Nov 5, 2013 at 10:56 PM, Bharata B Rao <bharata.rao at gmail.com> wrote:
>> My below mail didn't make it to the list, hence resending...
>> On Tue, Nov 5, 2013 at 8:04 PM, Bharata B Rao <bharata at linux.vnet.ibm.com> wrote:
>>> On Wed, Oct 30, 2013 at 11:26:48PM +0530, Bharata B Rao wrote:
>>>> On Tue, Oct 29, 2013 at 1:21 PM, Anand Avati <avati at gluster.org> wrote:
>>>>> Looks like what is happening is that qemu performs ioctls() on the backend
>>>>> to query logical_block_size (for direct IO alignment). That works on XFS,
>>>>> but fails on FUSE (hence qemu ends up performing IO with default 512
>>>>> alignment rather than 4k).
>>>>> Looks like this might be something we can enhance gluster driver in qemu.
>>>>> Note that glusterfs does not have an ioctl() FOP, but we could probably
>>>>> wire up a virtual xattr call for this purpose.
>>>>> Copying Bharata to check if he has other solutions in mind.
>>>> I see alignment issues and subsequent QEMU failure (pread() failing with
>>>> EINVAL) when I use a file from XFS mount point (with sectsz=4k) as a virtio
>>>> disk with cache=none QEMU option. However this failure isn't seen when I
>>>> have sectsz=512. And all this is w/o gluster. So there seems to be some
>>>> alignment issues even w/o gluster, I will debug more and get back.
>>> I gather that QEMU block layer and SeaBIOS don't yet support 4k sectors.
>>> So this is not a QEMU-GlusterFS specific issue.
>>> You could either not use cache=none option which results in O_DIRECT
>>> or use the below something like below which explicitly sets the sector size
>>> and min io size for the guest.
>>> -drive file=/mnt/xfs.img,if=none,cache=none,format=raw,id=mydisk -device virtio-blk,drive=mydisk,logical_block_size=4096,physical_block_size=4096,min_io_size=4096
>>> Ref: https://bugzilla.redhat.com/show_bug.cgi?id=997839
>>> Regards,
>>> Bharata.
> Bharata-
> Thanks for the update on this.  I'm going to give these qemu args a
> try and see what happens.
> On a side-note, I can't believe more users aren't running into this
> issue.  I assumed (perhaps incorrectly) that most modern drives were
> using 4K sectors.
> -Jacob

Jacob, are you using xfs on top of HDD or are you using somekind of RAID?

We have disks with 4K sectors and we are using those in RAID-6 setup 
with LSI Megaraid controller. We haven't run into these issues and I 
wasn't able to reproduce it. I did only very quick tests tho, so it may 
be that I have missed something.


