[Gluster-users] XFS and MD RAID

Wed Aug 29 07:48:39 UTC 2012

Does anyone have any experience running gluster with XFS and MD RAID as the
backend, and/or LSI HBAs, especially bad experience?

In a test setup (Ubuntu 12.04, gluster 3.3.0, 24 x SATA HD on LSI Megaraid
controllers, MD RAID) I can cause XFS corruption just by throwing some
bonnie++ load at the array - locally without gluster.  This happens within
hours.  The same test run over a week doesn't corrupt with ext4.

I've just been bitten by this in production too on a gluster brick I hadn't
converted to ext4.  I have the details I can post separately if you wish,
but the main symptoms were XFS timeout errors and stack traces in dmesg, and
xfs corruption (requiring a reboot and xfs_repair showing lots of errors,
almost certainly some data loss).

However, this leaves me with some unpalatable conclusions and I'm not sure
where to go from here.

(1) XFS is a shonky filesystem, at least in the version supplied in Ubuntu
kernels.  This seems unlikely given its pedigree and the fact that it is
heavily endorsed by Red Hat for their storage appliance.

(2) Heavy write load in XFS is tickling a bug lower down in the stack
(either MD RAID or LSI mpt2sas driver/firmware), but heavy write load in
ext4 doesn't.  This would have to be a gross error such as blocks queued for
write being thrown away without being sent to the drive.

I guess this is plausible - perhaps the usage pattern of write barriers is
different for example.  However I don't want to point the finger there
without direct evidence either.  There are no block I/O error events logged
in dmesg.

The only way I can think of pinning this down is to find out what's the
smallest MD RAID array I can reproduce the problem with, then try to build a
new system with a different controller card (as MD RAID + JBOD, and/or as a
hardware RAID array)

However while I try to see what I can do for that, I would be grateful for
any other experience people have in this area.

Many thanks,

Brian.