[Gluster-users] RAID 0 with Cache v/s NO-RAID

Fri Sep 28 07:42:36 UTC 2012

On Fri, Sep 28, 2012 at 08:58:55AM +0530, Indivar Nair wrote:
>    We were trying to cater to both large file (100MB - 2GB) read speed and
>    small file (10-50MB) read+write speed.
>    With Gluster, we were thinking of setting the individual stripe size to
>    50MB so that each volume could hold a complete small file. While larger
>    files could be striped across in 50MB chunks.

Sure, although you can do the same type of striping with RAID too. The RAID
might be slightly less efficient, in that a small file might straddle two
chunks whereas a small file in gluster will always hit one brick.  And
parity RAID is unlikely to work well with such a large chunk size, i.e. 
it's very unlikely you will ever write a whole stripe at once to avoid
readback of existing parity blocks.

>    The RAID Controllers that come with branded hardware does not allow
>    individual disk access (no passthrough mode)

It won't let you make one-disk RAID0 sets? However I'm not convinced that
this will make much difference in your case anyway.  You're looking for high
throughput with large files, which is limited simply by drive throughput,
and a write cache will not help with that.

Write caches are useful for operations which write small bits of data and
must have confirmation that the data has been written to disk before
continuing (e.g.  inode updates) which could result in filesystem corruption
if not done in correct sequence.

Of course, the best answer is simply to measure it with a test workload
representative of your expected usage.

>    One more thought, is it possible to have a mix of RAID6 volumes, and
>    individual disks and force Gluster to write large files (*.ma) to RAID6
>    volumes and small files (*.iff) to individual disks. That would solve
>    our problem completely.

Technically yes, in that you could manually hack about with the translator
stack and put in the 'map' translator:
http://gluster.org/pipermail/gluster-users/2010-March/004292.html

However this is completely unsupported by RedHat (both manually hacking
about with the stack, and the map translator itself)

If it were me I'd just put the *.ma files on one volume and the *.iff ones
on a different one, at the application layer.  But I'd only do this if I
really needed RAID10 for some data and RAID6 for other.

Regards,

Brian.