[Gluster-users] RAID 0 with Cache v/s NO-RAID

Indivar Nair indivar.nair at techterra.in
Wed Oct 3 15:07:55 UTC 2012


Sorry, couldn't reply earlier, was indisposed the last few days.

Thanks for the input Brian, especially on the 'map' translator.
It lead me to another one called the 'switch' scheduler that seems to do
exactly what I want.
i.e. distribute files on to selective bricks, based on file extension.

Trying to find out more about it. Do tell me / point me to more info if you
have any.

Regards,


Indivar Nair



On Fri, Sep 28, 2012 at 1:12 PM, Brian Candler <B.Candler at pobox.com> wrote:

> On Fri, Sep 28, 2012 at 08:58:55AM +0530, Indivar Nair wrote:
> >    We were trying to cater to both large file (100MB - 2GB) read speed
> and
> >    small file (10-50MB) read+write speed.
> >    With Gluster, we were thinking of setting the individual stripe size
> to
> >    50MB so that each volume could hold a complete small file. While
> larger
> >    files could be striped across in 50MB chunks.
>
> Sure, although you can do the same type of striping with RAID too. The RAID
> might be slightly less efficient, in that a small file might straddle two
> chunks whereas a small file in gluster will always hit one brick.  And
> parity RAID is unlikely to work well with such a large chunk size, i.e.
> it's very unlikely you will ever write a whole stripe at once to avoid
> readback of existing parity blocks.
>
> >    The RAID Controllers that come with branded hardware does not allow
> >    individual disk access (no passthrough mode)
>
> It won't let you make one-disk RAID0 sets? However I'm not convinced that
> this will make much difference in your case anyway.  You're looking for
> high
> throughput with large files, which is limited simply by drive throughput,
> and a write cache will not help with that.
>
> Write caches are useful for operations which write small bits of data and
> must have confirmation that the data has been written to disk before
> continuing (e.g.  inode updates) which could result in filesystem
> corruption
> if not done in correct sequence.
>
> Of course, the best answer is simply to measure it with a test workload
> representative of your expected usage.
>
> >    One more thought, is it possible to have a mix of RAID6 volumes, and
> >    individual disks and force Gluster to write large files (*.ma) to
> RAID6
> >    volumes and small files (*.iff) to individual disks. That would solve
> >    our problem completely.
>
> Technically yes, in that you could manually hack about with the translator
> stack and put in the 'map' translator:
> http://gluster.org/pipermail/gluster-users/2010-March/004292.html
>
> However this is completely unsupported by RedHat (both manually hacking
> about with the stack, and the map translator itself)
>
> If it were me I'd just put the *.ma files on one volume and the *.iff ones
> on a different one, at the application layer.  But I'd only do this if I
> really needed RAID10 for some data and RAID6 for other.
>
> Regards,
>
> Brian.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121003/d8485b8f/attachment.html>


More information about the Gluster-users mailing list