[Gluster-users] my own unify - raid5 setup, is it possible?

Mon Feb 15 11:54:03 UTC 2010

Hi Casper,
please find the inlined comments.

On Mon, Feb 15, 2010 at 3:38 PM, Casper Langemeijer <casper at bcx.nl> wrote:

> Hi List!
>
> Raghavendra G, thanks for you reply.
>
> On Mon, 2010-02-15 at 11:11 +0400, Raghavendra G wrote:
> >         Could I duplicate files to multiple data bricks in the cluster
> >         to
> >         provide a raid5-like setup? I very much want to be able to
> >         shutdown a single machine in the cluster and still have a
> >         fully functional filesystem. I'm very happy to write the
> >         application that does the copying over the data bricks myself.
>
> > We recommend using distribute translator instead of unify. But with
> > distribute (even with unify) data is not striped. Both translators
> > (unify and distribute) are used to aggregate multiple storage nodes
> > into a single filesystem. If you want to increase read performance
> > using stripe, you can use stripe translator.
>
> If I get this correctly: Using both stripe and distribute, I can create
> a redundant distributed filesystem. Basically a networked RAID10. (or 1
> +0, 01 0+1 thats details)
>
> I'm looking for a networked RAID5-like system though.
>

you can create a RAID-5 like system using stripe and replicate. replicate
provides redundancy, which in RAID5 is provided by distributed parity. If
you just want a RAID5 like setup, you don't need distribute or unify. But,
please note that we officially are not supporting this combination (yet).
However, we are willing to work on the problems (if you face any).

> >         Another advantage could be that I can decide on a per-file
> >         basis how many copies of a file exist in the filesystem. (Two
> >         would be a minimum for me) The real-world scenario: This would
> >         be the data filesystem for a webserver cluster setup. You can
> >         imagine images used on a homepage are requested more frequent
> >         than others.
> >
> > replicate (formerly known as afr) does not  support maintaining
> > different number of replicas for different files.
>
> I know I'm doing something that unify was not intended for. I did some
> simple tests. My two data bricks unified by two clients, subvolumes
> specified in a different order. (client1 has 'subvolumes data1 data2',
> client2 has 'subvolumes data2 data1')
>
> Reading works. I confimed that unify reads form the first data brick
> available. It remembers what brick a file is on. Once a file is found to
> be on data1, it won't change to data2.
>
> Removing works. Files are not only removed from the namespace brick, but
> also from every client. No stale data is left behind.
>
> Renaming works. Similar to remove, files are renamed on all bricks.
>
> Modifying doesn't work. My simple tests showed one copy to be modified,
> others got truncated. I'll investigate later on. If I can get my
> application to not modify data, but instead do a create tmp, remove old
> and rename tmp to old cycle. I might be there.
>
> It seems that although it's not meant to work this way, I found my
> networked RAID5-like system, as long as I'm willing to create copies of
> files to other bricks myself. I very much understand that I won't get
> any guarantees.
>
> >         What problems can I expect with this setup?
> >         Have others tried a similar setup?
> >         Am I missing a GlusterFS feature that would implement what I
> >         want, in a much easier way?
>
> I think I've got the answer to last question. GlusterFS provides a
> raid10-like, but nothing like a raid5-like setup.
>
> I would still like to know if I'm missing stuff here. I haven't thought
> of any performance issues for example.
>
> Also: I'm just starting using unify, and already using an
> Obsolete/legacy translator. For me switching to cluster/distribute is
> not an option. Does that mean I'll be locked-in to GlusterFS 2.0.9?
>

Yes.

>
> Greetings, Casper
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>

regards,
-- 
Raghavendra G