[Gluster-devel] New unify scheduler design proposition
robert.newson at gmail.com
Wed Mar 19 13:31:37 UTC 2008
You might want to look at Ceph (http://ceph.newdream.net/) which
partitions data by hashing too.
Specifically the paper on "CRUSH - Controlled, Scalable, Decentralized
Placement of Replicated Data"
Some of those ideas added to glusterfs could be very cool.
On Wed, Mar 19, 2008 at 3:17 AM, Daniel van Ham Colchete
<daniel.colchete at gmail.com> wrote:
> Hello yall!
> I've away from the community lately as I had to focus on some other
> stuff here at my work, but I'm still really anxious for GlusterFS
> 1.3.8 to be released so I can resume my tests and productions
> environments again :).
> As I was just going to bed and thinking about millions of small files
> as a solution to a problem I'm trying to solve here, I had this idea:
> The Partitioner Scheduler
> One liner: it's a scheduler that chooses witch file goes to what
> server based on an 1 bit hash of it's name (or path inside gluster
> mount + name).
> What do you win? You know where to look for the file.
> Picture this: 10 8HD 3TB RAID6 servers unified (no AFR). Question:
> where is x file? Unify would send a request to everyone asking: do you
> have it? So you probably have 80 harddrive head's searching for the
> directory index sector. That's really bad when you're dealing with
> small files, it's like everything stopping for 50ms because of a file
> lookup. Instead request it to the server where the file should be. If
> it isn't there, ask everybody else.
> Implementation: get a 8 bits really fast well distributed hash,
> basically your splitting your files to 256 possible computers. You
> only have 2? One is 0-127, two is 128-255.
> Question: when a file isn't at the expected server, should we move it
> there? I don't know, I can always imagine a completely crazy Unify+AFR
> situation where someone could screw things up if he really puts his
> mind into it. But, if not, upgrading the cluster would mean having the
> old problem back at the beginning at least.
> Problem: well, I'm assuming the servers are pretty much alike.
> Solution: get a bigger hash and add weights to the hash distribution.
> Problem 2: although the name explains how it works I think there was
> another thing using the same name in the storage area, but can't
> remember what ;-)... Two different things, same name, not good...
> Solution: The Colchete's Scheduler? Just kidding... hahaha
> The idea is not really original, if you look what Google's Bigtable
> does to be scalable. PostgreSQL and Oracle also achieve a lot knowing
> where to look for some information. You can still have Partitioned
> Unify a lot of 3-AFR or 2-AFR to increased reliability.
> Well, if you have less than 6 servers you would really care about this
> I think. If you have a small number of big file that wouldn't be much
> useful too, but that's the easy case everywhere.
> Best regards,
> Daniel Colchete
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
More information about the Gluster-devel