[Gluster-devel] ZkFarmer

Tue May 8 04:33:50 UTC 2012

On Mon, May 7, 2012 at 7:43 AM, Jeff Darcy <jdarcy at redhat.com> wrote:
> I've long felt that our ways of dealing with cluster membership and staging of
> config changes is not quite as robust and scalable as we might want.
> Accordingly, I spent a bit of time a couple of weeks ago looking into the
> possibility of using ZooKeeper to do some of this stuff.  Yeah, it brings in a
> heavy Java dependency, but when I looked at some lighter-weight alternatives
> they all seemed to be lacking in more important ways.  Basically the idea was
> to do this:
>
> * Set up the first N (e.g. N=3) nodes in our cluster as ZooKeeper servers, or
> point everyone at an existing ZooKeeper cluster.
>
> * Use ZK ephemeral nodes as a way to track cluster membership ("peer probe"
> merely updates ZK, and "peer status" merely reads from it).
>
> * Store config information in ZK *once* instead of regenerating volfiles etc.
> on every node (and dealing with the ugly cases where a node was down when the
> config change happened).
>
> * Set watches on ZK nodes to be notified when config changes happen, and
> respond appropriately.
>
> I eventually ran out of time and moved on to other things, but this or
> something like it (e.g. using Riak Core) still seems like a better approach
> than what we have.  In that context, it looks like ZkFarmer[1] might be a big
> help.  AFAICT someone else was trying to solve almost exactly the same kind of
> server/config problem that we have, and wrapped their solution into a library.
>  Is this a direction other devs might be interested in pursuing some day,
> if/when time allows?
>
>
> [1] https://github.com/rs/zkfarmer

Real issue is here is: GlusterFS is a fully distributed system. It is
OK for config files to be in one place (centralized). It is easier to
manage and backup. Avati still claims that making distributed copies
are not a problem (volume operations are fast, versioned and
checksumed). Also the code base for replicating 3 way or all-node is
same. We all need to come to agreement on the demerits of replicating
the volume spec on every node.

If we are convinced to keep the config info in one place, ZK is
certainly one a good idea. I personally hate Java dependency. I still
struggle with Java dependencies for browser and clojure. I can digest
that if we are going to adopt Java over Python for future external
modules. Alternatively we can also look at creating a replicated meta
system volume. What ever we adopt, we should keep dependencies and
installation steps to the bare minimum and simple.

-- 
Anand Babu Periasamy
Blog [ http://www.unlocksmith.org ]
Twitter [ http://twitter.com/abperiasamy ]

Imagination is more important than knowledge --Albert Einstein