[Gluster-devel] Glusterd: A New Hope

Anand Babu Periasamy abperiasamy at gmail.com
Sat Mar 23 23:58:17 UTC 2013

On Fri, Mar 22, 2013 at 3:08 PM, Stephan von Krawczynski
<skraw at ithnet.com> wrote:
> On Fri, 22 Mar 2013 14:27:45 -0400
> Jeff Darcy <jdarcy at redhat.com> wrote:
>> On 03/22/2013 02:20 PM, Anand Avati wrote:e
>> > The point is that it was never a question of performance - it was to
>> > just get basic functionality "working".
>> I stand corrected.  Here's the amended statement.
>> "The need for some change here is keenly felt right now as we struggle
>> to fix all of the race conditions that have resulted from the hasty
>> addition of synctasks to make up for poor event handling elsewhere in
>> that 44K lines of C."
> I have never heard a longer version of "we give up, because the code is BS".
> Sorry, but for me it feels like you should just throw away the whole 3.X
> series and re-implement the self-heal design - which is ok - based on top
> of 2.X and _use config files_.
> Your whole pseudo-automatism around glusterfsd is just bloatware.
> And to me it looks you just failed to survive your self-created complexity,
> exactly what I told you months ago.
> "We can't do it, so lets pile the sh*t before someone else's home..."
> WTF...
> Why is it you cannot accept that it should be a _filesystem_, and nothing else?
> It would have been a lot better to care about stability, keep it simple and
> feel fine. Concentrate on the strength (client based replication setups) and
> forget the rest.
> Sorry, someone has to tell you... beat me.
> --
> Regards,
> Stephan

I agree with your point. Users like GlusterFS because it is simple and
2.x was simpler and more powerful.

Glusterd was introduced for a reason:
* New users struggled with the volume spec files. It required the user
to build his own file system, which in turn needed deeper
understanding of what translators to pick, how to combine them as
graph and file system concepts.
* It was up to the admin to synchronize volume spec changes across the
servers. Failure to do so can result in unpredictable behavior or even
data corruption.
* Online volume management - ability to add-remove nodes, turn on/off
features, configure volume options.. all without restarting/remounting
the file system.
* We simply could not test all possible combinations of translators.
If we have not tested a particular combination, we simply should not
allow such a volume specification.

glusterd + gluster cli covered all these and exposed only few simple
commands. It was easy for new comers. Limitations of glusterd is
largely because of incomplete implementation and it is high time we
fix it or rewrite it.

Jeff's point is: Distributed coordination is a hard problem and takes
time to mature. There are external mature free software projects. Why
should we re-invent the wheel?

We should keep glusterd, but fix the problem without adding
complexity. Zoo-keepr like external projects should be our last
option, because managing zoo-keeper installation will require more
specailized skills than glusterfs itself. I will be OK only if we
could completely hide the transition. Users should not be required to
install and manage external services. Same gluster commands should do
the magic.

My suggestion is, to fix glusterd
(1) transparently create an internal meta-volume for storing volume
spec files. Gluster already has the necessary building blocks similar
to zoo-keeper. Core of the distributed-coordination requires
synchronous replication capability, which we already have. Also
synchronizing volume spec files to all nodes does not scale.
Meta-volume solves this by not spawning across the entire set of
nodes, yet it is distributed and replicated.

(2) We should evaluate zeromq for internal coordination.

(3) glusterd should manage glusterfsd as a pool of file system servers
serving many volumes. Currently we start one glusterfsd per volume per
server. If we create large number of smaller volumes (cloud
deployments), we fall apart.  It is possible to  hand craft vol spec
files and achieve this behaviour, but glusterd limits it.

(4) Auto-generated volume spec is too limiting. There are other useful
translators, but not even included (but disabled) by default. Volume
spec file does not list all possible volume options (but commented)
with default values and possible range of input values. CLI for volume
options does not list them all. There is no good documentation about
these options. Either remove those unexposed options entirely (better)
or document them. Allow user supplied customization (templates) to
default auto-generated volume specification (with clear warning).

Lets us get it right this time!


Imagination is more important than knowledge --Albert Einstein

More information about the Gluster-devel mailing list