[Gluster-devel] GlusterD 2.0 status updates

Richard Wareing rwareing at fb.com
Tue Sep 8 20:43:22 UTC 2015


WRT to thrift SSL and non-blocking IO are supported, I'll find out where the C implementation stands, it's possible patches haven't made it upstream for some reason.  It's also good to know there is no plan to move away from SUN-RPC for brick<->client comms.

On concurrency & python, I think this is where I'd make the argument that if the multi-threaded capabilities of the high-level language are a problem....or even a concern, those features should be done in C as it's clearly a feature where performance is critical; as such doesn't meet that bar for using the high-level language which is to be used in cases where performance isn't an issue.  We should also consider that features or code-paths which are slow (but deemed acceptably so) due to the higher-order language choice at 20-50 bricks, may not be the case at 1000, or 10,000 bricks; so these things should be carefully considered (which so far they seem to be which is good).

To clear, I think Go is a great choice based on the reasons you've cited, so long as there's a plan in place to re-write the Python components and be comfortable with the limited pool of developers which are likely to contribute (which is admittedly already a problem given GlusterFS core is in C; though I suspect there are way more C coders out there than Go).  If however, this isn't realistically possible then I think it needs another look IMHO.

@Atin, on being language agnostic/pluggable, I think it makes sense from an _external_ (feature/plugin) POV, but within the core code-base things should be cohesive and done in a consistent manner (from language selection, RPC frameworks, testing, doc style etc).  Consider the nightmare of building a automated build/test infra for a project which uses 3 or 4 languages?  Library management?  Tracking down/fixing bugs in the core lang libs?  

Also, was there any reason why C++ (11/14) wasn't considered as well?  It's kind of a nice middle ground between Python/Go and C, as it has lots of the higher-level features, rich libraries, mature tool chains, massive developer pool, support both Thrift (_very_ mature) and Protocol Buffers.


From: Kaushal M [kshlmster at gmail.com]
Sent: Monday, September 07, 2015 5:20 AM
To: Richard Wareing
Cc: Atin Mukherjee; Gluster Devel
Subject: Re: [Gluster-devel] GlusterD 2.0 status updates

Hi Richard,
Thanks a lot for you feedback. I've done my replies inline.

On Sat, Sep 5, 2015 at 5:46 AM, Richard Wareing <rwareing at fb.com> wrote:
> Hey Atin (and the wider community),
> This looks interesting, though I have a couple questions:
> 1. Language choice - Why the divergence from Python (which I'm no fan of) which is already heavily used in GlusterFS?  It seems a bit strange to me to introduce yet another language into the GlusterFS code base.  Doing this will make things less cohesive, harder to test, make it more difficult for contributors to understand the code base and improve their coding skills to be effective contributors.  I'm a bit concerned we are setting a precedent that development will switch to the new flavor of the day.  If a decision has been made to shift away from Python for the portions of GlusterFS where performance isn't a concern, will the portions currently written in Python be re-written as well?  I also question the wisdom of a language will a shallow developer pool, and a less development process (somewhat of an ironic choice IMHO).

One of our aims for GlusterD-2.0 was to switch to a higher level
language. While C is good for solving lower level performance critical
problems, it isn't best suited for the kind of management tasks we
want GlusterD to focus on. The choice of Go over Python as the higher
level language, was mainly driven by the following
- Go is easier to a hang of for a developer coming from a C
background. IMO for a developer new to both Go and Python, it's easier
to start producing working code in Go. The structure and syntax of the
language and the related tools make it easier.
- Go has built into the language support (think goroutines, channels)
for easily implementing the concurrency patterns that we have in the
current GlusterD codebase. This makes it easier for us think about
newer designs based on our understanding of the existing
- We have concerns about the concurrency and threading capabilities of
Python. We have faced a lot of problems with doing concurrency and
threading in GlusterD (though this is mostly down to bad design).
Python has known issues with threading, which doesn't give us
confidence as python novices.
- Go has a pretty good standard library (possibly the best standard
library), which provides us with almost everything required. This
reduces the number of dependencies that we pull in.

That's not to say Go doesn't have it's drawbacks. The major drawback
that I see currently with Go is that it doesn't have a way to do
plugins (dynamically load and execute binaries). But there have been
recent developments in this area. Go 1.5 landed support for building
Go packages as dynamic libraries. There is design ready to support
runtime loadability (plugins) ready, which is targeted for inclusion
in Go 1.6. As Go follows a 6 month release cycle, 1.6 is scheduled for
a Feb 2016 release, we need not wait too long for this to land. In the
meantime, we plan to build up the rest of the infrastructure required.

> 2. RPC framework - What's the reasoning behind using Protocol Buffers vs Thrift?  I'm admittedly biased here (since it's heavily used here at FB), however Thrift supports far more languages, has a larger user-base, features better data structure support, has exceptions and has a more open development process (it's an Apache project).  It's mentioned folks are "uncomfortable" with GLib, exactly why?  Has anyone done any latency benchmarks on the serialization/de-serialization to ensure we don't shoot ourselves in the foot by moving away from XDR for brick<->client/gNFSd communication?  The low latency communication between bricks & clients is to me a _critical_ component to GlusterFS's success; adding weight to the protocol or (worse) making it easier to add weight to me is unwise.

I'd like to clarify that we are not moving away from sunrpc/xdr for
brick - client communication. This will remain as is. The new RPC
framework is only for communication with GlusterD. We started looking
for an alternative rpc mechanism because Go doesn't have an existing
sunrpc implementation. Go does have an XDR implementation, but there
isn't a stable code generator to generate Go code from .x files.

We are looking for certain features in the RPC framework,
- Language support for C and Go
- Code generation - we didn't want to write message data structures
for each language ourselves
- Support for SSL transports - thought we wouldn't be using this
initially, we would require it later on.
- Support for non-blocking I/O - to better integrate with current
glusterfsd code, which mainly uses non-blocking I/O for network

At a high level thrift ticks all the boxes. But we initially were
weary of using thrift because of it's usage of Glib in the C
implementation. Glib brings in its own programming patterns and idioms
of which we were a little uncomfortable. But we did revisit Thrift
later (after my original mail), once we felt more comfortable. We
found that the C implementation of thrift is lacking in many features,
the major one among them is that it doesn't have SSL transports. Also,
the C implementation just has a simple blocking server. It doesn't
have either the multi-threaded server or the non-blocking server which
are present in the implementations of other languages.

As we couldn't find any suitable existing RPC frameworks, we went
forward with implementing a simple, easy to use framework ourselves.
We chose protobuf over thrift, as the protobuf-c implementation brings
in less dependencies and works just as well as thrift for
serialization. For the transport, we've tested using libevent for C,
which provides non-blocking I/O and support for encryption. We are in
the process of completing a usable library with this. We've also
started investigating if it's worth using nanomsg for our transports.

But, we are still ready to use thrift, if we can get help implementing
the missing features.

> So far things are moving towards 3-4 languages (Python, C, Go, sprinkle of BASH) and 2 RPC frameworks.  No language or RPC mechanism is perfect, but the proficiency of the coder at the keyboard is _far_ more important.  IMHO we should focus on 1 low level high-performance language (C) and 1 higher level language for other components where high performance isn't required (geo-rep, glusterd etc), as it will encourage higher proficiency in the chosen languages and less fractured knowledge amongst developers.

I agree with this. But I believe, with what I've said above, that we
have valid reasons to justify our decisions.

But, even then I'd like to apologize for not having had these
discussions and decisions made transparently. We did say that we'd be
doing GlusterD-2.0 in the open during the Gluster summit, but we
haven't been following our own words. This has been a mistake on our
part, and we will correct it.

> My 2 cents.
> Richard


> ________________________________________
> From: gluster-devel-bounces at gluster.org [gluster-devel-bounces at gluster.org] on behalf of Atin Mukherjee [amukherj at redhat.com]
> Sent: Monday, August 31, 2015 10:04 PM
> To: Gluster Devel
> Subject: [Gluster-devel] GlusterD 2.0 status updates
> Here is a quick summary of what we accomplished over last one month:
> 1. The skeleton of GlusterD 2.0 codebase is now available @ [1] and the
> same is integrated with gerrithub.
> 2. Rest end points for basic commands like volume
> create/start/stop/delete/info/list have been implemented. Needs little
> bit of more polishing to strictly follow the heketi APIs
> 3. Team has worked on coming up with a cross language light weight RPC
> framework using pbrpc and the same can be found at [2]. The same also
> has pbcodec package which provides a protobuf based rpc.ClientCodec and
> rpc.ServerCodec that can be used with rpc package in Go's standard library
> 4. We also worked on the first cut of volfile generation and its
> integrated in the repository.
> The plan for next month is as follows:
> 1. Focus on the documentation along with publishing the design document
> 2. Unit tests
> 3. Come up with the initial design & a basic prototype for transaction
> framework.
> [1] https://github.com/kshlm/glusterd2
> [2] https://github.com/kshlm/pbrpc
> Thanks,
> Atin
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://urldefense.proofpoint.com/v1/url?u=http://www.gluster.org/mailman/listinfo/gluster-devel&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=eSXH2j44tGdWnHEbaIk1Tg%3D%3D%0A&m=%2Fqy%2B94CzpNhZn9QOWkfc%2FZkbPJPDiR9uYJNVtG%2BgZPA%3D%0A&s=68e118c111403736815ea0ddf1c756a6c66800a66cbc5e1d14e0586c24ceb695
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://urldefense.proofpoint.com/v1/url?u=http://www.gluster.org/mailman/listinfo/gluster-devel&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=eSXH2j44tGdWnHEbaIk1Tg%3D%3D%0A&m=pAteF%2Fj38lY3d%2F8n1UGCzMpOmpeNXTUIyTJZ6gxQEMI%3D%0A&s=82871d5aad6768f86047f917351d283b736c36e8254882917de6696634e6a5a9

More information about the Gluster-devel mailing list