[Gluster-devel] jdarcy status (October 2014)

Jeff Darcy jdarcy at redhat.com
Wed Oct 8 20:20:34 UTC 2014


In the last few days, I've run into a couple of misunderstandings about
the status of some projects I've worked on.  At first I set out to
correct those misunderstandings, but then I realized it wouldn't be a
bad idea to keep posting status to this llist periodically.  Can't hurt,
anyway.  This is my first attempt.  I encourage other senior developers
to do likewise for their projects as well.  I know many of them already
do status reports for their downstream projects within Red Hat, and most
of that information not only can but *should* be shared upstream as
well.  It's just copy and paste.  Besides that, many people seem to have
side projects in addition to their official responsibilities, and it
would be good to hear about those.  So, without further ado...

= Contents =

 * NSR

 * SSL and multi-threading

 * GlusterFS 4.0

= NSR =

As those who watch the review queue know, I've been trying to revive the
NSR code from its dormant state and get it to work with the current
master branch.  This isn't just a git merge.  Given how much the code
has changed since anyone last worked on NSR, a lot of manual work has
been involved.  First, I split out some of the infrastructure that NSR
depends on, but which is not itself NSR-specific.

 * http://review.gluster.org/#/c/8769/   xdata support in GFAPI

 * http://review.gluster.org/#/c/8812/   GF_FOP_IPC

 * http://review.gluster.org/#/c/8887/   etcd support

Then there's NSR itself.

 * http://review.gluster.org/#/c/8913/   code (I/O path and glusterd)

 * http://review.gluster.org/#/c/8915/   design

Basically I have the I/O path building and running simple tests.  Ditto
for the glusterd parts (e.g. generating volfiles).  I want to get
reconciliation working next, which will require revamping (or replacing)
the NSR-specific version of changelog that we were using before.  I'm
also considering a pretty fundamental change from metadata logging to
full data logging, so that the actual writes can be done outside of the
I/O path at the cost of some overall write amplification.  I'd be very
interested in other people's thoughts on that.

= SSL and multi-threading =

SSL support has been part of the code for two years.  There are a few
users, but there could be many more.  Why aren't there?  Partly it's
because the documentation is not easy to find, and copious documentation
is necessary to do anything SSL-related.  The documentation does exist,
I've sent the information to various people and to this list before, but
it really should live in the wiki and/or the source tree.  The other
problem is that some people just don't seem to get why users might want
SSL.  Go figure.  Unfortunately, some of those people are the ones who
control whether resources are assigned to test it, and therefore whether
it gets mentioned as a feature, so it has effectively remained an
upstream-only feature all that time.  Maybe, now that OpenStack Manila
is using it for multi-tenancy, the higher profile will spark some change
in those attitudes.

Multi-threading is even more controversial.  It has also been in the
tree for two years (it was developed to address the problem of SSL code
slowing down our entire transport stack).  This feature, controlled by
the "own-thread" transport option, uses a thread per connection - not my
favorite concurrency model, but kind of necessary to deal with the
OpenSSL API.  More recently, a *completely separate* approach to
multi-threading - "multi-threaded epoll" - has been getting some
attention.  Here's what I see as the pros and cons of this new approach.

 * PRO: greater parallelism of requests on a single connection.  I think
   the actual performance benefits vs. own-thread are unproven and
   likely to be small, but they're real.

 * CON: with greater concurrency comes greater potential to uncover race
   conditions in other modules used to being single-threaded.  We've
   already seen this somewhat with own-thread, and we'd see it more with
   multi-epoll.

 * CON: multi-epoll does not work with SSL.  It *can't* work with
   OpenSSL at all, short of adopting a hybrid model where SSL
   connections use own-thread while others use multi-epoll, which is a
   bit of a testing nightmare.

Obviously I'm not a fan of multi-epoll.  The first point suggests little
or no benefit.  The second suggests greater risk.  The third is almost
fatal all by itself, and BTW it was known all along.  Don't we have
better things to do?

= GlusterFS 4.0 =

A while ago, several people asked me to spearhead a long term effort to
develop technologies for GlusterFS 4.0 - separate from shorter term
efforts in either 3.x or anything downstream.  I've tried to collect
related ideas into a coherent form here:

   http://goo.gl/QyjfxM

Some of the highlights:

 * glusterd scalability

 * DHT scalability

 * coherent caching

 * erasure coding

 * data classification and policy-driven storage allocation

 * multi-network support

 * NSR

 * management plugins

Some of these, such as erasure coding and at least some parts of data
classification, are likely to arrive sooner.  Most of them being led by
other people; I'm just collecting the ideas, not originating them.  This
is still *very much* a work in progress.  If anybody has other ideas on
what would make 4.0 the Next Great Distributed File System, now's a good
time to jump in.


More information about the Gluster-devel mailing list