[Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

Sun Dec 30 15:13:52 UTC 2012

On 12/27/12 3:36 PM, Stephan von Krawczynski wrote:
> And the same goes for glusterfs. It _could_ be the greatest fs on earth, but
> only if you accept:
> 
> 1) Throw away all non-linux code. Because this war is over since long.

Sorry, but we do have non-Linux users already and won't abandon them.  We
wouldn't save all that much time even if we did, so it just doesn't make sense.

> 2) Make a kernel based client/server implementation. Because it is the only
> way to acceptable performance.

That's an easy thing to state, but a bit harder to prove.  Even Ceph, which
makes a big deal of having a kernel client, has a user-space server.  HDFS is
way out in Java-land, as are many non-filesystem (e.g. object/NoSQL) data
stores, and people seem OK with that.  PLFS is even using FUSE, and the people
who run some of the biggest systems on the planet have reported significant
improvements over fully-in-kernel Lustre for demanding real-world workloads.

Thus, I don't think the case for putting things in the kernel is fully made.
We'd be giving up too much terms of flexibility and development velocity, and
for what?  Why do you think Ceph is taking so long to mature?  Are you
volunteering to implement complex new features such as multi-tenancy or
deduplication in the kernel?  I'm not, and I've been a kernel developer for
over twenty years.  A single task-specific translator can often provide greater
gains than putting everything in the kernel, for far less effort.  It's hard
enough to get people to think that way when the code's out in user space (even
in Python); in the kernel it simply wouldn't happen.  That would put us in a
me-too race with all the other distributed filesystems, instead of using
modularity and open source to let people create the filesystems that they each
need.  That's our advantage, and we intend to keep it.

Would a full in-kernel implementation help with latency, for certain workloads
that aren't already using the qemu interface (which reduces it still further)?
 Yes.  Would it help with bandwidth/scalability across many clients and
servers?  Not really.  Would it require extreme sacrifices in just about every
other area to address one need that's already well served elsewhere?
Absolutely.  It's fine that you want something else, but GlusterFS is not going
to be that.  Sorry.  If you want some help evaluating alternatives, e.g. with
tips for how to evaluate their performance or correctness, please let me know
(off list) and I'll do what I can.

> 3) Implement true undelete feature. Make delete a move to a deleted-files area.

Some people want that, some people do not.  Some are even precluded from using
it e.g. for compliance reasons.  It's hardly a must-have feature.  In any case,
it already exists - called "landfill" I believe, though I'm not sure of its
support status or configurability via the command line.  If it didn't exist, it
would still be easy to create - which wouldn't be the case at all if we
followed your advice to put this in the kernel.  If it's a priority for you and
existing facilities do not suffice, then I suggest adding a feature page on the
wiki and/or an enhancement-request bug report, so that we can incorporate that
feedback into our planning process.  Thank you for your help making GlusterFS
better.