[Gluster-users] Meta-discussion

Wed Jan 2 21:31:25 UTC 2013

On Wed, 2013-01-02 at 08:38 -0500, Whit Blauvelt wrote:
> There's a strong trend against documentation of software, and not just in
> open source. I'm old enough to remember when anything modestly complex came
> with hundreds of pages of manuals, often over several volumes.
I agree-- You also have to agree that there are "new" advancements too:
irc, mailing lists, etc... As one of my personal preferences, I
use/write puppet code. This is useful to me as "de-facto" documentation
on how to set something up. If there's ever software that I really don't
understand, but it has a puppet module (even if it's a poorly written
one) reading it can often give me clues as to how the underlying
software works.

After a hard time learning how gluster works, I made a puppet module [1]
for exactly this reason. It's definitely a more complicated module that
does more (some sysadmins don't want it to do this much), however it
*does* show how to get a working gluster setup if you look through it,
or run it. Conversely, I hope that the real gluster experts out there
check it out and add optimisations to it. What better way to get users
trying out gluster if they have a turn key, deployment solution
available.

This wasn't meant as a plug, it is Free Software after all, but you're
free to use and share it as a means to help new users figure out gluster
in the face of missing docs. I think this is a good way to learn.

Hope this was a useful comment,
James
[1] https://github.com/purpleidea/puppet-gluster

>  Now, I can
> understand why commercial software with constrained GUIs wants to pretend
> that what's underneath is as simple as the GUI suggests, so as not to scare
> away customers. And I can understand why some open source projects might
> want to withhold knowledge to motivate consulting contracts, as cynical as
> that may be.
> 
> But something on the scale of Gluster should have someone hired full time to
> do nothing but continuously write and update documentation. If you need a
> business model for that, print the results in a set of thick books, and sell
> it for $250 or so. Print JIT so you can track point releases. What Brian
> asks for should be the core of it. Even when stuff breaks for people who
> have paid for their RedHat Solution Architect, it will give that architect a
> place to look up the fix quickly, rather than having to go bother the
> development team, who are more profitably deployed in development.
> 
> Best,
> Whit
> 
> 
> On Wed, Jan 02, 2013 at 01:19:17PM +0100, Fred van Zwieten wrote:
> > +1 for 2b.
> > 
> > I am in de planning stages for an RHS 2.0 deployement and I too have suggested
> > a "cookbook" style guide for step-by-step procedures to my RedHat Solution
> > Architect.
> > 
> > What can I do to have this upped in the prio-list?
> > 
> > Cheers,
> > Fred
> > 
> > 
> > On Wed, Jan 2, 2013 at 12:49 PM, Brian Candler <B.Candler at pobox.com> wrote:
> > 
> >     On Thu, Dec 27, 2012 at 06:53:46PM -0500, John Mark Walker wrote:
> >     > I invite all sorts of disagreeable comments, and I'm all for public
> >     > discussion of things - as can be seen in this list's archives.  But, for
> >     > better or worse, we've chosen the approach that we have.  Anyone who
> >     would
> >     > like to challenge that approach is welcome to take up that discussion
> >     with
> >     > our developers on gluster-devel.  This list is for those who need help
> >     > using glusterfs.
> >     >
> >     > I am sorry that you haven't been able to deploy glusterfs in production.
> >     > Discussing how and why glusterfs works - or doesn't work - for particular
> >     > use cases is welcome on this list.  Starting off a discussion about how
> >     > the entire approach is unworkable is kind of counter-productive and not
> >     > exactly helpful to those of us who just want to use the thing.
> > 
> >     For me, the biggest problems with glusterfs are not in its design, feature
> >     set or performance; they are around what happens when something goes wrong.
> >     As I perceive them, the issues are:
> > 
> >     1. An almost total lack of error reporting, beyond incomprehensible entries
> >     in log files on a completely different machine, made very difficult to find
> >     because they are mixed in with all sorts of other incomprehensible log
> >     entries.
> > 
> >     2. Incomplete documentation. This breaks down further as:
> > 
> >     2a. A total lack of architecture and implementation documentation - such as
> >     what the translators are and how they work internally, what a GFID is, what
> >     xattrs are stored where and what they mean, and all the on-disk states you
> >     can expect to see during replication and healing.  Without this level of
> >     documentation, it's impossible to interpret the log messages from (1) short
> >     of reverse-engineering the source code (which is also very minimalist when
> >     it comes to comments); and hence it's impossible to reason about what has
> >     happened when the system is misbehaving, and what would be the correct and
> >     safe intervention to make.
> > 
> >     glusterfs 2.x actually had fairly comprehensive internals documentation,
> >     but
> >     this has all been stripped out in 3.x to turn it into a "black box".
> >     Conversely, development on 3.x has diverged enough from 2.x to make the 2.x
> >     documentation unusable.
> > 
> >     2b. An almost total lack of procedural documentation, such as "to replace a
> >     failed server with another one, follow these steps" (which in that case
> >     involves manually copying peer UUIDs from one server to another), or "if
> >     volume rebalance gets stuck, do this".  When you come across any of these
> >     issues you end up asking the list, and to be fair the list generally
> >     responds promptly and helpfully - but that approach doesn't scale, and
> >     doesn't necessarily help if you have a storage problem at 3am.
> > 
> >     For these reasons, I am holding back from deploying any of the more
> >     interesting features of glusterfs, such as replicated volumes and
> >     distributed volumes which might grow and need rebalancing.  And without
> >     those, I may as well go back to standard NFS and rsync.
> > 
> >     And yes, I have raised a number of bug reports for specific issues, but
> >     reporting a bug whenever you come across a problem in testing or production
> >     is not the right answer.  It seems to me that all these edge and error
> >     cases
> >     and recovery procedures should already have been developed and tested *as a
> >     matter of course*, for a service as critical as storage.
> > 
> >     I'm not saying there is no error handling in glusterfs, because that's
> >     clearly not true.  What I'm saying is that any complex system is bound to
> >     have states where processes cannot proceed without external assistance, and
> >     these cases all need to be tested, and you need to have good error
> >     reporting
> >     and good documentation.
> > 
> >     I know I'm not the only person to have been affected, because there is a
> >     steady stream of people on this list who are asking for help with how to
> >     cope with replication and rebalancing failures.
> > 
> >     Please don't consider the above as non-constructive. I count myself amongst
> >     "those of us who just want to use the thing".  But right now, I cannot
> >     wholeheartedly recommend it to my colleagues, because I cannot confidently
> >     say that I or they would be able to handle the failure scenarios I have
> >     already experienced, or other ones which may occur in the future.
> > 
> >     Regards,
> > 
> >     Brian.
> >     _______________________________________________
> >     Gluster-users mailing list
> >     Gluster-users at gluster.org
> >     http://supercolony.gluster.org/mailman/listinfo/gluster-users
> > 
> > 
> 
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130102/941e994e/attachment.sig>