[Gluster-devel] Plans for Gluster 3.8

Wed Jul 22 10:31:06 UTC 2015

Hi All,

let me discuss what Manila needs from GlusterFS for Liberty.

(Please roll down to Summary if you are not interested
in the context.)

Let's recap from where we left off with the GlusterFS feature
discussion: "The Manila RFEs and why so",
http://www.gluster.org/pipermail/gluster-devel/2015-June/045483.html

There we outlined three groups of features:

- Directory level operations
- Smart volume management
- Query features

We made the declaration there that the first two are alternatives
to each other and that addressing one of them is absolutely necessary.
The Gluster community took the challenge and settled with
"Smart volume management" as the favoured feature set and addressed
it with the emergence Heketi (the integration of which continues,
as Vijay also points it out). In that way the first two groups are
properly settled, having the first one abandoned and the second
one completed.

That leaves us with the third group as a residue:

- Query features:

    Bug 1226225 – [RFE] volume size query support
       https://bugzilla.redhat.com/show_bug.cgi?id=1226225

    Bug 1226776 – [RFE] volume capability query
       https://bugzilla.redhat.com/show_bug.cgi?id=1226776

Of these, we came to realize that "size query" would not
be used with Heketi, so we can retire that too. However,
"volume capability query" is still sought for, even if
not a top priority. So that's the one we take over to
3.8 planning.

But let's then see the hot cases:

-   Bug 1245380 - [RFE] Render all mounts of a volume defunct upon access revocation
        https://bugzilla.redhat.com/show_bug.cgi?id=1245380

    Context:

    In http://thread.gmane.org/gmane.comp.cloud.openstack.devel/58419/focus=58647
    Ben Swartzlander, the project lead of Manila declares:

    "Access deny should always result in immediate loss of access to the 
    share. It's not okay for a client to continue reading/writing data to a 
    share after access has been denied."

    As of now, we do comply with this, given we restart the volume upon
    making any change to auth.ssl-allow (and the restart will render the mounts
    defunct). However, one of the important planned scalability improvements is
    that we drop the restart step, which was expected to be made possible by
    getting a fix for

        Bug 1228127 - Volume needs restart after editing auth.ssl-allow list
                      for volume options which otherwise has to be automatic
            https://bugzilla.redhat.com/show_bug.cgi?id=1228127

    But we still can't do it, as with that existing mounts would stay
    functional after revocation of access. Implementing the RFE in question
    would remove this obstacle.

We are to reconcile with an old friend too:

-    Bug 829042 - [FEAT] selective read-only mode
         https://bugzilla.redhat.com/show_bug.cgi?id=829042

     This became an _absolute must have_ as read-only support
     is one of the Manila project's requirements for drivers in Liberty
     (cf. https://etherpad.openstack.org/p/manila-minimum-driver-requirements)
     and in order to work with a read-only volume we need a special way
     to access it in full (read-write) mode as well (for purposes of
     management).

     There is prior art in this regard by your humble presenter:

         http://review.gluster.org/#/q/topic:bug-829042

     The plan there was to grant r/w access to the volume for certain
     client pids. To that end I came up with

     -  a special syntax for specifying ranges of integers (planned 
        to be used to specify the r/w granted client pid set),
        cf. http://review.gluster.org/3525

     -  the patch for read-only xlator that allows r/w for the
        distinguished client pids (specified via the above syntax),
        cf. http://review.gluster.org/3526

     However, my syntax become controversial and considered to be
     an overkill, at which point the whole effort was abandoned.
     Yet there is no argument against a cut back version
     of the second change whereby the option to grant r/w access
     would take just a single integer value (or maybe an itemized
     finite list of integers -- but not infinite set of integers,
     like "all the negatives"). Thus the second change can be amended
     and resubmitted without relying the concept of the first patch.

     There is still a concern though: this mechanism to provide
     rw/ro access is *advisory*. There is no way to prevent a tenant
     from using the particular negative client pid that gives her
     rw access. Thus we need an additional auth mechanism extension
     that makes it possible to limit the usage of negive client pids.
     OR we need another approach (different from mine, that is, in terms
     of client pids) to get at selective read-onliness.

* * *

Summary and announcement of results (order means priority):

1.    Bug 829042 - [FEAT] selective read-only mode
         https://bugzilla.redhat.com/show_bug.cgi?id=829042

      absolutely necessary for not getting tarred & feathered in Tokyo ;)
      either resurrect http://review.gluster.org/3526
      and _find out integration with auth mechanism for special
      mounts_, or come up with a completely different concept

2.    Bug 1245380 - [RFE] Render all mounts of a volume defunct upon access revocation
         https://bugzilla.redhat.com/show_bug.cgi?id=1245380

      necessary to let us enable a watershed scalability
      enhancement

3.    Bug 1226776 – [RFE] volume capability query
         https://bugzilla.redhat.com/show_bug.cgi?id=1226776

      eventually we'll be choking in spaghetti if we don't get
      this feature. The ugly version checks we need to do against
      GlusterFS as in

      https://review.openstack.org/gitweb?p=openstack/manila.git;a=commitdiff;h=29456c#patch3

      will proliferate and eat the guts of the code out of its
      living body if this is not addressed.

Csaba

----- Original Message -----
> From: "Vijay Bellur" <vbellur at redhat.com>
> To: "gluster-users Discussion List" <Gluster-users at gluster.org>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Thursday, July 9, 2015 8:57:35 AM
> Subject: [Gluster-devel] Plans for Gluster 3.8
> 
> Hey All,
> 
> Now that 3.7 is out, here are some thoughts on how we can shape up 3.8. I am
> thinking of releasing Gluster 3.8 towards the end of this year. Here is a
> tentative list of things that we are contemplating to do in 3.8:
> 
> 1. Improvements for "Storage as a Service"
> 
> "Storage as a Service" broadly refers to the model where storage can be
> provisioned or decommissioned on demand, storage caters to single or multi
> tenant workloads and completely automated provisoning of storage is
> possible. Storage as a Service is what public/private clouds use as a
> building block today. By selecting enhancements and improvements that fit
> into this paradigm, we can make Gluster to be more easily adopted in modern
> datacenters. Following are sample use cases/workloads that can benefit by
> gluster improvements:
> 
> - Manila: File Share as a service project in OpenStack
> - Shared Storage for Containers
> - Any deployment where shares are created as a service
> 
> 
> Enhancements that can be accomplished in this release include:
> 
> a. Intelligent Volume provisioning through Heketi [1]
> b. Kerberized support for GlusterFS protocol
> c. Better network management support [2]
> 
> 
> 2. Regression test & Quality improvements
> 
> We have zeroed in on distaf[3] as the framework of choice where we will be
> adding support for multi-node regression tests. This will augment the single
> node pre-commit regression tests that we already run today with Jenkins. I
> expect tests in distaf passing as a gating factor for GA of all releases
> from 3.8. Here is what we would like to do in this release cycle:
> 
> a. all gluster components to have tests populated in distaf
> b. CI using Jenkins for running tests in distaf on nightlies/release
> candidates
> 
> 
> 3. Storage for Containers
> 
> There seems to be significant attention on storage for containers recently.
> We can cater to this interest by picking specific improvements for container
> storage like:
> 
> a. shared storage for applications in containers (already possible with nfs
> today). Explore how we can do this with native client etc.
> b. shared storage for docker/container repositories
> c. hyperconvergence of containers & storage
> 
> 4. Hyperconvergence with oVirt
> 
> There is an ongoing effort to have hyperconvergence of gluster with oVirt for
> storing virtual machine images in a single cluster [4]. Improvements like
> the following can help in making Gluster a better fit for hyperconvergence:
> 
> a. Throttling for maintenance operations in gluster (self-healing/rebalance
> etc.)
> b. Ensuring data locality for virtual machine images
> c. Integration of sharding for hyperconvergence (expect to reach here sooner
> than 3.8)
> 
> 5. Performance improvements
> 
> a. Continue the onoging small file performance improvements [5]
> b. multi-threaded self-heal daemon for improving performance of self-healing
> 
> 6. Other improvements like full fledged IPv6 support, delegations/lease-lock
> improvements, more policies for tiering, support for systematic erasure
> codes, support for native object service etc. are also planned.
> 
> There are other improvements which are being planned and have not found a
> mention here. If you are aware of such improvements, please reply to this
> thread. I will be collating this information and publish a release planning
> page for 3.8 in gluster.org.
> 
> If you have come all the way till here, we would be interested in knowing the
> following:
> 
> (i) What are your thoughts on the plan?
> (ii) What other improvements would you be interested in seeing?
> 
> Thoughts and feedback would be very welcome!
> 
> Thanks,
> Vijay
> 
> 
> [1] https://github.com/heketi/heketi/
> 
> [2]
> http://www.gluster.org/community/documentation/index.php/Features/SplitNetwork
> 
> [3] https://github.com/gluster/distaf
> 
> [4] http://www.ovirt.org/Features/GlusterFS-Hyperconvergence
> 
> [5]
> http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>