[Gluster-devel] GlusterFS 4.0 plan in conjunction with Architectural Guiding Principles discussion

Mark Hayford mhayford at redhat.com
Thu Dec 12 18:38:59 UTC 2013


Per our conf call yesterday, i would like to start a discussion on what the future architectural guiding principles of Gluster should be. The motivation for laying this foundation is to provide a framework for tactical short term feature/functions to map to and provide a vision for what we want Gluster to be when it grows up. Having a clear articulated vision can and does focus design elements and also provides for contextual framework discussions. My research into providing such a framework uncovered some earlier discussions around this topic. They have been paraphrased as follows: 

1 RHS BU 10 year objective= 
Do to storage what RH did to operating systems with RHEL 
2 RHEL recipe= 
Disrupt and transform the market by driving volume economics through open source community innovation 
3 RHS BU recipe = 
Drive disruptive storage technologies. Software defined data center (SDDC) drives the need for Open software defined storage (OSDS) that runs on converged commodity servers (storage, compute, network) with abstracted management outside the storage system (like openstack) Into specific storage market segments where the disruptive tech has competitive 

Although this does provide a discussion framework, it does not provide a path or set of principles for achievement of the goals. The principles for discussion, might include topics as follows, in no particular order: 

Architectural Guiding Principles 

1. Reliability - storage must be perceived as being reliable. a bit stored should be able to be retrieved. 
Sub topics might include: 
How should we gracefully recover from errors? 
How should we do system and cluster cleanup and maintenance? 
What functions should have preventive maintenance features? 
How do we provide reporting, monitoring and analysis? 

2. Availability - storage and ultimately workloads/applications must be perceived as be highly available 
Sub topics might include: 
What is our storage HA strategy intra and inter DC? 
What is our storage fail-over-back strategy and how does this interact with applications and virtualization? 
What is our field measured cluster uptime goals? 
How do we provide and support application end to end availability? 
How do we provide reporting, monitoring and analysis? 

3. Deterministic - Storage should provide individual application workloads as deterministic response times as possible. Application workloads accessed by users must have positive user experiences 
Sub topics might include: 
How should we architect storage data structures within the cluster to provide self-tuned application specific response times? 
How should we architect cluster wide multi-tenant workloads to coexist while providing application specific response times? 
How should we provide application workloads automatic response expansion or contraction? 
How do we provide reporting, monitoring and analysis? 

4. Scale out/up - Storage systems must scale in I/O, MB/s and capacity as workload requirements expand or contract 
Sub topics might include: 
How do we insert new physical cluster resources into the cluster for consumption and provide for cluster/application awareness of the new resource? 
How do we seamlessly plan and provide for the self-tuning of capacity expansion by application or workload type? 
How do we seamlessly plan and provide for the self-tuning of cluster wide applications with heuristics techniques? 
How do we provide for application migration and cut over? 
How do we provide for cluster upgrades without application downtime or reduction of services? 
How do we provide reporting, monitoring and analysis? 

5. Simplicity - Storage should be simple to understand, setup, configure, operate, manage, consume and report upon. The structural storage architecture and cluster wide interactions should be simple to articulate. 
Sub topics might include: 
How do we document, keep current and provide training for Gluster users, developers and interested parties? 
How can we better communicate inter dependencies of code modules while providing for agile structural development environments? 
How do we provide simple operational models at scale? 
How do we keep complexity minimized? 
How do we provide reporting, monitoring and analysis? 

6. Automation - Storage processes and procedures should be automated to reduce operational cost, overhead, reduce errors, scale up and out, provide workload and application fault tolerance, backups etc. 
Sub topics might include: 
What does automation at scale mean as PB+ scale is reached. 
How do we provide reporting, monitoring, analysis and control? 

7. Security - Storage and specifically the data stored must be perceived as secure. 
Sub topics might include: 
What security standards should we adopt and adhere to? 
How should security containers and data structures operate? 
What RBAC structures do we need to develop? 

I believe we should formulate goals and responses to each of these "architectural guiding principles" and add more if needed. 

Mark Hayford 

----- Original Message -----

From: "Anand Avati" <avati at gluster.org> 
To: "Gluster Devel" <gluster-devel at nongnu.org> 
Sent: Wednesday, December 11, 2013 8:52:36 PM 
Subject: [Gluster-devel] GlusterFS 4.0 plan 

Hello all, 

Here is a working draft of the plan for 4.0. It has pretty significant changes from the current model. Sending it out for early review/feedback. Further revisions will follow over time. 



Gluster-devel mailing list 
Gluster-devel at nongnu.org 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20131212/e181eba9/attachment-0001.html>

More information about the Gluster-devel mailing list