[Gluster-devel] Glusterfs Snapshot feature

Mon Dec 23 21:46:26 UTC 2013

Thanks Rajesh - happy to help with the docs/diagrams for the revisions to the wiki

Personally, I'm still dubious about the replica 3 support though. People I've dealt with already 'groan' about 2-way mirror and the associated cost, so a 3-way in the current architecture is going to be a pretty expensive exercise. In some regards, I would have thought with bricks/nodes down the admins should be focusing on restoring service to 100% - if a snapshot aborts due to a cluster 'health' issue it just makes sense to me. I like simple :)

Cheers,

PC

----- Original Message -----
> From: "Rajesh Joseph" <rjoseph at redhat.com>
> To: "Paul Cuzner" <pcuzner at redhat.com>
> Cc: "gluster-devel" <gluster-devel at nongnu.org>
> Sent: Monday, 23 December, 2013 4:48:24 PM
> Subject: Re: [Gluster-devel] Glusterfs Snapshot feature
> 
> Thanks Paul for going through the Snapshot design wiki and providing your
> valuable inputs. The wiki needs lot of update. I will update it soon and
> circulate it here again.
> 
> The cli status and list commands will provide space consumption by the
> snapshot. Snapshot scheduler is planned for phase 2. I will update the
> document with details of phase I and phase II.
> 
> As of now auto-delete feature is based on maximum allowed snapshot count.
> Admins can set a limit on number of snapshots allowed per volume. More
> details will be provided in the wiki.
> 
> We still have to iron out details of Replica-3 and quorum support, but the
> intention of providing the feature is to allow admins to take snapshots even
> some bricks are down without compromising the integrity of the volume.
> 
> When snapshot count reaches the maximum limit then an auto-delete is
> triggered. Before deletion the snapshot is stopped and then it is deleted.
> 
> User serviceable snapshots is planning in phase II and yes .snap directory
> will list in accordance with snap time-stamp.
> 
> Snapshot can be restored only when the volume is stopped. It is a completely
> offline activity. More details will be provided in the wiki.
> 
> Thanks & Regards,
> Rajesh
> 
> ----- Original Message -----
> From: "Paul Cuzner" <pcuzner at redhat.com>
> To: "Rajesh Joseph" <rjoseph at redhat.com>
> Cc: "gluster-devel" <gluster-devel at nongnu.org>
> Sent: Monday, December 23, 2013 3:50:45 AM
> Subject: Re: [Gluster-devel] Glusterfs Snapshot feature
> 
> 
> Hi Rajesh,
> 
> I've just read through the snapshot spec on the wiki page in the forge, and
> have been looking at it through a storage admin's eyes.
> 
> There are a couple of items that are perhaps already addressed but not listed
> on the wiki, but just in case here's my thoughts;
> 
> The CLI definition doesn't define a snap usage command - ie. a way for the
> admin to understand which snap is consuming the most space. With snaps comes
> misuse, and capacity issues, so our implementation should provide the admin
> with the information to make the right 'call'.
> 
> I think I've mentioned this before, but how are snaps orchestrated across the
> cluster. I see a CLI to start it - but what's needed is a cron like ability
> associated per volume. I think in a previous post I called this a "snap
> schedule" command.
> 
> A common thing I've seen on other platforms is unexpected space usage, due to
> changes in data access patterns generating more delta's - I've seen this a
> lot in virtual environments, when service packs/maintenance gets rolled out
> for example. In these situations, capacity can soon run out, so an
> auto-delete feature to drop snaps to ensure the real volume stays on line
> would seem like a sensible approach.
> 
> The comments around quorum and replica 3 enable an exception to a basic rule
> - fail a snap request if the cluster is not in a healthy state. I would
> argue against making exceptions and keep things simple - if a node/brick is
> unavailable, or there is a cluster reconfig in progress, raise an alert and
> fail the snap request. I think this is a more straight forward approach for
> admins to get their heads around than thinking about specific volume types
> and potential cleanup activity.
> 
> It's not clear from the wiki how snaps are deleted - for example, when the
> snap-max-limit is reached, does the create of a new snapshot automatically
> trigger the delete of the oldest snap? If so presumably the delete will only
> be actioned, once the barrier is in place across all affected bricks.
> 
> The snap create is allowing a user-defined name, which is a great idea.
> However, what is the order that the snaps would be resolved in when the user
> opens the .snap directory? Will create time for the snapshot be the order
> the user sees regardless of name, or is there a potential of a name for a
> more current time appearing lower in the list causing a potential recovery
> point to be missed by the end user?
> 
> snap restore presumably requires the volume to be offline, and the bricks
> unmounted? There's no detail in the scoping document about how the restore
> capability is intended to be implemented. Restore will be a drastic action,
> so understanding the implications to the various protocols smb,nfs,native,
> swift and gfapi would be key. Perhaps the simple answer is the volume must
> be in a stopped state first?
> 
> There is some mention of a phased implementation for snapshots - phase-1 is
> mentioned for example towards the end of the doc. Perhaps it would be
> beneficial to define the phases at the start of the article, and list the
> features likely to be in each phase. This may help focus feedback specific
> to the initial implementation for example.
> 
> Like I said - I'm looking at this through the eyes of an "old admin" ;)
> 
> Cheers,
> 
> Paul C
> 
> 
> 
> 
> ----- Original Message -----
> > From: "Rajesh Joseph" <rjoseph at redhat.com>
> > To: "gluster-devel" <gluster-devel at nongnu.org>
> > Sent: Friday, 6 December, 2013 1:05:29 AM
> > Subject: [Gluster-devel] Glusterfs Snapshot feature
> > 
> > Hi all,
> > 
> > We are implementing snapshot support for glusterfs volumes in release-3.6.
> > The design document can be found at
> > https://forge.gluster.org/snapshot/pages/Home.
> > The document needs some update and I will doing that in the coming weeks.
> > 
> > All the work done till now can be seen at review.gluster.com. The project
> > name for the snapshot
> > work is "glusterfs-snapshot"
> > http://review.gluster.org/#/q/status:open+project:glusterfs-snapshot,n,z
> > 
> > All suggestions and comments are welcome.
> > 
> > Thanks & Regards,
> > Rajesh
> > 
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at nongnu.org
> > https://lists.nongnu.org/mailman/listinfo/gluster-devel
> > 
>