[Gluster-devel] Snapshot design for glusterfs volumes

Tue Aug 6 21:41:30 UTC 2013

On 08/06/2013 12:16 AM, Shishir Gowda wrote:
> Hi Brian, 
> 
> - A barrier is similar to a throttling mechanism. All it does is queue up the call_backs at the server xlator.
>   Once barrier'ing is done, it just starts unwinding, so that clients can now get the response.
>   The idea is that if a application does not get a acknowledgement back for the fops, it will block for sometime,
>   hence effectively throttling itself. 
> 

Ok, but why the need to stop unwinds? Isn't it just as effective to pend
the next wind from said process?

Maybe I missed this from the doc, but it's actually a mechanism _in_ the
server translator, or an independent translator? I ask because it sounds
like the latter could potentially provide a general throttle mechanism
that could be more broadly useful (over time, of course ;) ).

> - Snapshot here guarantees a snap of whatever has been committed onto the disk.
>   So, in effect every internal operation (afr/dht...) should/will have to be able to heal them-selves once
>   the volume restore takes place. 
> 

Given the explanation above to consider the barrier translator as a
"throttle," then I suspect its primary purpose is for performance
reasons as opposed to purely functional reasons (i.e., make sure the
snap operation occurs in a timely fashion)? My inclination when reading
the document was to consider the barrier mechanism as effectively the
quiesce portion of the typical snapshot process.

>From a black box perspective, it seems a little strange to me that a
built-in snapshot mechanism wouldn't be coherent with internal
operations (though from a complexity standpoint I can understand why
that's put off). Has there been any considerations to try and solve that
problem?

That aside and assuming the current model, 1.) is there any assessment
for the likelihood of that kind of situation assuming a user follows the
expected process? and 2.) has the effect of that been measured on the
snapshot mechanism?

It's been a while since I've played with lvm snapshots and I suppose the
latest technology does away with the old per-snap exception store. Even
still, it seems like self-heals running across sets of large,
inconsistent vm image files (and potentially copying them from one side
to another) could eat a ton of resource (space and/or cpu), no?

Brian

> With regards, 
> Shishir 
> 
> ----- Original Message ----- 
> From: "Brian Foster" <bfoster at redhat.com> 
> To: "Shishir Gowda" <sgowda at redhat.com> 
> Cc: gluster-devel at nongnu.org 
> Sent: Monday, August 5, 2013 6:11:47 PM 
> Subject: Re: [Gluster-devel] Snapshot design for glusterfs volumes 
> 
> On 08/02/2013 02:26 AM, Shishir Gowda wrote: 
>> Hi All, 
>>
>> We propose to implement snapshot support for glusterfs volumes in release-3.6. 
>>
>> Attaching the design document in the mail thread. 
>>
>> Please feel free to comment/critique. 
>>
> 
> Hi Shishir, 
> 
> Thanks for posting this. A couple questions: 
> 
> - The stage-1 prepare section suggests that operations are blocked 
> (barrier) in the callback, but later on in the doc it indicates incoming 
> operations would be held up. Does barrier block winds and unwinds, or 
> just winds? Could you elaborate on the logic there? 
> 
> - This is kind of called out in the open issues section with regard to 
> write-behind, but don't we require some kind of operational coherency 
> with regard to cluster translator operations? Is it expected that a 
> snapshot across a cluster of bricks might not be coherent with regard to 
> active afr transactions (and thus potentially require a heal in the 
> snap), for example? 
> 
> Brian 
> 
>> With regards, 
>> Shishir 
>>
>>
>>
>> _______________________________________________ 
>> Gluster-devel mailing list 
>> Gluster-devel at nongnu.org 
>> https://lists.nongnu.org/mailman/listinfo/gluster-devel 
>>
> 
>