[Gluster-devel] Change in design for changelog crash consistency wrt snapshot.

Sun Aug 24 12:50:57 UTC 2014

[Adding the right alias for gluster-devel]

On 08/24/2014 05:57 PM, Ajeet Jha wrote:
> Hi all,
>
> The current design barriers rename and unlink(in changelog xlator) and allows normal creates and data operations. Geo-replication further relies on xsync (ie. marker xlators's xtime updation) for syncing. This design was taking care of one of the most genuine use-case of not blocking normal archival copy techniques (ie.."copy -a ..." ). But the marker translator being unable to propagate change in xtime till root, in snap volume, causes xsync to miss syncing few files.
>
> The change is design, plans to block/barrier all fops other than data operation(writev and truncate) and store changelogs in call-path with O_SYNC mode and call-back path as well. This call-path journelling in triggered by barrier enable and later stopped with barrier disable notification. Further, this call-path changelog gets replaced among normal changelogs(recorded in call-back path). And later reused in history-api call by geo-replication.

A few questions:

- If changelog updates its journal in call path of fops, why is 
barriering of such fops needed?

- What kind of updates happen to the journal in the call and call-back 
paths?

Vijay

>
> After brief talk with Venky on design change, i assume that there are few implications.
> 1. Breakage of initial design use-cases, like non-blocking archival copy during snapshot.
> 2. Call-path changelog in written in O_SYNC which may cause a huge delay in fop completion.
> 3. Large memory usage during barrier of all fops.
> 
> I request for suggestions on the change in design.
> 
>