[Gluster-devel] snapshot restore and USS
Raghavendra Bhat
rabhat at redhat.com
Thu Nov 27 09:29:43 UTC 2014
Hi,
With USS to access snapshots, we depend on last snapshot of the volume
(or the latest snapshot) to resolve some issues.
Ex:
Say there is a directory called "dir" within the root of the volume and
USS is enabled. Now when .snaps is accessed from "dir" (i.e.
/dir/.snaps), first a lookup is sent on /dir which snapview-client
xlator passes onto the normal graph till posix xlator of the brick. Next
the lookup comes on /dir/.snaps. snapview-client xlator now redirects
this call to the snap daemon (since .snaps is a virtual directory to
access the snapshots). The lookup comes to snap daemon with parent gfid
set to the gfid of "/dir" and the basename being set to ".snaps". Snap
daemon will first try to resolve the parent gfid by trying to find the
inode for that gfid. But since that gfid was not looked up before in the
snap daemon, it will not be able to find the inode. So now to resolve
it, snap daemon depends upon the latest snapshot. i.e. it tries to look
up the gfid of /dir in the latest snapshot and if it can get the gfid,
then lookup on /dir/.snaps is also successful.
But, there can be some confusion in the case of snapshot restore. Say
there are 5 snapshots (snap1, snap2, snap3, snap4, snap5) for a volume
vol. Now say the volume is restored to snap3. If there was a directory
called
"/a" at the time of taking snap3 and was later removed, then after
snapshot restore accessing .snaps from that directory (in fact all the
directories which were present while taking snap3) might cause problems.
Because now the original volume is nothing but the snap3 and snap daemon
when gets the lookup on "/a/.snaps", it tries to find the gfid of "/a"
in the latest snapshot (which is snap5) and if a was removed after
taking snap3, then the lookup of "/a" in snap5 fails and thus the lookup
of "/a/.snaps" will also fail.
Possible Solution:
One of the possible solution that can be helpful in this case is,
whenever glusterd sends the list of snapshots to snap daemon after
snapshot restore, send the list in such a way that the snapshot which is
previous to the restored snapshot is sent as the latest snapshot (in the
example above, since snap3 is restored, glusterd should send snap2 as
the latest snapshot to snap daemon).
But in the above solution also, there is a problem. If there are only 2
snapshots (snap1, snap2) and the volume is restored to the first
snapshot (snap1), there is no previous snapshot to look at. And glusterd
will send only one name in the list which is snap2 but it is in a future
state than the volume.
A patch has been submitted for the review to handle this
(http://review.gluster.org/#/c/9094/).
And in the patch because of the above confusions snapd tries to consult
the adjacent snapshots of the restored snapshot to resolve the gfids.
As per the 5 snapshots example, it tries to look at snap2 and snap4
(i.e. look into snap2 first, if it fails then look into snap4). If there
is no previous snapshot, then look at the next snapshot (2 snapshots
example). If there is no next snapshot, then look at the previous snapshot.
Please provide feed back about how this issue can be handled.
Regards,
Raghavendra Bhat
More information about the Gluster-devel
mailing list