[Gluster-users] User-serviceable snapshots design

Thu May 8 14:22:37 UTC 2014

On 05/08/2014 05:18 PM, Jeff Darcy wrote:
>>> * Since a snap volume will refer to multiple bricks, we'll need
>>>     more brick daemons as well.  How are *those* managed?
>> This is infra handled by the "core" snapshot functionality/feature. When
>> a snap is created, it is treated not only as a lvm2 thin-lv but as a
>> glusterfs volume as well. The snap volume is activated and mounted and
>> made available for regular use through the native fuse-protocol client.
>> Management of these is not part of the USS feature. But handled as part
>> of the core snapshot implementation.
> If we're auto-starting snapshot volumes, are we auto-stopping them as
> well?  According to what policy?
These are not auto-stopped at all. A deactivate cmd has been
introduced as part of the core snapshot feature, which can be used to 
deactivate
such a snap vol. Refer to the snapshot feature design for details.

>
>> USS (mainly snapview-server xlator)
>> talks to the snapshot volumes (and hence the bricks) through the glfs_t
>> *, and passing a glfs_object pointer.
> So snapview-server is using GFAPI from within a translator?  This caused
> a *lot* of problems in NSR reconciliation, especially because of how
> GFAPI constantly messes around with the "THIS" pointer.  Does the USS
> work include fixing these issues?
Well, not only that, here we have multiple gfapi call-graphs all hanging 
(for each snap vol)
from the same snapd/snapview-server xlator address space. Ok, don't panic :)

We haven't hit any issues in our basic testing so far. Let us test some 
more to see
if we hit the problems you mention. If we hit them, we fix them. Or 
something like that. ;)

> If snapview-server runs on all servers, how does a particular client
> decide which one to use?  Do we need to do something to avoid hot spots?
The idea as of today is to connect to the snapd running on the host the 
client connects to,
where the mgmt glusterd is running. We can think of other mechanisms 
like distributing
the connections to different snapds but that is not implemented for the 
first drop. And it
is a concern only if we hit a perf bottleneck wrt the number of requests 
a given snapd hits.

> Overall, it seems like having clients connect *directly* to the snapshot
> volumes once they've been started might have avoided some complexity or
> problems.  Was this considered?
Can you explain this in more detail? Are you saying that the virtual 
namespace overlay
used by the current design can be reused along with returning extra info 
to clients
or is this a new approach where you make the clients
much more intelligent than they are in the current approach?

>
>>> * How does snapview-server manage user credentials for connecting
>>>     to snap bricks?  What if multiple users try to use the same
>>>     snapshot at the same time?  How does any of this interact with
>>>     on-wire or on-disk encryption?
>> No interaction with on-disk or on-wire encryption. Multiple users can
>> always access the same snapshot (volume) at the same time. Why do you
>> see any restrictions there?
> If we're using either on-disk or on-network encryption, client keys and
> certificates must remain on the clients.  They must not be on servers.
> If the volumes are being proxied through snapview-server, it needs
> those credentials, but letting it have them defeats both security
> mechanisms.
>
> Also, do we need to handle the case where the credentials have changed
> since the snapshot was taken?  This is probably a more general problem
> with snapshots themselves, but still needs to be considered.
Agreed. Very nice point you brought up. We will need to think a bit more 
on this Jeff.

Cheers,
Anand