[Gluster-devel] Fuse Subdirectory mounts, access-control

Wed Mar 9 13:19:23 UTC 2016

> When this command is executed, volfile is requested with volfile-id
> '/vol/subdir1'
> Glusterd on seeing this volfile-id will generate the client xlator with
> remote-subvolume appending '/subdir1'

I don't think GlusterD needs to be involved here at all.  See below.

> When graph initialization on fuse mount happens, client xlator sends
> setvolume with the remote-subvolume which has extra '/subdir1' at the
> end.

Alternatively, we could send the subdirectory separately, in xdata or
a new field.  Since the subdirectory probably has a 1:1 relationship
with a tenant ID, we could send that instead.  This avoids any
ambiguity inherent in parsing a single string into two fields, and
might map more naturally into the UI for specifying per-tenant access
controls.

> Server xlator will do the access-control checks based on if this ip
> has access for the subdir1 based on the configuration. If setvolume is
> successful, server xlator sends gfid of the '/subdir1' in the response
> for setvolume. Client xlator sends this in CHILD_UP notification. Fuse
> mount sets this gfid as root_gfid and does a resolution by sending
> lookup fop.

That is not sufficient.  Full multi-tenant separation requires that
the mapping of GFID1 be done on the server, not the client, and that
each tenant's inode table remain separate as well.  Otherwise it's too
easy for one tenant to give a GFID belonging to a different client and
have it accepted.  There are also issues that remain to be solved wrt
information that's stored on the actual brick root, such as DHT's
volume commit hash and some 'du' information.  (Thanks to Shyam for
pointing these out.)  Also, applying a per-tenant suffix too early
might confuse things like quota, or make their use less intuitive.
Instead, we could apply the suffix all the way down at storage/posix,
but we'd still need a way to get at the actual brick root for some of
those special cases.

> Some of the things we are not clear about:
> 1) Should acls be set based on paths/gfids of the directories?
> 2) If answer to 1) is based on paths, what should happen if the
> directories are renamed?

(3) What is the UX for granting/denying access to a particular
subdirectory for a particular tenant (and possibly over a particular
network)?  How is this same information stored internally?

(4) How do we avoid tenants in separate subdirectories interfering
with each other in other ways - e.g. DHT rebalance leading to I/O
starvation, snapshots introducing quiescence bubbles, pushing each
others' data out of a hot tier?  It's not strictly necessary to
solve all of these problems just to do subdirectory mounts, but
we need to think about them before we choose an approach that will
require rework or abandonment when we get to that next stage.