[Gluster-devel] REVERT: Change in glusterfs[master]: fuse: auxiliary gfid mount support

Anand Avati avati at redhat.com
Wed Jul 31 12:42:30 UTC 2013


On 7/19/13 1:14 AM, Vijay Bellur (Code Review) wrote:
> Vijay Bellur has submitted this change and it was merged.
>
> Change subject: fuse: auxiliary gfid mount support
> ......................................................................
>
>
> fuse: auxiliary gfid mount support
>
> * files can be accessed directly through their gfid and not just
>    through their paths. For eg., if the gfid of a file is
>    f3142503-c75e-45b1-b92a-463cf4c01f99, that file can be accessed
>    using <gluster-mount>/.gfid/f3142503-c75e-45b1-b92a-463cf4c01f99
>
>    .gfid is a virtual directory used to seperate out the namespace
>    for accessing files through gfid. This way, we do not conflict with
>    filenames which can be qualified as uuids.
>
> * A new file/directory/symlink can be created with a pre-specified
>    gfid. A setxattr done on parent directory with fuse_auxgfid_newfile_args_t
>    initialized with appropriate fields as value to key "glusterfs.gfid.newfile"
>    results in the entry <parent>/bname whose gfid is set to args.gfid. The
>    contents of the structure should be in network byte order.
>
>    struct auxfuse_symlink_in {
>          char     linkpath[]; /* linkpath is a null terminated string */
>    } __attribute__ ((__packed__));
>
>    struct auxfuse_mknod_in {
>          unsigned int   mode;
>          unsigned int   rdev;
>          unsigned int   umask;
>    } __attribute__ ((__packed__));
>
>    struct auxfuse_mkdir_in {
>          unsigned int   mode;
>          unsigned int   umask;
>    } __attribute__ ((__packed__));
>
>    typedef struct {
>          unsigned int  uid;
>          unsigned int  gid;
>          char          gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid string
>                                                        * in canonical form.
>                                                        */
>          unsigned int  st_mode;
>          char          bname[];     /* bname is a null terminated string */
>
>          union {
>                  struct auxfuse_mkdir_in   mkdir;
>                  struct auxfuse_mknod_in   mknod;
>                  struct auxfuse_symlink_in symlink;
>          } __attribute__ ((__packed__)) args;
>    } __attribute__ ((__packed__)) fuse_auxgfid_newfile_args_t;
>
>    An initial consumer of this feature would be geo-replication to
>    create files on slave mount with same gfids as that on master.
>    It will also help gsyncd to access files directly through their
>    gfids. gsyncd in its newer version will be consuming a changelog
>    (of master) containing operations on gfids and sync corresponding
>    files to slave.
>
> * Also, bring in support to heal gfids with a specific value.
>    fuse-bridge sends across a gfid during a lookup, which storage
>    translators assign to an inode (file/directory etc) if there is
>    no gfid associated it. This patch brings in support
>    to specify that gfid value from an application, instead of relying
>    on random gfid generated by fuse-bridge.
>
>    gfids can be healed through setxattr interface. setxattr should be
>    done on parent directory. The key used is "glusterfs.gfid.heal"
>    and the value should be the following structure whose contents
>    should be in network byte order.
>
>    typedef struct {
>          char      gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid
>                                                        * string in canonical form
>                                                        */
>          char      bname[]; /* a null terminated basename */
>    } __attribute__((__packed__)) fuse_auxgfid_heal_args_t;
>
>    This feature can be used for upgrading older geo-rep setups where gfids
>    of files are different on master and slave to newer setups where they
>    should be same. One can delete gfids on slave using setxattr -x and
>    .glusterfs and issue stat on all the files with gfids from master.
>
> Thanks to "Amar Tumballi" <amarts at redhat.com> and "Csaba Henk"
> <csaba at redhat.com> for their inputs.
>
> Signed-off-by: Raghavendra G <rgowdapp at redhat.com>
> Change-Id: Ie8ddc0fb3732732315c7ec49eab850c16d905e4e
> BUG: 952029
> Reviewed-on: http://review.gluster.com/#/c/4702
> Reviewed-by: Amar Tumballi <amarts at redhat.com>
> Tested-by: Amar Tumballi <amarts at redhat.com>
> Reviewed-on: http://review.gluster.org/4702
> Reviewed-by: Xavier Hernandez <xhernandez at datalab.es>
> Tested-by: Gluster Build System <jenkins at build.gluster.com>
> Reviewed-by: Vijay Bellur <vbellur at redhat.com>
> ---
> M glusterfsd/src/glusterfsd.c
> M glusterfsd/src/glusterfsd.h
> M libglusterfs/src/glusterfs.h
> M libglusterfs/src/inode.c
> M libglusterfs/src/inode.h
> M xlators/cluster/dht/src/dht-common.c
> M xlators/mount/fuse/src/Makefile.am
> M xlators/mount/fuse/src/fuse-bridge.c
> M xlators/mount/fuse/src/fuse-bridge.h
> M xlators/mount/fuse/src/fuse-helpers.c
> A xlators/mount/fuse/src/glfs-fuse-bridge.h
> M xlators/mount/fuse/utils/mount.glusterfs.in
> M xlators/storage/posix/src/posix.c
> 13 files changed, 1,317 insertions(+), 136 deletions(-)
>
> Approvals:
>    Xavier Hernandez: Looks good to me, but someone else must approve
>    Amar Tumballi: Looks good to me, approved
>    Vijay Bellur: Looks good to me, approved
>    Gluster Build System: Verified
>
>
>

Raghavendra,

This patch needs significant rework in fuse-bridge.c. Particular 
concerns are (all of which can be fixed):

- fuse_nodeid_t - this a very confusing structure. It is building 
relations between inodes which messes up the ref model. Are inode 
pointers stored there with refs? or without refs? If inode pointers are 
stored with refs, then we are in a situation where we potentially never 
forget the inode (who initiates the extra unref and how?). If inode 
pointers are stored without ref, then there is no guarantee that the 
pointer deref is not going to land you on an object which was destroyed 
just a moment ago. Not only is the ref model very murky, it is extremely 
confusing to visualize the working during concurrent access and forgets.

- Per inode allocation of a new structure. Do we really need a new inode 
context? While we try to minimize per-inode allocations as much as 
possible, this is adding another structure for a reason which can be 
addressed in a different way (see below)

- Extra level of pointer indirection for fuse handle to inode conversion 
in every operation. Not only are we allocating per inode, the fuse 
handle is now a double deref through a _member_ of the ctx structure. 
Not only does this worsen memory locality, it is adding complexity for 
no good reason.

- Multiple types of inode contexts depending on situation. This is a 
problem which is arising with http://review.gluster.org/5267. We are now 
forced into a situation where the inode ctx is of different _types_ 
depending on whether the inode is for the gfid view or the path view. 
Having multipe _types_ of inode ctxes is a nightmare for maintenance, 
and we have already seen a crash where the code confuses it to be of one 
type instead of another in a place.

Instead, I propose this:

- Revert all of the changes done to fuse-bridge.c, bring it back to the 
state how it was before the patch (carefully considering other changes 
which have happened beyond this patch)
- Introduce a new translator to overlay a virtual gfid view
- Normal inodes can continue to return original gfids as-is.
- Virtual inodes can create a random on-the fly gfid (which need not be 
persisted), and identified by a numerical flag in inode ctx without 
allocating a per-inode object.
- Upon access of the virtual inodes (which can be identified with an 
integer flag in the inode context without a new structure), redirect the 
operation to the inode which has the gfid whose canonical form is the 
dentry name of the virtual inode.

Let fuse-bridge be a pure fuse <--> xlator converter. Adding a new 
(gfid) view is clearly a separate concern, best implemented as a 
separate translator.

Thanks,
Avati





More information about the Gluster-devel mailing list