[Gluster-devel] Implementing Flat Hierarchy for trashed files
ppai at redhat.com
Tue Aug 18 06:29:09 UTC 2015
----- Original Message -----
> From: "Anoop C S" <anoopcs at redhat.com>
> To: gluster-devel at gluster.org
> Sent: Monday, August 17, 2015 6:20:50 PM
> Subject: [Gluster-devel] Implementing Flat Hierarchy for trashed files
> Hi all,
> As we move forward, in order to fix the limitations with current trash
> translator we are planning to replace the existing criteria for trashed
> files inside trash directory with a general flat hierarchy as described
> in the following sections. Please have your thoughts on following
> design considerations.
> Current implementation
> * Trash translator resides on glusterfs server stack just above posix.
> * Trash directory (.trashcan) is created during volume start and is
> visible under root of the volume.
> * Each trashed file is moved (renamed) to trash directory with an
> appended time stamp in the file name.
Do these files get moved during re-balance due to name change or do you choose file name according to the DHT regex magic to avoid that ?
> * Exact directory hierarchy (w.r.t the root of volume) is maintained
> inside trash directory whenever a file is deleted/truncated from a
> Outstanding issues
> * Since renaming occurs at the server side, client-side is unaware of
> trash doing rename or create operations.
> * As a result files/directories may not be visible from mount point.
> * Files/Directories created from from trash translator will not have
> gfid associated with it until lookup is performed.
> Proposed Flat hierarchy
> * Instead of creating the whole directory under trash, we will rename
> the file and place it directly under trash directory (of course with
> appended time stamp).
The .trashcan directory might not scale with millions of such files placed under one directory. We had faced the same problem with gluster-swift project for object expiration feature and had decided to distribute our files across multiple directories in a deterministic way. And, personally, I'd prefer storing absolute timestamp, for example: as returned by `date +%s` command.
> * Directory hierarchy can be stored via either of the following two
> (a) File name will contain the whole path with time stamp
If this approach is taken, you might have trouble with choosing a "magic letter" representing slashes.
> (b) Store whole hierarchy as an xattr
> Other enhancements
> * Create the trash directory only
> when trash xlator is enabled.
This is a needed enhancement. Upgrade to 3.7.* from older glusterfs versions caused undesired results in gluster-swift integration because .trashcan was visible by default on all glusterfs volumes.
> * Operations such as unlink, rename etc
> will be prevented on trash
> directory only when trash xlator is
> * A new trash helper translator on client side(loaded only when
> is enabled) to resolve split brain issues with truncation of
> * Restore files from trash with the help of an explicit setfattr
You have to be very careful with races involved in re-creating the path when clients are accessing volume, also with over-writing if path exists.
It's way easier (from implementer's perspective) if this is a manual process.
> Thanks & Regards,
> -Anoop C S
> -Jiffin Tony Thottan
> Gluster-devel mailing list
> Gluster-devel at gluster.org
More information about the Gluster-devel