[Gluster-devel] Implementing Flat Hierarchy for trashed files

Jiffin Tony Thottan jthottan at redhat.com
Tue Sep 22 06:32:02 UTC 2015



On 19/08/15 15:57, Niels de Vos wrote:
> On Tue, Aug 18, 2015 at 04:51:58PM +0530, Jiffin Tony Thottan wrote:
>> Comments inline.
>>
>> On 18/08/15 09:54, Niels de Vos wrote:
>>> On Mon, Aug 17, 2015 at 06:20:50PM +0530, Anoop C S wrote:
>>>> Hi all,
>>>>
>>>> As we move forward, in order to fix the limitations with current trash
>>>> translator we are planning to replace the existing criteria for trashed
>>>> files inside trash directory with a general flat hierarchy as described
>>>> in the following sections. Please have your thoughts on following
>>>> design considerations.
>>>>
>>>> Current implementation
>>>> ======================
>>>> * Trash translator resides on glusterfs server stack just above posix.
>>>> * Trash directory (.trashcan) is created during volume start and is
>>>>    visible under root of the volume.
>>>> * Each trashed file is moved (renamed) to trash directory with an
>>>>    appended time stamp in the file name.
>>>> * Exact directory hierarchy (w.r.t the root of volume) is maintained
>>>>    inside trash directory whenever a file is deleted/truncated from a
>>>>    directory
>>>>
>>>> Outstanding issues
>>>> ==================
>>>> * Since renaming occurs at the server side, client-side is unaware of
>>>>    trash doing rename or create operations.
>>>> * As a result files/directories may not be visible from mount point.
>>> This might be something upcall could help with. If the trash xlator is
>>> placed above upcall, any clients interested in the .trashcan directory
>>> (or subdirs) could get an in/revalidation request.
>>>
>>>> * Files/Directories created from from trash translator will not have
>>>>    gfid associated with it until lookup is performed.
>>> When a client receives an invalidation of the parent directory (from
>>> upcall), a LOOKUP will follow on the next request.
>> If I understand it correctly , solution become more complex if integrate
>> both translator and upcall together.
>> 1.) Upcall notification can be send to a client only if it has accessed
>> .trashcan
> Correct, and those are the only clients we care about. Clients that do
> not have the .trashcan directory entries cached do not need to
> invalidate that cache.
>
>> 2.) There should be translator at client side to initiate lookup after
>> receiving upcall notification
> No, that is not needed. A LOOKUP will happen when the directory/inode
> needs a revalidate. After an invalidate from upcall, the next revalidate
> will cause a LOOKUP.
>
>> 3.) Performance hit. Say file `foo`is present in a/b/c/. We need to create
>> path a/b/c/ inside trash directory.
>> So ideally trash xlator will first create directory 'a' , then send upcall
>> notification to all of the client and then clients will initiate lookup on
>> 'a',
>> perform gfid healing on that directory. After that it will create `b` and
>> repeat the same procedure.
> Yes, directories are more tricky. But I think this is still a better
> approach then renaming a file to include the full path.
>
>>>> Proposed Flat hierarchy
>>>> =======================
>>> I'm missing a bit of info here, what limitations need to be addressed?
>> all above mentioned outstanding issues can be addressed by the flat
>> hierarchy.
> But you also introduce new issues. A huge directory to browse would be
> major concern. Users like doing directory listings and that is one of
> the worst workloads we have on Gluster :-/
>
>>>> * Instead of creating the whole directory under trash, we will rename
>>>>    the file and place it directly under trash directory (of course with
>>>>    appended time stamp).
>>>> * Directory hierarchy can be stored via either of the following two
>>>>    approaches:
>>>> 	(a) File name will contain the whole path with time stamp
>>>> 	    appended
>>>> 	(b) Store whole hierarchy as an xattr
>>> If this is needed, definitely go with (b). Filenames have a limit, and
>>> the full path (directories + filename + timestamp) could surely hit
>>> that.
>> Thanks for the suggestion.
>>
>>>> Other enhancements
>>>> ==================
>>> Have these been filed as bugs/RFEs? If not, please do so and include a
>>> good description of the work that is needed. Maybe others in the Gluster
>>> community are interested in providing patches, and details on what to do
>>> is very helpful.
>> Sure. We will file different RFE's as soon as possible and sent it in
>> different mail.
> Thanks!

The RFEs opened for trash xlator are listed below :

1.)  https://bugzilla.redhat.com/show_bug.cgi?id=1264847 - flat 
hierarchy for files inside trash directory
2.)  https://bugzilla.redhat.com/show_bug.cgi?id=1264849 - Create trash 
directory only when its is enabled
3.)  https://bugzilla.redhat.com/show_bug.cgi?id=1264853 - trash helper 
translator at client-side
4.)  https://bugzilla.redhat.com/show_bug.cgi?id=1264857 - Restore 
operation for files under trash directory

Contributions are mostly welcome.

With Regards,
Jiffin


>>> Thanks,
>>> Niels
>>>
>>>> * Create the trash directory only
>>>> when trash xlator is enabled.
>>>> * Operations such as unlink, rename etc
>>>> will be prevented on trash
>>>>    directory only when trash xlator is
>>>> enabled.
>>>> * A new trash helper translator on client side(loaded only when
>>>> trash
>>>>    is enabled) to resolve split brain issues with truncation of
>>>> files.
>>>> * Restore files from trash with the help of an explicit setfattr
>>>> call.
>>>>
>>>> Thanks & Regards,
>>>> -Anoop C S
>>>> -Jiffin Tony Thottan
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>
>> --
>> Jiffin
>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel



More information about the Gluster-devel mailing list