[Gluster-devel] Tracking File Creations, Modifications, and Deletions

Drew Morris drew at drewmorris.com
Mon Jul 20 23:28:48 UTC 2009


Hi All...

We are developing a custom translator to log modifications to files
(including creation, update and deletion) into database. This log is
later used by several processes to perform asynchronous operations
such as indexing recently modified files and updating an off-site 
backup for disaster recovery.

Once this is completed we plan to share the translator with the 
community if anyone is interested in it.

Our Current Approach:

By reviewing the Gluster and FUSE source code and documentation, we
concluded that the following FOPs should be monitored for this purpose: 
open, create, mknod, truncate, ftruncate, writev, flush, release, unlink and
rename.

We would like to insert one record per each file modification, hence we
need a mechanism to aggregate multiple operations such as open, writev
and flush over one file-descriptor into a single update.

For performance sake and preventing dirty reads, we would like to do
a database row insertion in the callback of the very last action that is
performed. By other means, during write we just set flags as modified 
in file descriptor context and perform the insert in the very last action.

The major issue is that (as most of the docs and FAQ indicated) there
is no reliable mechanism to decide which FOP action is the last one. 
We monitored file system interaction via trace module and noticed
that the flush action is called several times and release is never invoked
in many cases.

This issue forced us to log the very first flush which is quite problematic 
for a number of reasons including the fact that we can never be sure the
operation is finished before triggering any of our asynchronous operations
and we are slowing down the initial write because we are waiting on the
log action to complete.


Question:
Does anyone have a better solution for this issue? Perhaps there should 
be a mechanism to notify us of the closing of a file, otherwise an open file 
descriptor will remain forever. 

We would really love to find any other reliable method that allows us to 
track these operations at a higher level.

We would greatly appreciate any new approach that can overcome these 
deficiencies.

Thanks in Advance

- Drew Morris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090720/507acf17/attachment-0003.html>


More information about the Gluster-devel mailing list