[Gluster-devel] tools/glusterfind - update

Aravinda avishwan at redhat.com
Wed Apr 22 09:17:50 UTC 2015


Hi,

We are making some modifications to glusterfind compared to the first
version(documented here [1]).
glusterfind is now capable of detecting different type of changes
NEW, MODIFIED, RENAME and DELETE.

Instead of having multiple output files, output format is standardized
with single output file.

Full find with --full option:
-----------------------------
One entry per line. Each entry is encoded using RFC3986 [2]. Can be 
unescaped
using Python using urllib.unquote_plus. For example, if the output file
required in "<NUM_LETTERS> <PATH>" format then,

#!/usr/bin/python
import sys
import urllib

filename = sys.argv[1]

with open(filename) as f:
     for line in f:
         line = urllib.unquote_plus(line)
         print "%s %s" % (len(line), line)

You can run and redirect its output to a file.

Incremental:
------------
Output format is <TYPE> <PATH1> <PATH2>. PATH2 is applicable only if type
is RENAME. All the pathnames are encoded as explained in previous step.
Possible type values are: NEW, MODIFY, DELETE and RENAME.

Other new options will be introduced are,

Create command:
---------------
--reset-session-time Force reset the session time. Next incremental run
will start from this time.

Pre command:
------------
1. -N, --only-namespace-changes Do not list files/dirs which are modified
    (Data and Meta), detects only New, Rename, Link, Unlink etc.
2. Partial find is enabled by default, if one node fails to get the changes
    from that node command succeeds. This behavior can be prevented by
    passing --disable-partial
3. --change-detector option will be removed since we need to use changelog
    to detect different types of fops. With other crawl it is not possible.

Let us know if any suggestions.


[1] http://review.gluster.org/#/c/9800/
[2] https://www.ietf.org/rfc/rfc3986.txt

--
regards
Aravinda




More information about the Gluster-devel mailing list