[Gluster-users] Would there be a use for cluster-specific filesystem tools?

Wed Apr 16 14:31:27 UTC 2014

Excellent! I've been toying with the same concept in the back of my mind for a long while now. I'm sure there is an unrealized desire for such tools.

When your ready, please put such a toolset on forge.gluster.org. 

On April 16, 2014 6:50:48 AM PDT, Michael Peek <peek at nimbios.org> wrote:
>Hi guys,
>
>(I'm new to this, so pardon me if my shenanigans turns out to be a
>waste
>of your time.)
>
>I have been experimenting with Gluster by copying and deleting large
>numbers of files of all sizes.  What I found was that when deleting a
>large number of small files, the deletion process seems to take a good
>chunk of my time -- in some cases it seemed to take a significant
>percentage of the time that it took to copy the files to the cluster to
>begin with.  I'm guessing that the reason is a combination of find and
>rm -fr processing files serially and having to wait on the packets to
>travel back and forth over the network.  But with a clustering
>filesystem, the bottleneck is processing files serially and waiting for
>network packets when you don't have to.
>
>So I decided to try an experiment.  Instead of using /bin/rm to delete
>files serially, I wrote my own quick-and-dirty recursive rm (and
>recursive ls) that uses pthreads (listed as "cluster-rm" and
>"cluster-ls" in the table below):
>
>Methods:
>
>1) This was done on a Linux system.  I suspect that Linux (or any
>modern
>OS) caches filesystem information.  For example, after setting up a
>directory, when running rm -fr on that directory, the time for rm to
>complete is lessened if I first run find on the same directory.  So to
>avoid this caching effect, each command was run on it's own test
>directory.  (I.e. find was never run on the same directory as rm -fr or
>cluster-rm.)  This approach seemed to prevent inconsistencies resulting
>from any caching behavior, resulting in run times that were more
>consistent.
>
>2) Each test directory contained the exact same data for each of the
>four commands tested (find, cluster-ls, rm, cluster-rm) for each test
>run.
>
>3) All commands were run on a client machine and not one of the cluster
>nodes.
>
>Results:
>
>_*Data Size*_
>	_*Command*_
>	_*Test #1*_
>	_*Test #2*_
>	_*Test #3*_
>	_*Test #4*_
>49GB
>	find -print
>	real    6m45.066s
>user    0m0.172s
>sys    0m0.748s
>	real    6m18.524s
>user    0m0.140s
>sys    0m0.508s
>	real    5m45.301s
>user    0m0.156s
>sys    0m0.484s
>	real    5m58.577s
>user    0m0.132s
>sys    0m0.480s
>
>	cluster-ls
>	real    2m32.770s
>user    0m0.208s
>sys    0m1.876s
>	real    2m21.376s
>user    0m0.164s
>sys    0m1.568s
>	real    2m40.511s
>user    0m0.184s
>sys    0m1.488s
>	real    2m36.202s
>user    0m0.172s
>sys    0m1.412s
>
>	
>	
>	
>	
>	
>49GB
>	rm -fr
>	real    16m36.264s
>user    0m0.232s
>sys    0m1.724s
>	real    16m16.795s
>user    0m0.248s
>sys    0m1.528s
>	real    15m54.503s
>user    0m0.204s
>sys    0m1.396s
>	real    16m10.037s
>user    0m0.168s
>sys    0m1.448s
>
>	cluster-rm
>	real    1m50.717s
>user    0m0.236s
>sys    0m1.820s
>	real    1m44.803s
>user    0m0.192s
>sys    0m2.100s
>	real    2m6.250s
>user    0m0.224s
>sys    0m2.200s
>	real    2m6.367s
>user    0m0.224s
>sys    0m2.316s
>
>	
>	
>	
>	
>	
>97GB
>	find -print
>	real    11m39.990s
>user    0m0.380s
>sys    0m1.428s
>	real    11m21.018s
>user    0m0.380s
>sys    0m1.224s
>	real    11m33.257s
>user    0m0.288s
>sys    0m0.924s
>	real    11m4.867s
>user    0m0.332s
>sys    0m1.244s
>
>	cluster-ls
>	real    4m46.829s
>user    0m0.504s
>sys    0m3.228s
>	real    5m15.538s
>user    0m0.408s
>sys    0m3.736s
>	real    4m52.075s
>user    0m0.364s
>sys    0m3.004s
>	real    4m43.134s
>user    0m0.452s
>sys    0m3.140s
>
>	
>	
>	
>	
>	
>97GB
>	rm -fr
>	real    29m34.138s
>user    0m0.520s
>sys    0m3.908s
>	real    28m11.000s
>user    0m0.556s
>sys    0m3.480s
>	real    28m37.154s
>user    0m0.412s
>sys    0m2.756s
>	real    28m41.724s
>user    0m0.380s
>sys    0m4.184s
>
>	cluster-rm
>	real    3m30.750s
>user    0m0.524s
>sys    0m4.932s
>	real    4m20.195s
>user    0m0.456s
>sys    0m5.316s
>	real    4m45.206s
>user    0m0.444s
>sys    0m4.584s
>	real    4m26.894s
>user    0m0.436s
>sys    0m4.732s
>
>	
>	
>	
>	
>	
>145GB
>	find -print
>	real    16m26.498s
>user    0m0.520s
>sys    0m2.244s
>	real    16m53.047s
>user    0m0.596s
>sys    0m1.740s
>	real    15m10.704s
>user    0m0.364s
>sys    0m1.748s
>	real    15m53.943s
>user    0m0.456s
>sys    0m1.764s
>
>	cluster-ls
>	real    6m52.006s
>user    0m0.644s
>sys    0m5.664s
>	real    7m7.361s
>user    0m0.804s
>sys    0m5.432s
>	real    7m4.109s
>user    0m0.652s
>sys    0m4.800s
>	real    6m37.229s
>user    0m0.656s
>sys    0m4.652s
>
>	
>	
>	
>	
>	
>145GB
>	rm -fr
>	real    40m10.396s
>user    0m0.624s
>sys    0m5.492s
>	real    42m17.851s
>user    0m0.844s
>sys    0m4.872s
>	real    39m6.493s
>user    0m0.484s
>sys    0m4.868s
>	real    39m52.047s
>user    0m0.496s
>sys    0m4.980s
>
>	cluster-rm
>	real    6m49.769s
>user    0m0.708s
>sys    0m6.440s
>	real    8m34.644s
>user    0m0.852s
>sys    0m8.345s
>	real    6m3.563s
>user    0m0.636s
>sys    0m5.844s
>	real    6m31.808s
>user    0m0.664s
>sys    0m5.996s
>
>	
>	
>	
>	
>	
>1.1TB
>	find -print 	real    62m4.043s
>user    0m1.300s
>sys    0m5.448s
>	real    61m11.584s
>user    0m1.204s
>sys    0m5.172s
>	real    65m37.389s
>user    0m1.708s
>sys    0m4.276s
>	real    63m51.822s
>user    0m3.096s
>sys    0m9.869s
>
>	cluster-ls
>	real    73m12.463s
>user    0m2.472s
>sys    0m19.289s
>	real    68m37.846s
>user    0m2.080s
>sys    0m18.625s
>	real    72m56.417s
>user    0m2.516s
>sys    0m18.601s
>	real    69m3.575s
>user    0m4.316s
>sys    0m35.986s
>
>	
>	
>	
>	
>	
>1.1TB
>	rm -fr
>	real    188m1.925s
>user    0m2.240s
>sys    0m21.705s
>	real    190m21.850s
>user    0m2.372s
>sys    0m18.885s
>	real    200m25.712s
>user    0m5.840s
>sys    0m46.363s
>	real    196m12.686s
>user    0m4.916s
>sys    0m41.519s
>
>	cluster-rm
>	real    85m46.463s
>user    0m2.512s
>sys    0m30.478s
>	real    90m29.055s
>user    0m2.600s
>sys    0m30.382s
>	real    88m16.063s
>user    0m4.456s
>sys    0m51.667
>	real    77m42.096s
>user    0m2.464s
>sys    0m31.638s
>
>
>
>Conclusions:
>
>1) Once I had a threaded version of rm, a threaded version of ls was
>easy to make, so I included it in the tests (listed above as
>cluster-ls).  Performance looked spiffy up until the 1.1TB range, when
>cluster-ls started taking more time than find.  Right now I can't
>explain why.  1.1TB takes a long time to set up and process (about a
>day
>for each set of four commands), it could be that regular nightly
>backups
>might be interfering with performance.  If that's the case, then it
>calls into question the usefulness of my threaded approach.  Also,
>naturally the output from cluster-ls is out of order, so grep and sed
>would most likely be used in conjunction with something like that, and
>I
>haven't yet time-tested 'cluster-ls | some-other-command' against using
>plain old find by itself.
>
>2) Results from cluster-rm look pretty good to me across the board. 
>Again, performance seems to fall off in the 1.1TB tests, and the
>reasons
>are not clear to me at this time, but performance is still half that of
>rm -fr.  Run times fluctuate more than in the previous tests, but I
>suppose that's to be expected.  But since performance does drop, it
>makes me wonder how well this approach scales on larger sets of data.
>
>3) My threaded cluster-rm/ls commands are not clever.  While traversing
>directories, any subdirectories found would result in a new thread to
>process it, up until some hard-coded limit is reached (for the above
>results, 100 threads were used).  After the thread count limit is
>reached, directories are processed using plain old recursion until a
>thread exits, freeing up a thread to process another subdirectory.
>
>Further Research:
>
>A) I would like to test further with larger data sets.
>
>B) I would like to implement a smarter algorithm for determining how
>many threads to use to maximize performance.  Rather than a hard-coded
>maximum, a better approach might be to use some metric for measuring
>number of inodes processed per second, and use that to determine the
>effectiveness of adding more threads until a local maxima is reached.
>
>C) How do these numbers change if the commands are run on one of the
>cluster nodes instead of a client?
>
>I have some ideas of smarter things to try, but I am at best an
>inexperienced (if enthusiastic) dabbler in the programming arts.  A
>professional would likely do a much better job.
>
>But if this data looks at all interesting or useful, then maybe there
>would be a call for a handful of cluster-specific filesystem tools?
>
>Michael Peek
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://supercolony.gluster.org/mailman/listinfo/gluster-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140416/0dc63763/attachment.html>