[Gluster-users] directory traversal speed

Lana Deere lana.deere at gmail.com
Fri Nov 5 19:03:59 UTC 2010

I have a program which traverses directories and builds a list of
files in those directories.  The hierarchy being traversed consists
of approximately 750,000 files scattered among approximately 4100
directories.  I ran it under three configurations:
  1. a regular linux system with on a 15k RPM SAS drive.
  2. directly on the RAID under my gluster installation.
  3. GlusterFS 3.1.0 / RDMA / Distributed / native fuse clients.

I did the experiment twice in a row on each configuration.  Results:
  1. 90 seconds then 7 seconds.
  2. 74 seconds then 4 seconds.
  3. 4678 seconds then 4648 seconds.

Any suggestions about why my gluster installation is so much slower
than the regular file systems and how I can speed things up?

Here is pseudo-code for the traversal:
   push the root onto a stack
   while stack not empty
     curdir = pop the stack
     foreach diritem in curdir (ignoring . and ..),
       stat diritem
       if diritem is a directory,
         push diritem onto the stack
         put diritem and its size into output


.. Lana (lana.deere at gmail.com)

More information about the Gluster-users mailing list