[Gluster-users] Performance for operations like find

Carl Boberg
Fri Mar 2 13:17:19 UTC 2012

There are about 4000 files in the dir.

Ran it again after clearing the caches and now it took over 3 minutes on
the gfs and about 4 seconds on the nfs (this is not gluster nfs but an old,
classic nfs share)

All clients are centos 5.6 and the 2 servers are centos 6.2
Runnig Gluster 3.2.5 rpm install with replicate setup from the docs.

If it is the self heal operation that is the cause of the slowdown is there
away around not triggering it? Or better yet, any custom options to add to
the config to make this kind of find command go a bit quicker?

Our application read and write files to the volume but we also have a
section for admins in the application that utilizes find and grep to find
specific files by date or content. This tool is vital for problem
solving and if it takes so much more time to do such operations its just
not usable...


Carl Boberg

Memnon Networks AB
Tegnérgatan 34, SE-113 59 Stockholm

Mobile: +46(0)70 467 27 12

On Fri, Mar 2, 2012 at 11:58, Brian Candler <B.Candler at pobox.com> wrote:

> On Fri, Mar 02, 2012 at 11:43:27AM +0100, Carl Boberg wrote:
> >    time find /mnt/nfs/<datadir> -type f -mtime -2
> >
> >    real 2m0.067s <--
> >    user 0m0.030s
> >    sys 0m0.252s
> The -mtime -2 is forcing gluster to do a stat() on every file, and this
> makes gluster do a self-heal operation where it needs to access the file on
> both volumes:
> http://www.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate
> http://www.youtube.com/watch?v=AsgtE7Ph2_k
> Having said that, 2 minutes seems pretty slow. How many files are there in
> total, i.e. without the -mtime filter?
> Is it possible the NFS test had the inode data in cache, so was an unfair
> comparison?  I suggest you do
>    echo 3 >/proc/sys/vm/drop_caches
> (as root) on both client and server before each test.
> Regards,
> Brian.
