[Gluster-users] Slow performance with "ls"

Tue Jul 5 10:11:58 UTC 2016

Hi everyone,

I have an issue with slow performance whilst running an "ls" command on a
Gluster filesystem (client). When running "time ls" it shows that it takes
between 20 and 60 seconds, and then returns approximately 1700 directories
(this mount point is to be used for /home).

I can speed this up in a number of ways, each time returning in about 0.1
seconds (much more acceptable, on par with an NFS mount) :

  1. Running "time strace ls"
  2. Running "time ls | wc -l"
  3. Turning off the second Gluster instance (we run a replicated setup),
leaving only the primary server on. "time ls" itself is then very quick.

I have fired up WireShark, and tried to make sense of what is happening
when I do an "ls", and it seems that I get LOOKUP packets being sent
between the 2 Gluster servers when performing an "ls". Presumably this is
to make sure that the Gluster file system is in sync, before giving me an
out of date result. To us, it is more important that the result is
delivered in an acceptable time, than it is to be 100% correct (i.e. There
shouldn't be a problem with a file or two being out of sync for a few
minutes whilst the volume heals). I am not sure why this does not seem to
run when I try 1. and 2. above.

Ideally, I would like to know whether there is a configuration option
available to tweak this behaviour? I have had a look through the manual at
the list of available configuration options, but cannot find anything
related. I have tried Googling, unfortunately I cannot find any information
there either. I've been on the IRC channel a couple of times, and a few
other people have experienced similar issues, but no answer has been found
as of yet. If there is any way of speeding this up, I would love to know!

A bit of background :

* We currently have a test setup, with 2 Gluster hosts, replicated, and
split between 2 data centres (1 Gluster host per data centre) with a
dedicated (to the organisation, it is shared with other services we run)
400Mbps leased line.
* We are running Gluster v 3.7.8 on the clients and server.
* The clients are running on various versions of RedHat, 5.10,6.4 and 7.1
are the main ones. All experience this issue.
* This issue occurs on clients within the same data centre as their primary
Gluster server, and also on a client on the Gluster server itself.
* I have fiddled with various Configuration options, but as each one made
no difference, I believe I have reverted back to the default value each
time.

Any help would be greatly appreciated.

Thanks,

Craig.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160705/2ba92329/attachment.html>