[Gluster-users] Gluster Running Incredibly Slow

Tue Oct 25 16:16:27 UTC 2016

This is my first attempt at running Gluster, and so far it's not going
well. I've got a cluster of 150 machines (this is in a university
environment) that were previously all mounted to an NFS share on the
cluster's head node. To make the cluster more expandable, and theoretically
increase file I/O speeds, I decided to switch over to a distributed file
system. I configured it with three storage nodes, 1 brick per node, running
a Gluster in dispersed mode. Well, at first it seemed to be running fine,
but then when I tested it with simultaneous reads/write it got really slow.
If I run 'kash sleep 15', all 150 nodes will sleep for 15 seconds. If I
create a file called runSleep that does teh same thing and then try to
execute that file on all 150 nodes simultaneously, it will take 3-4 minutes
to complete!

Here are a few things I've done to try to narrow the problem down:

   1. I unmounted the Gluster volume and re-mounted it as NFS instead of
   using fuse. Same results as before.
   2. I deleted the Gluster volume, created an NFS share on one of the
   storage nodes, them mounted that share on all of the compute nodes. This
   ran just fine with no noticeable delay at all.
   3. I created a new Gluster volume that's distributed, but only uses one
   brick. This ran just as slowly as my original case.

Running 'gluster volume profile' on the storage node from test 3, I noticed
that the latency seems really high for teh last three operations listed,
OPEN, LOOKUP, and FSYNC. FSYNC shows an average latency of 708230.64 us, or
almost a full second.

The nodes are all running Ubuntu 16.04.

Any suggestions?

Thanks,
Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161025/45ac4380/attachment.html>