[Gluster-users] Disk utilisation

Tue May 15 20:12:10 UTC 2012

Hi,

we are using Gluster to make http file downloads available. We currently
have 2 gluster servers serving a replicated volume. Each gluster server has
22 disks in a hardware raid, the underlying file system is XFS. The average
file size is around 3-4MB. There are stored around 16TB of data on the
volume.

Once we start sending live http traffic towards the infrastructure we see a
horrible performance. For instance if the outgoing bandwidth on each of the
gluster servers is at ~130mbit/s our hardware raid has a busy rate of ~30%.
Once we increase the traffic towards 250mbit/s the busy rate doubles to
60%. With this the iowait values also increase.

We started to play with the read buffers on the http servers. There is no
difference between loading the whole file into memory at once and loading
the file in 64k chunks. This makes me believe that the gluster server loads
the file with its own buffers and the clients buffer has no influence. We
have also enabled profiling on the gluster volume: There are roughly 18
read() calls for each open() call which should be an indication for too
small buffers.

We have also made the mistake to store all files in a single directory but
XFS advertises that it can handle millions of files in a single directory
so it shouldn't be a problem or should it?

I would appreciate any ideas for the cause of this performance issue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120515/3bcef2dd/attachment.html>