[Gluster-users] gluster becomes too slow, need frequent stop-start or reboot

Thu Jun 14 17:18:36 UTC 2018

Our gluster keeps getting to a state where it becomes painfully slow and
many of our applications time out on read/write call. When this happens a
simple ls at top level directory from the mount takes somewhere between
8-25s (normally it is very fast, at most 1-2s). The top level directory
only has about 10 folders.

The two methods to mitigate this problem have been 1) restart all GFS
servers or 2) stop/start the volume. 2) does take somewhere between half an
hour to an hour for gluster to get back to its desired performance.

So far the logs don't show anything unusual but perhaps I don't know what I
should be looking for in the logs. Even when gluster are fully functional
we see lots of logs, hard to tell which error is harmless and what is not.

This issue does not seem to happen with our 3 replica glusters, only with
2-replica-1-arbiter and 2-replica. However, our 3-replica glusters are only
30% full while the 2-replica ones are about 80% full.
We're running 3.12.9 for the servers. The clients are 3.8.15, but we notice
the slowness of operations on 3.12.9 clients as well.

Configuration: 12 GFS servers, one brick per server, replica 2, 80T each
brick. We used to have arbiters but thought the arbiters were causing the
slow down so we took them out. Apparently it's not the case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180614/65243c0a/attachment.html>