[Gluster-users] Gluster operations speed limit

Fri Aug 4 12:02:44 UTC 2017

On Tue, Aug 1, 2017, at 06:16 AM, Alexey Zakurin wrote:
> I have a large distributed-replicated Glusterfs volume, that contains 
> few hundreds VM's images. Between servers 20Gb/sec link.
> When I start some operations like healing or removing, storage 
> performance becomes too low for a few days and server load becomes like 
> this:
> 
> 13:06:32 up 13 days, 20:02,  3 users,  load average: 43.62, 31.75, 
> 23.53.
> 
> Is it possible to set limit on this operations? Actually, VM's on my 
> cluster becomes offline, when I start healing, rebalance or removing 
> brick.

In addition to the cgroups workaround that Mohit mentions, there are two
longer-term efforts in progress (that I'm aware of) to address this and
similar issues.

(1) Some folks at Red Hat are working on limiting the number of files
that SHD will heal at one time
(https://github.com/gluster/glusterfs/issues/255).

(2) At Facebook, we're working on a more general solution to apportion
I/O among any users of a system, where "users" might be real users or
internal pseudo-users such as self heal or rebalance
(https://github.com/gluster/glusterfs/issues/266).

Either or both of these might land in 4.0; we're still planning that
release, so no definite answer yet.