[Gluster-users] Replication logic
Zenon Panoussis
oracle at provocation.net
Sun Jan 3 03:48:35 UTC 2021
>> Just take the slow brick offline during the initial sync
>> and then bring it online. The heal will go in background,
>> while the volume stays operational.
> Yes, but the heal will then take three weeks.
I meant this as an obvious exaggeration, but it seems it was
not.
I removed the arbiter and created a new full brick. This
resulted in two bricks in sync with each-other, with just
over 100.000 small (average 10 KiB) files, and one empty
brick. The healing process started populating the empty
brick at a really slow rate of something like two to five
files per minute.
I would have expected one of three things to become saturated
at least on one of the participating machines: or the network,
or disk I/O, or CPU. But far from it, nothing is even close
to saturated. On all three machines the CPUs (top) are running
almost idle, disk I/O (iotop) is negligible and network traffic
is in the order of 100 Kbps. It looks like 'nice -n 700 glusterd',
on a nice scale from 1 to 19.
Any ideas where I should look for the bottleneck? I can't find
anything even remotely relevant in any of the logs.
Z
More information about the Gluster-users
mailing list