[Gluster-users] Gluter 3.12.12: performance during heal and in general

Hu Bert revirii at googlemail.com
Thu Aug 16 09:57:28 UTC 2018


Hi,

well, as the situation doesn't get better, we're quite helpless and
mostly in the dark, so we're thinking about hiring some professional
support. Any hint? :-)


2018-08-15 11:07 GMT+02:00 Hu Bert <revirii at googlemail.com>:
> Hello again :-)
>
> The self heal must have finished as there are no log entries in
> glustershd.log files anymore. According to munin disk latency (average
> io wait) has gone down to 100 ms, and disk utilization has gone down
> to ~60% - both on all servers and hard disks.
>
> But now system load on 2 servers (which were in the good state)
> fluctuates between 60 and 100; the server with the formerly failed
> disk has a load of 20-30.I've uploaded some munin graphics of the cpu
> usage:
>
> https://abload.de/img/gluster11_cpu31d3a.png
> https://abload.de/img/gluster12_cpu8sem7.png
> https://abload.de/img/gluster13_cpud7eni.png
>
> This can't be normal. 2 of the servers under heavy load and one not
> that much. Does anyone have an explanation of this strange behaviour?
>
>
> Thx :-)
>
> 2018-08-14 9:37 GMT+02:00 Hu Bert <revirii at googlemail.com>:
>> Hi there,
>>
>> well, it seems the heal has finally finished. Couldn't see/find any
>> related log message; is there such a message in a specific log file?
>>
>> But i see the same behaviour when the last heal finished: all CPU
>> cores are consumed by brick processes; not only by the formerly failed
>> bricksdd1, but by all 4 brick processes (and their threads). Load goes
>> up to > 100 on the 2 servers with the not-failed brick, and
>> glustershd.log gets filled with a lot of entries. Load on the server
>> with the then failed brick not that high, but still ~60.
>>
>> Is this behaviour normal? Is there some post-heal after a heal has finished?
>>
>> thx in advance :-)


More information about the Gluster-users mailing list