[Gluster-users] Fwd: disperse heal speed up

Serkan Çoban cobanserkan at gmail.com
Fri Apr 15 10:28:56 UTC 2016


100TB is newly created files when brick is down.I rethink the
situation and realized that I reformatted all the bricks in case 1 so
write speed limit is 26*100MB/disk
In case 2 I just reformatted one brick so write speed limited to
100MB/disk...I will repeat the tests using one brick in both cases
once with reformat, and once with just killing brick process...
Thanks for reply..

On Fri, Apr 15, 2016 at 9:27 AM, Xavier Hernandez <xhernandez at datalab.es> wrote:
> Hi Serkan,
>
> sorry for the delay, I'm a bit busy lately.
>
> On 13/04/16 13:59, Serkan Çoban wrote:
>>
>> Hi Xavier,
>>
>> Can you help me about the below issue? How can I increase the disperse
>> heal speed?
>
>
> It seems weird. Is there any related message in the logs ?
>
> In this particular test, are the 100TB modified files or newly created files
> while the brick was down ?
>
> How many files have been modified ?
>
>> Also I would be grateful if you have detailed documentation about disperse
>> heal,
>> why heal happens on disperse volume, how it is triggered? Which nodes
>> participate in heal process? Any client interaction?
>
>
> Heal process is basically the same used for replicate. There are two ways to
> trigger a self-heal:
>
> * when an inconsistency is detected, the client initiates a background
> self-heal of the inode
>
> * the self-heal daemon scans the lists of modified files created by the
> index xlator when a modification is made while some node is down. All these
> files are self-healed.
>
> Xavi
>
>
>>
>> Serkan
>>
>>
>> ---------- Forwarded message ----------
>> From: Serkan Çoban <cobanserkan at gmail.com>
>> Date: Fri, Apr 8, 2016 at 5:46 PM
>> Subject: disperse heal speed up
>> To: Gluster Users <gluster-users at gluster.org>
>>
>>
>> Hi,
>>
>> I am testing heal speed of disperse volume and what I see is 5-10MB/s per
>> node.
>> I increased disperse.background-heals to 32 and
>> disperse.heal-wait-qlength to 256, but still no difference.
>> One thing I noticed is that, when I kill a brick process, reformat it
>> and restart it heal speed is nearly 20x (200MB/s/node)
>>
>> But when I kill the brick, then write 100TB data, and start brick
>> afterwords heal is slow (5-10MB/s/node)
>>
>> What is the difference between two scenarios? Why one heal is slow and
>> other is fast? How can I increase disperse heal speed? Should I
>> increase thread count to 128 or 256? I am on 78x(16+4) disperse volume
>> and my servers are pretty strong (2x14 cores with 512GB ram, each node
>> has 26x8TB disks)
>>
>> Gluster version is 3.7.10.
>>
>> Thanks,
>> Serkan
>>
>


More information about the Gluster-users mailing list