[Gluster-users] Heal operation detail of EC volumes

Tue May 30 07:25:46 UTC 2017

When we say client side heal or server side heal, we basically talking about the side which "triggers" heal of a file. 

1 - server side heal - shd scans indices and triggers heal 

2 - client side heal - a fop finds that file needs heal and it triggers heal for that file. 

Now, what happens when heal gets triggered. 
In both the cases following functions takes part - 

ec_heal => ec_heal_throttle=>ec_launch_heal 

Now ec_launch_heal just creates heal tasks (with ec_synctask_heal_wrap which calls ec_heal_do ) and put it into a queue. 
This happens on server and "syncenv" infrastructure which is nothing but a set of workers pick these tasks and execute it. That is when actual read/write for 
heal happens. 

----- Original Message -----

From: "Serkan Çoban" <cobanserkan at gmail.com> 
To: "Ashish Pandey" <aspandey at redhat.com> 
Cc: "Gluster Users" <gluster-users at gluster.org> 
Sent: Monday, May 29, 2017 6:44:50 PM 
Subject: Re: [Gluster-users] Heal operation detail of EC volumes 

>>Healing could be triggered by client side (access of file) or server side (shd). 
>>However, in both the cases actual heal starts from "ec_heal_do" function. 
If I do a recursive getfattr operation from clients, then all heal 
operation is done on clients right? Client read the chunks, calculate 
and write the missing chunk. 
And If I don't access files from client then SHD daemons will start 
heal and read,calculate,write the missing chunks right? 

In first case EC calculations takes places in client fuse process, in 
second case EC calculations will be made in SHD process right? 
Does brick process has any role in EC calculations? 

On Mon, May 29, 2017 at 3:32 PM, Ashish Pandey <aspandey at redhat.com> wrote: 
> 
> 
> ________________________________ 
> From: "Serkan Çoban" <cobanserkan at gmail.com> 
> To: "Gluster Users" <gluster-users at gluster.org> 
> Sent: Monday, May 29, 2017 5:13:06 PM 
> Subject: [Gluster-users] Heal operation detail of EC volumes 
> 
> Hi, 
> 
> When a brick fails in EC, What is the healing read/write data path? 
> Which processes do the operations? 
> 
> Healing could be triggered by client side (access of file) or server side 
> (shd). 
> However, in both the cases actual heal starts from "ec_heal_do" function. 
> 
> 
> Assume a 2GB file is being healed in 16+4 EC configuration. I was 
> thinking that SHD deamon on failed brick host will read 2GB from 
> network and reconstruct its 100MB chunk and write it on to brick. Is 
> this right? 
> 
> You are correct about read/write. 
> The only point is that, SHD deamon on one of the good brick will pick the 
> index entry and heal it. 
> SHD deamon scans the .glusterfs/index directory and heals the entries. If 
> the brick went down while IO was going on, index will be present on killed 
> brick also. 
> However, if a brick was down and then you started writing on a file then in 
> this case index entry would not be present on killed brick. 
> So even after brick will be UP, sdh on that brick will not be able to find 
> it out this index. However, other bricks would have entries and shd on that 
> brick will heal it. 
> 
> Note: I am considering each brick on different node. 
> 
> Ashish 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________ 
> Gluster-users mailing list 
> Gluster-users at gluster.org 
> http://lists.gluster.org/mailman/listinfo/gluster-users 
> 
_______________________________________________ 
Gluster-users mailing list 
Gluster-users at gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170530/4d92f1a7/attachment.html>