[Bugs] [Bug 1593224] [Disperse] : Client side heal is not removing dirty flag for some of the files.

bugzilla at redhat.com bugzilla at redhat.com
Tue Dec 11 06:51:11 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1593224



--- Comment #3 from Ashish Pandey <aspandey at redhat.com> ---


While debugging the failure of this patch and thinking of incorporating
comments given by Pranith and Xavi,
I found that there is some design constraints to implement the idea of not
enqueue an entry if it is already healing.

Consider 2+1 config and following scenario - 

1 - Create volume and disable self heal daemon.
2 - Created a file wrote some data while all the bricks are UP.
3 - Kill one brick and write some data on the same file.
4 - Bring the brick UP.

5 - Now to trigger heal we will do "chmod 0666 file". This will do stst on file
which will find the brick is not healthy and 
trigger the heal.
6 - Now a synctask for the heal will be created and started which will call
ec_heal_do, which in turn calls ec_heal_metadata and ec_heal_data.

7 - A fop setattr will also be called on the file to set permission.

Now, a sequence of steps could be like this-

a > Stat- which saw unhealthy file and triggered heal

b  >ec_heal_metadata - took lock and healed metadata and healed metadata part
of trusted.ec.version, release the lock on file. [At this point setattr is
waiting for lock]

c > setattr takes the lock and found that the brick is still unhealthy as data
version is not healed and miss matching. Mark the dirty for metadata version,
unlock the file.

d > ec_heal_data takes the locks and heals the data.

Now, if we restrict only one fop to trigger heal, after step d, the file will
contain dirty flag and mismatched metadata versions.
If we keep all the heal request from every fop in a queue and after every heal
we check if the heal is needed or not then we will end up triggering heal for
all the fop, defeats the purpose of the patch.

Xavi, Pranith,
Please provide your comments. Am I correct in my understanding?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Bugs mailing list