[Gluster-Maintainers] inclusion of multi-threaded self-heal feature in 3.7.12

Niels de Vos ndevos at redhat.com
Tue Apr 12 09:32:54 UTC 2016


On Tue, Apr 12, 2016 at 02:20:27PM +0530, Pranith Kumar Karampuri wrote:
> 
> 
> On 04/12/2016 12:05 PM, Niels de Vos wrote:
> >On Tue, Apr 12, 2016 at 11:50:23AM +0530, Pranith Kumar Karampuri wrote:
> >>
> >>On 04/11/2016 09:58 PM, Vijay Bellur wrote:
> >>>On 04/11/2016 08:02 AM, Niels de Vos wrote:
> >>>>On Mon, Apr 11, 2016 at 05:00:22PM +0530, Pranith Kumar Karampuri wrote:
> >>>>>hi,
> >>>>>         I am thinking of getting multi-threaded self-heal patch in
> >>>>>3.7.12
> >>>>>release as it is not a big code change, could you let me know if
> >>>>>anyone has
> >>>>>any issues with this?
> >>>>Is this safe to use when there is no rpc throttling?
> >>Default number of parallel heals is still '1'. It will be 1 until we have
> >>some form of throttling of traffic, which we are targeting for 3.8.
> >>
> >>>A few more questions:
> >>>
> >>>1. What is the impact on stability due to this patch?
> >>So far we haven't found any problems with this patch. It has been tested for
> >>around a month before it is sent upstream for merge.
> >>>2. What scenarios would we want this feature to be turned on? Basically an
> >>>admin guide update about these scenarios would be useful.
> >>I can send a document about this.
> >>>3. What kind of tests have been done with this feature?
> >>Most of the tests include perf comparison of VM images/sharded images which
> >>showed promising results.
> >>>4. Are there tests that we want to cover but have not been able to
> >>>complete for this feature?
> >>Hmm... not really.
> >I'm not completely confident that this should be backported to 3.7. We
> >should focus on backporting bug fixes and leave the featurs for next
> >releases. We have seen a few regression in the 3.7 stable branch because
> >of agressive backporting of features. Potential undiscovered issues are
> >something we should expect for each new feature, and backporting them
> >always comes with a risk. I really want to keep the promise to users
> >that 3.7 is a stable version.
> 
> Give me a list of things that need to be done for building confidence for
> inclusion of this patch and I will get it done.
> At the moment like I said it already underwent more than a month of testing
> and I feel it is stable. I also as a maintainer really want to keep the
> promise to the users that 3.7. branch is stable and with good performance.

It would be a good start if you can point to the testing that was, and
the results. I hope that there are DiSTAF testcases that we can include
too. Is there an easy way so that we can run all the healing test-cases
with the current self-heal and with multi-threaded self-heal with
different number of threads?

> >You have to come with really good arguments to convince anyone from
> >including a feature in stable branches. Disabling (or in this case one
> >parallel heal) does not make much for backports.
> 
> Really good arguments is subjective Niels. Most of the users have been
> asking for both this feature and throttling of self-heal traffic for years,
> I feel it is stable enough to get into a branch. We are also keeping the
> defaults such that users who don't care about this won't have any surprises.
> I am happy to get anything and everything that the maintainer for 3.7.12 ask
> for this patch to be included in 3.7.12 but I want this to get in for
> 3.7.12.

My main concern is that users will enable it without understanding the
potential side effects that it can cause. Updates to the documentation,
blog posts and clear guidance in the release notes are a must. In fact,
you need these anyway for new features :-)

Thanks,
Niels
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/maintainers/attachments/20160412/9d9cf1cc/attachment.sig>


More information about the maintainers mailing list