[Gluster-infra] New spurious regression

Avra Sengupta asengupt at redhat.com
Thu Nov 5 10:29:25 UTC 2015


On 11/05/2015 03:57 PM, Avra Sengupta wrote:
> On 11/05/2015 03:56 PM, Vijay Bellur wrote:
>> On Thursday 05 November 2015 12:19 PM, Avra Sengupta wrote:
>>> Hi,
>>>
>>> We investigated the logs in the regression failures that encountered
>>> this and following are the findings:
>>> 1. snapshot clone failure is indeed the reason for the failure.
>>> 2. snapshot clone has failed in pre-validation with the error that the
>>> brick of snap3 is not up and running.
>>> 3. snap3 was created, and subsequently started (because of
>>> activate-on-create being enabled), long before we tried to create a
>>> clone out of it.
>>> 4. The snap3's brick shows no failure logs, and thereby gives us no
>>> reason to believe that it did not start properly in the course of the
>>> testcase.
>>> 5. Which leaves us with the assumption (it is an assumption because we
>>> do not have any logs backing it) that, there was some delay in either
>>> the start of the brick process for snap3, or for glusterd to register
>>> that the same has started, and before either of these events could have
>>> happened the clone command got executed and failed. This would make 
>>> it a
>>> race.
>>>
>>> Some other things to consider about the particular testcase:
>>> 1. It did pass (and still passes consistently), in our local systems
>>> making it not reproducible locally.
>>> 2. The patch was merged after both linux and netbsd regressions passed
>>> (at one go).
>>> 3. The release 3.7 backported patch for the same, has also passed both
>>> the linux and netbsd regressions as of now.
>>>
>>> The rationale behind mentioning the above three points being, this
>>> testcase has passed locally, as well as on the regression setups(not
>>> just at the time of merge, but even now), which brings me back to the
>>> assumption mentioned in point #5 . To get more clarity on the said
>>> assumption we need access to one of the regression setups, so that we
>>> can try reproducing the failure in that environment and get some proof
>>> of what really is happening.
>>>
>>> Vijay,
>>>
>>> Could you please provide us with a jenkins linux slave to perform the
>>> above mentioned validity
>>>
>>
>> Please send out a request on gluster-infra if not done so and Michael 
>> Scherer should be able to help.
>>
>> Thanks!
>> Vijay
>>
> + Adding gluster-infra and Michael
>
> Could you please provide us with a jenkins linux slave to perform the 
> above mentioned validity
Thunderbird seems to be dropping to's in the mail. Added gluster-infra


More information about the Gluster-infra mailing list