[Gluster-infra] [Gluster-devel] bug-1432542-mpx-restart-crash.t failing

Amar Tumballi atumball at redhat.com
Tue Jul 10 03:59:29 UTC 2018


On Mon, Jul 9, 2018 at 8:10 PM, Nithya Balachandran <nbalacha at redhat.com>
wrote:

> We discussed reducing the number of volumes in the maintainers'
> meeting.Should we still go ahead and do that?
>
>
>
It would still be a good exercise, IMO. Reducing it to 50-60 volumes from
120 now.


> On 9 July 2018 at 15:45, Xavi Hernandez <jahernan at redhat.com> wrote:
>
>> On Mon, Jul 9, 2018 at 11:14 AM Karthik Subrahmanya <ksubrahm at redhat.com>
>> wrote:
>>
>>> Hi Deepshikha,
>>>
>>> Are you looking into this failure? I can still see this happening for
>>> all the regression runs.
>>>
>>
>> I've executed the failing script on my laptop and all tests finish
>> relatively fast. What seems to take time is the final cleanup. I can see
>> 'semanage' taking some CPU during destruction of volumes. The test required
>> 350 seconds to finish successfully.
>>
>> Not sure what caused the cleanup time to increase, but I've created a bug
>> [1] to track this and a patch [2] to give more time to this test. This
>> should allow all blocked regressions to complete successfully.
>>
>> Xavi
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1599250
>> [2] https://review.gluster.org/20482
>>
>>
>>> Thanks & Regards,
>>> Karthik
>>>
>>> On Sun, Jul 8, 2018 at 7:18 AM Atin Mukherjee <amukherj at redhat.com>
>>> wrote:
>>>
>>>> https://build.gluster.org/job/regression-test-with-multiplex
>>>> /794/display/redirect has the same test failing. Is the reason of the
>>>> failure different given this is on jenkins?
>>>>
>>>> On Sat, 7 Jul 2018 at 19:12, Deepshikha Khandelwal <dkhandel at redhat.com>
>>>> wrote:
>>>>
>>>>> Hi folks,
>>>>>
>>>>> The issue[1] has been resolved. Now the softserve instance will be
>>>>> having 2GB RAM i.e. same as that of the Jenkins builder's sizing
>>>>> configurations.
>>>>>
>>>>> [1] https://github.com/gluster/softserve/issues/40
>>>>>
>>>>> Thanks,
>>>>> Deepshikha Khandelwal
>>>>>
>>>>> On Fri, Jul 6, 2018 at 6:14 PM, Karthik Subrahmanya <
>>>>> ksubrahm at redhat.com> wrote:
>>>>> >
>>>>> >
>>>>> > On Fri 6 Jul, 2018, 5:18 PM Deepshikha Khandelwal, <
>>>>> dkhandel at redhat.com>
>>>>> > wrote:
>>>>> >>
>>>>> >> Hi Poornima/Karthik,
>>>>> >>
>>>>> >> We've looked into the memory error that this softserve instance have
>>>>> >> showed up. These machine instances have 1GB RAM which is not in the
>>>>> >> case with the Jenkins builder. It's 2GB RAM there.
>>>>> >>
>>>>> >> We've created the issue [1] and will solve it sooner.
>>>>> >
>>>>> > Great. Thanks for the update.
>>>>> >>
>>>>> >>
>>>>> >> Sorry for the inconvenience.
>>>>> >>
>>>>> >> [1] https://github.com/gluster/softserve/issues/40
>>>>> >>
>>>>> >> Thanks,
>>>>> >> Deepshikha Khandelwal
>>>>> >>
>>>>> >> On Fri, Jul 6, 2018 at 3:44 PM, Karthik Subrahmanya <
>>>>> ksubrahm at redhat.com>
>>>>> >> wrote:
>>>>> >> > Thanks Poornima for the analysis.
>>>>> >> > Can someone work on fixing this please?
>>>>> >> >
>>>>> >> > ~Karthik
>>>>> >> >
>>>>> >> > On Fri, Jul 6, 2018 at 3:17 PM Poornima Gurusiddaiah
>>>>> >> > <pgurusid at redhat.com>
>>>>> >> > wrote:
>>>>> >> >>
>>>>> >> >> The same test case is failing for my patch as well [1]. I
>>>>> requested for
>>>>> >> >> a
>>>>> >> >> regression system and tried to reproduce it.
>>>>> >> >> From my analysis, the brick process (mutiplexed) is consuming a
>>>>> lot of
>>>>> >> >> memory, and is being OOM killed. The regression has 1GB ram and
>>>>> the
>>>>> >> >> process
>>>>> >> >> is consuming more than 1GB. 1GB for 120 bricks is acceptable
>>>>> >> >> considering
>>>>> >> >> there is 1000 threads in that brick process.
>>>>> >> >> Ways to fix:
>>>>> >> >> - Increase the regression system RAM size OR
>>>>> >> >> - Decrease the number of volumes in the test case.
>>>>> >> >>
>>>>> >> >> But what is strange is why the test passes sometimes for some
>>>>> patches.
>>>>> >> >> There could be some bug/? in memory consumption.
>>>>> >> >>
>>>>> >> >> Regards,
>>>>> >> >> Poornima
>>>>> >> >>
>>>>> >> >>
>>>>> >> >> On Fri, Jul 6, 2018 at 2:11 PM, Karthik Subrahmanya
>>>>> >> >> <ksubrahm at redhat.com>
>>>>> >> >> wrote:
>>>>> >> >>>
>>>>> >> >>> Hi,
>>>>> >> >>>
>>>>> >> >>> $subject is failing on centos regression for most of the
>>>>> patches with
>>>>> >> >>> timeout error.
>>>>> >> >>>
>>>>> >> >>> 07:32:34
>>>>> >> >>>
>>>>> >> >>> ============================================================
>>>>> ====================
>>>>> >> >>> 07:32:34 [07:33:05] Running tests in file
>>>>> >> >>> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>>>>> >> >>> 07:32:34 Timeout set is 300, default 200
>>>>> >> >>> 07:37:34 ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>>>>> timed out
>>>>> >> >>> after 300 seconds
>>>>> >> >>> 07:37:34 ./tests/bugs/core/bug-1432542-mpx-restart-crash.t:
>>>>> bad status
>>>>> >> >>> 124
>>>>> >> >>> 07:37:34
>>>>> >> >>> 07:37:34        *********************************
>>>>> >> >>> 07:37:34        *       REGRESSION FAILED       *
>>>>> >> >>> 07:37:34        * Retrying failed tests in case *
>>>>> >> >>> 07:37:34        * we got some spurious failures *
>>>>> >> >>> 07:37:34        *********************************
>>>>> >> >>> 07:37:34
>>>>> >> >>> 07:42:34 ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>>>>> timed out
>>>>> >> >>> after 300 seconds
>>>>> >> >>> 07:42:34 End of test ./tests/bugs/core/bug-1432542-
>>>>> mpx-restart-crash.t
>>>>> >> >>> 07:42:34
>>>>> >> >>>
>>>>> >> >>> ============================================================
>>>>> ====================
>>>>> >> >>>
>>>>> >> >>> Can anyone take a look?
>>>>> >> >>>
>>>>> >> >>> Thanks,
>>>>> >> >>> Karthik
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>> _______________________________________________
>>>>> >> >>> Gluster-devel mailing list
>>>>> >> >>> Gluster-devel at gluster.org
>>>>> >> >>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>> >> >>
>>>>> >> >>
>>>>> >> >
>>>>> >> > _______________________________________________
>>>>> >> > Gluster-infra mailing list
>>>>> >> > Gluster-infra at gluster.org
>>>>> >> > https://lists.gluster.org/mailman/listinfo/gluster-infra
>>>>> _______________________________________________
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>> --
>>>> - Atin (atinm)
>>>>
>>> _______________________________________________
>>> Gluster-infra mailing list
>>> Gluster-infra at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-infra
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
> _______________________________________________
> Gluster-infra mailing list
> Gluster-infra at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-infra
>



-- 
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-infra/attachments/20180710/ec67bd64/attachment-0001.html>


More information about the Gluster-infra mailing list