[Gluster-infra] [Gluster-devel] 8/10 AWS jenkins builders disconnected

Sankarshan Mukhopadhyay sankarshan.mukhopadhyay at gmail.com
Wed Mar 6 16:01:47 UTC 2019


On Wed, Mar 6, 2019 at 8:47 PM Michael Scherer <mscherer at redhat.com> wrote:
>
> Le mercredi 06 mars 2019 à 17:53 +0530, Sankarshan Mukhopadhyay a
> écrit :
> > On Wed, Mar 6, 2019 at 5:38 PM Deepshikha Khandelwal
> > <dkhandel at redhat.com> wrote:
> > >
> > > Hello,
> > >
> > > Today while debugging the centos7-regression failed builds I saw
> > > most of the builders did not pass the instance status check on AWS
> > > and were unreachable.
> > >
> > > Misc investigated this and came to know about the patch[1] which
> > > seems to break the builder one after the other. They all ran the
> > > regression test for this specific change before going offline.
> > > We suspect that this change do result in infinite loop of processes
> > > as we did not see any trace of error in the system logs.
> > >
> > > We did reboot all those builders and they all seem to be running
> > > fine now.
> > >
> >
> > The question though is - what to do about the patch, if the patch
> > itself is the root cause? Is this assigned to anyone to look into?
>
> We also pondered on wether we should protect the builder from that kind
> of issue. But since:
> - we are not sure that the hypothesis is right
> - any protection based on "limit the number of process" would surely
> sooner or later block legitimate tests, and requires adjustement (and
> likely investigation)
>
> we didn't choose to follow that road for now.
>

This is a good topic though. Is there any logical way to fence off the
builders from noisy neighbors?


More information about the Gluster-infra mailing list