[Gluster-infra] Downtime for Jenkins

Niels de Vos ndevos at redhat.com
Sun May 17 17:55:02 UTC 2015


On Sun, May 17, 2015 at 06:06:19PM +0530, Vijay Bellur wrote:
> On 05/17/2015 02:32 PM, Vijay Bellur wrote:
> >[Adding gluster-devel]
> >
> >On 05/16/2015 11:31 PM, Niels de Vos wrote:
> >>On Sat, May 16, 2015 at 06:32:00PM +0200, Niels de Vos wrote:
> >>>It seems that many failures of the regression tests (at least for
> >>>NetBSD) are caused by failing to reconnect to the slave. Jenkins tries
> >>>to keep a control connection open to the slaves, and reconnects when the
> >>>connection terminates.
> >>>
> >>>I do not know why the connection is disrupted, but I can see that
> >>>Jenkins is not able to resolve the hostname of the slave. For example,
> >>>from (well, you have to find the older logs, Jenkins seems to have
> >>>automatically reconnected)
> >>>http://build.gluster.org/computer/nbslave72.cloud.gluster.org-v2/log :
> >>>
> >>>     java.io.IOException: There was a problem while connecting to
> >>>nbslave71.cloud.gluster.org:22
> >>>     ...
> >>>     Caused by: java.net.UnknownHostException:
> >>>nbslave71.cloud.gluster.org: Name or service not known
> >>>
> >>>
> >>>The error in the console log of the regression test is less helpful, it
> >>>only states the disconnection failure:
> >>>
> >>>
> >>>http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/5408/console
> >>>
> >>
> >>In fact, this looks very much related to these reports:
> >>
> >>- https://issues.jenkins-ci.org/browse/JENKINS-19619 duplicate of 18879
> >>- https://issues.jenkins-ci.org/browse/JENKINS-18879
> >>
> >>This problem should be fixed in Jenkins 1.524 and newer. Time to upgrade
> >>Jenkins too?
> >
> >Yes, I have started an upgrade. Please expect a downtime for Jenkins
> >during the upgrade.
> >
> >I will update once the activity is complete.
> >
> 
> Upgrade to Jenkins v1.613 is now complete and Jenkins seems to be largely
> doing fine. Several plugins of Jenkins have also been updated to their
> latest versions. During the course of the upgrade, I noticed that we were
> using the deprecated 'gerrit approve' interface to intimate status of a
> smoke run. Have changed that to use 'gerrit review' and this seems to have
> addressed the problem of smoke tests not reporting status back to gerrit.
> 
> There were a few instances of Jenkins not being able to launch slaves
> through ssh but was later successful upon automatic retries. We will need to
> watch this behavior to see if this problem persists and comes in the way of
> normal functioning.
> 
> Manu - can you please verify and report back if the NetBSD slaves work
> better with the upgraded Jenkins master?
> 
> All - please drop a note on gluster-infra if you happen to notice problems
> with Jenkins.

Thanks, Vijay!


More information about the Gluster-infra mailing list