[Gluster-infra] [Gluster-devel] Downtime for Jenkins

Justin Clift justin at gluster.org
Mon May 18 10:05:09 UTC 2015

On 17 May 2015, at 13:36, Vijay Bellur <vbellur at redhat.com> wrote:
> On 05/17/2015 02:32 PM, Vijay Bellur wrote:
>> [Adding gluster-devel]
>> On 05/16/2015 11:31 PM, Niels de Vos wrote:
>>> On Sat, May 16, 2015 at 06:32:00PM +0200, Niels de Vos wrote:
>>>> It seems that many failures of the regression tests (at least for
>>>> NetBSD) are caused by failing to reconnect to the slave. Jenkins tries
>>>> to keep a control connection open to the slaves, and reconnects when the
>>>> connection terminates.
>>>> I do not know why the connection is disrupted, but I can see that
>>>> Jenkins is not able to resolve the hostname of the slave. For example,
>>>> from (well, you have to find the older logs, Jenkins seems to have
>>>> automatically reconnected)
>>>> http://build.gluster.org/computer/nbslave72.cloud.gluster.org-v2/log :
>>>>     java.io.IOException: There was a problem while connecting to
>>>> nbslave71.cloud.gluster.org:22
>>>>     ...
>>>>     Caused by: java.net.UnknownHostException:
>>>> nbslave71.cloud.gluster.org: Name or service not known
>>>> The error in the console log of the regression test is less helpful, it
>>>> only states the disconnection failure:
>>>> http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/5408/console
>>> In fact, this looks very much related to these reports:
>>> - https://issues.jenkins-ci.org/browse/JENKINS-19619 duplicate of 18879
>>> - https://issues.jenkins-ci.org/browse/JENKINS-18879
>>> This problem should be fixed in Jenkins 1.524 and newer. Time to upgrade
>>> Jenkins too?
>> Yes, I have started an upgrade. Please expect a downtime for Jenkins
>> during the upgrade.
>> I will update once the activity is complete.
> Upgrade to Jenkins v1.613 is now complete and Jenkins seems to be largely doing fine. Several plugins of Jenkins have also been updated to their latest versions. During the course of the upgrade, I noticed that we were using the deprecated 'gerrit approve' interface to intimate status of a smoke run. Have changed that to use 'gerrit review' and this seems to have addressed the problem of smoke tests not reporting status back to gerrit.
> There were a few instances of Jenkins not being able to launch slaves through ssh but was later successful upon automatic retries. We will need to watch this behavior to see if this problem persists and comes in the way of normal functioning.
> Manu - can you please verify and report back if the NetBSD slaves work better with the upgraded Jenkins master?
> All - please drop a note on gluster-infra if you happen to notice problems with Jenkins.

Good stuff. :)

+ Justin

GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

More information about the Gluster-infra mailing list