[Gluster-infra] Do we have a monitoring system on our builders?

Michael Scherer mscherer at redhat.com
Mon Apr 29 08:17:37 UTC 2019


Le samedi 27 avril 2019 à 22:18 +0300, Yaniv Kaul a écrit :
> I'd like to see what is our status.
> Just had CI failures[1] because builder26.int.rht.gluster.org is not
> available, apparently.

We have nagios too. Web interface is password protected so I can't give
it (need to do a guest account, and so far, no one has expressed
interest into that).


This failure is weird, cause the builder is pretty much up and running,
but it seems the jenkins agent crashed. This exact process is not
monitored by nagios, as I was under the impression that jenkins was
smart enough to start it on demand (seems I was wrong), and/or see it
crashed and put the server out of rotation (seems I was wrong on that
one too)


I suspect this was related to the openjdk upgrade on the 20 on
builder26. Since jenkins do not support that on the main server, I
guess it also may be unstable on the agent side :/

I disconnected/reconnected the builder, this should fix for this one,
but we definitely need to dig a bit more to see what happened and how
to prevent that.

Adding supervision of the agent should be quick (*cough* famous last
words *cough*), so let's do that as a first step.

-- 
Michael Scherer
Sysadmin, Community Infrastructure



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.gluster.org/pipermail/gluster-infra/attachments/20190429/0b2bcadd/attachment.sig>


More information about the Gluster-infra mailing list