[Cinder.glusterfs.ci] [Third-party-announce] Gerrit account xio-ise-iscsi-ci is disabled

Anita Kuno anteaya at anteaya.info
Thu Dec 17 22:18:25 UTC 2015


On 12/17/2015 05:04 PM, Hedlind, Richard wrote:
> My triage notes so far:
> 
> Jenkins service crashed on the local CI master due to an out of memory issue (seemingly caused by zuul as it had eaten up a lot of system memory at that point).
> The zuul scheduler (version 2.1.1.dev15)  could no longer communicate with Jenkins to submit the job for change 252250,13 and hit the following exception:
> 
> 2015-12-16 22:07:44,274 INFO zuul.Gerrit: Updating information for 252250,13
> 2015-12-16 22:07:45,386 ERROR zuul.Scheduler: Exception in run handler:
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 831, in run
>     while pipeline.manager.processQueue():
>   File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1441, in processQueue
>     item, nnfi, ready_ahead)
>   File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1413, in _processOneItem
>     self.reportItem(item)
>   File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1498, in reportItem
>     item.reported = not self._reportItem(item)
>   File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1552, in _reportItem
>     self.updateBuildDescriptions(item.current_build_set)
>   File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1460, in updateBuildDescriptions
>     self.sched.launcher.setBuildDescription(build, desc)
>   File "/usr/local/lib/python2.7/dist-packages/zuul/launcher/gearman.py", line 518, in setBuildDescription
>     timeout=300)
>   File "/usr/local/lib/python2.7/dist-packages/gear/__init__.py", line 1450, in submitJob
>     raise GearmanError("Unable to submit job to any connected servers")
> GearmanError: Unable to submit job to any connected servers
> 2015-12-16 22:07:45,387 INFO zuul.IndependentPipelineManager: Reporting change <Change 0x7f3f6ce87f50 252250,13>, actions: [<ActionReporter <zuul.reporter.gerrit.Reporter object at 0x7f40bf4bdb50>, {'verified': 0}>]
> 
> This caused zuul to end up in an infinite loop of trying to post the job, hit the exception, post a comment on the change to gerrit and try again.
> 
> Remediation steps identified so far: 
> 
> 1) Updated zuul to latest version 2.1.1.dev109 (completed)
> 2) Look at zuul source to see if an infinite loop can be identified after hitting above exception.
> 3) Add protection/alerting mechanism to handle Jenkins crash.
> 
> I would also like to know what the steps are to have the CI account enabled again?
> 
> Richard

Thanks Richard, I appreciate the speed with which you appeared in
channel to address this as well as the detail in your triage notes.
Thank you.

Mostly I am looking for indications in thought and deed that an operator
can be trusted with the responsibility a system represents. You are
indicating to me that you take your work seriously, you are aware of the
effect an errant system has on the rest of the developer community and
that you will work diligently to ensure your system behaves in a fashion
that respects the overall developer community. That is what I needed to
see, thank you.

Our zuul queues seem to be moving along, so hopefully the rest of the
backlog will be cleared over the weekend. Please work on some sort of
script that takes your system to not commenting on infra's gerrit if it
detects some form of a commenting loop as we discussed in channel. Let's
talk again tomorrow in channel and we will keep working on this and
assessing progress. You have done quite a bit today, it is nice to see
thank you.

Glad to be able to work with you on this.

Thanks Richard,
Anita,
(anteaya)

> 
> -----Original Message-----
> From: Anita Kuno [mailto:anteaya at anteaya.info] 
> Sent: Thursday, December 17, 2015 10:32 AM
> To: Announcements for Third Party CI Operators. <third-party-announce at lists.openstack.org>
> Subject: [Third-party-announce] Gerrit account xio-ise-iscsi-ci is disabled
> 
> https://wiki.openstack.org/wiki/ThirdPartySystems/X-IO_technologies_CI
> 
> This account is disabled and the connection to Gerrit is closed.
> 
> This account was autogenerating comments to Gerrit patch 252250 to the tune of 4MB of content:
> http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2015-12-17.log.html#t2015-12-17T17:07:41
> 
> This account will remain disabled until the Zuul backlog created by this occurance has been cleared: http://status.openstack.org/zuul/ and until I hear from the operators of this system telling me that they are willing to take responsibility for their actions and they will do so in the future.
> 
> Thank you,
> Anita.
> 
> _______________________________________________
> Third-party-announce mailing list
> Third-party-announce at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-announce
> Please attend the third party meetings: http://eavesdrop.openstack.org/#Third_Party_Meeting
> 
> _______________________________________________
> Third-party-announce mailing list
> Third-party-announce at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-announce
> Please attend the third party meetings: http://eavesdrop.openstack.org/#Third_Party_Meeting
> 


_______________________________________________
Third-party-announce mailing list
Third-party-announce at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-announce
Please attend the third party meetings: http://eavesdrop.openstack.org/#Third_Party_Meeting


More information about the Cinder.glusterfs.ci mailing list