[Gluster-infra] [shadow-it] Portmortem for gluster jenkins disk full outage on the 15th of August
mscherer at redhat.com
Wed Aug 15 09:32:09 UTC 2018
Le mercredi 15 août 2018 à 11:10 +0200, Michael Scherer a écrit :
> Hi folks,
> So Gluster jenkins disk was full today (cause outages do not respect
> public holiday in India (Independance day) and France(Assumption)),
> here is the post mortem for your reading pleasure
> Date: 15/08/2018
> Service affected:
> Jenkins for Gluster (jenkins-el7.rht.gluster.org)
> No jenkins job could be triggered.
> Root cause:
> A disk full mainly because we got new jobs and more patches, so
> regular growth.
> Increased the disk by 30G, and investigating if cleanup could be
> improved. This did require a reboot.
> Action items:
> - (misc) see what can be done for myrmicinae (the hypervisor where
> jenkins is running) since there is no more space.
So I looked at myrmicinae, and:
- we have only 23G free for VMs
- there is a 300G partition for the old VM of jenkins/gerrit that we
migrated last november. I kept it to be able to recover if needed, but
I guess that's no longer needed.
I will sync with Nigel to make extra sure that we can remove this
Sysadmin, Community Infrastructure and Platform, OSAS
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 836 bytes
Desc: This is a digitally signed message part
More information about the Gluster-infra