[Gluster-infra] Move of munin inside the DC

Michael Scherer mscherer at redhat.com
Wed Sep 19 17:59:00 UTC 2018


Le samedi 15 septembre 2018 à 00:23 +0200, Michael Scherer a écrit :
> Hi,
> 
> so today, I moved (without too much problem) our external munin to
> the
> internal lan, which help us getting rid of one more VM on rackspace.
> 
> This bring:
> - monitoring of the internal servers
> - more protection (since no longer accessible from outside, except by
> the proxy)
> - less server on rackspace
> - potentially link that with nagios for monitoring
> 
> 
> The url is still (but now served by the proxy):
> 
> https://munin.gluster.org/
> 
> The astute reader will see holes in some graphs, and some servers
> without historical data.
> 
> The lack of historical data was because munin was on the cloud, so
> not
> able to reach the internal lan. So new servers to graphs woohoo \o/
> 
> 
> The gap of 4h were due to another project website build breaking
> (docker breaking, causing all stuff to go on 1 node, so oom on the
> openshift cluster, so no new build), so I had to look at that.
> 
> Then the new munin package had some perl module missing ( https://bug
> zi
> lla.redhat.com/show_bug.cgi?id=1628390 ), so things were not working,
> but in a subtle way (aka, I had to use the debugger). 
> 
> And some error in the existing munin role regarding firewall
> reloading,
> etc compat with debian, freebsd, etc, that were found after
> deployment
> 
> So there is still some servers to integrate, some others to remove,
> but
> so far, it went well.

So, I did continue today (after 2 days off) the work, and it turn out
that my assumption of "public ip" == "hostname of the system" that I
used for munin before is slightly broken in various subtle ways around:
- firewalls
- proxy servers

It turn out also that the munin packages were more broken that I
thought, that firewalls (on freebsd) were more restrictive than I
remembered, that munin did detect old stuff creating empty graphs (rpc
on pleiometrosis, for example) that I had to clean, that we did had
servers using old names breaking the said assumption and all kind of
fun stuff.

So currently, all seems to be working but:
- some builders who are broken ATM (builder101, builder104)
- the 2 proxys (because they are using a weird firewall setup)
- gerrit prod server (because the name is not the one in DNS)

Nigel, can the hostname of gerrit be changed without side effect on
gerrit side ? I kinda would avoid coding a special things if that can
be avoided.

-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.gluster.org/pipermail/gluster-infra/attachments/20180919/a1046d32/attachment.sig>


More information about the Gluster-infra mailing list