[Gluster-infra] Move of munin inside the DC
Michael Scherer
mscherer at redhat.com
Wed Sep 19 17:59:00 UTC 2018
Le samedi 15 septembre 2018 à 00:23 +0200, Michael Scherer a écrit :
> Hi,
>
> so today, I moved (without too much problem) our external munin to
> the
> internal lan, which help us getting rid of one more VM on rackspace.
>
> This bring:
> - monitoring of the internal servers
> - more protection (since no longer accessible from outside, except by
> the proxy)
> - less server on rackspace
> - potentially link that with nagios for monitoring
>
>
> The url is still (but now served by the proxy):
>
> https://munin.gluster.org/
>
> The astute reader will see holes in some graphs, and some servers
> without historical data.
>
> The lack of historical data was because munin was on the cloud, so
> not
> able to reach the internal lan. So new servers to graphs woohoo \o/
>
>
> The gap of 4h were due to another project website build breaking
> (docker breaking, causing all stuff to go on 1 node, so oom on the
> openshift cluster, so no new build), so I had to look at that.
>
> Then the new munin package had some perl module missing ( https://bug
> zi
> lla.redhat.com/show_bug.cgi?id=1628390 ), so things were not working,
> but in a subtle way (aka, I had to use the debugger).
>
> And some error in the existing munin role regarding firewall
> reloading,
> etc compat with debian, freebsd, etc, that were found after
> deployment
>
> So there is still some servers to integrate, some others to remove,
> but
> so far, it went well.
So, I did continue today (after 2 days off) the work, and it turn out
that my assumption of "public ip" == "hostname of the system" that I
used for munin before is slightly broken in various subtle ways around:
- firewalls
- proxy servers
It turn out also that the munin packages were more broken that I
thought, that firewalls (on freebsd) were more restrictive than I
remembered, that munin did detect old stuff creating empty graphs (rpc
on pleiometrosis, for example) that I had to clean, that we did had
servers using old names breaking the said assumption and all kind of
fun stuff.
So currently, all seems to be working but:
- some builders who are broken ATM (builder101, builder104)
- the 2 proxys (because they are using a weird firewall setup)
- gerrit prod server (because the name is not the one in DNS)
Nigel, can the hostname of gerrit be changed without side effect on
gerrit side ? I kinda would avoid coding a special things if that can
be avoided.
--
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.gluster.org/pipermail/gluster-infra/attachments/20180919/a1046d32/attachment.sig>
More information about the Gluster-infra
mailing list