[Gluster-infra] [Gluster-devel] Reboot for Meltdown and stuff
amye at redhat.com
Sat Jan 6 22:50:13 UTC 2018
Thanks for your quick work on this, get some rest!
We can look at supercolony next week when you're back in action.
On Sat, Jan 6, 2018 at 11:48 AM, Michael Scherer <mscherer at redhat.com>
> Le samedi 06 janvier 2018 à 11:44 +0100, Michael Scherer a écrit :
> > Le vendredi 05 janvier 2018 à 14:24 +0100, Michael Scherer a écrit :
> > > Hi,
> > >
> > > unless you are living in a place without any internet (a igloo in
> > > Antartica, the middle of the Gobi desert, a bunker in Switzerland
> > > or
> > > simply the Paris underground train), you may have seen the news
> > > that
> > > this week is again a security nightmare (also called "just a normal
> > > Wednesday" among practitioners ), and that we have important kernel
> > > patch to push, that do requiers a reboot.
> > >
> > > See https://spectreattack.com/
> > >
> > > While I suspect our infra will not be targeted and there is more
> > > venue
> > > to attack on local computers and browsers who are the one running
> > > proprietary random code in form of JS on a regular basis, we still
> > > have
> > > to upgrade everything to be sure.
> > >
> > > Therefor, I am gonna have to reboot all the infra (yes, the 83
> > > servers), minus the few servers I already did reboot (because in
> > > HA,
> > > or
> > > not customer facing) tomorrow.
> > >
> > > I will block jenkins, and wait for the jobs to be finished before
> > > rebooting the various servers. I will send a email tomorrow once
> > > the
> > > reboot start (e.g., when/if I wake up), and another one things are
> > > good
> > > (or if stuff broke in a horrible fashion too, as it happened
> > > today).
> > >
> > > If there is some precaution or anything to take, people have around
> > > 24h
> > > to voice their concerns.
> > Reboot is starting. I already did various backend servers, the
> > document
> > I used for tracking the work is on
> > https://bimestriel.framapad.org/p/gluster_infra_reboot
> So almost all Linux servers got rebooted, most without issues, but
> during the day, I started to have the first symptom of a cold
> (headaches, shivering, etc), so I had to ping Nigel to finish the last
> server (who wasn't without issue)
> For people who do not want gruesome details on the reboots, you can
> stop here.
> We did got some trouble with:
> - a few servers on Rackspace (mostly infra) with cloud-init reseting
> the configuration to dhcp, and the dhcp not working. I am finally
> changing that and was in the course of fixing it for good before going
> back to bed.
> - gerrit didn't start automatically at boot. I know we had a fix for
> that, but not sure on why it didn't work, or if we didn't deployed yet.
> - supercolony seems to be unable to boot the latest kernel. It went so
> bad that the emergency console wasn't working. A erroneous message said
> "disabled for your account", so I did open a rackspace ticket and
> waited. This occurred as I started to not feel well, so I didn't really
> searched more, or I would have:
> - seen that the console was working for others servers (thus
> erroneous messages)
> - would have tried harder to boot another kernel
> - search a bit more on internal list that said "there is some issue
> somewhere around RHEL 6". Didn't investigate more, but that's also what
> In the end, Nigel took over the problem solving and pinged harder
> Rackspace, whose support suggested to boot another kernel, which he did
> (but better than I did).
> And thus supercolony is back, but not upgraded.
> The last one still puzzle me, because the current configuration is:
> "default=2", so that should start the 3rd kernel in the list.
> Grub doc say "The first entry (here, counting starts with number zero,
> not one!) will be the default choice", it was "0" when i first tried to
> boot another kernel (switched to 1).
> So since we have:
> [root at supercolony ~]# grep title /boot/grub/menu.lst
> title Red Hat Enterprise Linux Server (2.6.32-696.18.7.el6.x86_64)
> title Red Hat Enterprise Linux Server (2.6.32-696.16.1.el6.x86_64)
> title Red Hat Enterprise Linux Server (2.6.32-642.15.1.el6.x86_64)
> default=1 should have used 2.6.32-696.16.1, but it didn't boot.
> Nigel changed it for "default=2", so that should have used 2.6.32-
> 642.15.1, but plot twist...
> # uname -a
> Linux supercolony.gluster.org 2.6.32-696.16.1.el6.x86_64 #1 SMP Sun Oct
> 8 09:45:56 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> So there is something fishy for grub, but as I redact that from my bed,
> maybe the problem is on my side. I am sure it will be clearer once I
> hit "send".
> So, to recap, we have one or two servers to upgrade (cf the pad), the
> *bsd are not patched yet (I quickly checked their lists, but I do not
> expect it soon), but since the more urgent issues were on the
> hypervisor side, we are ok for that.
> The grub on supercolony need to be investigated, and supercolony should
> be upgraded as well.
> I also need to take some rest.
> Many thanks for Nigel for taking over when my body failed me.
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> Gluster-infra mailing list
> Gluster-infra at gluster.org
Amye Scavarda | amye at redhat.com | Gluster Community Lead
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-infra