<div dir="ltr">Update: All the nodes that had problems with geo-rep are now fixed. Waiting on the patch to be merged before we switch over to Centos 7. If things go well, we'll replace nodes one by one as soon as we have one green on Centos 7.<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 22, 2018 at 12:21 PM, Nigel Babu <span dir="ltr"><<a href="mailto:nigelb@redhat.com" target="_blank">nigelb@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hello folks,</div><div><br></div><div><div class="m_5000121705177514626gmail-"><div class="m_5000121705177514626gmail-public-DraftStyleDefault-block m_5000121705177514626gmail-public-DraftStyleDefault-ltr"><span><span>As you may have noticed, we've had a lot of centos6-regression failures lately. The geo-replication failures are the new ones which particularly concern me. These failures have nothing to do with the test. The tests are exposing a problem in our infrastructure that we've carried around for a long time. Our machines are not clean machines that we automated. We setup automation on machines that were already created. At some point, we loaned machines for debugging. </span></span><span class="m_5000121705177514626gmail-veryhardreadability"><span><span>During this time, developers have </span></span></span><span class="m_5000121705177514626gmail-adverb"><span><span>inadvertently</span></span></span><span class="m_5000121705177514626gmail-veryhardreadability"><span><span> done 'make install' on the system to install onto system paths rather than into /build/install</span></span></span><span><span>. This is what is causing the geo-replication tests to fail. I've tried cleaning the machines up several times with little to no success.</span></span></div></div><div class="m_5000121705177514626gmail-"><div class="m_5000121705177514626gmail-public-DraftStyleDefault-block m_5000121705177514626gmail-public-DraftStyleDefault-ltr"><span><br></span></div></div><div class="m_5000121705177514626gmail-"><div class="m_5000121705177514626gmail-public-DraftStyleDefault-block m_5000121705177514626gmail-public-DraftStyleDefault-ltr"><span><span>Last week, we decided to take an aggressive path to fix this problem. We planned to replace all our problematic nodes with new Centos 7 nodes. This exposed more problems. We expected a specific type of machine from Rackspace. These are no longer offered. Thus, our automation fails on some steps. </span></span><span class="m_5000121705177514626gmail-hardreadability"><span><span>I've spent this weekend tweaking our automation so that it works on the new Rackspace machines and I'm down to </span></span></span><span class="m_5000121705177514626gmail-qualifier"><span><span>just</span></span></span><span class="m_5000121705177514626gmail-hardreadability"><span><span> one test failure[1]</span></span></span><span><span>. I have a patch up to fix this failure[2]. As soon as that patch </span></span><span class="m_5000121705177514626gmail-passivevoice"><span><span>is merged</span></span></span><span><span>, we can push forward with Centos7 nodes. In 4.0, we're dropping support for Centos 6, so this decision makes more sense to do sooner than later.</span></span></div></div><div class="m_5000121705177514626gmail-"><div class="m_5000121705177514626gmail-public-DraftStyleDefault-block m_5000121705177514626gmail-public-DraftStyleDefault-ltr"><span><br></span></div></div><div class="m_5000121705177514626gmail-"><div class="m_5000121705177514626gmail-public-DraftStyleDefault-block m_5000121705177514626gmail-public-DraftStyleDefault-ltr"><span><span>We'll not be lending machines anymore from production. We'll be creating new nodes which are a snapshots of an existing production node. This machine will </span></span><span class="m_5000121705177514626gmail-passivevoice"><span><span>be destroyed</span></span></span><span><span> after use. This helps prevent this particular problem in the future. This also means that our machine capacity at all times is at 100 with very minimal wastage.</span></span></div></div><div class="m_5000121705177514626gmail-"><div class="m_5000121705177514626gmail-public-DraftStyleDefault-block m_5000121705177514626gmail-public-DraftStyleDefault-ltr"><span><br></span></div></div><div class="m_5000121705177514626gmail-"><div class="m_5000121705177514626gmail-public-DraftStyleDefault-block m_5000121705177514626gmail-public-DraftStyleDefault-ltr"><span><span>[1]: <a href="https://build.gluster.org/job/cage-test/184/consoleText" target="_blank">https://build.gluster.org/job/<wbr>cage-test/184/consoleText</a></span></span></div></div><div class="m_5000121705177514626gmail-"><div class="m_5000121705177514626gmail-public-DraftStyleDefault-block m_5000121705177514626gmail-public-DraftStyleDefault-ltr"><span><span>[2]: <a href="https://review.gluster.org/#/c/19262/" target="_blank">https://review.gluster.org/#/<wbr>c/19262/</a></span></span></div></div></div><span class="HOEnZb"><font color="#888888"><div><div><br>-- <br><div class="m_5000121705177514626gmail-m_-1559655015143039186gmail_signature"><div dir="ltr">nigelb<br></div></div>
</div></div></font></span></div>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">nigelb<br></div></div>
</div>