[Gluster-devel] [Gluster-infra] Infra-related Regression Failures and What We're Doing

Kotresh Hiremath Ravishankar khiremat at redhat.com
Mon Jan 22 07:04:25 UTC 2018


On Mon, Jan 22, 2018 at 12:21 PM, Nigel Babu <nigelb at redhat.com> wrote:

> Hello folks,
>
> As you may have noticed, we've had a lot of centos6-regression failures
> lately. The geo-replication failures are the new ones which particularly
> concern me. These failures have nothing to do with the test. The tests are
> exposing a problem in our infrastructure that we've carried around for a
> long time. Our machines are not clean machines that we automated. We setup
> automation on machines that were already created. At some point, we loaned
> machines for debugging. During this time, developers have inadvertently
> done 'make install' on the system to install onto system paths rather than
> into /build/install. This is what is causing the geo-replication tests to
> fail. I've tried cleaning the machines up several times with little to no
> success.
>
> Last week, we decided to take an aggressive path to fix this problem. We
> planned to replace all our problematic nodes with new Centos 7 nodes. This
> exposed more problems. We expected a specific type of machine from
> Rackspace. These are no longer offered. Thus, our automation fails on some
> steps. I've spent this weekend tweaking our automation so that it works
> on the new Rackspace machines and I'm down to just one test failure[1]. I
> have a patch up to fix this failure[2]. As soon as that patch is merged,
> we can push forward with Centos7 nodes. In 4.0, we're dropping support for
> Centos 6, so this decision makes more sense to do sooner than later.
>
> We'll not be lending machines anymore from production. We'll be creating
> new nodes which are a snapshots of an existing production node. This
> machine will be destroyed after use. This helps prevent this particular
> problem in the future. This also means that our machine capacity at all
> times is at 100 with very minimal wastage.
>

    +2 for this

>
> [1]: https://build.gluster.org/job/cage-test/184/consoleText
> [2]: https://review.gluster.org/#/c/19262/
>
> --
> nigelb
>
> _______________________________________________
> Gluster-infra mailing list
> Gluster-infra at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-infra
>



-- 
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180122/e44bdfad/attachment.html>


More information about the Gluster-devel mailing list