[Gluster-infra] Regression fails due to infra issue

Michael Scherer mscherer at redhat.com
Tue Jun 7 08:29:34 UTC 2016


Le mardi 07 juin 2016 à 10:00 +0200, Michael Scherer a écrit :
> Le mardi 07 juin 2016 à 09:54 +0200, Michael Scherer a écrit :
> > Le lundi 06 juin 2016 à 21:18 +0200, Niels de Vos a écrit :
> > > On Mon, Jun 06, 2016 at 09:59:02PM +0530, Nigel Babu wrote:
> > > > On Mon, Jun 6, 2016 at 12:56 PM, Poornima Gurusiddaiah <pgurusid at redhat.com>
> > > > wrote:
> > > > 
> > > > > Hi,
> > > > >
> > > > > There are multiple issues that we saw with regressions lately:
> > > > >
> > > > > 1. On certain slaves the regression fails during build and i see those on
> > > > > slave26.cloud.gluster.org, slave25.cloud.gluster.org and may be others
> > > > > also.
> > > > >     Eg:
> > > > > https://build.gluster.org/job/rackspace-regression-2GB-triggered/21422/console
> > > > >
> > > > 
> > > > Are you sure this isn't a code breakage?
> > > 
> > > No, it really does not look like that.
> > > 
> > > This is an other one, it seems the testcase got killed for some reason:
> > > 
> > >   https://build.gluster.org/job/rackspace-regression-2GB-triggered/21459/console
> > > 
> > > It was running on slave25.cloud.gluster.org too... Is it possible that
> > > there is some watchdog or other configuration checking for resources and
> > > killing testcases on occasion? The number of slaves where this happens
> > > seems limited, were these more recently installed/configured?
> > 
> > So dmesg speak of segfault in yum
> > 
> > yum[2711] trap invalid opcode ip:7f2efac38d60 sp:7ffd77322658 error:0 in
> > libfreeblpriv3.so[7f2efabe6000+72000]
> > 
> > and
> > https://access.redhat.com/solutions/2313911
> > 
> > That's exactly the problem.
> > [root at slave25 ~]# /usr/bin/curl https://google.com
> > Illegal instruction
> > 
> > I propose to remove the builder from rotation while we investigate.
> 
> Or we can:
> 
> export NSS_DISABLE_HW_AES=1
> 
> to work around, cf the bug listed on the article.
> 
> Not sure the best way to deploy that.

So we are testing the fix on slave25, and if that's what fix the error,
I will deploy to the whole gluster builders, and investigate for the non
builders server. That's only for RHEL 6/Centos 6 on rackspace.

I will also post a post mortem
-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://www.gluster.org/pipermail/gluster-infra/attachments/20160607/25db79d4/attachment.sig>


More information about the Gluster-infra mailing list