[Gluster-devel] [Gluster-infra] rebal-all-nodes-migrate.t always fails now

Michael Scherer mscherer at redhat.com
Thu Apr 4 13:19:25 UTC 2019


Le jeudi 04 avril 2019 à 13:53 +0200, Michael Scherer a écrit :
> Le jeudi 04 avril 2019 à 16:13 +0530, Atin Mukherjee a écrit :
> > Based on what I have seen that any multi node test case will fail
> > and
> > the
> > above one is picked first from that group and If I am correct none
> > of
> > the
> > code fixes will go through the regression until this is fixed. I
> > suspect it
> > to be an infra issue again. If we look at
> > https://review.gluster.org/#/c/glusterfs/+/22501/ &
> > https://build.gluster.org/job/centos7-regression/5382/ peer
> > handshaking is
> > stuck as 127.1.1.1 is unable to receive a response back, did we end
> > up
> > having firewall and other n/w settings screwed up? The test never
> > fails
> > locally.
> 
> The firewall didn't change, and since the start has a line:
> "-A INPUT -i lo -j ACCEPT", so all traffic on the localhost interface
> work. (I am not even sure that netfilter do anything meaningful on
> the
> loopback interface, but maybe I am wrong, and not keen on looking
> kernel code for that).
> 
> 
> Ping seems to work fine as well, so we can exclude a routing issue.
> 
> Maybe we should look at the socket, does it listen to a specific
> address or not ?

So, I did look at the 20 first ailure, removed all not related to
rebal-all-nodes-migrate.t and seen all were run on builder203, who was
freshly reinstalled. As Deepshika noticed today, this one had a issue
with ipv6, the 2nd issue we were tracking.

Summary, rpcbind.socket systemd unit listen on ipv6 despites ipv6 being
disabled, and the fix is to reload systemd. We have so far no idea on
why it happen, but suspect this might be related to the network issue
we did identify, as that happen only after a reboot, that happen only
if a build is cancelled/crashed/aborted.

I apply the workaround on builder203, so if the culprit is that
specific issue, guess that's fixed. 

I started a test to see how it go:
https://build.gluster.org/job/centos7-regression/5383/

-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190404/4bbc9ad9/attachment.sig>


More information about the Gluster-devel mailing list