[Gluster-devel] spurious regression failures again! [bug-1112559.t]

Thu Jul 24 12:52:44 UTC 2014

>        2) N3 tries to start the brick B2. Now the problem lies here. N3 uses
>        glusterd_resolve_brick() to resolve the UUID of B2->hostname(N2).
>           In glusterd_resolve_brick(), it cannot find  N2 in the peerinfo
>           list. Then it checks if N2 is a local loop back address. Since
>           N2(127.1.1.2) starts with
>           "127" it decides that its a local loop back address. Thus
>           glusterd_resolve_brick() fills brickinfo->uuid with [UUID3]. Now
>           as brickinfo->uuid == MY_UUID is
>           true, N3 initiates the brick process B2 with -s 127.1.1.2 and
>           *-posix.glusterd-uuid=[UUID3]. This process dies off immediately,
>           But for a short amount of
>           time it holds on to the  --brick-port, say for example 49155

This is the part that seems "off" to me.  If an address doesn't
*exactly* match that on some local interface, it's not local.  When we
implemented the cluster.rc infrastructure so that we could simulate
multi-node testing, we had to root out a bunch of stuff like this, but
apparently some crept back in.  If we just fixed the "127.* == local"
mistake, would that be adequate to prevent these errors?