[Gluster-devel] spurious regression failures again! [bug-1112559.t]

Thu Jul 24 09:00:56 UTC 2014

On 2014-07-24 08:21, Joseph Fernandes wrote:
> Hi All,
> 
> After further investigation we have the root cause for this issue. 
> The root cause is the way in which a new node is added to the cluster.
> 
> Now we have N1(127.1.1.1) and N2(127.1.1.2) as two nodes in the cluster, each having a brick N1:B1 (127.1.1.1 : 49146) and N2:B2 (127.1.1.2 : 49147)
> 
> Now lets peer probe N3(127.1.1.3) from N1
> 
> 1) Friend request is sent from N1 to N3. N3 added N1 in the peerinfo list i.e N1 and its uuid say [UUID1]
> 2) N3 get the brick infos from N1 
> 3) N3 tries to start the bricks
>        1) N3 tries to start the brick B1 and find its not a local brick, using the logic MY_UUID == brickinfo->uuid, which is false in this case,
>           as the UUID of brickinfo->hostname (N1) is [UUID1] (as suggested by the peerinfo list) and MY_UUID is [UUID3]. Hence doesn't start it. 
>        2) N3 tries to start the brick B2. Now the problem lies here. N3 uses glusterd_resolve_brick() to resolve the UUID of B2->hostname(N2). 
>           In glusterd_resolve_brick(), it cannot find  N2 in the peerinfo list. Then it checks if N2 is a local loop back address. Since N2(127.1.1.2) starts with 
>           "127" it decides that its a local loop back address. Thus glusterd_resolve_brick() fills brickinfo->uuid with [UUID3]. Now as brickinfo->uuid == MY_UUID is 
>           true, N3 initiates the brick process B2 with -s 127.1.1.2 and *-posix.glusterd-uuid=[UUID3]. This process dies off immediately, But for a short amount of 
>           time it holds on to the  --brick-port, say for example 49155
> 
> All the above is observed & inferred from glusterd logs from N3 (by adding some extra debug messages) 
> 
> Now coming back to our test case, i.e firing snapshot create and peer probe together. If N2 has assigned 49155 as the port --brick-port for the snapshot brick, then it finds that 49155 is Already acquired by some other process(i.e faulty brick process N3:B2 (127.1.1.2 : 49155), which as the -s 127.1.1.2 and *-posix.glusterd-uuid=[UUID3]) and hence fails to start the snapshot brick process.
> 
> 1) The error is spurious, as its all about chance when N2 and N3 use the same port for their brick processes.
> 2) This issue is possible only in a regression test scenario, As all the nodes are on the same machine, differentiated only by a different loop back address (127.1.1.*). 
> 3) Plus The logic that "127" is a local loop back address is also not wrong as glusterd's are suppose to run on different machines in real usage cases.
> 
> Please do share your thoughts on this, And what would be a possible fix.

Possible solutions (many/all of them probably breaks important assumptions):

* Use some alias address range instead of 127.*.*.* for testing purposes
* Stop treating localhost as special
* Adopt the systemd LISTEN_FDS approach and have a special program that
  tries to bind to ports and then hands the port over to the proper daemon

/Anders
-- 
Anders Blomdell                  Email: anders.blomdell at control.lth.se
Department of Automatic Control
Lund University                  Phone:    +46 46 222 4625
P.O. Box 118                     Fax:      +46 46 138118
SE-221 00 Lund, Sweden