[Gluster-devel] spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t

Fri Jul 3 06:28:57 UTC 2015

----- Original Message -----
> 
> This is caused because when bind-insecure is turned on (which is the default
> now), it may happen
> that brick is not able to bind to port assigned by Glusterd for example
> 49192-49195...
> It seems to occur because the rpc_clnt connections are binding to ports in
> the same range.
> so brick fails to bind to a port which is already used by someone else.
> 
> This bug already exist before http://review.gluster.org/#/c/11039/ when use
> rdma, i.e. even
> previously rdma binds to port >= 1024 if it cannot find a free port < 1024,
> even when bind insecure was turned off (ref to commit '0e3fd04e').
> Since we don't have tests related to rdma we did not discover this issue
> previously.
> 
> http://review.gluster.org/#/c/11039/ discovers the bug we encountered,
> however now the bug can be fixed by
> http://review.gluster.org/#/c/11512/ by making rpc_clnt to get port numbers
> from 65535 in a descending
> order, as a result port clash is minimized, also it fixes issues in rdma too

This approach could still surprise the storage-admin when glusterfs(d) processes
bind to ports in the range where brick ports are being assigned. We should make this
predictable by reserving brick ports setting net.ipv4.ip_local_reserved_ports.
Initially reserve 50 ports starting at 49152. Subsequently, we could reserve ports on demand,
say 50 more ports, when we exhaust previously reserved range. net.ipv4.ip_local_reserved_ports
doesn't interfere with explicit port allocation behaviour. i.e if the socket uses
a port other than zero. With this option we don't have to manage ports assignment at a process
level. Thoughts?