[Gluster-devel] spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t

Raghavendra Gowdappa rgowdapp at redhat.com
Thu Jul 2 17:29:15 UTC 2015


I've reverted [1] which brought the change allow-insecure to be on by default. The patch seems to have issues which will be addressed and merged later. The revert can be found at [2].

[1] http://review.gluster.org/11274
[2] http://review.gluster.org/11507

Please let me know if the regressions are still failing.

regards,
Raghavendra.


----- Original Message -----
> From: "Atin Mukherjee" <atin.mukherjee83 at gmail.com>
> To: "Prasanna Kalever" <pkalever at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Thursday, July 2, 2015 9:41:33 PM
> Subject: Re: [Gluster-devel] spurious failures	tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
> 
> 
> 
> Thanks Prasanna for the patches :)
> 
> -Atin
> Sent from one plus one
> On Jul 2, 2015 9:19 PM, "Prasanna Kalever" < pkalever at redhat.com > wrote:
> 
> 
> 
> This is caused because when bind-insecure is turned on (which is the default
> now), it may happen
> that brick is not able to bind to port assigned by Glusterd for example
> 49192-49195...
> It seems to occur because the rpc_clnt connections are binding to ports in
> the same range.
> so brick fails to bind to a port which is already used by someone else.
> 
> This bug already exist before http://review.gluster.org/#/c/11039/ when use
> rdma, i.e. even
> previously rdma binds to port >= 1024 if it cannot find a free port < 1024,
> even when bind insecure was turned off (ref to commit '0e3fd04e').
> Since we don't have tests related to rdma we did not discover this issue
> previously.
> 
> http://review.gluster.org/#/c/11039/ discovers the bug we encountered,
> however now the bug can be fixed by
> http://review.gluster.org/#/c/11512/ by making rpc_clnt to get port numbers
> from 65535 in a descending
> order, as a result port clash is minimized, also it fixes issues in rdma too
> 
> Thanks to Raghavendra Talur for help in discovering the real cause
> 
> 
> Regards,
> Prasanna Kalever
> 
> 
> 
> ----- Original Message -----
> From: "Raghavendra Talur" < raghavendra.talur at gmail.com >
> To: "Krishnan Parthasarathi" < kparthas at redhat.com >
> Cc: "Gluster Devel" < gluster-devel at gluster.org >
> Sent: Thursday, July 2, 2015 6:45:17 PM
> Subject: Re: [Gluster-devel] spurious failures
> tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
> 
> 
> 
> On Thu, Jul 2, 2015 at 4:40 PM, Raghavendra Talur <
> raghavendra.talur at gmail.com > wrote:
> 
> 
> 
> 
> 
> On Thu, Jul 2, 2015 at 10:52 AM, Krishnan Parthasarathi < kparthas at redhat.com
> > wrote:
> 
> 
> 
> > > 
> > > A port assigned by Glusterd for a brick is found to be in use already by
> > > the brick. Any changes in Glusterd recently which can cause this?
> > > 
> > > Or is it a test infra problem?
> 
> This issue is likely to be caused by http://review.gluster.org/11039
> This patch changes the port allocation that happens for rpc_clnt based
> connections. Previously, ports allocated where < 1024. With this change,
> these connections, typically mount process, gluster-nfs server processes
> etc could end up using ports that bricks are being assigned to.
> 
> IIUC, the intention of the patch was to make server processes lenient to
> inbound messages from ports > 1024. If we don't require to use ports > 1024
> we could leave the port allocation for rpc_clnt connections as before.
> Alternately, we could reserve the range of ports starting from 49152 for
> bricks
> by setting net.ipv4.ip_local_reserved_ports using sysctl(8). This is specific
> to Linux.
> I'm not aware of how this could be done in NetBSD for instance though.
> 
> 
> It seems this is exactly whats happening.
> 
> I have a question, I get the following data from netstat and grep
> 
> tcp 0 0 f6be17c0fbf5:1023 f6be17c0fbf5:24007 ESTABLISHED 31516/glusterfsd
> tcp 0 0 f6be17c0fbf5:49152 f6be17c0fbf5:490 ESTABLISHED 31516/glusterfsd
> unix 3 [ ] STREAM CONNECTED 988353 31516/glusterfsd
> /var/run/gluster/4878d6e905c5f6032140a00cc584df8a.socket
> 
> Here 31516 is the brick pid.
> 
> Looking at the data, line 2 is very clear, it shows connection between brick
> and glusterfs client.
> unix socket on line 3 is also clear, it is the unix socket connection that
> glusterd and brick process use for communication.
> 
> I am not able to understand line 1; which part of brick process established a
> tcp connection with glusterd using port 1023?
> Note: this data is from a build which does not have the above mentioned
> patch.
> 
> 
> The patch which exposed this bug is being reverted till the underlying bug is
> also fixed.
> You can monitor revert patches here
> master: http://review.gluster.org/11507
> 3.7 branch: http://review.gluster.org/11508
> 
> Please rebase your patches after the above patches are merged to ensure that
> you patches pass regression.
> 
> 
> 
> 
> 
> --
> Raghavendra Talur
> 
> 
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 


More information about the Gluster-devel mailing list