[Gluster-devel] spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
Raghavendra Gowdappa
rgowdapp at redhat.com
Thu Jul 2 17:29:15 UTC 2015
I've reverted [1] which brought the change allow-insecure to be on by default. The patch seems to have issues which will be addressed and merged later. The revert can be found at [2].
[1] http://review.gluster.org/11274
[2] http://review.gluster.org/11507
Please let me know if the regressions are still failing.
regards,
Raghavendra.
----- Original Message -----
> From: "Atin Mukherjee" <atin.mukherjee83 at gmail.com>
> To: "Prasanna Kalever" <pkalever at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Thursday, July 2, 2015 9:41:33 PM
> Subject: Re: [Gluster-devel] spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
>
>
>
> Thanks Prasanna for the patches :)
>
> -Atin
> Sent from one plus one
> On Jul 2, 2015 9:19 PM, "Prasanna Kalever" < pkalever at redhat.com > wrote:
>
>
>
> This is caused because when bind-insecure is turned on (which is the default
> now), it may happen
> that brick is not able to bind to port assigned by Glusterd for example
> 49192-49195...
> It seems to occur because the rpc_clnt connections are binding to ports in
> the same range.
> so brick fails to bind to a port which is already used by someone else.
>
> This bug already exist before http://review.gluster.org/#/c/11039/ when use
> rdma, i.e. even
> previously rdma binds to port >= 1024 if it cannot find a free port < 1024,
> even when bind insecure was turned off (ref to commit '0e3fd04e').
> Since we don't have tests related to rdma we did not discover this issue
> previously.
>
> http://review.gluster.org/#/c/11039/ discovers the bug we encountered,
> however now the bug can be fixed by
> http://review.gluster.org/#/c/11512/ by making rpc_clnt to get port numbers
> from 65535 in a descending
> order, as a result port clash is minimized, also it fixes issues in rdma too
>
> Thanks to Raghavendra Talur for help in discovering the real cause
>
>
> Regards,
> Prasanna Kalever
>
>
>
> ----- Original Message -----
> From: "Raghavendra Talur" < raghavendra.talur at gmail.com >
> To: "Krishnan Parthasarathi" < kparthas at redhat.com >
> Cc: "Gluster Devel" < gluster-devel at gluster.org >
> Sent: Thursday, July 2, 2015 6:45:17 PM
> Subject: Re: [Gluster-devel] spurious failures
> tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
>
>
>
> On Thu, Jul 2, 2015 at 4:40 PM, Raghavendra Talur <
> raghavendra.talur at gmail.com > wrote:
>
>
>
>
>
> On Thu, Jul 2, 2015 at 10:52 AM, Krishnan Parthasarathi < kparthas at redhat.com
> > wrote:
>
>
>
> > >
> > > A port assigned by Glusterd for a brick is found to be in use already by
> > > the brick. Any changes in Glusterd recently which can cause this?
> > >
> > > Or is it a test infra problem?
>
> This issue is likely to be caused by http://review.gluster.org/11039
> This patch changes the port allocation that happens for rpc_clnt based
> connections. Previously, ports allocated where < 1024. With this change,
> these connections, typically mount process, gluster-nfs server processes
> etc could end up using ports that bricks are being assigned to.
>
> IIUC, the intention of the patch was to make server processes lenient to
> inbound messages from ports > 1024. If we don't require to use ports > 1024
> we could leave the port allocation for rpc_clnt connections as before.
> Alternately, we could reserve the range of ports starting from 49152 for
> bricks
> by setting net.ipv4.ip_local_reserved_ports using sysctl(8). This is specific
> to Linux.
> I'm not aware of how this could be done in NetBSD for instance though.
>
>
> It seems this is exactly whats happening.
>
> I have a question, I get the following data from netstat and grep
>
> tcp 0 0 f6be17c0fbf5:1023 f6be17c0fbf5:24007 ESTABLISHED 31516/glusterfsd
> tcp 0 0 f6be17c0fbf5:49152 f6be17c0fbf5:490 ESTABLISHED 31516/glusterfsd
> unix 3 [ ] STREAM CONNECTED 988353 31516/glusterfsd
> /var/run/gluster/4878d6e905c5f6032140a00cc584df8a.socket
>
> Here 31516 is the brick pid.
>
> Looking at the data, line 2 is very clear, it shows connection between brick
> and glusterfs client.
> unix socket on line 3 is also clear, it is the unix socket connection that
> glusterd and brick process use for communication.
>
> I am not able to understand line 1; which part of brick process established a
> tcp connection with glusterd using port 1023?
> Note: this data is from a build which does not have the above mentioned
> patch.
>
>
> The patch which exposed this bug is being reverted till the underlying bug is
> also fixed.
> You can monitor revert patches here
> master: http://review.gluster.org/11507
> 3.7 branch: http://review.gluster.org/11508
>
> Please rebase your patches after the above patches are merged to ensure that
> you patches pass regression.
>
>
>
>
>
> --
> Raghavendra Talur
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list