[Gluster-devel] regression failed : snapshot/bug-1316437.t

Avra Sengupta asengupt at redhat.com
Tue Jul 26 07:16:20 UTC 2016


Had a look at the patch. What you are trying to do is, to re-use the 
port and if not successfult, you are getting a new port. I have some 
comments in the patch, but to me this looks mostly fine.

On 07/25/2016 07:14 PM, Atin Mukherjee wrote:
>
>
> On Mon, Jul 25, 2016 at 7:12 PM, Atin Mukherjee <amukherj at redhat.com 
> <mailto:amukherj at redhat.com>> wrote:
>
>
>
>     On Mon, Jul 25, 2016 at 5:37 PM, Atin Mukherjee
>     <amukherj at redhat.com <mailto:amukherj at redhat.com>> wrote:
>
>
>
>         On Mon, Jul 25, 2016 at 4:34 PM, Avra Sengupta
>         <asengupt at redhat.com <mailto:asengupt at redhat.com>> wrote:
>
>             The crux of the problem is that as of today, brick
>             processes on restart try to reuse the old port they were
>             using (assuming that no other process will be using it,
>             and not consulting pmap_registry_alloc() before using it).
>             With a recent change, pmap_registry_alloc (), reassigns
>             older ports that were used, but are now free. Hence snapd
>             now gets a port that was previously used by a brick and
>             tries to bind to it, whereas the older brick process
>             without consulting pmap table blindly tries to connect to
>             it, and hence we see this problem.
>
>             Now coming to the fix, I feel brick process should not try
>             to get the older port and should just take a new port
>             every time it comes up. We will not run out of ports with
>             this change coz, now pmap allocates old ports again, and
>             the previous port being used by the brick process will
>             eventually be reused. If anyone sees any concern with this
>             approach, please feel free to raise so now.
>
>
>         Looks to be OK, but I'll think through it and get back to you
>         by a day or two if I have any objections.
>
>
>     If we are conservative about bricks not binding to a different
>     port on a restart, I've an alternative approach here [1] . Neither
>     it has a full fledged commit message nor a BZ. I've just put this
>     up for your input?
>
>
> Read it as "binding" instead "not binding"
>
>
>     [1] http://review.gluster.org/15005
>
>
>
>             While awaiting feedback from you guys, I have sent this
>             patch (http://review.gluster.org/15001), which moves the
>             said test case to bad tests for now, and after we
>             collectively reach to a conclusion on the fix, we will
>             remove this from bad test.
>
>             Regards,
>             Avra
>
>
>             On 07/25/2016 02:33 PM, Avra Sengupta wrote:
>>             The failure suggests that the port snapd is trying to
>>             bind to is already in use. But snapd has been modified to
>>             use a new port everytime. I am looking into this.
>>
>>             On 07/25/2016 02:23 PM, Nithya Balachandran wrote:
>>>             More failures:
>>>             https://build.gluster.org/job/rackspace-regression-2GB-triggered/22452/console
>>>
>>>             I see these messages in the snapd.log:
>>>
>>>             [2016-07-22 05:31:52.482282] I
>>>             [rpcsvc.c:2199:rpcsvc_set_outstanding_rpc_limit]
>>>             0-rpc-service: Configured rpc.outstanding-rpc-limit with
>>>             value 64
>>>             [2016-07-22 05:31:52.482352] W [MSGID: 101002]
>>>             [options.c:954:xl_opt_validate] 0-patchy-server: option
>>>             'listen-port' is deprecated, preferred is
>>>             'transport.socket.listen-port', continuing with correction
>>>             [2016-07-22 05:31:52.482436] E
>>>             [socket.c:771:__socket_server_bind] 0-tcp.patchy-server:
>>>             binding to  failed: Address already in use
>>>             [2016-07-22 05:31:52.482447] E
>>>             [socket.c:774:__socket_server_bind] 0-tcp.patchy-server:
>>>             Port is already in use
>>>             [2016-07-22 05:31:52.482459] W
>>>             [rpcsvc.c:1630:rpcsvc_create_listener] 0-rpc-service:
>>>             listening on transport failed
>>>             [2016-07-22 05:31:52.482469] W [MSGID: 115045]
>>>             [server.c:1061:init] 0-patchy-server: creation of
>>>             listener failed
>>>             [2016-07-22 05:31:52.482481] E [MSGID: 101019]
>>>             [xlator.c:433:xlator_init] 0-patchy-server:
>>>             Initialization of volume 'patchy-server' failed, review
>>>             your volfile again
>>>             [2016-07-22 05:31:52.482491] E [MSGID: 101066]
>>>             [graph.c:324:glusterfs_graph_init] 0-patchy-server:
>>>             initializing translator failed
>>>             [2016-07-22 05:31:52.482499] E [MSGID: 101176]
>>>             [graph.c:670:glusterfs_graph_activate] 0-graph: init failed
>>>
>>>             On Mon, Jul 25, 2016 at 12:00 PM, Ashish Pandey
>>>             <aspandey at redhat.com <mailto:aspandey at redhat.com>> wrote:
>>>
>>>                 Hi,
>>>
>>>                 Following test has failed 3 times in last two days -
>>>
>>>                 ./tests/bugs/snapshot/bug-1316437.t
>>>                 https://build.gluster.org/job/rackspace-regression-2GB-triggered/22445/consoleFull
>>>                 https://build.gluster.org/job/rackspace-regression-2GB-triggered/22445/consoleFull
>>>                 https://build.gluster.org/job/rackspace-regression-2GB-triggered/22470/consoleFull
>>>
>>>                 Please take a look at it and check if it spurious
>>>                 failure or not.
>>>
>>>                 Ashish
>>>
>>>                 _______________________________________________
>>>                 Gluster-devel mailing list
>>>                 Gluster-devel at gluster.org
>>>                 <mailto:Gluster-devel at gluster.org>
>>>                 http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>>
>
>
>
>
>         -- 
>
>         --Atin
>
>
>
>
>     -- 
>
>     --Atin
>
>
>
>
> -- 
>
> --Atin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160726/2b97215e/attachment-0001.html>


More information about the Gluster-devel mailing list