[Bugs] [Bug 1431176] New: USS is broken when multiplexing is on

bugzilla at redhat.com bugzilla at redhat.com
Fri Mar 10 14:20:48 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1431176

            Bug ID: 1431176
           Summary: USS is broken when multiplexing is on
           Product: GlusterFS
           Version: 3.10
         Component: glusterd
          Assignee: bugs at gluster.org
          Reporter: jdarcy at redhat.com
                CC: bugs at gluster.org
        Depends On: 1430148



+++ This bug was initially created as a clone of Bug #1430148 +++

This manifests as a test failure in uss.t when we first try to access snap1
through USS.  The underlying problem is described in the commit message for the
patch I'll submit as soon as I have a bug number.

    This was causing USS tests to fail.  The underlying problem here is
    that if we try to queue the attach request too soon after starting a
    brick process then the socket code will get an error trying to write
    to the still-unconnected socket.  Its response is to shut down the
    socket, which causes the queued attach requests to be force-unwound.
    There's nothing to retry them, so they effectively never happen and
    those bricks (second and succeeding for a snapshot) never become
    available.

    We *do* have a retry loop for attach requests, but currently break out
    as soon as a request is queued - not actually sent.  The fix is to
    modify that loop so it will wait some more if the rpc connection isn't
    even complete yet.  Now we break out only when we have a completed
    connection *and* a queued request.

--- Additional comment from Worker Ant on 2017-03-07 18:46:46 EST ---

REVIEW: https://review.gluster.org/16868 (glusterd: don't queue attach reqs
before connecting) posted (#1) for review on master by Jeff Darcy
(jdarcy at redhat.com)

--- Additional comment from Worker Ant on 2017-03-08 11:01:42 EST ---

COMMIT: https://review.gluster.org/16868 committed in master by Jeff Darcy
(jdarcy at redhat.com) 
------
commit 53e2c875cf97df8337f7ddb5124df2fc6dd37bca
Author: Jeff Darcy <jdarcy at redhat.com>
Date:   Tue Mar 7 18:36:58 2017 -0500

    glusterd: don't queue attach reqs before connecting

    This was causing USS tests to fail.  The underlying problem here is
    that if we try to queue the attach request too soon after starting a
    brick process then the socket code will get an error trying to write
    to the still-unconnected socket.  Its response is to shut down the
    socket, which causes the queued attach requests to be force-unwound.
    There's nothing to retry them, so they effectively never happen and
    those bricks (second and succeeding for a snapshot) never become
    available.

    We *do* have a retry loop for attach requests, but currently break out
    as soon as a request is queued - not actually sent.  The fix is to
    modify that loop so it will wait some more if the rpc connection isn't
    even complete yet.  Now we break out only when we have a completed
    connection *and* a queued request.

    Change-Id: Ib6be13646f1fa9072b4a944ab5f13e1b29084841
    BUG: 1430148
    Signed-off-by: Jeff Darcy <jdarcy at redhat.com>
    Reviewed-on: https://review.gluster.org/16868
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Prashanth Pai <ppai at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1430148
[Bug 1430148] USS is broken when multiplexing is on
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list