[Bugs] [Bug 1421937] [Replicate] "RPC call decoding failed" leading to IO hang & mount inaccessible

bugzilla at redhat.com bugzilla at redhat.com
Thu Feb 16 04:09:40 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1421937



--- Comment #3 from Worker Ant <bugzilla-bot at gluster.org> ---
COMMIT: https://review.gluster.org/16613 committed in master by Raghavendra G
(rgowdapp at redhat.com) 
------
commit 8607f22dcd1bc9b84e452ae90102fa9d345ad3db
Author: Poornima G <pgurusid at redhat.com>
Date:   Tue Feb 14 12:45:36 2017 +0530

    rpcsvc: Add rpchdr and proghdr to iobref before submitting to transport

    Issue:
    When fio is run on multiple clients (each client writes to its own files),
    and meanwhile the clients does a readdirp, thus the client which did
    a readdirp will now recieve the upcalls. In this scenario the client
    disconnects with rpc decode failed error.

    RCA:
    Upcall calls rpcsvc_request_submit to submit the request to socket:
    rpcsvc_request_submit currently:
    rpcsvc_request_submit () {
       iobuf = iobuf_new
       iov = iobuf->ptr
       fill iobuf to contain xdrised upcall content - proghdr
       rpcsvc_callback_submit (..iov..)
       ...
       if (iobuf)
           iobuf_unref (iobuf)
    }

    rpcsvc_callback_submit (... iov...) {
       ...
       iobuf = iobuf_new
       iov1 = iobuf->ptr
       fill iobuf to contain xdrised rpc header - rpchdr
       msg.rpchdr = iov1
       msg.proghdr = iov
       ...
       rpc_transport_submit_request (msg)
       ...
       if (iobuf)
           iobuf_unref (iobuf)
    }

    rpcsvc_callback_submit assumes that once rpc_transport_submit_request()
    returns the msg is written on to socket and thus the buffers(rpchdr,
proghdr)
    can be freed, which is not the case. In especially high workload,
    rpc_transport_submit_request() may not be able to write to socket
immediately
    and hence adds it to its own queue and returns as successful. Thus, we have
    use after free, for rpchdr and proghdr. Hence the clients gets garbage
rpchdr
    and proghdr and thus fails to decode the rpc, resulting in disconnect.

    To prevent this, we need to add the rpchdr and proghdr to a iobref and send
    it in msg:
       iobref_add (iobref, iobufs)
       msg.iobref = iobref;
    The socket layer takes a ref on msg.iobref, if it cannot write to socket
and
    is adding to the queue. Thus we do not have use after free.

    Thank You for discussing, debugging and fixing along:
    Prashanth Pai <ppai at redhat.com>
    Raghavendra G <rgowdapp at redhat.com>
    Rajesh Joseph <rjoseph at redhat.com>
    Kotresh HR <khiremat at redhat.com>
    Mohammed Rafi KC <rkavunga at redhat.com>
    Soumya Koduri <skoduri at redhat.com>

    Change-Id: Ifa6bf6f4879141f42b46830a37c1574b21b37275
    BUG: 1421937
    Signed-off-by: Poornima G <pgurusid at redhat.com>
    Reviewed-on: https://review.gluster.org/16613
    Reviewed-by: Prashanth Pai <ppai at redhat.com>
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: soumya k <skoduri at redhat.com>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Raghavendra G <rgowdapp at redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=UcmlsPxYsB&a=cc_unsubscribe


More information about the Bugs mailing list