[Bugs] [Bug 1422788] [Replicate] "RPC call decoding failed" leading to IO hang & mount inaccessible
bugzilla at redhat.com
bugzilla at redhat.com
Fri Apr 7 12:05:13 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1422788
--- Comment #4 from Worker Ant <bugzilla-bot at gluster.org> ---
COMMIT: https://review.gluster.org/16638 committed in release-3.8 by Niels de
Vos (ndevos at redhat.com)
------
commit 982de32c7f559ab57f66a9ee92f884b772bae1e4
Author: Poornima G <pgurusid at redhat.com>
Date: Tue Feb 14 12:45:36 2017 +0530
rpcsvc: Add rpchdr and proghdr to iobref before submitting to transport
Backport of https://review.gluster.org/16613
Issue:
When fio is run on multiple clients (each client writes to its own files),
and meanwhile the clients does a readdirp, thus the client which did
a readdirp will now recieve the upcalls. In this scenario the client
disconnects with rpc decode failed error.
RCA:
Upcall calls rpcsvc_request_submit to submit the request to socket:
rpcsvc_request_submit currently:
rpcsvc_request_submit () {
iobuf = iobuf_new
iov = iobuf->ptr
fill iobuf to contain xdrised upcall content - proghdr
rpcsvc_callback_submit (..iov..)
...
if (iobuf)
iobuf_unref (iobuf)
}
rpcsvc_callback_submit (... iov...) {
...
iobuf = iobuf_new
iov1 = iobuf->ptr
fill iobuf to contain xdrised rpc header - rpchdr
msg.rpchdr = iov1
msg.proghdr = iov
...
rpc_transport_submit_request (msg)
...
if (iobuf)
iobuf_unref (iobuf)
}
rpcsvc_callback_submit assumes that once rpc_transport_submit_request()
returns the msg is written on to socket and thus the buffers(rpchdr,
proghdr)
can be freed, which is not the case. In especially high workload,
rpc_transport_submit_request() may not be able to write to socket
immediately
and hence adds it to its own queue and returns as successful. Thus, we have
use after free, for rpchdr and proghdr. Hence the clients gets garbage
rpchdr
and proghdr and thus fails to decode the rpc, resulting in disconnect.
To prevent this, we need to add the rpchdr and proghdr to a iobref and send
it in msg:
iobref_add (iobref, iobufs)
msg.iobref = iobref;
The socket layer takes a ref on msg.iobref, if it cannot write to socket
and
is adding to the queue. Thus we do not have use after free.
Thank You for discussing, debugging and fixing along:
Prashanth Pai <ppai at redhat.com>
Raghavendra G <rgowdapp at redhat.com>
Rajesh Joseph <rjoseph at redhat.com>
Kotresh HR <khiremat at redhat.com>
Mohammed Rafi KC <rkavunga at redhat.com>
Soumya Koduri <skoduri at redhat.com>
> Reviewed-on: https://review.gluster.org/16613
> Reviewed-by: Prashanth Pai <ppai at redhat.com>
> Smoke: Gluster Build System <jenkins at build.gluster.org>
> Reviewed-by: soumya k <skoduri at redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
Change-Id: Ifa6bf6f4879141f42b46830a37c1574b21b37275
BUG: 1422788
Signed-off-by: Poornima G <pgurusid at redhat.com>
Reviewed-on: https://review.gluster.org/16638
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Smoke: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Prashanth Pai <ppai at redhat.com>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list