[Bugs] [Bug 1422787] New: [Replicate] "RPC call decoding failed" leading to IO hang & mount inaccessible
bugzilla at redhat.com
bugzilla at redhat.com
Thu Feb 16 09:19:02 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1422787
Bug ID: 1422787
Summary: [Replicate] "RPC call decoding failed" leading to IO
hang & mount inaccessible
Product: GlusterFS
Version: 3.9
Component: rpc
Keywords: Triaged
Severity: high
Assignee: bugs at gluster.org
Reporter: pgurusid at redhat.com
CC: amukherj at redhat.com, bugs at gluster.org,
ksandha at redhat.com, nchilaka at redhat.com,
rcyriac at redhat.com, rhinduja at redhat.com,
rhs-bugs at redhat.com, rjoseph at redhat.com,
skoduri at redhat.com
Depends On: 1421937, 1422363
Blocks: 1409135, 1416031 (glusterfs-3.10.0)
+++ This bug was initially created as a clone of Bug #1422363 +++
+++ This bug was initially created as a clone of Bug #1421937 +++
+++ This bug was initially created as a clone of Bug #1409135 +++
Description of problem:
RPC failed to decode the msg on nfs mount leading to Mounted volume on NFS
mount and started sequential write with o-direct flag
Version-Release number of selected component (if applicable):
3.8.4-8
Logs are placed at
rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug>
How reproducible:
Tried Once
Steps to Reproduce:
1. 4 Servers And 4 Clients , Mount 1:1 with gnfs
2. Daemonize FIO on 4 Client and start sequential write with 0 direct flag
3. After the inception of IO's the tool got hanged.
4. Multiple errors and warnings in nfs.log
<SNIP>
2016-12-29 10:27:29.424871] W [xdr-rpc.c:55:xdr_to_rpc_call] 0-rpc: failed to
decode call msg
[2016-12-29 10:27:29.425032] W [rpc-clnt.c:717:rpc_clnt_handle_cbk]
0-testvol-client-0: RPC call decoding failed
[2016-12-29 10:27:29.443275] E [rpc-clnt.c:365:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f9ccdb40682] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f9ccd90675e] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f9ccd90686e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f9ccd907fd4] (-->
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x94)[0x7f9ccd908864] )))))
0-testvol-client-0: forced unwinding frame type(GlusterFS 3.3) op(FINODELK(30))
called at 2016-12-29 10:26:55.289465 (xid=0xa8ddd)
[2016-12-29 10:27:29.443308] E [MSGID: 114031]
[client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-testvol-client-0: remote
operation failed [Transport endpoint is not connected]
[2016-12-29 10:27:29.443571] E [rpc-clnt.c:365:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f9ccdb40682] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f9ccd90675e] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f9ccd90686e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f9ccd907fd4] (-->
</SNIP>
Actual results:
IO tool got hung and multiple error and warning in the logs
Expected results:
NO IO hang should be observed
Additional info:
[root at gqas004 ~]# gluster v info
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: e2212c28-f04a-4f08-9f17-b0fb74434bbf
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: ip1:/bricks/testvol_brick0
Brick2: ip2:/bricks/testvol_brick1
Brick3: ip3:/bricks/testvol_brick2
Brick4: ip4:/bricks/testvol_brick3
Options Reconfigured:
cluster.use-compound-fops: off
network.remote-dio: off
performance.strict-o-direct: on
network.inode-lru-limit: 90000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
server.allow-insecure: on
performance.stat-prefetch: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1409135
[Bug 1409135] [Replicate] "RPC call decoding failed" leading to IO hang &
mount inaccessible
https://bugzilla.redhat.com/show_bug.cgi?id=1416031
[Bug 1416031] GlusterFS 3.10 tracker
https://bugzilla.redhat.com/show_bug.cgi?id=1421937
[Bug 1421937] [Replicate] "RPC call decoding failed" leading to IO hang &
mount inaccessible
https://bugzilla.redhat.com/show_bug.cgi?id=1422363
[Bug 1422363] [Replicate] "RPC call decoding failed" leading to IO hang &
mount inaccessible
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list