[Bugs] [Bug 1532842] New: Large directories in disperse volumes with rdma transport can' t be accessed with ls
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jan 9 21:42:42 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1532842
Bug ID: 1532842
Summary: Large directories in disperse volumes with rdma
transport can't be accessed with ls
Product: GlusterFS
Version: 3.13
Component: rdma
Severity: high
Assignee: bugs at gluster.org
Reporter: shane at axiomalaska.com
CC: bugs at gluster.org
Created attachment 1379248
--> https://bugzilla.redhat.com/attachment.cgi?id=1379248&action=edit
Script to replicate disperse rdma bug
Description of problem:
In disperse volumes with rdma transport, large directories (containing >= 617
files) can't be listed with `ls`. Attempts to do so result in a "Transport
endpoint is not connected" error, and the following log messages appear in the
mount log:
[2018-01-09 21:33:15.186370] W [MSGID: 103046]
[rdma.c:3604:gf_rdma_decode_header] 0-rpc-transport/rdma: received a msg of
type RDMA_ERROR
[2018-01-09 21:33:15.186411] W [MSGID: 103046]
[rdma.c:4057:gf_rdma_process_recv] 0-rpc-transport/rdma: peer
(10.4.1.60:49152), couldn't encode or decode the msg properly or write chunks
were not provided for replies that were bigger than RDMA_INLINE_THRESHOLD
(2048)
[2018-01-09 21:33:15.186435] W [MSGID: 114031]
[client-rpc-fops.c:2577:client3_3_readdirp_cbk] 0-erasure-client-0: remote
operation failed [Transport endpoint is not connected]
[2018-01-09 21:33:15.186503] W [fuse-bridge.c:2897:fuse_readdirp_cbk]
0-glusterfs-fuse: 74631173: READDIRP => -1 (Transport endpoint is not
connected)
Repeated attempts to ls the directory will cause different peers in the cluster
to be identified in the log message, indicating that the problem is not with a
misconfigured peer.
Files in the problem directories can be accessed directly as normal (ls, cat,
etc work fine on full file paths within the large directories).
Changing the transport type of the disperse volume to tcp and restarting the
volume allows the problem directories to be accessed. The issue also does not
occur with distributed volumes, only disperse.
Version-Release number of selected component (if applicable):
3.13.1
How reproducible:
Extremely.
Steps to Reproduce:
General approach outlined here. See attached gluster-disperse-rdma-bug.sh for
working script to reproduce bug.
1. Create and start disperse volume with rdma transport
2. Mount disperse volume
3. Create directory in mounted disperse volume and create 616 empty files
4. Verify that the directory can be accessed with ls
5. Create the 617th file in the test directory
6. Verify that the directory can no longer be accessed with ls
Actual results:
Large directory cannot be accessed with ls
Expected results:
Large directory should be accessible with ls
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list