[Bugs] [Bug 1686399] New: listing a file while writing to it causes deadlock

bugzilla at redhat.com bugzilla at redhat.com
Thu Mar 7 11:34:23 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1686399

            Bug ID: 1686399
           Summary: listing a file while writing to it causes deadlock
           Product: GlusterFS
           Version: 6
            Status: NEW
         Component: core
          Assignee: bugs at gluster.org
          Reporter: rgowdapp at redhat.com
                CC: bugs at gluster.org
        Depends On: 1674412
  Target Milestone: ---
    Classification: Community



+++ This bug was initially created as a clone of Bug #1674412 +++

Description of problem:

Following test case was given by Nithya.
Create a pure replicate volume and enable the following options:
Volume Name: xvol
Type: Replicate
Volume ID: 095d6083-ea82-4ec9-a3a9-498fbd5f8dbe
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.122.7:/bricks/brick1/xvol-1
Brick2: 192.168.122.7:/bricks/brick1/xvol-2
Brick3: 192.168.122.7:/bricks/brick1/xvol-3
Options Reconfigured:
server.event-threads: 4
client.event-threads: 4
performance.parallel-readdir: on
performance.readdir-ahead: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off


Fuse mount using:
mount -t glusterfs -o lru-limit=500 -s 192.168.122.7:/xvol /mnt/g1
mkdir /mnt/g1/dirdd

>From terminal 1:
cd /mnt/g1/dirdd
while (true); do ls -lR dirdd; done

>From terminal 2:
while true; do dd if=/dev/urandom of=/mnt/g1/dirdd/1G.file bs=1M count=1; rm -f
/mnt/g1/dirdd/1G.file; done

On running this test, both dd and ls hang after some time.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Raghavendra G on 2019-02-11 10:01:41 UTC ---

(gdb) thr 8
[Switching to thread 8 (Thread 0x7f28072d1700 (LWP 26397))]
#0  0x00007f2813a404cd in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007f2813a404cd in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f2813a3bdcb in _L_lock_812 () from /lib64/libpthread.so.0
#2  0x00007f2813a3bc98 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f2805e3122f in rda_inode_ctx_get_iatt (inode=0x7f27ec0010b8,
this=0x7f2800012560, attr=0x7f28072d0700) at readdir-ahead.c:286
#4  0x00007f2805e3134d in __rda_fill_readdirp (ctx=0x7f27f800f290,
request_size=<optimized out>, entries=0x7f28072d0890, this=0x7f2800012560) at
readdir-ahead.c:326
#5  __rda_serve_readdirp (this=this at entry=0x7f2800012560,
ctx=ctx at entry=0x7f27f800f290, size=size at entry=4096,
entries=entries at entry=0x7f28072d0890, op_errno=op_errno at entry=0x7f28072d085c)
at readdir-ahead.c:353
#6  0x00007f2805e32732 in rda_fill_fd_cbk (frame=0x7f27f801c1e8,
cookie=<optimized out>, this=0x7f2800012560, op_ret=3, op_errno=2,
entries=<optimized out>, xdata=0x0) at readdir-ahead.c:581
#7  0x00007f2806097447 in client4_0_readdirp_cbk (req=<optimized out>,
iov=<optimized out>, count=<optimized out>, myframe=0x7f27f800f498) at
client-rpc-fops_v2.c:2339
#8  0x00007f28149a29d1 in rpc_clnt_handle_reply
(clnt=clnt at entry=0x7f2800051120, pollin=pollin at entry=0x7f280006a180) at
rpc-clnt.c:755
#9  0x00007f28149a2d37 in rpc_clnt_notify (trans=0x7f28000513e0,
mydata=0x7f2800051150, event=<optimized out>, data=0x7f280006a180) at
rpc-clnt.c:922
#10 0x00007f281499f5e3 in rpc_transport_notify (this=this at entry=0x7f28000513e0,
event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, data=data at entry=0x7f280006a180)
at rpc-transport.c:542
#11 0x00007f2808d88f77 in socket_event_poll_in (notify_handled=true,
this=0x7f28000513e0) at socket.c:2522
#12 socket_event_handler (fd=<optimized out>, idx=<optimized out>,
gen=<optimized out>, data=0x7f28000513e0, poll_in=<optimized out>,
poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:2924
#13 0x00007f2814c5a926 in event_dispatch_epoll_handler (event=0x7f28072d0e80,
event_pool=0x90d560) at event-epoll.c:648
#14 event_dispatch_epoll_worker (data=0x96f1e0) at event-epoll.c:762
#15 0x00007f2813a39dd5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007f2813302b3d in clone () from /lib64/libc.so.6
[Switching to thread 7 (Thread 0x7f2806ad0700 (LWP 26398))]
#0  0x00007f2813a404cd in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007f2813a404cd in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f2813a3bdcb in _L_lock_812 () from /lib64/libpthread.so.0
#2  0x00007f2813a3bc98 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f2805e2cd85 in rda_mark_inode_dirty (this=this at entry=0x7f2800012560,
inode=0x7f27ec009da8) at readdir-ahead.c:234
#4  0x00007f2805e2f3cc in rda_writev_cbk (frame=0x7f27f800ef48,
cookie=<optimized out>, this=0x7f2800012560, op_ret=131072, op_errno=0,
prebuf=0x7f2806acf870, postbuf=0x7f2806acf910, xdata=0x0)
    at readdir-ahead.c:769
#5  0x00007f2806094064 in client4_0_writev_cbk (req=<optimized out>,
iov=<optimized out>, count=<optimized out>, myframe=0x7f27f801a7f8) at
client-rpc-fops_v2.c:685
#6  0x00007f28149a29d1 in rpc_clnt_handle_reply
(clnt=clnt at entry=0x7f2800051120, pollin=pollin at entry=0x7f27f8008320) at
rpc-clnt.c:755
#7  0x00007f28149a2d37 in rpc_clnt_notify (trans=0x7f28000513e0,
mydata=0x7f2800051150, event=<optimized out>, data=0x7f27f8008320) at
rpc-clnt.c:922
#8  0x00007f281499f5e3 in rpc_transport_notify (this=this at entry=0x7f28000513e0,
event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, data=data at entry=0x7f27f8008320)
at rpc-transport.c:542
#9  0x00007f2808d88f77 in socket_event_poll_in (notify_handled=true,
this=0x7f28000513e0) at socket.c:2522
#10 socket_event_handler (fd=<optimized out>, idx=<optimized out>,
gen=<optimized out>, data=0x7f28000513e0, poll_in=<optimized out>,
poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:2924
#11 0x00007f2814c5a926 in event_dispatch_epoll_handler (event=0x7f2806acfe80,
event_pool=0x90d560) at event-epoll.c:648
#12 event_dispatch_epoll_worker (data=0x96f4b0) at event-epoll.c:762
#13 0x00007f2813a39dd5 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f2813302b3d in clone () from /lib64/libc.so.6


In writev and readdirp codepath inode and fd-ctx locks are acquired in opposite
order causing a deadlock.

--- Additional comment from Worker Ant on 2019-03-07 11:24:16 UTC ---

REVIEW: https://review.gluster.org/22321 (performance/readdir-ahead: fix
deadlock) posted (#1) for review on master by Raghavendra G


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1674412
[Bug 1674412] listing a file while writing to it causes deadlock
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list