[Bugs] [Bug 1262249] New: Fuse mount crashes with quick read enabled

Fri Sep 11 09:19:40 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1262249

            Bug ID: 1262249
           Summary: Fuse mount crashes with quick read enabled
           Product: GlusterFS
           Version: 3.7.4
         Component: quick-read
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: tribioli at arcetri.inaf.it
                CC: bugs at gluster.org, gluster-bugs at redhat.com

Description of problem:
glusterfs crashes and the volume must be remounted

Version-Release number of selected component (if applicable):
glusterfs-fuse-3.7.4-2.el7.x86_64

How reproducible:
It happens randomly but quite frequently under medium load. 

Steps to Reproduce:
1. Create a tow server replicated volume with 3 bricks for each server
2. Mount the volume with FUSE
3. Set performance.quick-read on

Actual results:
Fuse mount process crashes on the server under load. Still works in the other
server

Expected results:
glusterfs should not crash

Additional info:

(gdb) bt
#0  0x00007f44586825f6 in __memcpy_ssse3_back () from /lib64/libc.so.6
#1  0x00007f4447563bc4 in memcpy (__len=<optimized out>,
    __src=<optimized out>, __dest=<optimized out>)
    at /usr/include/bits/string3.h:51
#2  qr_content_extract (xdata=xdata at entry=0x7f445a163774) at quick-read.c:278
#3  0x00007f4447563f94 in qr_lookup_cbk (frame=0x7f44579942c4,
    cookie=<optimized out>, this=0x7f4448016320, op_ret=0, op_errno=117,
    inode_ret=0x7f4444afd434, buf=0x7f444c0628f0, xdata=0x7f445a163774,
    postparent=0x7f444c062b20) at quick-read.c:422
#4  0x00007f444777095c in ioc_lookup_cbk (frame=0x7f44579a1dcc,
    cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>,
    op_errno=<optimized out>, inode=0x7f4444afd434, stbuf=0x7f444c0628f0,
    xdata=0x7f445a163774, postparent=0x7f444c062b20) at io-cache.c:260
#5  0x00007f4447dc4f7f in dht_discover_complete (
    this=this at entry=0x7f4448011220,
    discover_frame=discover_frame at entry=0x7f44579906f8) at dht-common.c:304
#6  0x00007f4447dc563a in dht_discover_cbk (frame=0x7f44579906f8,
    cookie=0x7f4457990fb4, this=0x7f4448011220, op_ret=<optimized out>,
    op_errno=117, inode=0x7f4444afd434, stbuf=0x7f4439b0c198,
    xattr=0x7f445a163774, postparent=0x7f4439b0c208) at dht-common.c:439
#7  0x00007f444c1a2bb7 in afr_discover_done (this=<optimized out>,
    frame=0x7f4457990fb4) at afr-common.c:2114
#8  afr_discover_cbk (frame=0x7f4457990fb4, cookie=<optimized out>,
    this=<optimized out>, op_ret=<optimized out>, op_errno=<optimized out>,
    inode=<optimized out>, buf=0x7f444ce08930, xdata=0x7f445a162e28,
    postparent=0x7f444ce089a0) at afr-common.c:2149
#9  0x00007f444c3f1437 in client3_3_lookup_cbk (req=<optimized out>,
    iov=<optimized out>, count=<optimized out>, myframe=0x7f4457993e10)
    at client-rpc-fops.c:2978
#10 0x00007f4459c4eb10 in rpc_clnt_handle_reply (
    clnt=clnt at entry=0x7f44480fd310, pollin=pollin at entry=0x7f4448a51fd0)
    at rpc-clnt.c:766
#11 0x00007f4459c4edcf in rpc_clnt_notify (trans=<optimized out>,
    mydata=0x7f44480fd340, event=<optimized out>, data=0x7f4448a51fd0)
    at rpc-clnt.c:907
#12 0x00007f4459c4a903 in rpc_transport_notify (
    this=this at entry=0x7f444810d010,
    event=event at entry=RPC_TRANSPORT_MSG_RECEIVED,
    data=data at entry=0x7f4448a51fd0) at rpc-transport.c:544
#13 0x00007f444e8eb506 in socket_event_poll_in (this=this at entry=0x7f444810d010)
    at socket.c:2236
#14 0x00007f444e8ee3f4 in socket_event_handler (fd=fd at entry=17,
    idx=idx at entry=6, data=0x7f444810d010, poll_in=1, poll_out=0, poll_err=0)
    at socket.c:2349
#15 0x00007f4459ee17ba in event_dispatch_epoll_handler (event=0x7f444ce08e80,
    event_pool=0x7f445abf2330) at event-epoll.c:575
#16 event_dispatch_epoll_worker (data=0x7f445ac3aeb0) at event-epoll.c:678
#17 0x00007f4458ce8df5 in start_thread () from /lib64/libpthread.so.0
#18 0x00007f445862f1ad in clone () from /lib64/libc.so.6

----

Volume Name: home_gfs
Type: Distributed-Replicate
Volume ID: fa5aa52a-8105-47f1-b1d6-f10db8a11330
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: castore:/glusterfs/home_gfs/brick1
Brick2: polluce:/glusterfs/home_gfs/brick1
Brick3: castore:/glusterfs/home_gfs/brick2
Brick4: polluce:/glusterfs/home_gfs/brick2
Brick5: castore:/glusterfs/home_gfs/brick3
Brick6: polluce:/glusterfs/home_gfs/brick3
Options Reconfigured:
performance.quick-read: on
nfs.ports-insecure: on
diagnostics.client-log-level: ERROR
diagnostics.brick-log-level: ERROR
cluster.self-heal-daemon: enable
nfs.disable: on
server.allow-insecure: on
client.bind-insecure: on
network.ping-timeout: 5

It is mounted this way:

castore:/home_gfs on /export/home/public type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.