[Bugs] [Bug 1457058] New: glusterfs client crash on io-cache.so( __ioc_page_wakeup+0x44)

Wed May 31 03:35:57 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1457058

            Bug ID: 1457058
           Summary: glusterfs client crash on
                    io-cache.so(__ioc_page_wakeup+0x44)
           Product: GlusterFS
           Version: 3.11
         Component: io-cache
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: nbalacha at redhat.com
                CC: amukherj at redhat.com, bugs at gluster.org,
                    csaba at redhat.com, moagrawa at redhat.com,
                    nbalacha at redhat.com, rhinduja at redhat.com,
                    rhs-bugs at redhat.com, rnalakka at redhat.com,
                    sbhaloth at redhat.com
        Depends On: 1435357, 1456385
            Blocks: 1457054

+++ This bug was initially created as a clone of Bug #1456385 +++

+++ This bug was initially created as a clone of Bug #1435357 +++

Description of problem:

Glusterfs fuse client was crashing with the below back trace. 

~~~~~~~~~
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2017-03-22 22:31:32
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.9
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xc2)[0x7f45f3c3e1c2]
/lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f45f3c6396d]
/lib64/libc.so.6(+0x35670)[0x7f45f232a670]
/usr/lib64/glusterfs/3.7.9/xlator/performance/io-cache.so(__ioc_page_wakeup+0x44)[0x7f45e525e5b4]
/usr/lib64/glusterfs/3.7.9/xlator/performance/io-cache.so(ioc_inode_wakeup+0x164)[0x7f45e525ffa4]
/usr/lib64/glusterfs/3.7.9/xlator/performance/io-cache.so(ioc_cache_validate_cbk+0x31b)[0x7f45e5257b2b]
/usr/lib64/glusterfs/3.7.9/xlator/performance/read-ahead.so(ra_attr_cbk+0x11a)[0x7f45e566edfa]
/lib64/libglusterfs.so.0(default_fstat_cbk+0x11a)[0x7f45f3c47ada]
/usr/lib64/glusterfs/3.7.9/xlator/cluster/distribute.so(dht_file_attr_cbk+0x1c5)[0x7f45e5aea505]
/usr/lib64/glusterfs/3.7.9/xlator/cluster/replicate.so(afr_fstat_cbk+0x131)[0x7f45e5d27de1]
/usr/lib64/glusterfs/3.7.9/xlator/protocol/client.so(client3_3_fstat_cbk+0x44e)[0x7f45e5fa7f8e]
/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7f45f3a0c990]
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1bf)[0x7f45f3a0cc4f]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f45f3a08793]
/usr/lib64/glusterfs/3.7.9/rpc-transport/socket.so(+0x69b4)[0x7f45e86a19b4]
/usr/lib64/glusterfs/3.7.9/rpc-transport/socket.so(+0x95f4)[0x7f45e86a45f4]
/lib64/libglusterfs.so.0(+0x94c0a)[0x7f45f3cacc0a]
/lib64/libpthread.so.0(+0x7dc5)[0x7f45f2aa6dc5]
/lib64/libc.so.6(clone+0x6d)[0x7f45f23ebced]
~~~~~~~~~

* There was a lot of below messages in the client just before the crash. 

~~~~~~~~~~
[2017-03-23 08:41:29.936098] W [MSGID: 108027]
[afr-common.c:2250:afr_discover_done] 4-vol-replicate-2: no read subvols for /
The message "W [MSGID: 108027] [afr-common.c:2250:afr_discover_done]
4-vol-replicate-2: no read subvols for /" repeated 90 times between [2017-03-23
08:41:29.936098] and [2017-03-23 08:43:28.210919]
~~~~~~~~~~

* These messages was caused by the meta-data split-brain with some directories
including the brick "/"

Actual results:

glusterfs-fuseclient was crashing and mount point started giving transport end
point not connected. 

Expected results:

glusterfs-fuseclient should not crash

Additional info:

* Application coredump is collected. It gave the below BT

(gdb) bt
#0  0x00007f45e525e5b4 in __ioc_page_wakeup (page=0x7f43246e1500,
page at entry=0x7f45f17d0d64, op_errno=0) at page.c:960
#1  0x00007f45e525ffa4 in ioc_inode_wakeup (frame=0x7f45e00396c8,
frame at entry=0x7f45f17d0d64, ioc_inode=ioc_inode at entry=0x7f45e0e62160,
stbuf=stbuf at entry=0x7f45e69cca10) at ioc-inode.c:119
#2  0x00007f45e5257b2b in ioc_cache_validate_cbk (frame=0x7f45f17d0d64,
cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=<optimized
out>, stbuf=<optimized out>, xdata=0x0)
    at io-cache.c:402
#3  0x00007f45e566edfa in ra_attr_cbk (frame=0x7f45f17e22e0, cookie=<optimized
out>, this=<optimized out>, op_ret=0, op_errno=0, buf=0x7f45e69cca10,
xdata=0x0) at read-ahead.c:721
#4  0x00007f45f3c47ada in default_fstat_cbk (frame=0x7f45f17b7188,
cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=0,
buf=0x7f45e69cca10, xdata=0x0) at defaults.c:1053
#5  0x00007f45e5aea505 in dht_file_attr_cbk (frame=0x7f45f17ba090,
cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>,
op_errno=<optimized out>, stbuf=<optimized out>, 
    xdata=0x0) at dht-inode-read.c:214
#6  0x00007f45e5d27de1 in afr_fstat_cbk (frame=0x7f45f17562d8,
cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=0,
buf=0x7f45e69cca10, xdata=0x0) at afr-inode-read.c:291
#7  0x00007f45e5fa7f8e in client3_3_fstat_cbk (req=<optimized out>,
iov=<optimized out>, count=<optimized out>, myframe=0x7f45f17e1c28) at
client-rpc-fops.c:1574
#8  0x00007f45f3a0c990 in rpc_clnt_handle_reply
(clnt=clnt at entry=0x7f45e03547c0, pollin=pollin at entry=0x7f45e1033480) at
rpc-clnt.c:764
#9  0x00007f45f3a0cc4f in rpc_clnt_notify (trans=<optimized out>,
mydata=0x7f45e03547f0, event=<optimized out>, data=0x7f45e1033480) at
rpc-clnt.c:905
#10 0x00007f45f3a08793 in rpc_transport_notify (this=<optimized out>,
event=<optimized out>, data=<optimized out>) at rpc-transport.c:546
#11 0x00007f45e86a19b4 in socket_event_poll_in (this=0x7f45e0364440) at
socket.c:2355
#12 0x00007f45e86a45f4 in socket_event_handler (fd=<optimized out>, idx=8,
data=0x7f45e0364440, poll_in=1, poll_out=0, poll_err=0) at socket.c:2469
#13 0x00007f45f3cacc0a in event_dispatch_epoll_handler (event=0x7f45e69cce80,
event_pool=0x7f45f507c350) at event-epoll.c:570
#14 event_dispatch_epoll_worker (data=0x7f45f50d2ff0) at event-epoll.c:678
#15 0x00007f45f2aa6dc5 in start_thread (arg=0x7f45e69cd700) at
pthread_create.c:308
#16 0x00007f45f23ebced in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:113

* It seems like the glusterfs.fuse was crashing in the below function. 

 __ioc_page_wakeup (ioc_page_t *page, int32_t op_errno)
    948 {
    949         ioc_waitq_t  *waitq = NULL, *trav = NULL;
    950         call_frame_t *frame = NULL;
    951         int32_t       ret   = -1;
    952 
    953         GF_VALIDATE_OR_GOTO ("io-cache", page, out);
    954 
    955         waitq = page->waitq;
    956         page->waitq = NULL;
    957 
    958         page->ready = 1;
    959 
    960         gf_msg_trace (page->inode->table->xl->name, 0,
    961                       "page is %p && waitq = %p", page, waitq);
    962 
    963         for (trav = waitq; trav; trav = trav->next) {
    964                 frame = trav->data;
    965                 ret = __ioc_frame_fill (page, frame,
trav->pending_offset,
    966                                         trav->pending_size, op_errno);
    967                 if (ret == -1) {
    968                         break;
    969                 }                                        
    970         }

--- Additional comment from Nithya Balachandran on 2017-05-29 06:09:42 EDT ---

Core was generated by `/usr/sbin/glusterfs
--volfile-server=redhatstorage.web.skynas.local --volfile-i'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f45e525e5b4 in __ioc_page_wakeup (page=0x7f43246e1500,
page at entry=0x7f45f17d0d64, op_errno=0) at page.c:960
960            gf_msg_trace (page->inode->table->xl->name, 0,
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.5-4.el7.x86_64
(gdb) bt
#0  0x00007f45e525e5b4 in __ioc_page_wakeup (page=0x7f43246e1500,
page at entry=0x7f45f17d0d64, op_errno=0) at page.c:960
#1  0x00007f45e525ffa4 in ioc_inode_wakeup (frame=0x7f45e00396c8,
frame at entry=0x7f45f17d0d64, ioc_inode=ioc_inode at entry=0x7f45e0e62160, 
    stbuf=stbuf at entry=0x7f45e69cca10) at ioc-inode.c:119
#2  0x00007f45e5257b2b in ioc_cache_validate_cbk (frame=0x7f45f17d0d64,
cookie=<optimized out>, this=<optimized out>, op_ret=0, 
    op_errno=<optimized out>, stbuf=<optimized out>, xdata=0x0) at
io-cache.c:402
#3  0x00007f45e566edfa in ra_attr_cbk (frame=0x7f45f17e22e0, cookie=<optimized
out>, this=<optimized out>, op_ret=0, op_errno=0, buf=0x7f45e69cca10, 
    xdata=0x0) at read-ahead.c:721
#4  0x00007f45f3c47ada in default_fstat_cbk (frame=0x7f45f17b7188,
cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=0, 
    buf=0x7f45e69cca10, xdata=0x0) at defaults.c:1053
#5  0x00007f45e5aea505 in dht_file_attr_cbk (frame=0x7f45f17ba090,
cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>, 
    op_errno=<optimized out>, stbuf=<optimized out>, xdata=0x0) at
dht-inode-read.c:214
#6  0x00007f45e5d27de1 in afr_fstat_cbk (frame=0x7f45f17562d8,
cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=0,
buf=0x7f45e69cca10, 
    xdata=0x0) at afr-inode-read.c:291
#7  0x00007f45e5fa7f8e in client3_3_fstat_cbk (req=<optimized out>,
iov=<optimized out>, count=<optimized out>, myframe=0x7f45f17e1c28)
    at client-rpc-fops.c:1574
#8  0x00007f45f3a0c990 in rpc_clnt_handle_reply
(clnt=clnt at entry=0x7f45e03547c0, pollin=pollin at entry=0x7f45e1033480) at
rpc-clnt.c:764
#9  0x00007f45f3a0cc4f in rpc_clnt_notify (trans=<optimized out>,
mydata=0x7f45e03547f0, event=<optimized out>, data=0x7f45e1033480) at
rpc-clnt.c:905
#10 0x00007f45f3a08793 in rpc_transport_notify (this=<optimized out>,
event=<optimized out>, data=<optimized out>) at rpc-transport.c:546
#11 0x00007f45e86a19b4 in socket_event_poll_in (this=0x7f45e0364440) at
socket.c:2355
#12 0x00007f45e86a45f4 in socket_event_handler (fd=<optimized out>, idx=8,
data=0x7f45e0364440, poll_in=1, poll_out=0, poll_err=0) at socket.c:2469
#13 0x00007f45f3cacc0a in event_dispatch_epoll_handler (event=0x7f45e69cce80,
event_pool=0x7f45f507c350) at event-epoll.c:570
#14 event_dispatch_epoll_worker (data=0x7f45f50d2ff0) at event-epoll.c:678
#15 0x00007f45f2aa6dc5 in start_thread (arg=0x7f45e69cd700) at
pthread_create.c:308
#16 0x00007f45f23ebced in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) p *page
$1 = {page_lru = {next = 0xbabebabe, prev = 0xcafecafe}, inode = 0x0, priority
= 0x0, dirty = 0 '\000', ready = 1 '\001', vector = 0x0, count = 1, 
  offset = 356384768, size = 131072, waitq = 0x0, iobref = 0x7f45d235fe40,
page_lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
      __kind = -1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19
times>, __align = 0}, op_errno = 0, stale = 1 '\001'}

This segfaults in gf_msg_trace :

(gdb) p *page->inode
Cannot access memory at address 0x0

(gdb) f 1
#1  0x00007f45e525ffa4 in ioc_inode_wakeup (frame=0x7f45e00396c8,
frame at entry=0x7f45f17d0d64, ioc_inode=ioc_inode at entry=0x7f45e0e62160, 
    stbuf=stbuf at entry=0x7f45e69cca10) at ioc-inode.c:119
119                                            page_waitq =
(gdb) l
114                    if (waiter_page) {
115                            if (cache_still_valid) {
116                                    /* cache valid, wake up page */
117                                    ioc_inode_lock (ioc_inode);
118                                    {
119                                            page_waitq =
120                                                    __ioc_page_wakeup
(waiter_page,
121                                                                      
waiter_page->op_errno);
122                                    }
123                                    ioc_inode_unlock (ioc_inode);
(gdb) p waiter_page
$2 = (ioc_page_t *) 0x7f45f17d0d64
(gdb) p *waiter_page
$3 = {page_lru = {next = 0x7f45f15a95ec, prev = 0x0}, inode = 0x7f45f15a9c54,
priority = 0x7f45f17e22f0, dirty = -56 '\310', ready = -106 '\226', 
  vector = 0x7f45e0029c20, count = 0, offset = 4294967296, size = 0, waitq =
0x0, iobref = 0x0, page_lock = {__data = {__lock = 0, __count = 0, 
      __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev =
0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, 
  op_errno = 0, stale = 0 '\000'}

(gdb) f 0
(gdb) p page->inode
$49 = (struct ioc_inode *) 0x0

(gdb) p page->inode
$49 = (struct ioc_inode *) 0x0
(gdb) p waitq
$50 = (ioc_waitq_t *) 0x0

No symbol "inode" in current context.
(gdb) l
955            waitq = page->waitq;
956            page->waitq = NULL;
957    
958            page->ready = 1;
959    
960            gf_msg_trace (page->inode->table->xl->name, 0,
961                          "page is %p && waitq = %p", page, waitq);
962    
963            for (trav = waitq; trav; trav = trav->next) {
964                    frame = trav->data;
(gdb) p page
$51 = (ioc_page_t *) 0x7f43246e1500

(gdb) p *page
$52 = {page_lru = {next = 0xbabebabe, prev = 0xcafecafe}, inode = 0x0, priority
= 0x0, 
  dirty = 0 '\000', ready = 1 '\001', vector = 0x0, count = 1, offset =
356384768, size = 131072, 
  waitq = 0x0, iobref = 0x7f45d235fe40, page_lock = {__data = {__lock = 0,
__count = 0, 
      __owner = 0, __nusers = 0, __kind = -1, __spins = 0, __list = {__prev =
0x0, __next = 0x0}}, 
    __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19
times>, 
    __align = 0}, op_errno = 0, stale = 1 '\001'}

This page has already been freed.

--- Additional comment from Worker Ant on 2017-05-29 06:15:01 EDT ---

REVIEW: https://review.gluster.org/17410 (perf/ioc: Fix race causing crash when
accessing freed page) posted (#1) for review on master by N Balachandran
(nbalacha at redhat.com)

--- Additional comment from Nithya Balachandran on 2017-05-29 06:19:53 EDT ---

>From Mohit:

gdb) f 0
#0  0x00007f45e525e5b4 in __ioc_page_wakeup (page=0x7f43246e1500,
page at entry=0x7f45f17d0d64, op_errno=0) at page.c:960
960            gf_msg_trace (page->inode->table->xl->name, 0,
(gdb) p page->inode
$1 = (struct ioc_inode *) 0x0
(gdb) f 1
#1  0x00007f45e525ffa4 in ioc_inode_wakeup (frame=0x7f45e00396c8,
frame at entry=0x7f45f17d0d64, 
    ioc_inode=ioc_inode at entry=0x7f45e0e62160, stbuf=stbuf at entry=0x7f45e69cca10)
at ioc-inode.c:119
119                                            page_waitq =
(gdb) p waiter
$2 = (ioc_waitq_t *) 0x7f4325f60140
(gdb) p ioc_inode
$3 = (ioc_inode_t *) 0x7f45e0e62160
(gdb) p *ioc_inode
$4 = {table = 0x7f45e0038f10, ia_size = 1037687610, cache = {page_table =
0x7f4324fe6940, page_lru = {next = 0x7f45d124aad0, 
      prev = 0x7f45d124aad0}, mtime = 1480448949, mtime_nsec = 697732971, tv =
{tv_sec = 1490221892, tv_usec = 31395}}, 
  inode_list = {next = 0x7f45d805d5b8, prev = 0x7f45dbfec4b8}, inode_lru =
{next = 0x7f45e0039020, prev = 0x7f45e0774068}, 
  waitq = 0x0, inode_lock = {__data = {__lock = 2, __count = 0, __owner =
11655, __nusers = 1, __kind = 0, __spins = 0, 
      __list = {__prev = 0x0, __next = 0x0}}, 
    __size = "\002\000\000\000\000\000\000\000\207-\000\000\001", '\000'
<repeats 26 times>, __align = 2}, weight = 1, 
  inode = 0x7f45d78719a8}
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f45d7498700 (LWP 11659))]
#0  __lll_lock_wait () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
135    2:    movl    %edx, %eax
(gdb) f 3
#3  0x00007f45e525d181 in ioc_prune (table=0x7f45e0038f10) at page.c:221
221                                    ioc_inode_lock (curr);
(gdb) p curr
$5 = (ioc_inode_t *) 0x7f45e0e62160
(gdb) p index
$6 = 1

>>>>>>>>>>>>>>>>>>>>>>>>

IMO in ioc_inode_wakeup we are not fetching waiter queue from ioc_inode in
correct way,I think we should fetch 
waiter page in while loop at the time of calling __ioc_page_wakeup.

Regards
Mohit Agrawal

--- Additional comment from Worker Ant on 2017-05-29 06:22:39 EDT ---

REVIEW: https://review.gluster.org/17410 (perf/ioc: Fix race causing crash when
accessing freed page) posted (#2) for review on master by N Balachandran
(nbalacha at redhat.com)

--- Additional comment from Worker Ant on 2017-05-30 04:54:21 EDT ---

REVIEW: https://review.gluster.org/17410 (perf/ioc: Fix race causing crash when
accessing freed page) posted (#3) for review on master by N Balachandran
(nbalacha at redhat.com)

--- Additional comment from Worker Ant on 2017-05-30 11:51:50 EDT ---

COMMIT: https://review.gluster.org/17410 committed in master by Jeff Darcy
(jeff at pl.atyp.us) 
------
commit 6b6162f7ff93bccef0e615cb490e881168827e1d
Author: N Balachandran <nbalacha at redhat.com>
Date:   Mon May 29 15:21:39 2017 +0530

    perf/ioc: Fix race causing crash when accessing freed page

    ioc_inode_wakeup does not lock the ioc_inode for the duration
    of the operation, leaving a window where ioc_prune could find
    a NULL waitq and hence free the page which ioc_inode_wakeup later
    tries to access.

    Thanks to Mohit for the analysis.

    credit: moagrawa at redhat.com

    Change-Id: I54b064857e2694826d0c03b23f8014e3984a3330
    BUG: 1456385
    Signed-off-by: N Balachandran <nbalacha at redhat.com>
    Reviewed-on: https://review.gluster.org/17410
    Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
    Tested-by: Raghavendra G <rgowdapp at redhat.com>
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Jeff Darcy <jeff at pl.atyp.us>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1435357
[Bug 1435357] [GSS]RHGS 3.1.3 glusterfs client crash on
io-cache.so(__ioc_page_wakeup+0x44)
https://bugzilla.redhat.com/show_bug.cgi?id=1456385
[Bug 1456385] glusterfs client crash on io-cache.so(__ioc_page_wakeup+0x44)
https://bugzilla.redhat.com/show_bug.cgi?id=1457054
[Bug 1457054] glusterfs client crash on io-cache.so(__ioc_page_wakeup+0x44)
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.