[Bugs] [Bug 1392288] New: gfapi clients crash while using async calls due to double fd_unref

bugzilla at redhat.com bugzilla at redhat.com
Mon Nov 7 06:37:31 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1392288

            Bug ID: 1392288
           Summary: gfapi clients crash while using async calls due to
                    double fd_unref
           Product: GlusterFS
           Version: 3.8
         Component: libgfapi
          Assignee: bugs at gluster.org
          Reporter: rtalur at redhat.com
        QA Contact: sdharane at redhat.com
                CC: bugs at gluster.org, sdharane at redhat.com
        Depends On: 1391086
            Blocks: 1391093, 1392286



+++ This bug was initially created as a clone of Bug #1391086 +++

Description of problem:
Clients crash with backtrace similar to one shown below.

(gdb) bt
#0  0x00007f120b0755f7 in raise () from /lib64/libc.so.6
#1  0x00007f120b076ce8 in abort () from /lib64/libc.so.6
#2  0x00007f120c9d6b9b in dump_core () at ../source3/lib/dumpcore.c:322
#3  0x00007f120c9c9f97 in smb_panic_s3 (why=<optimized out>) at
../source3/lib/util.c:814
#4  0x00007f120eebc57f in smb_panic (why=why at entry=0x7f120ef0354a "internal
error") at ../lib/util/fault.c:166
#5  0x00007f120eebc796 in fault_report (sig=<optimized out>) at
../lib/util/fault.c:83
#6  sig_fault (sig=<optimized out>) at ../lib/util/fault.c:94
#7  <signal handler called>
#8  0x00007f120f11a210 in pthread_spin_lock () from /lib64/libpthread.so.0
#9  0x00007f11f3f78255 in fd_unref () from /lib64/libglusterfs.so.0
#10 0x00007f11f4661a81 in glfs_io_async_cbk () from /lib64/libgfapi.so.0
#11 0x00007f11f4661eef in glfs_preadv_async_cbk () from /lib64/libgfapi.so.0
#12 0x00007f11df5b002d in io_stats_readv_cbk () from
/usr/lib64/glusterfs/3.8.4/xlator/debug/io-stats.so
#13 0x00007f11f3fd0c52 in default_readv_cbk () from /lib64/libglusterfs.so.0
#14 0x00007f11f3fd0c52 in default_readv_cbk () from /lib64/libglusterfs.so.0
#15 0x00007f11f3fd0c52 in default_readv_cbk () from /lib64/libglusterfs.so.0
#16 0x00007f11dfdf7787 in ioc_frame_return () from
/usr/lib64/glusterfs/3.8.4/xlator/performance/io-cache.so
#17 0x00007f11dfdf7b2f in ioc_waitq_return () from
/usr/lib64/glusterfs/3.8.4/xlator/performance/io-cache.so
#18 0x00007f11dfdf81dd in ioc_fault_cbk () from
/usr/lib64/glusterfs/3.8.4/xlator/performance/io-cache.so
#19 0x00007f11ec413332 in ra_readv_disabled_cbk () from
/usr/lib64/glusterfs/3.8.4/xlator/performance/read-ahead.so
#20 0x00007f11f3fd0c52 in default_readv_cbk () from /lib64/libglusterfs.so.0
#21 0x00007f11ec890fe3 in dht_readv_cbk () from
/usr/lib64/glusterfs/3.8.4/xlator/cluster/distribute.so
#22 0x00007f11ecac7649 in afr_readv_cbk () from
/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so
#23 0x00007f11ecd59773 in client3_3_readv_cbk () from
/usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so
#24 0x00007f11f4449680 in rpc_clnt_handle_reply () from /lib64/libgfrpc.so.0
#25 0x00007f11f444995f in rpc_clnt_notify () from /lib64/libgfrpc.so.0
#26 0x00007f11f4445883 in rpc_transport_notify () from /lib64/libgfrpc.so.0
#27 0x00007f11ed78ceb4 in socket_event_poll_in () from
/usr/lib64/glusterfs/3.8.4/rpc-transport/socket.so
#28 0x00007f11ed78f365 in socket_event_handler () from
/usr/lib64/glusterfs/3.8.4/rpc-transport/socket.so
#29 0x00007f11f3fae340 in event_dispatch_epoll_worker () from
/lib64/libglusterfs.so.0
#30 0x00007f120f115dc5 in start_thread () from /lib64/libpthread.so.0
#31 0x00007f120b136ced in clone () from /lib64/libc.so.6


Version-Release number of selected component (if applicable):
mainline


How reproducible:
Always when using async api


Steps to Reproduce:
Will attach a reproducer

--- Additional comment from Worker Ant on 2016-11-02 20:38:50 IST ---

REVIEW: http://review.gluster.org/15768 (gfapi: async fops should unref in
callbacks) posted (#1) for review on master by Raghavendra Talur
(rtalur at redhat.com)

--- Additional comment from Worker Ant on 2016-11-04 13:02:56 IST ---

REVIEW: http://review.gluster.org/15768 (gfapi: async fops should unref in
callbacks) posted (#2) for review on master by Raghavendra Talur
(rtalur at redhat.com)

--- Additional comment from Worker Ant on 2016-11-06 02:02:43 IST ---

COMMIT: http://review.gluster.org/15768 committed in master by Shyamsundar
Ranganathan (srangana at redhat.com) 
------
commit e65738818dd22462ec00dda021566654d1c702b1
Author: Raghavendra Talur <rtalur at redhat.com>
Date:   Wed Nov 2 19:51:26 2016 +0530

    gfapi: async fops should unref in callbacks

    If fd is unref'd at the end of async call then the unref in cbks would
    lead to double unref and possible crash. Removing duplicate unrefs.

    Added unref only in failure cases.

    A simple test case has been added to test async write case. Need to
    extend the same for other async APIs too.

    Details:
    All glfd based calls in libgfapi, except for glfs_open and glfs_close,
    behave in the same way. At the start of the operation, they take a ref
    on glfd and fd. At the end of the operation, they unref it. Async calls
    are a little different as they unref in the cbk function. A successfull
    open call does not unref either the glfd or fd, thereby functioning as a
    reference for a OPEN file object. glfs_close makes a syncop_flush call
    sandwiched between a fd ref and unref(this can be removed, more on this
    below), followed by a call to glfs_mark_glfd_for_deletion which unrefs
    glfd and also calls glfs_fd_destroy as a release function thereby doing
    a unref on fd too.

    Functionally, there is no problem with how everything works when as
    described above. However, it is a little non-intuitive that we need to
    perform a fd_unref as a consequence of a implicit fd_ref that happens
    within glfs_resolve_fd. As we perform a GF_REF_GET(glfd) at the start of
    every operation, it would be worthwhile to remove the fd_ref that
    glfs_resovle_fd takes and do away with explicit fd_unref()s at the end
    of every operation. This is the same reason why we don't need the fd_ref
    in glfs_close. This is however not in the scope of this patch.

    Change-Id: I86b1d3b2ad846b16ea527d541dc82b5e90b0ba85
    BUG: 1391086
    Signed-off-by: Raghavendra Talur <rtalur at redhat.com>
    Reviewed-on: http://review.gluster.org/15768
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    Reviewed-by: Rajesh Joseph <rjoseph at redhat.com>
    Reviewed-by: Xavier Hernandez <xhernandez at datalab.es>
    Reviewed-by: soumya k <skoduri at redhat.com>
    Reviewed-by: Prasanna Kumar Kalever <pkalever at redhat.com>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana at redhat.com>

--- Additional comment from Worker Ant on 2016-11-07 12:05:49 IST ---

REVIEW: http://review.gluster.org/15778 (gfapi: async fops should unref in
callbacks) posted (#1) for review on release-3.9 by Raghavendra Talur
(rtalur at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1391086
[Bug 1391086] gfapi clients crash while using async calls due to double
fd_unref
https://bugzilla.redhat.com/show_bug.cgi?id=1391093
[Bug 1391093] [Samba-Crash] : Core logs were generated while working on
random IOs
https://bugzilla.redhat.com/show_bug.cgi?id=1392286
[Bug 1392286] gfapi clients crash while using async calls due to double
fd_unref
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list