[Bugs] [Bug 1225859] New: Glusterfs client crash during fd migration after graph switch

Thu May 28 10:44:53 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1225859

            Bug ID: 1225859
           Summary: Glusterfs client crash during fd migration after graph
                    switch
           Product: GlusterFS
           Version: 3.7.0
         Component: unclassified
          Assignee: bugs at gluster.org
          Reporter: rgowdapp at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com
        Depends On: 1225323

+++ This bug was initially created as a clone of Bug #1225323 +++

Description of problem:
(gdb) bt
#0  0x00007fe56cb27503 in dht_fsync (frame=0x7fe576d701c4, this=<optimized
out>, fd=0x7fe560002cec, datasync=0, xdata=0x0)
    at ../../../../../xlators/cluster/dht/src/dht-inode-read.c:818
#1  0x00007fe56c8db4b8 in wb_fsync (frame=0x7fe576d70118, this=0x7fe56800a2e0,
fd=0x7fe560002cec, datasync=0, xdata=0x0)
    at ../../../../../xlators/performance/write-behind/src/write-behind.c:1523
#2  0x00007fe56c6ccb3a in ra_fsync (frame=0x7fe576d7006c, this=0x7fe56800b6b0,
fd=0x7fe560002cec, datasync=0, xdata=0x0)
    at ../../../../../xlators/performance/read-ahead/src/read-ahead.c:628
#3  0x00007fe577eefb1d in default_fsync (frame=0x7fe576d7006c,
this=0x7fe56800c9e0, fd=0x7fe560002cec, flags=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:1817
#4  0x00007fe577eefb1d in default_fsync (frame=0x7fe576d7006c,
this=0x7fe56800dde0, fd=0x7fe560002cec, flags=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:1817
#5  0x00007fe577eefb1d in default_fsync (frame=0x7fe576d7006c,
this=0x7fe56800f0d0, fd=0x7fe560002cec, flags=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:1817
#6  0x00007fe577efab3b in default_fsync_resume (frame=0x7fe576d70a80,
this=0x7fe568010540, fd=0x7fe560002cec, flags=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:1376
#7  0x00007fe577f14fdd in call_resume (stub=0x7fe5767f7dd4) at
../../../libglusterfs/src/call-stub.c:2576
#8  0x00007fe567dfa578 in open_and_resume (this=this at entry=0x7fe568010540,
fd=fd at entry=0x7fe560002cec, stub=0x7fe5767f7dd4)
    at ../../../../../xlators/performance/open-behind/src/open-behind.c:241
#9  0x00007fe567dfa922 in ob_fsync (frame=0x7fe576d70a80, this=0x7fe568010540,
fd=0x7fe560002cec, flag=<optimized out>, xdata=<optimized out>)
    at ../../../../../xlators/performance/open-behind/src/open-behind.c:498
#10 0x00007fe567becbaf in mdc_fsync (frame=0x7fe576d709d4, this=0x7fe5680118d0,
fd=0x7fe560002cec, datasync=0, xdata=0x0)
    at ../../../../../xlators/performance/md-cache/src/md-cache.c:1669
#11 0x00007fe5679d0f76 in io_stats_fsync (frame=0x7fe576d70270,
this=0x7fe568012c90, fd=0x7fe560002cec, flags=0, xdata=0x0)
    at ../../../../../xlators/debug/io-stats/src/io-stats.c:2207
#12 0x00007fe577eefb1d in default_fsync (frame=0x7fe576d70270,
this=0x7fe568014160, fd=0x7fe560002cec, flags=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:1817
#13 0x00007fe577f3040a in syncop_fsync (subvol=subvol at entry=0x7fe568014160,
fd=fd at entry=0x7fe560002cec, dataonly=dataonly at entry=0, 
    xdata_in=xdata_in at entry=0x0, xdata_out=xdata_out at entry=0x0) at
../../../libglusterfs/src/syncop.c:2280
#14 0x00007fe56fe0257a in fuse_migrate_fd (this=this at entry=0x1661700,
basefd=basefd at entry=0x7fe560002cec, 
    old_subvol=old_subvol at entry=0x7fe568014160,
new_subvol=new_subvol at entry=0x7fe560030be0)
    at ../../../../../xlators/mount/fuse/src/fuse-bridge.c:4423
#15 0x00007fe56fe0271d in fuse_handle_opened_fds (this=0x1661700,
old_subvol=0x7fe568014160, new_subvol=0x7fe560030be0)
    at ../../../../../xlators/mount/fuse/src/fuse-bridge.c:4508
#16 0x00007fe56fe027c9 in fuse_graph_switch_task (data=<optimized out>) at
../../../../../xlators/mount/fuse/src/fuse-bridge.c:4559
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) f 0
#0  0x00007fe56cb27503 in dht_fsync (frame=0x7fe576d701c4, this=<optimized
out>, fd=0x7fe560002cec, datasync=0, xdata=0x0)
    at ../../../../../xlators/cluster/dht/src/dht-inode-read.c:818
818            STACK_WIND (frame, dht_fsync_cbk, subvol, subvol->fops->fsync,
(gdb) l
813            local->call_cnt = 1;
814            local->rebalance.flags = datasync;
815    
816            subvol = local->cached_subvol;
817    
818            STACK_WIND (frame, dht_fsync_cbk, subvol, subvol->fops->fsync,
819                        fd, datasync, xdata);
820    
821            return 0;
822    
(gdb) p subvol
$1 = (xlator_t *) 0x0

Version-Release number of selected component (if applicable):
master

How reproducible:
Fairly consistently in some setups

Steps to Reproduce:
1. run ./tests/performance/open-behind.t in a loop. Sometimes Gluster mount
crashes after we switch open-behind off. The crash is seen when when
"num_graphs" is executed.

2.
3.

Actual results:

Expected results:

Additional info:

RCA: Its a race. If a graph switch happens when we have an fd opened under
".meta" subtree, crash is seen when migrating that fd. meta-fsync is not
implemented and hence call is wound to real volume. Other translators have no
idea of handling this virtual inode. Fix is to implement fsync in meta. As a
complete fix, meta should implement all fops, so that it can check whether an
inode belongs to it or not and wind the call down only if it does not belong to
itself.

--- Additional comment from Anand Avati on 2015-05-27 02:45:27 EDT ---

REVIEW: http://review.gluster.org/10929 (meta: implement fsync(dir)) posted
(#1) for review on master by Raghavendra G (rgowdapp at redhat.com)

--- Additional comment from Anand Avati on 2015-05-27 07:03:21 EDT ---

COMMIT: http://review.gluster.org/10929 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit d6fc353afce03095c98d67d377eb7ddf334fd42e
Author: Raghavendra G <rgowdapp at redhat.com>
Date:   Wed May 27 12:08:54 2015 +0530

    meta: implement fsync(dir)

    Change-Id: I707c608a9803fe6ef86860ca5578d4d3f63fd2aa
    BUG: 1225323
    Signed-off-by: Raghavendra G <rgowdapp at redhat.com>
    Reviewed-on: http://review.gluster.org/10929
    Tested-by: NetBSD Build System
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1225323
[Bug 1225323] Glusterfs client crash during fd migration after graph switch
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.