[Bugs] [Bug 1173624] New: glusterfs client crashed while migrating the fds

bugzilla at redhat.com bugzilla at redhat.com
Fri Dec 12 14:32:29 UTC 2014


https://bugzilla.redhat.com/show_bug.cgi?id=1173624

            Bug ID: 1173624
           Summary: glusterfs client crashed while migrating the fds
           Product: Red Hat Storage
           Version: 3.0
         Component: gluster-snapshot
          Assignee: rjoseph at redhat.com
          Reporter: ssamanta at redhat.com
        QA Contact: storage-qa-internal at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com,
                    rabhat at redhat.com
        Depends On: 1172262
             Group: redhat



+++ This bug was initially created as a clone of Bug #1172262 +++

Description of problem:

glusterfs client crashed with the below backtrace while migrating the fds from
old graph to the newer graph.


(gdb) bt
#0  0x00007f00bf7a51c6 in dht_fsync (frame=0x7f00c9351250, this=<value
optimized out>, fd=0x366242c, datasync=0, 
    xdata=0x0) at dht-inode-read.c:818
#1  0x00007f00bf560359 in wb_fsync (frame=0x7f00c93513a8, this=0x31b9a50,
fd=0x366242c, datasync=0, xdata=0x0)
    at write-behind.c:1523
#2  0x00007f00bf34fa9a in ra_fsync (frame=0x7f00c93515ac, this=0x31ba4b0,
fd=0x366242c, datasync=0, xdata=0x0)
    at read-ahead.c:628
#3  0x0000003c7e0265cb in default_fsync (frame=0x7f00c93515ac, this=0x31bb000,
fd=0x366242c, flags=0, 
    xdata=<value optimized out>) at defaults.c:1795
#4  0x0000003c7e0265cb in default_fsync (frame=0x7f00c93515ac, this=0x31bbc10,
fd=0x366242c, flags=0, 
    xdata=<value optimized out>) at defaults.c:1795
#5  0x0000003c7e0265cb in default_fsync (frame=0x7f00c93515ac, this=0x31bc7a0,
fd=0x366242c, flags=0, 
    xdata=<value optimized out>) at defaults.c:1795
#6  0x00007f00beb21722 in mdc_fsync (frame=0x7f00c9351d10, this=0x31bd3a0,
fd=0x366242c, datasync=0, xdata=0x0)
    at md-cache.c:1654
#7  0x0000003c7e0265cb in default_fsync (frame=0x7f00c9351d10, this=0x31becc0,
fd=0x366242c, flags=0, 
    xdata=<value optimized out>) at defaults.c:1795
#8  0x00007f00be6fd915 in io_stats_fsync (frame=0x7f00c9351908, this=0x31bfa40,
fd=0x366242c, flags=0, xdata=0x0)
    at io-stats.c:2194
#9  0x0000003c7e0265cb in default_fsync (frame=0x7f00c9351908, this=0x31c0760,
fd=0x366242c, flags=0, 
    xdata=<value optimized out>) at defaults.c:1795
#10 0x0000003c7e0684e2 in syncop_fsync (subvol=0x31c0760, fd=0x366242c,
dataonly=0) at syncop.c:1921
#11 0x00007f00c2ac43e5 in fuse_migrate_fd (this=0x12ec8e0, basefd=0x366242c,
old_subvol=0x31c0760, new_subvol=0x368a540)
    at fuse-bridge.c:4382
#12 0x00007f00c2ac459c in fuse_handle_opened_fds (this=0x12ec8e0,
old_subvol=0x31c0760, new_subvol=0x368a540)
    at fuse-bridge.c:4467
#13 0x00007f00c2ac4649 in fuse_graph_switch_task (data=<value optimized out>)
at fuse-bridge.c:4518
#14 0x0000003c7e05c222 in synctask_wrap (old_task=<value optimized out>) at
syncop.c:333
#15 0x000000304ac438f0 in ?? () from /lib64/libc.so.6
#16 0x0000000000000000 in ?? ()
f 0
#0  0x00007f00bf7a51c6 in dht_fsync (frame=0x7f00c9351250, this=<value
optimized out>, fd=0x366242c, datasync=0, 
    xdata=0x0) at dht-inode-read.c:818
818            STACK_WIND (frame, dht_fsync_cbk, subvol, subvol->fops->fsync,
(gdb) l
813            local->call_cnt = 1;
814            local->rebalance.flags = datasync;
815    
816            subvol = local->cached_subvol;
817    
818            STACK_WIND (frame, dht_fsync_cbk, subvol, subvol->fops->fsync,
819                        fd, datasync, xdata);
820    
821            return 0;
822    
(gdb) p local
$1 = <value optimized out>
(gdb)  p local->cached_subvol 
$1 = (xlator_t *) 0x0
f 6
#6  0x00007f522eaa5dc3 in default_fsync (frame=0x7f521000122c, this=0xfb1280,
fd=0x7f5218001dec, flags=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:1795
1795            STACK_WIND_TAIL (frame, FIRST_CHILD(this),
(gdb) p *fd->inode->fd_ctx
There is no member named fd_ctx.
(gdb) p *this
$2 = {name = 0xfb0ac0 "vol-snapview-client", type = 0xfb1da0
"features/snapview-client", next = 0xfaff00, 
  prev = 0xfb2820, parents = 0xfb3b50, children = 0xfb2760, options = 0xfb1cfc,
dlhandle = 0xfb1ea0, 
  fops = 0x7f521fdedc00 <fops>, cbks = 0x7f521fdedee0 <cbks>, dumpops = 0x0,
volume_options = {next = 0xfb25f0, 
    prev = 0xfb25f0}, fini = 0x7f521fbea2aa <svc_forget+228>, init =
0x7f521fbe9f69 <svc_flush+729>, 
  reconfigure = 0x7f521fbe9e49 <svc_flush+441>, mem_acct_init = 0x7f521fbe9ef2
<svc_flush+610>, 
  notify = 0x7f521fbea31c <reconfigure+56>, loglevel = GF_LOG_NONE, latencies =
{{min = 0, max = 0, total = 0, std = 0, 
      mean = 0, count = 0} <repeats 49 times>}, history = 0x0, ctx = 0xf5f010,
graph = 0xf9fad0, itable = 0x0, 
  init_succeeded = 1 '\001', private = 0xfb9eb0, mem_acct = {num_types = 118,
rec = 0xfb8c10}, winds = 0, 
  switched = 0 '\000', local_pool = 0xfb9f00, is_autoloaded = _gf_false}
(gdb) p *fd->inode->_ctx
$3 = {{key = 16454272, xl_key = 0xfb1280}, {value1 = 2, ptr1 = 0x2}, {value2 =
0, ptr2 = 0x0}}
(gdb) p *fd->inode->_ctx->xl_key
$4 = {name = 0xfb0ac0 "vol-snapview-client", type = 0xfb1da0
"features/snapview-client", next = 0xfaff00, 
  prev = 0xfb2820, parents = 0xfb3b50, children = 0xfb2760, options = 0xfb1cfc,
dlhandle = 0xfb1ea0, 
  fops = 0x7f521fdedc00 <fops>, cbks = 0x7f521fdedee0 <cbks>, dumpops = 0x0,
volume_options = {next = 0xfb25f0, 
    prev = 0xfb25f0}, fini = 0x7f521fbea2aa <svc_forget+228>, init =
0x7f521fbe9f69 <svc_flush+729>, 
  reconfigure = 0x7f521fbe9e49 <svc_flush+441>, mem_acct_init = 0x7f521fbe9ef2
<svc_flush+610>, 
  notify = 0x7f521fbea31c <reconfigure+56>, loglevel = GF_LOG_NONE, latencies =
{{min = 0, max = 0, total = 0, std = 0, 
      mean = 0, count = 0} <repeats 49 times>}, history = 0x0, ctx = 0xf5f010,
graph = 0xf9fad0, itable = 0x0, 
  init_succeeded = 1 '\001', private = 0xfb9eb0, mem_acct = {num_types = 118,
rec = 0xfb8c10}, winds = 0, 
  switched = 0 '\000', local_pool = 0xfb9f00, is_autoloaded = _gf_false}
p *fd->inode->_ctx
$5 = {{key = 16454272, xl_key = 0xfb1280}, {value1 = 2, ptr1 = 0x2}, {value2 =
0, ptr2 = 0x0}}



Version-Release number of selected component (if applicable):


How reproducible:

keep doing ls in .snaps directory when USS is enabled and do graph changes (all
it has to do is a fd has to be opened and then graph switch has to happen)

Steps to Reproduce:
1.Create 2*2 dist-rep and start it
2.Mount the volume and start I/O at CIFS mount
3.Enable USS and copy a large file to the mount-point and take a snapshots
4.View the .snaps/<snap-dir>

Actual results:
There is a glusterfs client crash.

Expected results:
There should not be any crash.

Additional info:

Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 34a699ad-81f8-489e-8242-40c5e181901a
Status: Started
Snap Volume: no
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.42.244:/rhs/brick1/vol
Brick2: 10.70.43.6:/rhs/brick2/vol
Brick3: 10.70.42.204:/rhs/brick3/vol
Brick4: 10.70.42.10:/rhs/brick4/vol
Options Reconfigured:
features.quota-deem-statfs: on
features.quota: on
features.uss: on
storage.batch-fsync-delay-usec: 0
server.allow-insecure: on
performance.stat-prefetch: off
performance.readdir-ahead: on
features.barrier: disable
auto-delete: disable
snap-max-soft-limit: 90
snap-max-hard-limit: 256


[2014/12/12 06:51:18.905371,  0] modules/vfs_glusterfs.c:550(vfs_gluster_stat)
  glfs_stat(.snaps/snap1/06.Cluster-Translators-Distribute-Stripe.mp4) failed:
Transport endpoint is not connected
[2014/12/12 06:51:18.906050,  0] modules/vfs_glusterfs.c:550(vfs_gluster_stat)
  glfs_stat(.snaps) failed: Transport endpoint is not connected
[2014/12/12 06:51:20.720738,  0] lib/fault.c:47(fault_report)
  ===============================================================
[2014/12/12 06:51:20.721126,  0] lib/fault.c:48(fault_report)
  INTERNAL ERROR: Signal 11 in pid 698 (3.6.9-169.2.el6rhs)
  Please read the Trouble-Shooting section of the Samba3-HOWTO
[2014/12/12 06:51:20.721566,  0] lib/fault.c:50(fault_report)

  From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
[2014/12/12 06:51:20.721830,  0] lib/fault.c:51(fault_report)
  ===============================================================
[2014/12/12 06:51:20.722008,  0] lib/util.c:1117(smb_panic)
  PANIC (pid 698): internal error
[2014/12/12 06:51:20.729958,  0] lib/util.c:1221(log_stack_trace)
  BACKTRACE: 39 stack frames:
   #0 smbd(log_stack_trace+0x1a) [0x7f517f0c1d8a]
   #1 smbd(smb_panic+0x2b) [0x7f517f0c1e5b]
   #2 smbd(+0x41b784) [0x7f517f0b2784]
   #3 /lib64/libc.so.6(+0x3f792326a0) [0x7f517af376a0]
   #4
/usr/lib64/glusterfs/3.6.0.38/xlator/cluster/distribute.so(dht_fsync+0x166)
[0x7f516c55a1c6]
   #5
/usr/lib64/glusterfs/3.6.0.38/xlator/performance/write-behind.so(wb_fsync+0x2d9)
[0x7f516c315359]
   #6
/usr/lib64/glusterfs/3.6.0.38/xlator/performance/read-ahead.so(ra_fsync+0x1aa)
[0x7f516c104a9a]
   #7 /usr/lib64/libglusterfs.so.0(default_fsync+0x7b) [0x7f517c54c5cb]
   #8 /usr/lib64/libglusterfs.so.0(default_fsync+0x7b) [0x7f517c54c5cb]
   #9 /usr/lib64/libglusterfs.so.0(default_fsync+0x7b) [0x7f517c54c5cb]
   #10 /usr/lib64/libglusterfs.so.0(default_fsync+0x7b) [0x7f517c54c5cb]
   #11
/usr/lib64/glusterfs/3.6.0.38/xlator/debug/io-stats.so(io_stats_fsync+0x165)
[0x7f51675bd915]
   #12 /usr/lib64/libglusterfs.so.0(default_fsync+0x7b) [0x7f517c54c5cb]
   #13 /usr/lib64/libglusterfs.so.0(syncop_fsync+0x192) [0x7f517c58e522]
   #14 /usr/lib64/libgfapi.so.0(glfs_migrate_fd_safe+0x1e5) [0x7f517c7d5e95]
   #15 /usr/lib64/libgfapi.so.0(__glfs_migrate_fd+0x46) [0x7f517c7d6266]
   #16 /usr/lib64/libgfapi.so.0(__glfs_migrate_openfds+0xb4) [0x7f517c7d6354]
   #17 /usr/lib64/libgfapi.so.0(__glfs_active_subvol+0x80) [0x7f517c7d67f0]
   #18 /usr/lib64/libgfapi.so.0(glfs_active_subvol+0x7f) [0x7f517c7d6a6f]
   #19 /usr/lib64/libgfapi.so.0(glfs_preadv+0x6c) [0x7f517c7d203c]
   #20 /usr/lib64/libgfapi.so.0(glfs_pread+0x1a) [0x7f517c7d238a]
   #21 smbd(read_file+0x133) [0x7f517eda4f13]
   #22 smbd(smbd_smb2_request_process_read+0x524) [0x7f517ee2b934]
   #23 smbd(smbd_smb2_request_dispatch+0x735) [0x7f517ee22ea5]
   #24 smbd(+0x18d506) [0x7f517ee24506]
   #25 smbd(+0x18a7f0) [0x7f517ee217f0]
   #26 smbd(+0x22a782) [0x7f517eec1782]
   #27 smbd(+0x22ab14) [0x7f517eec1b14]
   #28 smbd(+0x229ad9) [0x7f517eec0ad9]
   #29 /usr/lib64/libtevent.so.0(tevent_common_loop_immediate+0xe8)
[0x7f517b8d20c8]
   #30 smbd(run_events_poll+0x3c) [0x7f517f0d0f0c]
   #31 smbd(smbd_process+0x7da) [0x7f517ee0d9aa]
   #32 smbd(+0x69715f) [0x7f517f32e15f]
   #33 smbd(run_events_poll+0x377) [0x7f517f0d1247]
   #34 smbd(+0x43a6ff) [0x7f517f0d16ff]
   #35 /usr/lib64/libtevent.so.0(_tevent_loop_once+0x9d) [0x7f517b8d149d]
   #36 smbd(main+0xf3c) [0x7f517f32f45c]
   #37 /lib64/libc.so.6(__libc_start_main+0xfd) [0x7f517af23d5d]
   #38 smbd(+0xf4c29) [0x7f517ed8bc29]
[2014/12/12 06:51:20.732844,  0] lib/fault.c:372(dump_core)
  dumping core in /var/log/samba/cores/smbd



snapview-client does not implement fsync fop.

--- Additional comment from Anand Avati on 2014-12-09 12:57:48 EST ---

REVIEW: http://review.gluster.org/9258 (features/snapview-client: handle fsync
fop) posted (#1) for review on master by Raghavendra Bhat
(raghavendra at redhat.com)

--- Additional comment from Anand Avati on 2014-12-09 13:01:44 EST ---

REVIEW: http://review.gluster.org/9258 (features/snapview-client: handle fsync
fop) posted (#2) for review on master by Raghavendra Bhat
(raghavendra at redhat.com)

--- Additional comment from Anand Avati on 2014-12-09 13:07:57 EST ---

REVIEW: http://review.gluster.org/9258 (features/snapview-client: handle fsync
fop) posted (#3) for review on master by Raghavendra Bhat
(raghavendra at redhat.com)

--- Additional comment from Anand Avati on 2014-12-10 04:02:03 EST ---

REVIEW: http://review.gluster.org/9258 (features/snapview-client: handle fsync
fop) posted (#4) for review on master by Raghavendra Bhat
(raghavendra at redhat.com)

--- Additional comment from Anand Avati on 2014-12-10 05:44:14 EST ---

REVIEW: http://review.gluster.org/9258 (features/snapview-client: handle fsync
fop) posted (#5) for review on master by Raghavendra Bhat
(raghavendra at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1172262
[Bug 1172262] glusterfs client crashed while migrating the fds
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=M2vRB979DT&a=cc_unsubscribe


More information about the Bugs mailing list