[Bugs] [Bug 1235582] New: snapd crashed due to stack overflow

Thu Jun 25 09:01:18 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1235582

            Bug ID: 1235582
           Summary: snapd crashed due to stack overflow
           Product: GlusterFS
           Version: mainline
         Component: core
          Severity: urgent
          Assignee: kparthas at redhat.com
          Reporter: kparthas at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com
        Depends On: 1235571

+++ This bug was initially created as a clone of Bug #1235571 +++

Description of problem:

Snapd crashed on one of the Storage node.

Version-Release number of selected component (if applicable):

[root at darkknightrises ~]# rpm -qa | grep glusterfs
glusterfs-libs-3.7.1-5.el6rhs.x86_64
glusterfs-api-3.7.1-5.el6rhs.x86_64
samba-vfs-glusterfs-4.1.17-7.el6rhs.x86_64
glusterfs-3.7.1-5.el6rhs.x86_64
glusterfs-fuse-3.7.1-5.el6rhs.x86_64
glusterfs-cli-3.7.1-5.el6rhs.x86_64
glusterfs-geo-replication-3.7.1-5.el6rhs.x86_64
glusterfs-debuginfo-3.7.1-5.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-5.el6rhs.x86_64
glusterfs-server-3.7.1-5.el6rhs.x86_64

How reproducible:

1/1

Steps to Reproduce:
1. Create 2*2 distribute replicate volume
2. Do fuse mount and start I/O from the mount point
3. Enable quota and set limit usage
4. Enable USS
5. set snap-activate-on-create enable for snapshot 
6. Enable auto-delete for snapshot 
7. set cluster.enable-shared-storage enable
8. Schedule snapshot every after every 5 mins
[root at darkknightrises ~]# snap_scheduler.py list
JOB_NAME         SCHEDULE         OPERATION        VOLUME NAME      
--------------------------------------------------------------------
job1             */5 * * * *      Snapshot Create  vol0             
9. edit scheduled time to every two minutes
10. Restart nptd service on all the storage node. 
11 Did gluster v status vol0 observers the crash

Actual results:

Snapd crashed 

Expected results:

Snapd should not crash

Additional info:

[root at darkknightrises ~]# gluster v info vol0

Volume Name: vol0
Type: Distributed-Replicate
Volume ID: 553670d5-8f05-4d41-878e-921d59f117ae
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: node1:/rhs/brick1/b01
Brick2: node2:/rhs/brick1/b02
Brick3: node3:/rhs/brick1/b03
Brick4: node4:/rhs/brick1/04
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
features.uss: enable
features.show-snapshot-directory: enable
features.barrier: enable
auto-delete: enable
snap-activate-on-create: enable
cluster.enable-shared-storage: enable

===============================================================

[root at darkknightrises /]# gluster v status vol0
Status of volume: vol0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/rhs/brick1/b01          49152     0          Y       24731
Brick node2:/rhs/brick1/b02          49152     0          Y       16359
Brick node3:/rhs/brick1/b03          49152     0          Y       10149
Brick node4:/rhs/brick1/04            49152     0          Y       5512 
Snapshot Daemon on localhost                N/A       N/A        N       25133
NFS Server on localhost                     2049      0          Y       25671
Self-heal Daemon on localhost               N/A       N/A        Y       25681
Quota Daemon on localhost                   N/A       N/A        Y       26666
Snapshot Daemon on node3             49153     0          Y       10525
NFS Server on node3                  2049      0          Y       10991
Self-heal Daemon on node3            N/A       N/A        Y       11001
Quota Daemon on node3                N/A       N/A        Y       11853
Snapshot Daemon on node4              49154     0          Y       5918 
NFS Server on node4                   2049      0          Y       6396 
Self-heal Daemon on node4             N/A       N/A        Y       6406 
Quota Daemon on node4                 N/A       N/A        Y       7255 
Snapshot Daemon on node2             49154     0          Y       16714
NFS Server on node2                  2049      0          Y       17201
Self-heal Daemon on node2            N/A       N/A        Y       17211
Quota Daemon on node2                N/A       N/A        Y       18168

bt
========================================
#44214 0x00007f473c1f5def in client_dump_version_cbk (req=<value optimized
out>, iov=<value optimized out>, count=<value optimized out>,
myframe=0x7f4024701c24) at client-handshake.c:1623
 1623                    gf_msg (frame->this->name, GF_LOG_WARNING, ENOTCONN,
 (gdb) up
 #44215 0x00007f475360cb1e in saved_frames_unwind (saved_frames=0x7f45d80022d0)
at rpc-clnt.c:366
 366                     trav->rpcreq->cbkfn (trav->rpcreq, &iov, 1,
trav->frame);
 (gdb) up
 #44216 0x00007f475360cc0e in saved_frames_destroy (frames=0x7f45d80022d0) at
rpc-clnt.c:383
 383             saved_frames_unwind (frames);
 (gdb) down
 #44215 0x00007f475360cb1e in saved_frames_unwind (saved_frames=0x7f45d80022d0)
at rpc-clnt.c:366
 366                     trav->rpcreq->cbkfn (trav->rpcreq, &iov, 1,
trav->frame);
 (gdb) up
 #44216 0x00007f475360cc0e in saved_frames_destroy (frames=0x7f45d80022d0) at
rpc-clnt.c:383
 383             saved_frames_unwind (frames);
 (gdb) up
 #44217 0x00007f475360ccdb in rpc_clnt_connection_cleanup (conn=0x7f45d0e86f40)
at rpc-clnt.c:536
 536             saved_frames_destroy (saved_frames);
 (gdb) up
 #44218 0x00007f475360d29f in rpc_clnt_notify (trans=<value optimized out>,
mydata=0x7f45d0e86f40, event=<value optimized out>, data=<value optimized out>)
at rpc-clnt.c:843
 843                     rpc_clnt_connection_cleanup (conn);
 (gdb) up
 #44219 0x00007f4753608928 in rpc_transport_notify (this=<value optimized out>,
event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:543
 543                     ret = this->notify (this, this->mydata, event, data);
 (gdb) up
 #44220 0x00007f4748370741 in socket_event_poll_err (fd=<value optimized out>,
idx=<value optimized out>, data=0x7f45d0ee01c0, poll_in=<value optimized out>,
poll_out=0, poll_err=0)
     at socket.c:1205
 1205            rpc_transport_notify (this, RPC_TRANSPORT_DISCONNECT, this);
 (gdb) up
 #44221 socket_event_handler (fd=<value optimized out>, idx=<value optimized
out>, data=0x7f45d0ee01c0, poll_in=<value optimized out>, poll_out=0,
poll_err=0) at socket

 ----------------------------------------
 and some frames at the top

 0  0x00007f47521ee06a in vfprintf () from /lib64/libc.so.6
 #1  0x00007f47522a9b10 in __vsnprintf_chk () from /lib64/libc.so.6
 #2  0x00007f47522a9a4a in __snprintf_chk () from /lib64/libc.so.6
 #3  0x00007f4753856a5d in snprintf (buf=0x7f4024f1bb98 "") at
/usr/include/bits/stdio2.h:65
 #4  gf_backtrace_append (buf=0x7f4024f1bb98 "") at common-utils.c:3547
 #5  gf_backtrace_fillframes (buf=0x7f4024f1bb98 "") at common-utils.c:3593
 #6  0x00007f4753856ac1 in gf_backtrace_save (buf=<value optimized out>) at
common-utils.c:3631
 #7  0x00007f475383cfc0 in _gf_log_callingfn (domain=0x7f47538add11 "",
file=<value optimized out>, function=0x7f47538b3850 "gf_mem_set_acct_info",
line=62, level=GF_LOG_ERROR, 
     fmt=0x7f47538b3718 "Assertion failed: xl->mem_acct != NULL") at
logging.c:835
 #8  0x00007f475386dd34 in gf_mem_set_acct_info (xl=0x7f4753ad4120,
alloc_ptr=0x7f401a76ffe8, size=437, type=39, typestr=0x7f47538b35cc
"gf_common_mt_asprintf") at mem-pool.c:62
 #9  0x00007f475386ddf6 in __gf_malloc (size=437, type=39,
typestr=0x7f47538b35cc "gf_common_mt_asprintf") at mem-pool.c:146
 #10 0x00007f475386df04 in gf_vasprintf (string_ptr=0x7f401a770228,
format=0x7f47538adc22 "[%s] %s [%s:%d:%s] %s %d-%s: ", arg=<value optimized
out>) at mem-pool.c:220
 #11 0x00007f475386dfc8 in gf_asprintf (string_ptr=<value optimized out>,
format=<value optimized out>) at mem-pool.c:238
 #12 0x00007f475383d1fe in _gf_log_callingfn (domain=0x7f47538add11 "",
file=<value optimized out>, function=0x7f47538b3850 "gf_mem_set_acct_info",
line=62, level=GF_LOG_ERROR, 
     fmt=0x7f47538b3718 "Assertion failed: xl->mem_acct != NULL") at
logging.c:869
 #13 0x00007f475386dd34 in gf_mem_set_acct_info (xl=0x7f4753ad4120,
alloc_ptr=0x7f401a770488, size=437, type=39, typestr=0x7f47538b35cc
"gf_common_mt_asprintf") at mem-pool.c:62
 #14 0x00007f475386ddf6 in __gf_malloc (size=437, type=39,
typestr=0x7f47538b35cc "gf_common_mt_asprintf") at mem-pool.c:146

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1235571
[Bug 1235571] snapd crashed due to stack overflow
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=FmyR5mq9N5&a=cc_unsubscribe