[Bugs] [Bug 1263585] New: Data Tiering:new crash seen with tier rebalance deamon

bugzilla at redhat.com bugzilla at redhat.com
Wed Sep 16 08:55:52 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1263585

            Bug ID: 1263585
           Summary: Data Tiering:new crash seen with tier rebalance deamon
           Product: GlusterFS
           Version: 3.7.4
         Component: tiering
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: nchilaka at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org



I create a 2x2 hot over 2x(4+2) ec volume.
I set promote freq as 1000sec and demote as 100 and mounted the tier vol on
fuse.
Now I wrote many different folders containing mp3 files(about 15 folders each
on an avg holding 5-6 files)

I left this volume for some time so that all files will get demoted.
After about 5 hrs, this crash was hit


[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id
rebalance/gold --xlator-option *d'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007ff8f9f82af0 in syncop_ipc (subvol=0x7ff7a2de5700, op=op at entry=1, 
    xdata_in=xdata_in at entry=0x0, xdata_out=xdata_out at entry=0x0) at
syncop.c:2823
2823            SYNCOP (subvol, (&args), syncop_ipc_cbk, subvol->fops->ipc,
Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.el7.x86_64
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.12.2-14.el7.x86_64
libcom_err-1.42.9-7.el7.x86_64 libgcc-4.8.3-9.el7.x86_64
libselinux-2.2.2-6.el7.x86_64 libuuid-2.23.2-22.el7_1.1.x86_64
openssl-libs-1.0.1e-42.el7_1.9.x86_64 pcre-8.32-14.el7.x86_64
sqlite-3.7.17-6.el7_1.1.x86_64 sssd-client-1.12.2-58.el7_1.14.x86_64
xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64
(gdb) bt
#0  0x00007ff8f9f82af0 in syncop_ipc (subvol=0x7ff7a2de5700, op=op at entry=1, 
    xdata_in=xdata_in at entry=0x0, xdata_out=xdata_out at entry=0x0) at
syncop.c:2823
#1  0x00007ff8e7919fb9 in tier_process_brick_cbk (args=<synthetic pointer>, 
    local_brick=0x7ff8db43bc60) at tier.c:548
#2  tier_build_migration_qfile (is_promotion=_gf_false, 
    query_cbk_args=0x7ff79fddee70, args=0x7ff8db43bcc0) at tier.c:608
#3  tier_demote (args=0x7ff8db43bcc0) at tier.c:668
#4  0x00007ff8f8d99df5 in start_thread () from /lib64/libpthread.so.0
#5  0x00007ff8f86e01ad in clone () from /lib64/libc.so.6
(gdb) quit
[root at zod /]# ll core.32719



[root at zod glusterfs]# tail -n 30 gold-tier.log 

2015-09-16 07:00:00.348553] W [MSGID: 101105]
[gfdb_sqlite3.c:379:apply_sql_params_db] 0-sqlite3: Failed to retrieve
sql-db-autovacuum from params.Assigning default value: none
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2015-09-16 07:00:00
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.4
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7ff8f9f30fd2]
/lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7ff8f9f4d45d]
/lib64/libc.so.6(+0x35650)[0x7ff8f861f650]
/lib64/libglusterfs.so.0(syncop_ipc+0x140)[0x7ff8f9f82af0]
/usr/lib64/glusterfs/3.7.4/xlator/cluster/tier.so(+0x55fb9)[0x7ff8e7919fb9]
/lib64/libpthread.so.0(+0x7df5)[0x7ff8f8d99df5]
/lib64/libc.so.6(clone+0x6d)[0x7ff8f86e01ad]
---------



Version-Release number of selected component (if applicable):
=============================================================
[root at zod glusterfs]# gluster --version
glusterfs 3.7.4 built on Sep 12 2015 01:35:35
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
[root at zod glusterfs]# rpm -qa|grep gluster
glusterfs-client-xlators-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-api-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-fuse-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-debuginfo-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-server-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-cli-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-libs-3.7.4-0.33.git1d02d4b.el7.centos.x86_64


[root at zod glusterfs]# gluster v info gold;gluster v status gold;gluster v rebal
gold status;gluster v tier gold status

Volume Name: gold
Type: Tier
Volume ID: ba50bc3f-5b0f-4707-bf4f-cf5a2f5a192b
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: yarrow:/rhs/brick6/gold_hot
Brick2: zod:/rhs/brick6/gold_hot
Brick3: yarrow:/rhs/brick7/gold_hot
Brick4: zod:/rhs/brick7/gold_hot
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick5: zod:/rhs/brick1/gold
Brick6: yarrow:/rhs/brick1/gold
Brick7: zod:/rhs/brick2/gold
Brick8: yarrow:/rhs/brick2/gold
Brick9: zod:/rhs/brick3/gold
Brick10: yarrow:/rhs/brick3/gold
Brick11: zod:/rhs/brick4/gold
Brick12: yarrow:/rhs/brick4/gold
Brick13: zod:/rhs/brick5/gold
Brick14: yarrow:/rhs/brick5/gold
Brick15: yarrow:/rhs/brick6/gold
Brick16: zod:/rhs/brick6/gold
Options Reconfigured:
cluster.tier-demote-frequency: 100
cluster.tier-promote-frequency: 1000
features.ctr-enabled: on
performance.io-cache: off
performance.quick-read: off
performance.readdir-ahead: on
Status of volume: gold
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick yarrow:/rhs/brick6/gold_hot           49227     0          Y       739  
Brick zod:/rhs/brick6/gold_hot              49234     0          Y       32646
Brick yarrow:/rhs/brick7/gold_hot           49226     0          Y       713  
Brick zod:/rhs/brick7/gold_hot              49233     0          Y       32622
Cold Bricks:
Brick zod:/rhs/brick1/gold                  49227     0          Y       32303
Brick yarrow:/rhs/brick1/gold               49220     0          Y       32644
Brick zod:/rhs/brick2/gold                  49228     0          Y       32321
Brick yarrow:/rhs/brick2/gold               49221     0          Y       32668
Brick zod:/rhs/brick3/gold                  49229     0          Y       32339
Brick yarrow:/rhs/brick3/gold               49222     0          Y       32686
Brick zod:/rhs/brick4/gold                  49230     0          Y       32357
Brick yarrow:/rhs/brick4/gold               49223     0          Y       32704
Brick zod:/rhs/brick5/gold                  49231     0          Y       32375
Brick yarrow:/rhs/brick5/gold               49224     0          Y       32723
Brick yarrow:/rhs/brick6/gold               49225     0          Y       32747
Brick zod:/rhs/brick6/gold                  49232     0          Y       32393
NFS Server on localhost                     2049      0          Y       11286
NFS Server on yarrow                        2049      0          Y       23678

Task Status of Volume gold
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 8e8a40b4-be43-4593-a192-3f2b7411bd97
Status               : in progress         

                                    Node Rebalanced-files          size      
scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes        
  108             0             0          in progress             765.00
                                  yarrow                0        0Bytes        
 7344             0             0          in progress           68476.00
volume rebalance: gold: success: 
Node                 Promoted files       Demoted files        Status           
---------            ---------            ---------            ---------        
localhost            0                    0                    in progress      
yarrow               0                    0                    in progress      
volume rebalance: gold: success: 
[root at zod glusterfs]#

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list