[Bugs] [Bug 1273295] New: [Tier]: glusterfs crashed --volfile-id rebalance/tiervolume

bugzilla at redhat.com bugzilla at redhat.com
Tue Oct 20 07:26:48 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1273295

            Bug ID: 1273295
           Summary: [Tier]: glusterfs crashed --volfile-id
                    rebalance/tiervolume
           Product: GlusterFS
           Version: 3.7.5
         Component: tiering
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: rhinduja at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org



Description of problem:
=======================

Few cores are reported on longevity setup with the same bt as:

(gdb) bt
#0  0x00007f0b53de58b1 in __strlen_sse2_pminub () from /lib64/libc.so.6
#1  0x00007f0b46dc2481 in gf_sql_query_function (prep_stmt=0x7f0b203c9348,
query_callback=query_callback at entry=0x7f0b473ee150 <tier_gf_query_callback>,
_query_cbk_args=_query_cbk_args at entry=0x7f0b28ff8e90)
    at gfdb_sqlite3_helper.c:1157
#2  0x00007f0b46dc3c00 in gf_sqlite3_find_recently_changed_files
(db_conn=0x7f0b203d21e0, query_callback=0x7f0b473ee150
<tier_gf_query_callback>, query_cbk_args=0x7f0b28ff8e90,
from_time=0x7f0b28ff8e00)
    at gfdb_sqlite3.c:728
#3  0x00007f0b46dbdef1 in find_recently_changed_files (_conn_node=<optimized
out>, query_callback=0x7f0b473ee150 <tier_gf_query_callback>,
_query_cbk_args=0x7f0b28ff8e90, from_time=0x7f0b28ff8e00)
    at gfdb_data_store.c:551
#4  0x00007f0b473ed179 in tier_process_self_query
(local_brick=local_brick at entry=0x7f0b140033f0, args=args at entry=0x7f0b28ff8e10)
at tier.c:682
#5  0x00007f0b473edd23 in tier_process_brick (args=0x7f0b28ff8e10,
local_brick=0x7f0b140033f0) at tier.c:953
#6  tier_build_migration_qfile (args=args at entry=0x7f0b3cfdec60,
query_cbk_args=query_cbk_args at entry=0x7f0b28ff8e90,
is_promotion=is_promotion at entry=_gf_true) at tier.c:1028
#7  0x00007f0b473f0822 in tier_promote (args=0x7f0b3cfdec60) at tier.c:1128
#8  0x00007f0b54432dc5 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f0b53d791cd in clone () from /lib64/libc.so.6
(gdb) f 2
#2  0x00007f0b46dc3c00 in gf_sqlite3_find_recently_changed_files
(db_conn=0x7f0b203d21e0, query_callback=0x7f0b473ee150
<tier_gf_query_callback>, query_cbk_args=0x7f0b28ff8e90,
from_time=0x7f0b28ff8e00)
    at gfdb_sqlite3.c:728
728            ret = gf_sql_query_function (prep_stmt, query_callback,
query_cbk_args);
(gdb) f 1
#1  0x00007f0b46dc2481 in gf_sql_query_function (prep_stmt=0x7f0b203c9348,
query_callback=query_callback at entry=0x7f0b473ee150 <tier_gf_query_callback>,
_query_cbk_args=_query_cbk_args at entry=0x7f0b28ff8e90)
    at gfdb_sqlite3_helper.c:1157
1157                                                                   
(text_column);
(gdb) l
1152                            /* Get link string. Do shallow copy here
1153                             * query_callback function should do a
1154                             * deep copy and then do operations on this
field*/
1155                            gfdb_query_record->_link_info_str =
text_column;
1156                            gfdb_query_record->link_info_size = strlen
1157                                                                   
(text_column);
1158    
1159                            /* Call the call back function provided*/
1160                            ret = query_callback (gfdb_query_record,
1161                                                           
_query_cbk_args);
(gdb) p text_column
$1 = <optimized out>
(gdb) p *text_column
value has been optimized out
(gdb) f 0
#0  0x00007f0b53de58b1 in __strlen_sse2_pminub () from /lib64/libc.so.6
(gdb) f 1
#1  0x00007f0b46dc2481 in gf_sql_query_function (prep_stmt=0x7f0b203c9348,
query_callback=query_callback at entry=0x7f0b473ee150 <tier_gf_query_callback>,
_query_cbk_args=_query_cbk_args at entry=0x7f0b28ff8e90)
    at gfdb_sqlite3_helper.c:1157
1157                                                                   
(text_column);
(gdb) p gfdb_query_record
$2 = (gfdb_query_record_t *) 0x0
(gdb) p *gfdb_query_record
Cannot access memory at address 0x0
(gdb) q
[root at dhcp37-162 glusterfs]#

[root at dhcp37-162 glusterfs]# file /core.23681
/core.23681: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style,
from '/usr/sbin/glusterfs -s localhost --volfile-id rebalance/tiervolume
--xlator-opt'
[root at dhcp37-162 glusterfs]# 

Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-0.19.git0f5c3e8.el7.centos.x86_64

Setup Details:
==============
1. Created 12 node cluster
2. Create tiered volume with Hot tier as (6 x 2) and Cold tier as (2 x (6 + 2)
= 16)
3. Fuse Mount the volume on 3 clients RHEL7.2,RHEl7.1 and RHEL6.7
4. Start creating data from each client:

Client 1:
=========
[root at dj ~]# crefi --multi -n 10 -b 10 -d 10 --max=1024k --min=5k --random -T 5
-t text -I 5 --fop=create /mnt/fuse/

Client 2:
=========
[root at mia ~]# cd /mnt/fuse/
[root at mia fuse]# for i in {1..10}; do cp -rf /etc etc.$i ; sleep 100 ; done

Client 3:
=========
[root at wingo fuse]# for i in {1..999}; do dd if=/dev/zero of=dd.$i bs=1M count=1
; sleep 10 ; done

5. After a while, the data creation of client 1 and client 2 should be
completed while the data creation from client 3 will still be inprogress

6. At this point the data creation will be of only 1 file from client 3 in
every 10 sec.

7. Disabled/Enabled quota 

8. System was idle for a couple of days in between step 6 and step 7

Actual results:
===============

Cores are generated between after step 6.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list