[Bugs] [Bug 1263200] New: Data Tiering:Setting only promote frequency and no demote frequency causes crash

bugzilla at redhat.com bugzilla at redhat.com
Tue Sep 15 10:11:38 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1263200

            Bug ID: 1263200
           Summary: Data Tiering:Setting only promote frequency and no
                    demote frequency causes crash
           Product: GlusterFS
           Version: 3.7.4
         Component: tiering
          Assignee: bugs at gluster.org
          Reporter: nchilaka at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org



Description of problem:
=======================
I created a regular volume and created some files cf1,cf2,cf3 and started with
linux kernel files

I then attached a tier 
Set the ctr enable and promote freq to 10sec.
I then tried to modify files cf* using touch command.

But the files were not at all getting promoted (seperate bz#1262885)

I kept the volume idle for some few hours and then I saw that a crash was hit.


Note: As part of another unrelated issue, I was hitting glusterd crashes, I did
a work-around suggested by dev. to modify/add below option in glusterd.vol file
and restart glusterd.
   option ping-timeout 0
   option event-threads 1

I restarted glusterd a couple of time post this modification


Another problem after crash is wrong info of ec cold volume. It shows a two way
distributed EC cold tier as 12 way distributed.

Before crash:
[root at zod ~]# gluster v info 9301

Volume Name: 9301
Type: Tier
Volume ID: 0314fa86-49dc-4fbe-925f-8080157a9c8b
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: yarrow:/rhs/brick6/9301_hot
Brick2: zod:/rhs/brick6/9301_hot
Brick3: yarrow:/rhs/brick7/9301_hot
Brick4: zod:/rhs/brick7/9301_hot
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick5: zod:/rhs/brick1/9301
Brick6: yarrow:/rhs/brick1/9301
Brick7: zod:/rhs/brick2/9301
Brick8: yarrow:/rhs/brick2/9301
Brick9: zod:/rhs/brick3/9301
Brick10: yarrow:/rhs/brick3/9301
Brick11: zod:/rhs/brick4/9301
Brick12: yarrow:/rhs/brick4/9301
Brick13: zod:/rhs/brick5/9301
Brick14: yarrow:/rhs/brick5/9301
Brick15: yarrow:/rhs/brick6/9301
Brick16: zod:/rhs/brick6/9301
Options Reconfigured:
cluster.tier-promote-frequency: 10
features.ctr-enabled: on
performance.io-cache: off
performance.quick-read: off
performance.readdir-ahead: on




After crash:
Volume Name: 9301
Type: Tier
Volume ID: 0314fa86-49dc-4fbe-925f-8080157a9c8b
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: yarrow:/rhs/brick6/9301_hot
Brick2: zod:/rhs/brick6/9301_hot
Brick3: yarrow:/rhs/brick7/9301_hot
Brick4: zod:/rhs/brick7/9301_hot
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 12 x (4 + 2) = 12
Brick5: zod:/rhs/brick1/9301
Brick6: yarrow:/rhs/brick1/9301
Brick7: zod:/rhs/brick2/9301
Brick8: yarrow:/rhs/brick2/9301
Brick9: zod:/rhs/brick3/9301
Brick10: yarrow:/rhs/brick3/9301
Brick11: zod:/rhs/brick4/9301
Brick12: yarrow:/rhs/brick4/9301
Brick13: zod:/rhs/brick5/9301
Brick14: yarrow:/rhs/brick5/9301
Brick15: yarrow:/rhs/brick6/9301
Brick16: zod:/rhs/brick6/9301
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.io-cache: off
features.ctr-enabled: on
cluster.tier-promote-frequency: 10
cluster.read-freq-threshold: 0
cluster.write-freq-threshold: 0




core.2839: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from
'/usr/sbin/glusterfs -s localhost --volfile-id rebalance/9301 --xlator-option
*d'
[root at zod /]# gdb /usr/sbin/glusterfs core.2839
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/glusterfsd...Reading symbols from
/usr/lib/debug/usr/sbin/glusterfsd.debug...done.
done.

warning: core file may not match specified executable file.
[New LWP 13112]
[New LWP 2844]
[New LWP 2860]
[New LWP 2841]
[New LWP 2842]
[New LWP 2843]
[New LWP 2840]
[New LWP 2863]
[New LWP 2866]
[New LWP 2854]
[New LWP 2859]
[New LWP 2865]
[New LWP 2862]
[New LWP 2864]
[New LWP 2839]
[New LWP 2861]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id
rebalance/9301 --xlator-option *d'.
Program terminated with signal 11, Segmentation fault.
#0  tier_build_migration_qfile (is_promotion=_gf_true, 
    query_cbk_args=0x7f3afa854e70, args=0x7f3fee27bca0) at tier.c:607
607            list_for_each_entry (local_brick, args->brick_list, list) {
Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.el7.x86_64
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.12.2-14.el7.x86_64
libcom_err-1.42.9-7.el7.x86_64 libgcc-4.8.3-9.el7.x86_64
libselinux-2.2.2-6.el7.x86_64 libuuid-2.23.2-22.el7_1.1.x86_64
openssl-libs-1.0.1e-42.el7_1.9.x86_64 pcre-8.32-14.el7.x86_64
sqlite-3.7.17-6.el7_1.1.x86_64 sssd-client-1.12.2-58.el7_1.14.x86_64
xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64
(gdb) bt
#0  tier_build_migration_qfile (is_promotion=_gf_true, 
    query_cbk_args=0x7f3afa854e70, args=0x7f3fee27bca0) at tier.c:607
#1  tier_promote (args=0x7f3fee27bca0) at tier.c:704
#2  0x00007f4003bd0df5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f40035171ad in clone () from /lib64/libc.so.6

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list