[Bugs] [Bug 1258144] New: Data Tiering: Tier deamon crashed when detach tier start was issued while IOs were happening
bugzilla at redhat.com
bugzilla at redhat.com
Sat Aug 29 15:40:56 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1258144
Bug ID: 1258144
Summary: Data Tiering: Tier deamon crashed when detach tier
start was issued while IOs were happening
Product: GlusterFS
Version: 3.7.3
Component: tiering
Severity: urgent
Assignee: bugs at gluster.org
Reporter: nchilaka at redhat.com
QA Contact: bugs at gluster.org
CC: bugs at gluster.org
Description of problem:
=========================
I created a replicate tier over dist-rep volume. Mounted volume over nfs and I
turned on ctr
I had done quite some IOs by untarring linux kernel tar.
The files were demoted after some time as expected.
Now. I renamed the existing untarred dir and issued an untar again.
While this was going on, I issued a detach tier start.
I noted the following observations:
1)the tier deamon crashed
2)obviously, the rebalance tier status and rebalance status shows as failed
as below:
gluster v rebal g1 status
Node Rebalanced-files size
scanned failures skipped status run time in secs
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
localhost 0 0Bytes
0 0 0 failed 0.00
10.70.46.36 0 0Bytes
0 0 0 failed 0.00
3)*IMPORTANT* The IOs were however happening still and getting populated in hot
tier only(this could eventually fill the hot tier)
4)After some time, when i issued "gluster v status <vname>, it failed as below
[root at nag-manual-node1 ~]# gluster v status g1
Commit failed on localhost. Please check the log file for more details.
5)The AFR deamons too were not showing up in ps -ef
Version-Release number of selected component (if applicable):
=================================================================
[root at nag-manual-node1 ~]# rpm -qa|grep gluster
glusterfs-libs-3.7.3-0.82.git6c4096f.el6.x86_64
glusterfs-fuse-3.7.3-0.82.git6c4096f.el6.x86_64
glusterfs-server-3.7.3-0.82.git6c4096f.el6.x86_64
glusterfs-3.7.3-0.82.git6c4096f.el6.x86_64
glusterfs-api-3.7.3-0.82.git6c4096f.el6.x86_64
glusterfs-cli-3.7.3-0.82.git6c4096f.el6.x86_64
glpython-gluster-3.7.3-0.82.git6c4096f.el6.noarch
glusterfs-client-xlators-3.7.3-0.82.git6c4096f.el6.x86_64
[root at nag-manual-node1 ~]# gluster --version
glusterfs 3.7.3 built on Aug 27 2015 01:23:05
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
Steps to Reproduce:
===================
1.created a 2x2 vol and start it
2.attached a 1x2 replica hot tier and mounted on nfs
3.performed linux untar
4. demotes happened after some(expected)
5. Now again did a linux untar, after renaming old dir
6. While in progress, issued a detach-tier start.
7. This caused tier deamon crash(and probably even replica crash, but not
sure,as the ps -ef didn't show them, but the files which were still getting
untarred were avialable on both the bricks of the hot pair)
CRASH
======
[2015-08-29 16:02:01.669901] E [MSGID: 109037] [tier.c:898:tier_start]
0-g1-tier-dht: Demotion failed!
[2015-08-29 16:02:00.311020] I [MSGID: 109038]
[tier.c:350:tier_migrate_using_query_file] 0-g1-tier-dht: Tier 0 src_subvol
g1-hot-dht file .gitignore
[2015-08-29 16:02:00.312280] I [MSGID: 109038]
[tier.c:109:tier_check_same_node] 0-g1-tier-dht: /linux-4.1.6/.gitignore does
not belong to this node
[2015-08-29 16:04:00.698176] I [MSGID: 109038]
[tier.c:574:tier_build_migration_qfile] 0-g1-tier-dht: Failed to remove
/var/run/gluster/demotequeryfile-20559
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-08-29 16:04:00
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.3
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x3560c25936]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x32f)[0x3560c4549f]
/lib64/libc.so.6[0x340e8326a0]
/lib64/libc.so.6[0x340e93372f]
/usr/lib64/libgfdb.so.0(gf_sql_query_function+0xdf)[0x7fa812066dcf]
/usr/lib64/libgfdb.so.0(gf_sqlite3_find_unchanged_for_time+0xd5)[0x7fa81206bb05]
/usr/lib64/libgfdb.so.0(find_unchanged_for_time+0x4f)[0x7fa812065f1f]
/usr/lib64/glusterfs/3.7.3/xlator/cluster/tier.so(+0x5410d)[0x7fa81266f10d]
/usr/lib64/libglusterfs.so.0(dict_foreach_match+0x74)[0x3560c1d2d4]
/usr/lib64/libglusterfs.so.0(dict_foreach+0x18)[0x3560c1d388]
/usr/lib64/glusterfs/3.7.3/xlator/cluster/tier.so(+0x55ea7)[0x7fa812670ea7]
/lib64/libpthread.so.0[0x340ec07a51]
/lib64/libc.so.6(clone+0x6d)[0x340e8e89ad]
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list