[Bugs] [Bug 1611106] New: Glusterd crashed on a few (master) nodes
bugzilla at redhat.com
bugzilla at redhat.com
Thu Aug 2 05:07:58 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1611106
Bug ID: 1611106
Summary: Glusterd crashed on a few (master) nodes
Product: GlusterFS
Version: 4.1
Component: glusterd
Keywords: Reopened
Severity: high
Priority: high
Assignee: bugs at gluster.org
Reporter: khiremat at redhat.com
CC: atumball at redhat.com, bmekala at redhat.com,
bugs at gluster.org, khiremat at redhat.com,
nbalacha at redhat.com, rallan at redhat.com,
rhinduja at redhat.com, rhs-bugs at redhat.com,
sankarshan at redhat.com, storage-qa-internal at redhat.com,
vbellur at redhat.com, vdas at redhat.com
Depends On: 1570586, 1576392
Blocks: 1577868
+++ This bug was initially created as a clone of Bug #1576392 +++
Description of problem:
=======================
Glusterd crashed on a few nodes
Geo-replication status was CREATED/ACTIVE as opposed to ACTIVE/PASSIVE.
Geo-replication session was started and the following was shown as the status
of the session:
----------------------------------------------------------------------------------------------
[root at dhcp41-226 scripts]# gluster volume geo-replication master
10.70.41.160::slave status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.41.226 master /rhs/brick3/b7 root
10.70.41.160::slave N/A Created N/A N/A
10.70.41.226 master /rhs/brick1/b1 root
10.70.41.160::slave N/A Created N/A N/A
10.70.41.230 master /rhs/brick2/b5 root
10.70.41.160::slave N/A Created N/A N/A
10.70.41.229 master /rhs/brick2/b4 root
10.70.41.160::slave N/A Created N/A N/A
10.70.41.219 master /rhs/brick2/b6 root
10.70.41.160::slave N/A Created N/A N/A
10.70.41.227 master /rhs/brick3/b8 root
10.70.41.160::slave N/A Created N/A N/A
10.70.41.227 master /rhs/brick1/b2 root
10.70.41.160::slave N/A Created N/A N/A
10.70.41.228 master /rhs/brick3/b9 root
10.70.41.160::slave 10.70.41.160 Active Changelog Crawl 2018-04-23
06:13:53
10.70.41.228 master /rhs/brick1/b3 root
10.70.41.160::slave 10.70.42.79 Active Changelog Crawl 2018-04-23
06:13:53
glusterd logs:
-------------
[2018-04-23 07:34:16.850166] E [mem-pool.c:307:__gf_free]
(-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x419cf)
[0x7f98a9e619cf]
-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x44ca5)
[0x7f98a9e64ca5] -->/lib64/libglusterfs.so.0(__gf_free+0xac) [0x7f98b53e268c] )
0-: Assertion failed: GF_MEM_HEADER_MAGIC == header->magic
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 6
time of crash:
2018-04-23 07:34:16
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.2
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f98b53ba4d0]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f98b53c4414]
/lib64/libc.so.6(+0x36280)[0x7f98b3a19280]
/lib64/libc.so.6(gsignal+0x37)[0x7f98b3a19207]
/lib64/libc.so.6(abort+0x148)[0x7f98b3a1a8f8]
/lib64/libc.so.6(+0x78cc7)[0x7f98b3a5bcc7]
/lib64/libc.so.6(+0x7f574)[0x7f98b3a62574]
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x44ca5)[0x7f98a9e64ca5]
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x419cf)[0x7f98a9e619cf]
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1bdc2)[0x7f98a9e3bdc2]
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x23b6e)[0x7f98a9e43b6e]
/lib64/libglusterfs.so.0(synctask_wrap+0x10)[0x7f98b53f3250]
/lib64/libc.so.6(+0x47fc0)[0x7f98b3a2afc0]
---------
Version-Release number of selected component (if applicable):
=============================================================
mainline
How reproducible:
=================
1/1
Steps to Reproduce:
===================
1. Create Master and a Slave cluster from 6 nodes (each)
2. Create and Start master volume (Tiered: cold-tier 1x(4+2) and hot-tier 1x3)
4. Create and Start slave volume (Tiered: cold-tier 1x(4+2) and hot-tier 1x3)
5. Enable quota on master volume
6. Enable shared storage on master volume
7. Setup geo-rep session between master and slave volume
8. Mount master volume on client
9. Create data from master client
Actual results:
================
Glusterd crashed on a few nodes
Geo-rep session was in Created/ACTIVE state
Expected results:
=================
Glusterd should not crash
A geo-rep session which was started should be in ACTIVE/PASSIVE state.
(gdb) bt
#0 0x00007f3fbd4d7e4d in __gf_free () from /lib64/libglusterfs.so.0
#1 0x00007f3fb1ff63de in gd_sync_task_begin () from
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so
#2 0x00007f3fb1ff6c50 in glusterd_op_begin_synctask () from
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so
#3 0x00007f3fb1fc3d98 in __glusterd_handle_gsync_set () from
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so
#4 0x00007f3fb1f38b1e in glusterd_big_locked_handler () from
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so
#5 0x00007f3fbd4e8ad0 in synctask_wrap () from /lib64/libglusterfs.so.0
#6 0x00007f3fbbb1ffc0 in ?? () from /lib64/libc.so.6
#7 0x0000000000000000 in ?? ()
(gdb)
--- Additional comment from Worker Ant on 2018-05-09 07:12:51 EDT ---
REVIEW: https://review.gluster.org/19993 (glusterd/geo-rep: Fix glusterd crash)
posted (#1) for review on master by Kotresh HR
--- Additional comment from Worker Ant on 2018-05-12 05:07:02 EDT ---
COMMIT: https://review.gluster.org/19993 committed in master by "Amar Tumballi"
<amarts at redhat.com> with a commit message- glusterd/geo-rep: Fix glusterd crash
Using strdump instead of gf_strdup crashes
during free if mempool is being used.
gf_free checks the magic number in the
header which will not be taken care if
strdup is used.
fixes: bz#1576392
Change-Id: Iab36496554b838a036af9d863e3f5fd07fd9780e
Signed-off-by: Kotresh HR <khiremat at redhat.com>
--- Additional comment from Worker Ant on 2018-05-14 23:05:03 EDT ---
REVISION POSTED: https://review.gluster.org/20019 (glusterd/geo-rep: Fix
glusterd crash) posted (#2) for review on release-3.12 by Kotresh HR
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1570586
[Bug 1570586] Glusterd crashed on a few (master) nodes
https://bugzilla.redhat.com/show_bug.cgi?id=1576392
[Bug 1576392] Glusterd crashed on a few (master) nodes
https://bugzilla.redhat.com/show_bug.cgi?id=1577868
[Bug 1577868] Glusterd crashed on a few (master) nodes
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list