[Bugs] [Bug 1805052] New: Disperse volume : Ganesha crash with IO in 4+2 config when one glusterfsd restart every 600s
bugzilla at redhat.com
bugzilla at redhat.com
Thu Feb 20 07:38:33 UTC 2020
https://bugzilla.redhat.com/show_bug.cgi?id=1805052
Bug ID: 1805052
Summary: Disperse volume : Ganesha crash with IO in 4+2 config
when one glusterfsd restart every 600s
Product: GlusterFS
Version: 5
Status: NEW
Component: disperse
Assignee: bugs at gluster.org
Reporter: pkarampu at redhat.com
CC: aspandey at redhat.com, bugs at gluster.org,
kinglongmee at gmail.com
Depends On: 1729772
Target Milestone: ---
Classification: Community
+++ This bug was initially created as a clone of Bug #1729772 +++
Description of problem:
LTP ftestxx tests at a 4+2 disperse volume.
When running the test at nfs client, a bash scripts running which reboot one
node(the cluster node Ganesha.nfsd is not running on) every 600s.
ganesha.nfsd crash when healing name,
Core was generated by `/usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N N'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f0d5ae8c5a9 in ec_heal_name (frame=0x7f0d57c6ca28,
ec=0x7f0d5b62d280, parent=0x0, name=0x7f0d57537d31 "b",
participants=0x7f0d0dfffe30 "\001\001\001") at ec-heal.c:1685
1685 loc.inode = inode_new(parent->table);
Missing separate debuginfos, use: debuginfo-install
bzip2-libs-1.0.6-13.el7.x86_64 dbus-libs-1.10.24-12.el7.x86_64
elfutils-libelf-0.172-2.el7.x86_64 elfutils-libs-0.172-2.el7.x86_64
glibc-2.17-260.el7.x86_64 gssproxy-0.7.0-21.el7.x86_64
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-34.el7.x86_64
libacl-2.2.51-14.el7.x86_64 libattr-2.4.46-13.el7.x86_64
libblkid-2.23.2-59.el7.x86_64 libcap-2.22-9.el7.x86_64
libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7.x86_64
libgcrypt-1.5.3-14.el7.x86_64 libgpg-error-1.12-3.el7.x86_64
libnfsidmap-0.25-19.el7.x86_64 libselinux-2.5-14.1.el7.x86_64
libuuid-2.23.2-59.el7.x86_64 lz4-1.7.5-2.el7.x86_64
openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64
systemd-libs-219-62.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64
zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0 0x00007f0d5ae8c5a9 in ec_heal_name (frame=0x7f0d57c6ca28,
ec=0x7f0d5b62d280, parent=0x0, name=0x7f0d57537d31 "b",
participants=0x7f0d0dfffe30 "\001\001\001") at ec-heal.c:1685
#1 0x00007f0d5ae93cae in ec_heal_do (this=0x7f0d5b65ac00,
data=0x7f0d24e3c028, loc=0x7f0d24e3c358, partial=0) at ec-heal.c:3050
#2 0x00007f0d5ae94455 in ec_synctask_heal_wrap (opaque=0x7f0d24e3c028)
at ec-heal.c:3139
#3 0x00007f0d6d1268c9 in synctask_wrap () at syncop.c:369
#4 0x00007f0d6c6bf010 in ?? () from /lib64/libc.so.6
#5 0x0000000000000000 in ?? ()
(gdb) frame 1
#1 0x00007f0d5ae93cae in ec_heal_do (this=0x7f0d5b65ac00,
data=0x7f0d24e3c028, loc=0x7f0d24e3c358, partial=0) at ec-heal.c:3050
3050 ret = ec_heal_name(frame, ec, loc->parent, (char *)loc->name,
(gdb) p loc
$1 = (loc_t *) 0x7f0d24e3c358
(gdb) p *loc
$2 = {
path = 0x7f0d57537d00 "/nfsshare/ltp-eZQlnozjnX/ftegVRmbT/ftest05.20436/b",
name = 0x7f0d57537d31 "b", inode = 0x7f0d24255b28, parent = 0x0,
gfid = "\263\341\223\031\301\245I\260\234\334\017\to%\305^",
pargfid = '\000' <repeats 15 times>}
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
--- Additional comment from Worker Ant on 2019-07-14 13:03:24 UTC ---
REVIEW: https://review.gluster.org/23029 (cluster/ec: do loc_copy from ctx->loc
in fd->lock) posted (#2) for review on master by Kinglong Mee
--- Additional comment from Worker Ant on 2019-07-17 16:24:24 UTC ---
REVIEW: https://review.gluster.org/23029 (cluster/ec: skip updating ctx->loc
again when ec_fix_open/opendir) merged (#4) on master by Kinglong Mee
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1729772
[Bug 1729772] Disperse volume : Ganesha crash with IO in 4+2 config when one
glusterfsd restart every 600s
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list