[Gluster-devel] Update on mgmt_v3-locks.t on NetBSD

Kaushal M kshlmster at gmail.com
Thu Apr 16 14:22:49 UTC 2015


I finally started testing this today after getting my NetBSD VMs setup
correctly.

I ran the test on a NetBSD-6.1.5 VM with NetBSD-7 kernel, and on a NetBSD-7 vm.

On the NetBSD-6 VM the test failed always, where as it didn't fail on
the NetBSD-7 VM.

The tests 11 and 14 failed always because of a GlusterD
crashing/hanging when exiting.

In mgmt_v3-locks.t:107, the test is attempting to kill GlusterD 3.
`kill_glusterd 3`

But GlusterD crashes and hangs when being killed, and doesn't actually
die. I obtained the following trace,
```
Thread 5 (LWP 3):
#0  0x00007f7ff78746c2 in getframeaddr (level=10) at
../../contrib/libexecinfo/execinfo.c:201
#1  backtrace (buffer=<optimized out>, size=200) at
../../contrib/libexecinfo/execinfo.c:339
#2  0x00007f7ff78210f1 in _gf_msg_backtrace_nomem (level=GF_LOG_ALERT,
stacksize=<optimized out>) at logging.c:1084
#3  0x00007f7ff7837bf9 in gf_print_trace (signum=11,
ctx=0x7f7ff7b01000) at common-utils.c:618
#4  <signal handler called>
#5  0x00007f7ff7874656 in getframeaddr (level=7) at
../../contrib/libexecinfo/execinfo.c:198
#6  backtrace (buffer=<optimized out>, size=5) at
../../contrib/libexecinfo/execinfo.c:339
#7  0x00007f7ff78211c3 in _gf_msg_backtrace (stacksize=<optimized
out>, callstr=0x7f7ff2ffeaf0 "", strsize=5) at logging.c:1118
#8  0x00007f7ff7822cea in _gf_msg (domain=0x416d8e "", file=0x413155
"glusterfsd.c", function=0x415970 "cleanup_and_exit", line=1212,
level=GF_LOG_WARNING, errnum=0, trace=1, msgid=100032,
   fmt=0x4141d8 "received signum (%d), shutting down") at logging.c:2032
#9  0x000000000040855c in cleanup_and_exit (signum=15) at glusterfsd.c:1212
#10 0x0000000000408635 in glusterfs_sigwaiter (arg=<optimized out>) at
glusterfsd.c:1983
#11 0x00007f7ff600b2ce in ?? () from /usr/lib/libpthread.so.1
#12 0x00007f7ff5275d70 in ___lwp_park50 () from /usr/lib/libc.so.12
#13 0x00007f7ff3000000 in ?? ()
#14 0x00007f7ff7ff04c0 in ?? ()
#15 0x0000000111110001 in ?? ()
#16 0x0000000033330003 in ?? ()
#17 0x0000000000000000 in ?? ()
```
Because this GlusterD didn't die, test 11 `mgmt_v3-locks.t:111 :
EXPECT_WITHIN $PROBE_TIMEOUT 1 check_peers` and test 14
`mgmt_v3-locks.t:115 : TEST $glusterd_3` fail.
Test 11 fails, as the supposedly dead GlusterD still shows up as
connected in peer status.
Test 14 fails, as a new GlusterD process cannot be started as the old
one still exists.

The above failure that I observed is different from the one I saw
during the regression run (I'm referencing the regression for one of
my changes http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/3140/consoleFull).
During the regression almost all of the tests in mgmt_v3-locks.t fail.

I'll continue investigation to identify why the crash on kill is happening.

I still don't know why my NetBSD-7 attempts to run the test are
passing. May be the issue has been fixed by some other change.

I performed my tests at commit c07f16656 'tests: early bail-out on bad
status or new core file(s)' plus my change at
https://review.gluster.org/10192.


~kaushal


More information about the Gluster-devel mailing list