[Bugs] [Bug 1603063] ./tests/bugs/glusterd/ validating-server-quorum.t is generated core

bugzilla at redhat.com bugzilla at redhat.com
Mon Jul 23 03:10:29 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1603063

Atin Mukherjee <amukherj at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amukherj at redhat.com



--- Comment #1 from Atin Mukherjee <amukherj at redhat.com> ---
I had a look at
https://build.gluster.org/job/regression-test-burn-in/4044/consoleFull which
was a recent report of the same crash. Apparently during replace brick,
glusterd process crashed due to a null dst_brickinfo while it was trying to
resolve this brick through glusterd_resolve_brick ().

(gdb) 
#0  0x00007f523fcf4277 in raise () from ./lib64/libc.so.6
#1  0x00007f523fcf5968 in abort () from ./lib64/libc.so.6
#2  0x00007f523fced096 in __assert_fail_base () from ./lib64/libc.so.6
#3  0x00007f523fced142 in __assert_fail () from ./lib64/libc.so.6
#4  0x00007f523610bb3e in glusterd_resolve_brick (brickinfo=0x0)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-utils.c:1134
#5  0x00007f5236190c79 in glusterd_op_replace_brick (dict=0x7f5228000e78,
rsp_dict=0x7f5228015a28)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:521
#6  0x00007f52361e58cb in gd_mgmt_v3_commit_fn (op=GD_OP_REPLACE_BRICK,
dict=0x7f5228000e78, 
    op_errstr=0x7f5224239338, op_errno=0x7f522423932c, rsp_dict=0x7f5228015a28)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt.c:310
#7  0x00007f52361e30c9 in glusterd_handle_commit_fn (req=0x7f52240041e8)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:609
#8  0x00007f52360d923b in glusterd_big_locked_handler (req=0x7f52240041e8, 
    actor_fn=0x7f52361e2dee <glusterd_handle_commit_fn>)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-handler.c:80
#9  0x00007f52361e4178 in glusterd_handle_commit (req=0x7f52240041e8)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:993
#10 0x00007f52416f2ee4 in synctask_wrap ()
    at
/home/jenkins/root/workspace/regression-test-burn-in/libglusterfs/src/syncop.c:375
#11 0x00007f523fd06030 in ?? () from ./lib64/libc.so.6
#12 0x0000000000000000 in ?? ()
(gdb) 
#0  0x00007f523fcf4277 in raise () from ./lib64/libc.so.6
#1  0x00007f523fcf5968 in abort () from ./lib64/libc.so.6
#2  0x00007f523fced096 in __assert_fail_base () from ./lib64/libc.so.6
#3  0x00007f523fced142 in __assert_fail () from ./lib64/libc.so.6
#4  0x00007f523610bb3e in glusterd_resolve_brick (brickinfo=0x0)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-utils.c:1134
#5  0x00007f5236190c79 in glusterd_op_replace_brick (dict=0x7f5228000e78,
rsp_dict=0x7f5228015a28)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:521
#6  0x00007f52361e58cb in gd_mgmt_v3_commit_fn (op=GD_OP_REPLACE_BRICK,
dict=0x7f5228000e78, 
    op_errstr=0x7f5224239338, op_errno=0x7f522423932c, rsp_dict=0x7f5228015a28)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt.c:310
#7  0x00007f52361e30c9 in glusterd_handle_commit_fn (req=0x7f52240041e8)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:609
#8  0x00007f52360d923b in glusterd_big_locked_handler (req=0x7f52240041e8, 
    actor_fn=0x7f52361e2dee <glusterd_handle_commit_fn>)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-handler.c:80
#9  0x00007f52361e4178 in glusterd_handle_commit (req=0x7f52240041e8)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:993
#10 0x00007f52416f2ee4 in synctask_wrap ()
    at
/home/jenkins/root/workspace/regression-test-burn-in/libglusterfs/src/syncop.c:375
#11 0x00007f523fd06030 in ?? () from ./lib64/libc.so.6
#12 0x0000000000000000 in ?? ()
(gdb) 
(gdb) bt
#0  0x00007f523fcf4277 in raise () from ./lib64/libc.so.6
#1  0x00007f523fcf5968 in abort () from ./lib64/libc.so.6
#2  0x00007f523fced096 in __assert_fail_base () from ./lib64/libc.so.6
#3  0x00007f523fced142 in __assert_fail () from ./lib64/libc.so.6
#4  0x00007f523610bb3e in glusterd_resolve_brick (brickinfo=0x0)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-utils.c:1134
#5  0x00007f5236190c79 in glusterd_op_replace_brick (dict=0x7f5228000e78,
rsp_dict=0x7f5228015a28)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:521
#6  0x00007f52361e58cb in gd_mgmt_v3_commit_fn (op=GD_OP_REPLACE_BRICK,
dict=0x7f5228000e78, 
    op_errstr=0x7f5224239338, op_errno=0x7f522423932c, rsp_dict=0x7f5228015a28)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt.c:310
#7  0x00007f52361e30c9 in glusterd_handle_commit_fn (req=0x7f52240041e8)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:609
#8  0x00007f52360d923b in glusterd_big_locked_handler (req=0x7f52240041e8, 
    actor_fn=0x7f52361e2dee <glusterd_handle_commit_fn>)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-handler.c:80
#9  0x00007f52361e4178 in glusterd_handle_commit (req=0x7f52240041e8)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:993
#10 0x00007f52416f2ee4 in synctask_wrap ()
    at
/home/jenkins/root/workspace/regression-test-burn-in/libglusterfs/src/syncop.c:375
#11 0x00007f523fd06030 in ?? () from ./lib64/libc.so.6
#12 0x0000000000000000 in ?? ()
(gdb) p src_brickinfo
No symbol "src_brickinfo" in current context.
(gdb) p dst_brickinfo
No symbol "dst_brickinfo" in current context.
(gdb) f 5
#5  0x00007f5236190c79 in glusterd_op_replace_brick (dict=0x7f5228000e78,
rsp_dict=0x7f5228015a28)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:521
521   
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:
No such file or directory.
(gdb) p src_brickinfo
$1 = (glusterd_brickinfo_t *) 0x7f521c02ae70
(gdb) p *src_brickinfo
$2 = {hostname = "127.1.1.2", '\000' <repeats 1014 times>, 
  path = "/d/backends/2/patchy2", '\000' <repeats 4074 times>, 
  real_path = "/d/backends/2/patchy2", '\000' <repeats 4074 times>, 
  device_path = '\000' <repeats 4095 times>, 
  mount_dir = "/backends/2/patchy2", '\000' <repeats 4076 times>, 
  brick_id = "patchy-client-1", '\000' <repeats 1008 times>, fstype = '\000'
<repeats 254 times>, 
  mnt_opts = '\000' <repeats 1023 times>, brick_list = {next = 0x7f521c035970,
prev = 0x7f521c029d70}, 
  uuid = "\311\333d\361\323\313Fm\244c\177\321\030Ld\t", port = 0, rdma_port =
0, logfile = 0x0, 
  shandle = 0x7f521c03d6a0, status = GF_BRICK_STOPPED, rpc = 0x0,
decommissioned = 0, 
  vg = '\000' <repeats 4095 times>, caps = 0, snap_status = 0, group = 0, 
  jbr_uuid = '\000' <repeats 15 times>, statfs_fsid = 0, fs_share_count = 0,
port_registered = false, 
  start_triggered = false, restart_mutex = {__data = {__lock = 0, __count = 0,
__owner = 0, 
      __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev =
0x0, __next = 0x0}}, 
    __size = '\000' <repeats 39 times>, __align = 0}}
(gdb) p dst_brickinfo
$3 = (glusterd_brickinfo_t *) 0x0

The only possibility of dst_brickinfo getting overwritten here is by checking
if there's a friend import happening during this time through a separate
synctask which is the case here from thread 9.

Thread 9 (LWP 23474):
#0  0x00007f523fd8356d in nanosleep () from ./lib64/libc.so.6
#1  0x00007f523fd83404 in sleep () from ./lib64/libc.so.6
#2  0x00007f523620113e in glusterd_proc_stop (proc=0x7f5241a360d8, sig=15,
flags=4)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-proc-mgmt.c:114
#3  0x00007f52362023f0 in glusterd_svc_stop (svc=0x7f5241a340c0, sig=15)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-svc-mgmt.c:239
#4  0x00007f52362034c8 in glusterd_shdsvc_manager (svc=0x7f5241a340c0,
data=0x0, flags=2)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-shd-svc.c:126
#5  0x00007f5236205da3 in glusterd_svcs_manager (volinfo=0x0)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-svc-helper.c:126
#6  0x00007f5236118b7a in glusterd_import_friend_volumes_synctask
(opaque=0x7f5224003548)
    at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-utils.c:4817
#7  0x00007f52416f2ee4 in synctask_wrap ()
    at
/home/jenkins/root/workspace/regression-test-burn-in/libglusterfs/src/syncop.c:375
#8  0x00007f523fd06030 in ?? () from ./lib64/libc.so.6
#9  0x0000000000000000 in ?? ()

This seems to be a race however I'm still not sure why are we frequently
hitting this now.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list