[Bugs] [Bug 1627610] glusterd crash in regression build
bugzilla at redhat.com
bugzilla at redhat.com
Tue Sep 11 03:06:41 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1627610
Sanju <srakonde at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|bugs at gluster.org |srakonde at redhat.com
--- Comment #1 from Sanju <srakonde at redhat.com> ---
Root Cause:
>From Thread 7:
#10 0x00007f50dd2801f9 in glusterd_store_volinfo (volinfo=0x902290,
ac=GLUSTERD_VOLINFO_VER_AC_NONE)
at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-store.c:1806
>From Thread 1:
#10 0x00007f50dd2801f9 in glusterd_store_volinfo (volinfo=0x902290,
ac=GLUSTERD_VOLINFO_VER_AC_NONE)
at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-store.c:1806
>From above snippets from the output of "t a a bt", we can say that Thread 7 and
Thread 1 are pointing to the same volinfo structure.
Source code for glusterd_store volinfo_write:
int32_t
glusterd_store_volinfo_write (int fd, glusterd_volinfo_t *volinfo)
{
int32_t ret = -1;
gf_store_handle_t *shandle = NULL;
GF_ASSERT (fd > 0);
GF_ASSERT (volinfo);
GF_ASSERT (volinfo->shandle);
shandle = volinfo->shandle;
ret = glusterd_volume_exclude_options_write (fd, volinfo);
if (ret)
goto out;
shandle->fd = fd;
dict_foreach (volinfo->dict, _storeopts, shandle);
dict_foreach (volinfo->gsync_slaves, _storeslaves, shandle);
shandle->fd = 0;
out:
gf_msg_debug (THIS->name, 0, "Returning %d", ret);
return ret;
}
At Thread 1,
#8 0x00007f50dd27e211 in glusterd_store_volinfo_write (fd=8, volinfo=0x902290)
at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-store.c:1157
glusterd_store_volinfo_write is calling _storeopts, which again calls
gf_store_save_value. _storeopts is also having a assertion check for whether
fd>0. At glusterd_store_volinfo_write fd value is 8.
#4 0x00007f50e882b341 in gf_store_save_value (fd=0, key=0x91bff0
"performance.client-io-threads",
value=0x8bcc40 "off")
at
/home/jenkins/root/workspace/regression-test-burn-in/libglusterfs/src/store.c:344
>From above we can see that fd value is 0.
At Thread 7,
#8 0x00007f50dd27edbf in glusterd_store_brickinfos (volinfo=0x902290,
vol_fd=16)
at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-store.c:1373
#9 0x00007f50dd27fa35 in glusterd_store_perform_volume_store
(volinfo=0x902290)
at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-store.c:1613
#10 0x00007f50dd2801f9 in glusterd_store_volinfo (volinfo=0x902290,
ac=GLUSTERD_VOLINFO_VER_AC_NONE)
at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-store.c:1806
#11 0x00007f50dd258a76 in glusterd_restart_bricks (opaque=0x0)
at
/home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-utils.c:6422
#12 0x00007f50e883111e in synctask_wrap ()
at
/home/jenkins/root/workspace/regression-test-burn-in/libglusterfs/src/syncop.c:375
#13 0x00007f50e6e42030 in ?? () from ./lib64/libc.so.6
#14 0x0000000000000000 in ?? ()
In the stack, we can see glusterd_store_perform_volume_store calling
glusterd_store_brickinfos. Before calling glusterd_store_brickinfos,
glusterd_store_perform_volume_store calls glusterd_store_volinfo_write, which
is writing shandle->fd as 0.
So, Thread 7 updated the fd value as 0, where as Thread 1 is expecting fd > 0.
This is happening because we are having a separate syntask for
glusterd_restart_bricks. We can see glusterd_restart_bricks at Thread 7 bt.
Solution for this can be, acquiring the locks before writing in a critical
section. Need to explore more on the solution.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list