[Gluster-devel] NetBSD regressions, memory corruption
Venky Shankar
yknev.shankar at gmail.com
Tue Mar 24 17:35:39 UTC 2015
On Mar 24, 2015 10:48 PM, "Emmanuel Dreyfus" <manu at netbsd.org> wrote:
>
> Hi
>
> The merge of http://review.gluster.org/9953/ removed a few crashes from
> NetBSD regression tests, but the thing remains uterly broken since the
> merge of http://review.gluster.org/9708/ though I cannot tell if I have
> bugs leftover form this commit or if I face new problems.
>
> Here are the known problem so far:
>
> 1) This needs to be merged:
> http://review.gluster.org/9831
> http://review.gluster.org/9944
>
> 2) I still experience memory corruption, which usually crash glsuterfsd
> because some pointer waas replaced by value 0x3. This strikes on iobref
> most of the time, but it can happens elsewhere.
>
> I would be glad if someone could help here. On nbslave70:/autobuild I
> added code to check for iobref/iobuf sanity at random place (by calling
> iobref_sanity()). I do this in synask_wrap and in STACK_WIND/UNWIND,
> but I have not been able to spot the source of the problem yet.
I'll take a look at this tomorrow.
>
> The weird thing is that memory seems to always be overwritten by the
> same values, and magic 0xcafebabe number before the buffer is preserved.
> Here is an example: where iobref->iobrefs = 0xbb11a458
> 0xbb11a44c: 0xcafebabe 0x00000000 0x00000000 0x00000003
> 0xbb11a45c: 0x00000003 0x00000008 0x00000003 0x0000000c
> 0xbb11a46c: 0x00000003 0x0000000e 0x00000003 0x00000010
> 0xbb11a47c: 0x00000003 0x00000009 0x00000003 0x0000000d
> 0xbb11a48c: 0x00000003 0x00000015 0x00000003 0x00000016
> 0xbb11a49c: 0x00000003 0x00000032 0x00000034 0xbb1e2018
> 0xbb11a4ac: 0xcafebabe 0x00000000 0x00000000 0xbb11a5d8
>
>
> Additionnaly, there are two workarounds I had to make for crashes
> that happen sometime:
> 3) I had to make this change (not yet posted on gerrit) to avoid crashing
> because op = GD_OP_NONE. Things seems to go fins without the test.
> a cause or a symptom:
>
> diff --git a/xlators/mgmt/glusterd/src/glusterd-utils.c
b/xlators/mgmt/glusterd/src/glusterd-utils.c
> index 02d2cfb..c06959c 100644
> --- a/xlators/mgmt/glusterd/src/glusterd-utils.c
> +++ b/xlators/mgmt/glusterd/src/glusterd-utils.c
> @@ -8301,15 +8301,12 @@ out:
> int
> glusterd_volume_heal_use_rsp_dict (dict_t *aggr, dict_t *rsp_dict)
> {
> int ret = 0;
> dict_t *ctx_dict = NULL;
> - glusterd_op_t op = GD_OP_NONE;
> + glusterd_op_t op = GD_OP_HEAL_VOLUME;
>
> GF_ASSERT (rsp_dict);
>
> - op = glusterd_op_get_op ();
> - GF_ASSERT (GD_OP_HEAL_VOLUME == op);
> -
> if (aggr) {
> ctx_dict = aggr;
>
>
> 4) Here I crash because this->private = NULL, and here is a
> workaround:
>
> diff --git a/xlators/storage/posix/src/posix.c
b/xlators/storage/posix/src/posix.c
> index ae08adc..3918e07 100644
> --- a/xlators/storage/posix/src/posix.c
> +++ b/xlators/storage/posix/src/posix.c
> @@ -913,6 +913,7 @@ posix_opendir (call_frame_t *frame, xlator_t *this,
>
> VALIDATE_OR_GOTO (frame, out);
> VALIDATE_OR_GOTO (this, out);
> + VALIDATE_OR_GOTO (this->private, out);
> VALIDATE_OR_GOTO (loc, out);
> VALIDATE_OR_GOTO (fd, out);
>
>
>
> 4)
>
>
> --
> Emmanuel Dreyfus
> manu at netbsd.org
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150324/e6a821fd/attachment-0001.html>
More information about the Gluster-devel
mailing list