<div dir="ltr"><div><div><div>I have not used valgrind before, so I may be wrong here. <br><br></div>I think the valgrind_stack_deregister should have been after GF_FREE_STACK.<br></div>That may explain the instance of invalid_write during stack_destroy calls in after.log.<br><br></div>There seems to be numerous issues reported in before.log (I am assuming, you did not have the valgrind_stack_register call in it),  <br>From <a href="http://valgrind.org/docs/manual/manual-core.html" target="_blank">http://valgrind.org/docs/<wbr>manual/manual-core.html</a>, looks like valgrind detects client switching stack only If a memory of &gt; 2MB change in Stack pointer register. Is it possible that since marker is only using 16k, the stack pointer could have been in less than 2MB offset from current Stack Pointer? <br>It seems unlikely to me since we are allocating the stack from heap.<br><div><div><div><div><br></div><div>Did you try a run with the valgrind instrumentation, without changing stack size ? <br></div><div>None of this explains the crash though.. We had seen a memory overrun crash in same code path on netbsd earlier but did not follow up then. <br></div><div>Will look further into it.<br></div><div><br><br></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 8, 2017 at 4:51 PM, Kinglong Mee <span dir="ltr">&lt;<a href="mailto:kinglongmee@gmail.com" target="_blank">kinglongmee@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Maybe it&#39;s my fault, I found valgrind can&#39;t parse context switch(makecontext/<wbr>swapcontext) by default.<br>
So, I test with the following patch (tells valgrind new stack by VALGRIND_STACK_DEREGISTER).<br>
With it, only some &quot;Invalid read/write&quot; by __gf_mem_invalidate, Does it right ??<br>
So, there is only one problem, if without io-threads, the stack size is small for marker.<br>
Am I right?<br>
<br>
Ps:<br>
valgrind-before.log is the log without the following patch, the valgrind-after.log is with the patch.<br>
<br>
==35656== Invalid write of size 8<br>
==35656==    at 0x4E8FFD4: __gf_mem_invalidate (mem-pool.c:278)<br>
==35656==    by 0x4E90313: __gf_free (mem-pool.c:334)<br>
==35656==    by 0x4EA4E5B: synctask_destroy (syncop.c:394)<br>
==35656==    by 0x4EA4EDF: synctask_done (syncop.c:412)<br>
==35656==    by 0x4EA58B3: synctask_switchto (syncop.c:673)<br>
==35656==    by 0x4EA596B: syncenv_processor (syncop.c:704)<br>
==35656==    by 0x60B2DC4: start_thread (in /usr/lib64/<a href="http://libpthread-2.17.so" rel="noreferrer" target="_blank">libpthread-2.17.so</a>)<br>
==35656==    by 0x67A873C: clone (in /usr/lib64/<a href="http://libc-2.17.so" rel="noreferrer" target="_blank">libc-2.17.so</a>)<br>
==35656==  Address 0x1b104931 is 2,068,017 bytes inside a block of size 2,097,224 alloc&#39;d<br>
==35656==    at 0x4C29975: calloc (vg_replace_malloc.c:711)<br>
==35656==    by 0x4E8FA5E: __gf_calloc (mem-pool.c:117)<br>
==35656==    by 0x4EA52F5: synctask_create (syncop.c:500)<br>
==35656==    by 0x4EA55AE: synctask_new1 (syncop.c:576)<br>
==35656==    by 0x143AE0D7: mq_synctask1 (marker-quota.c:1078)<br>
==35656==    by 0x143AE199: mq_synctask (marker-quota.c:1097)<br>
==35656==    by 0x143AE6F6: _mq_create_xattrs_txn (marker-quota.c:1236)<br>
==35656==    by 0x143AE82D: mq_create_xattrs_txn (marker-quota.c:1253)<br>
==35656==    by 0x143B0DCB: mq_inspect_directory_xattr (marker-quota.c:2027)<br>
==35656==    by 0x143B13A8: mq_xattr_state (marker-quota.c:2117)<br>
==35656==    by 0x143A6E80: marker_lookup_cbk (marker.c:2961)<br>
==35656==    by 0x141811E0: up_lookup_cbk (upcall.c:753)<br>
<br>
----------------------- valgrind ------------------------------<wbr>------------<br>
<br>
Don&#39;t forget install valgrind-devel.<br>
<br>
diff --git a/libglusterfs/src/syncop.c b/libglusterfs/src/syncop.c<br>
index 00a9b57..97b1de1 100644<br>
--- a/libglusterfs/src/syncop.c<br>
+++ b/libglusterfs/src/syncop.c<br>
@@ -10,6 +10,7 @@<br>
<br>
 #include &quot;syncop.h&quot;<br>
 #include &quot;libglusterfs-messages.h&quot;<br>
+#include &lt;valgrind/valgrind.h&gt;<br>
<br>
 int<br>
 syncopctx_setfsuid (void *uid)<br>
@@ -388,6 +389,8 @@ synctask_destroy (struct synctask *task)<br>
         if (!task)<br>
                 return;<br>
<br>
+VALGRIND_STACK_DEREGISTER(<wbr>task-&gt;valgrind_ret);<br>
+<br>
         GF_FREE (task-&gt;stack);<br>
<br>
         if (task-&gt;opframe)<br>
@@ -509,6 +512,8 @@ synctask_create (struct syncenv *env, size_t stacksize, sync<br>
<br>
         newtask-&gt;ctx.uc_stack.ss_sp   = newtask-&gt;stack;<br>
<br>
+       newtask-&gt;valgrind_ret = VALGRIND_STACK_REGISTER(<wbr>newtask-&gt;stack, newtask-<br>
+<br>
         makecontext (&amp;newtask-&gt;ctx, (void (*)(void)) synctask_wrap, 2, newtask)<br>
<br>
         newtask-&gt;state = SYNCTASK_INIT;<br>
diff --git a/libglusterfs/src/syncop.h b/libglusterfs/src/syncop.h<br>
index c2387e6..247325b 100644<br>
--- a/libglusterfs/src/syncop.h<br>
+++ b/libglusterfs/src/syncop.h<br>
@@ -63,6 +63,7 @@ struct synctask {<br>
         int                 woken;<br>
         int                 slept;<br>
         int                 ret;<br>
+       int                 valgrind_ret;<br>
<br>
         uid_t               uid;<br>
         gid_t               gid;<br>
<span class="">diff --git a/xlators/features/marker/src/<wbr>marker-quota.c b/xlators/features/marke<br>
index 902b8e5..f3d2507 100644<br>
--- a/xlators/features/marker/src/<wbr>marker-quota.c<br>
+++ b/xlators/features/marker/src/<wbr>marker-quota.c<br>
@@ -1075,7 +1075,7 @@ mq_synctask1 (xlator_t *this, synctask_fn_t task, gf_boole<br>
         }<br>
<br>
         if (spawn) {<br>
-                ret = synctask_new1 (this-&gt;ctx-&gt;env, 1024 * 16, task,<br>
+                ret = synctask_new1 (this-&gt;ctx-&gt;env, 0, task,<br>
                                       mq_synctask_cleanup, NULL, args);<br>
                 if (ret) {<br>
                         gf_log (this-&gt;name, GF_LOG_ERROR, &quot;Failed to spawn &quot;<br>
<br>
<br>
</span><span class="">On 6/8/2017 19:02, Sanoj Unnikrishnan wrote:<br>
&gt; I would still be worried about the Invalid read/write. IMO whether an illegal access causes a crash depends on whether the page is currently mapped.<br>
&gt; So, it could so happen that there is a use after free / use outside of bounds happening in the code  and  it turns out that this location gets mapped in a different (unmapped) page when IO threads is not loaded.<br>
&gt;<br>
&gt; Could you please share the valgrind logs as well.<br>
&gt;<br>
</span><span class="">&gt; On Wed, Jun 7, 2017 at 8:22 PM, Kinglong Mee &lt;<a href="mailto:kinglongmee@gmail.com">kinglongmee@gmail.com</a> &lt;mailto:<a href="mailto:kinglongmee@gmail.com">kinglongmee@gmail.com</a>&gt;<wbr>&gt; wrote:<br>
&gt;<br>
&gt;     After deleting io-threads from the vols, quota operates (list/set/modify) lets glusterfsd crash.<br>
&gt;     I use it at CentOS 7 (CentOS Linux release 7.3.1611) with glusterfs 3.8.12.<br>
&gt;     It seems the stack corrupt, when testing with the following diff, glusterfsd runs correctly.<br>
&gt;<br>
&gt;     There are two questions as,<br>
&gt;     1. When using valgrind, it shows there are many &quot;Invalid read/write&quot; when with io-threads.<br>
&gt;        Why glusterfsd runs correctly with io-threads? but crash without io-threads?<br>
&gt;<br>
&gt;     2. With the following diff, valgrind also shows many &quot;Invalid read/write&quot; when without io-threads?<br>
&gt;        but no any crash.<br>
&gt;<br>
&gt;     Any comments are welcome.<br>
&gt;<br>
</span>&gt;     Revert <a href="http://review.gluster.org/11499" rel="noreferrer" target="_blank">http://review.gluster.org/<wbr>11499</a> &lt;<a href="http://review.gluster.org/11499" rel="noreferrer" target="_blank">http://review.gluster.org/<wbr>11499</a>&gt; seems better than the diff.<br>
<div><div class="h5">&gt;<br>
&gt;     diff --git a/xlators/features/marker/src/<wbr>marker-quota.c b/xlators/features/marke<br>
&gt;     index 902b8e5..f3d2507 100644<br>
&gt;     --- a/xlators/features/marker/src/<wbr>marker-quota.c<br>
&gt;     +++ b/xlators/features/marker/src/<wbr>marker-quota.c<br>
&gt;     @@ -1075,7 +1075,7 @@ mq_synctask1 (xlator_t *this, synctask_fn_t task, gf_boole<br>
&gt;              }<br>
&gt;<br>
&gt;              if (spawn) {<br>
&gt;     -                ret = synctask_new1 (this-&gt;ctx-&gt;env, 1024 * 16, task,<br>
&gt;     +                ret = synctask_new1 (this-&gt;ctx-&gt;env, 0, task,<br>
&gt;                                            mq_synctask_cleanup, NULL, args);<br>
&gt;                      if (ret) {<br>
&gt;                              gf_log (this-&gt;name, GF_LOG_ERROR, &quot;Failed to spawn &quot;<br>
&gt;<br>
&gt;     ------------------------------<wbr>-----test steps ------------------------------<wbr>----<br>
&gt;     1. gluster volume create gvtest node1:/test/ node2:/test/<br>
&gt;     2. gluster volume start gvtest<br>
&gt;     3. gluster volume quota enable gvtest<br>
&gt;<br>
&gt;     4. &quot;deletes io-threads from all vols&quot;<br>
&gt;     5. reboot node1 and node2.<br>
&gt;     6. sh quota-set.sh<br>
&gt;<br>
&gt;     # cat quota-set.sh<br>
&gt;     gluster volume quota gvtest list<br>
&gt;     gluster volume quota gvtest limit-usage / 10GB<br>
&gt;     gluster volume quota gvtest limit-usage /1234 1GB<br>
&gt;     gluster volume quota gvtest limit-usage /hello 1GB<br>
&gt;     gluster volume quota gvtest limit-usage /test 1GB<br>
&gt;     gluster volume quota gvtest limit-usage /xyz 1GB<br>
&gt;     gluster volume quota gvtest list<br>
&gt;     gluster volume quota gvtest remove /hello<br>
&gt;     gluster volume quota gvtest remove /test<br>
&gt;     gluster volume quota gvtest list<br>
&gt;     gluster volume quota gvtest limit-usage /test 1GB<br>
&gt;     gluster volume quota gvtest remove /xyz<br>
&gt;     gluster volume quota gvtest list<br>
&gt;<br>
&gt;     -----------------------<wbr>glusterfsd crash without the diff--------------------------<wbr>------<br>
&gt;<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(_gf_msg_backtrace_nomem+<wbr>0xf5)[0x7f6e1e950af1]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(gf_print_trace+0x21f)[<wbr>0x7f6e1e956943]<br>
&gt;     /usr/local/sbin/glusterfsd(<wbr>glusterfsd_print_trace+0x1f)[<wbr>0x409c83]<br>
&gt;     /lib64/libc.so.6(+0x35250)[<wbr>0x7f6e1d025250]<br>
&gt;     /lib64/libc.so.6(gsignal+0x37)<wbr>[0x7f6e1d0251d7]<br>
&gt;     /lib64/libc.so.6(abort+0x148)[<wbr>0x7f6e1d0268c8]<br>
&gt;     /lib64/libc.so.6(+0x74f07)[<wbr>0x7f6e1d064f07]<br>
&gt;     /lib64/libc.so.6(+0x7baf5)[<wbr>0x7f6e1d06baf5]<br>
&gt;     /lib64/libc.so.6(+0x7c3e6)[<wbr>0x7f6e1d06c3e6]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(__gf_free+0x311)[<wbr>0x7f6e1e981327]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(synctask_destroy+0x82)[<wbr>0x7f6e1e995c20]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(synctask_done+0x25)[<wbr>0x7f6e1e995c47]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(synctask_switchto+0xcf)[<wbr>0x7f6e1e996585]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(syncenv_processor+0x60)[<wbr>0x7f6e1e99663d]<br>
&gt;     /lib64/libpthread.so.0(+<wbr>0x7dc5)[0x7f6e1d7a2dc5]<br>
&gt;     /lib64/libc.so.6(clone+0x6d)[<wbr>0x7f6e1d0e773d]<br>
&gt;<br>
&gt;     or<br>
&gt;<br>
&gt;     package-string: glusterfs 3.8.12<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(_gf_msg_backtrace_nomem+<wbr>0xf5)[0x7fa15e623af1]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(gf_print_trace+0x21f)[<wbr>0x7fa15e629943]<br>
&gt;     /usr/local/sbin/glusterfsd(<wbr>glusterfsd_print_trace+0x1f)[<wbr>0x409c83]<br>
&gt;     /lib64/libc.so.6(+0x35250)[<wbr>0x7fa15ccf8250]<br>
&gt;     /lib64/libc.so.6(gsignal+0x37)<wbr>[0x7fa15ccf81d7]<br>
&gt;     /lib64/libc.so.6(abort+0x148)[<wbr>0x7fa15ccf98c8]<br>
&gt;     /lib64/libc.so.6(+0x74f07)[<wbr>0x7fa15cd37f07]<br>
&gt;     /lib64/libc.so.6(+0x7dd4d)[<wbr>0x7fa15cd40d4d]<br>
&gt;     /lib64/libc.so.6(__libc_<wbr>calloc+0xb4)[0x7fa15cd43a14]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(__gf_calloc+0xa7)[<wbr>0x7fa15e653a5f]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(iobref_new+0x2b)[<wbr>0x7fa15e65875a]<br>
&gt;     /usr/local/lib/glusterfs/3.8.<wbr>12/rpc-transport/socket.so(+<wbr>0xa98c)[0x7fa153a8398c]<br>
&gt;     /usr/local/lib/glusterfs/3.8.<wbr>12/rpc-transport/socket.so(+<wbr>0xacbc)[0x7fa153a83cbc]<br>
&gt;     /usr/local/lib/glusterfs/3.8.<wbr>12/rpc-transport/socket.so(+<wbr>0xad10)[0x7fa153a83d10]<br>
&gt;     /usr/local/lib/glusterfs/3.8.<wbr>12/rpc-transport/socket.so(+<wbr>0xb2a7)[0x7fa153a842a7]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(+0x97ea9)[0x7fa15e68eea9]<br>
&gt;     /usr/local/lib/libglusterfs.<wbr>so.0(+0x982c6)[0x7fa15e68f2c6]<br>
&gt;     /lib64/libpthread.so.0(+<wbr>0x7dc5)[0x7fa15d475dc5]<br>
&gt;     /lib64/libc.so.6(clone+0x6d)[<wbr>0x7fa15cdba73d]<br>
&gt;<br>
&gt;     ______________________________<wbr>_________________<br>
&gt;     Gluster-devel mailing list<br>
</div></div>&gt;     <a href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a> &lt;mailto:<a href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.<wbr>org</a>&gt;<br>
&gt;     <a href="http://lists.gluster.org/mailman/listinfo/gluster-devel" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-devel</a> &lt;<a href="http://lists.gluster.org/mailman/listinfo/gluster-devel" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-devel</a><wbr>&gt;<br>
&gt;<br>
&gt;<br>
</blockquote></div><br></div>