<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:宋体;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:DengXian;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"\@宋体";
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:"\@等线";
        panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:宋体;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:12.0pt;
        font-family:宋体;}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:DengXian;
        color:windowtext;}
span.EmailStyle19
        {mso-style-type:personal-compose;
        font-family:DengXian;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:DengXian;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="ZH-CN" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">Hi,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">The reason why I move event_handled to the end of socket_event_poll_in is because if event_handled is called before rpc_transport_pollin_destroy, it allowed another round
of event_dispatch_epoll_handler happen before rpc_transport_pollin_destroy, in this way, when the latter poll in goes to rpc_transport_pollin_destroy , there is a chance that the pollin->iobref has already been destroyed by the first one(there is no lock
destroy for iobref->lock in iobref_destroy by the way). That may cause stuck in “LOCK (&iobref->lock);”<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">I find the one of recent patch<o:p></o:p></span></p>
<p class="MsoNormal" style="text-autospace:none"><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New"">SHA-1: f747d55a7fd364e2b9a74fe40360ab3cb7b11537<o:p></o:p></span></p>
<p class="MsoNormal" style="text-autospace:none"><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New""><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New"">* socket: fix issue on concurrent handle of a socket<o:p></o:p></span></b></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">I think the point is to avoid the concurrent handling of the same socket at the same time, but after my test with this patch this problem also exists, so I think event_handled
is still called too early to allow concurrent handling of the same socket happen, and after move it to the end of socket_event_poll this glusterd stuck issue disappeared.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">cynthia<o:p></o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Raghavendra Gowdappa <rgowdapp@redhat.com>
<br>
<b>Sent:</b> Monday, April 15, 2019 2:36 PM<br>
<b>To:</b> Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou@nokia-sbell.com><br>
<b>Cc:</b> gluster-devel@gluster.org<br>
<b>Subject:</b> Re: glusterd stuck for glusterfs with version 3.12.15<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<div>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US">On Mon, Apr 15, 2019 at 11:08 AM Zhou, Cynthia (NSB - CN/Hangzhou) <<a href="mailto:cynthia.zhou@nokia-sbell.com">cynthia.zhou@nokia-sbell.com</a>> wrote:<o:p></o:p></span></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">Ok, thanks for your comment!</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> </span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">cynthia</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> </span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Raghavendra
Gowdappa <<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>>
<br>
<b>Sent:</b> Monday, April 15, 2019 11:52 AM<br>
<b>To:</b> Zhou, Cynthia (NSB - CN/Hangzhou) <<a href="mailto:cynthia.zhou@nokia-sbell.com" target="_blank">cynthia.zhou@nokia-sbell.com</a>><br>
<b>Cc:</b> <a href="mailto:gluster-devel@gluster.org" target="_blank">gluster-devel@gluster.org</a><br>
<b>Subject:</b> Re: glusterd stuck for glusterfs with version 3.12.15</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Cynthia,<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">On Mon, Apr 15, 2019 at 8:10 AM Zhou, Cynthia (NSB - CN/Hangzhou) <<a href="mailto:cynthia.zhou@nokia-sbell.com" target="_blank">cynthia.zhou@nokia-sbell.com</a>>
wrote:<o:p></o:p></span></p>
</div>
<blockquote style="border:none;border-left:solid windowtext 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt;border-color:currentcolor currentcolor currentcolor rgb(204,204,204)">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">Hi,</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">I made a patch and according to my test, this glusterd stuck issue disappear with my patch. Only need to move event_handled
to the end of socket_event_poll_in function.</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> </span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">--- a/rpc/rpc-transport/socket/src/socket.c
</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">+++ b/rpc/rpc-transport/socket/src/socket.c</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">@@ -2305,9 +2305,9 @@ socket_event_poll_in (rpc_transport_t *this, gf_boolean_t notify_handled)</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> }</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> </span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">- if (notify_handled && (ret != -1))</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">- event_handled (ctx->event_pool, priv->sock, priv->idx,</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">- priv->gen);</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">@@ -2330,6 +2327,9 @@ socket_event_poll_in (rpc_transport_t *this, gf_boolean_t notify_handled)</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> }</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> pthread_mutex_unlock (&priv->notify.lock);</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> }</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">+ if (notify_handled && (ret != -1))</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">+ event_handled (ctx->event_pool, priv->sock, priv->idx,</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">+ priv->gen);</span><span lang="EN-US"><o:p></o:p></span></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Thanks for this tip. Though this helps in fixing the hang, this change has performance impact. Moving event_handled to end of poll_in means, socket will be added
back for polling of new events only _after_ the rpc is msg is processed by higher layers (like EC) and higher layers can have significant latency for processing the msg. Which means, socket will be out of polling for longer periods of time which decreases
the throughput (number of msgs read per second) affecting performance. However, this experiment definitely indicates there is a codepath where event_handled is not called (and hence causing the hang). I'll go through this codepath again.<o:p></o:p></span></p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">Can you check whether patch [1] fixes the issue you are seeing?<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">[1] <a href="https://review.gluster.org/#/c/glusterfs/+/22566">
https://review.gluster.org/#/c/glusterfs/+/22566</a><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Thanks for that experiment :).<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<blockquote style="border:none;border-left:solid windowtext 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt;border-color:currentcolor currentcolor currentcolor rgb(204,204,204)">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> return ret;</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">}</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> </span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">cynthia</span><span lang="EN-US"><o:p></o:p></span></p>
<div>
<div style="border:none;border-top:solid windowtext 1.0pt;padding:3.0pt 0cm 0cm 0cm;border-color:currentcolor">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Zhou,
Cynthia (NSB - CN/Hangzhou) <br>
<b>Sent:</b> Tuesday, April 09, 2019 3:57 PM<br>
<b>To:</b> 'Raghavendra Gowdappa' <<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>><br>
<b>Cc:</b> <a href="mailto:gluster-devel@gluster.org" target="_blank">gluster-devel@gluster.org</a><br>
<b>Subject:</b> RE: glusterd stuck for glusterfs with version 3.12.15</span><span lang="EN-US"><o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian">Can you figure out some possible reason why iobref is corrupted, is it possible that thread 8 has called poll in
and iobref has been relased, but the lock within it is not properly released (as I can not find any free lock operation in iobref_destroy), then thread 9 called
</span><span lang="EN-US">rpc_transport_pollin_destroy again, and so stuck on this lock<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Also, there should not be two thread handling the same socket at the same time, although there has been a patch claimed to tackle this issue.<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">cynthia<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:10.5pt;font-family:DengXian"> </span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Raghavendra
Gowdappa <<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>>
<br>
<b>Sent:</b> Tuesday, April 09, 2019 3:52 PM<br>
<b>To:</b> Zhou, Cynthia (NSB - CN/Hangzhou) <<a href="mailto:cynthia.zhou@nokia-sbell.com" target="_blank">cynthia.zhou@nokia-sbell.com</a>><br>
<b>Cc:</b> <a href="mailto:gluster-devel@gluster.org" target="_blank">gluster-devel@gluster.org</a><br>
<b>Subject:</b> Re: glusterd stuck for glusterfs with version 3.12.15</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">On Mon, Apr 8, 2019 at 7:42 AM Zhou, Cynthia (NSB - CN/Hangzhou) <<a href="mailto:cynthia.zhou@nokia-sbell.com" target="_blank">cynthia.zhou@nokia-sbell.com</a>>
wrote:<o:p></o:p></span></p>
</div>
<blockquote style="border:none;border-left:solid windowtext 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt;border-color:currentcolor currentcolor currentcolor rgb(204,204,204)">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Hi glusterfs experts,<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Good day!<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">In my test env, sometimes glusterd stuck issue happened, and it is not responding to any gluster commands, when I checked this issue I find that glusterd thread
9 and thread 8 is dealing with the same socket, I thought following patch should be able to solve this issue, however after I merged this patch this issue still exist. When I looked into this code, it seems socket_event_poll_in called event_handled before
rpc_transport_pollin_destroy, I think this gives the chance for another poll for the exactly the same socket. And caused this glusterd stuck issue, also, I find there is no LOCK_DESTROY(&iobref->lock)<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">In iobref_destroy, I think it is better to add destroy lock.<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Following is the gdb info when this issue happened, I would like to know your opinion on this issue, thanks!<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New"">SHA-1: f747d55a7fd364e2b9a74fe40360ab3cb7b11537</span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New""> </span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New"">* socket: fix issue on concurrent handle of a socket</span></b><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New""> </span></b><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New""> </span></b><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New""> </span></b><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:8.0pt;font-family:"Courier New"">GDB INFO:</span></b><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Thread 8 is blocked on pthread_cond_wait, and thread 9 is blocked in iobref_unref, I think
<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Thread 9 (Thread 0x7f9edf7fe700 (LWP 1933)):<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#0 0x00007f9ee9fd785c in __lll_lock_wait () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#1 0x00007f9ee9fda657 in __lll_lock_elision () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#2 0x00007f9eeb24cae6 in iobref_unref (iobref=0x7f9ed00063b0) at iobuf.c:944<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#3 0x00007f9eeafd2f29 in rpc_transport_pollin_destroy (pollin=0x7f9ed00452d0) at rpc-transport.c:123<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#4 0x00007f9ee4fbf319 in socket_event_poll_in (this=0x7f9ed0049cc0, notify_handled=_gf_true) at socket.c:2322<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#5 0x00007f9ee4fbf932 in socket_event_handler (<b><span style="color:red">fd=36</span>,<span style="color:red"> idx=27, gen=4, data=0x7f9ed0049cc0, poll_in=1,
poll_out=0, poll_err=0</span></b>) at socket.c:2471<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#6 0x00007f9eeb2825d4 in event_dispatch_epoll_handler (event_pool=0x17feb00, event=0x7f9edf7fde84) at event-epoll.c:583<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#7 0x00007f9eeb2828ab in event_dispatch_epoll_worker (data=0x180d0c0) at event-epoll.c:659<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#8 0x00007f9ee9fce5da in start_thread () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#9 0x00007f9ee98a4eaf in clone () from /lib64/libc.so.6<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Thread 8 (Thread 0x7f9edffff700 (LWP 1932)):<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#0 0x00007f9ee9fd785c in __lll_lock_wait () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#1 0x00007f9ee9fd2b42 in __pthread_mutex_cond_lock () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#2 0x00007f9ee9fd44a8 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#3 0x00007f9ee4fbadab in socket_event_poll_err (this=0x7f9ed0049cc0, gen=4, idx=27) at socket.c:1201<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#4 0x00007f9ee4fbf99c in socket_event_handler (<b><span style="color:red">fd=36, idx=27, gen=4, data=0x7f9ed0049cc0, poll_in=1, poll_out=0, poll_err=0</span></b>)
at socket.c:2480<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#5 0x00007f9eeb2825d4 in event_dispatch_epoll_handler (event_pool=0x17feb00, event=0x7f9edfffee84) at event-epoll.c:583<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#6 0x00007f9eeb2828ab in event_dispatch_epoll_worker (data=0x180cf20) at event-epoll.c:659<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#7 0x00007f9ee9fce5da in start_thread () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#8 0x00007f9ee98a4eaf in clone () from /lib64/libc.so.6<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">(gdb) thread 9<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">[Switching to thread 9 (Thread 0x7f9edf7fe700 (LWP 1933))]<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#0 0x00007f9ee9fd785c in __lll_lock_wait () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">(gdb) bt<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#0 0x00007f9ee9fd785c in __lll_lock_wait () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#1 0x00007f9ee9fda657 in __lll_lock_elision () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#2 0x00007f9eeb24cae6 in iobref_unref (iobref=0x7f9ed00063b0) at iobuf.c:944<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#3 0x00007f9eeafd2f29 in rpc_transport_pollin_destroy (pollin=0x7f9ed00452d0) at rpc-transport.c:123<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#4 0x00007f9ee4fbf319 in socket_event_poll_in (this=0x7f9ed0049cc0, notify_handled=_gf_true) at socket.c:2322<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#5 0x00007f9ee4fbf932 in socket_event_handler (fd=36, idx=27, gen=4, data=0x7f9ed0049cc0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2471<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#6 0x00007f9eeb2825d4 in event_dispatch_epoll_handler (event_pool=0x17feb00, event=0x7f9edf7fde84) at event-epoll.c:583<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#7 0x00007f9eeb2828ab in event_dispatch_epoll_worker (data=0x180d0c0) at event-epoll.c:659<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#8 0x00007f9ee9fce5da in start_thread () from /lib64/libpthread.so.0<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#9 0x00007f9ee98a4eaf in clone () from /lib64/libc.so.6<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">(gdb) frame 2<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">#2 0x00007f9eeb24cae6 in iobref_unref (iobref=0x7f9ed00063b0) at iobuf.c:944<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">944 iobuf.c: No such file or directory.<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">(gdb) print *iobref<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">$1 = {lock = {spinlock = 2, mutex = {__data = {__lock = 2, __count = 222, __owner = -2120437760, __nusers = 1, __kind = 8960, __spins = 512,
<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> __elision = 0, __list = {__prev = 0x4000, __next = 0x7f9ed00063b000}},
<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> __size = "\002\000\000\000\336\000\000\000\000\260\234\201\001\000\000\000\000#\000\000\000\002\000\000\000@\000\000\000\000\000\000\000\260c\000</span>О<span lang="EN-US">\177",
__align = 953482739714}}, ref = -256, iobrefs = 0xffffffffffffffff, alloced = -1, used = -1}<o:p></o:p></span></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">looks like the iobref is corrupted here. It seems to be a use-after-free issue. We need to dig into why a freed iobref is being accessed here.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<blockquote style="border:none;border-left:solid windowtext 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt;border-color:currentcolor currentcolor currentcolor rgb(204,204,204)">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">(gdb) quit<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">A debugging session is active.<o:p></o:p></span></p>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</body>
</html>