<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html>
<head>
<meta name="Generator" content="Zarafa WebAccess v7.1.14-51822">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>RE: [Gluster-users] BUG: After stop and start wrong port is advertised</title>
<style type="text/css">
body
{
font-family: Arial, Verdana, Sans-Serif ! important;
font-size: 12px;
padding: 5px 5px 5px 5px;
margin: 0px;
border-style: none;
background-color: #ffffff;
}
p, ul, li
{
margin-top: 0px;
margin-bottom: 0px;
}
</style>
</head>
<body>
<p>Hello,</p><p> </p><p> </p><p>Will we also suffer from this regression in any of the (previously) fixed 3.10 releases? We kept 3.10 and hope to stay stable :/</p><p><br /><br />Regards</p><p>Jo</p><p> </p><p> </p><blockquote style="border-left: 2px solid #325FBA; padding-left: 5px;margin-left:5px;">-----Original message-----<br /><strong>From:</strong>        Atin Mukherjee <amukherj@redhat.com><br /><strong>Sent:</strong>        Tue 23-01-2018 05:15<br /><strong>Subject:</strong>        Re: [Gluster-users] BUG: After stop and start wrong port is advertised<br /><strong>To:</strong>        Alan Orth <alan.orth@gmail.com>; <br /><strong>CC:</strong>        Jo Goossens <jo.goossens@hosted-power.com>; gluster-users@gluster.org; <br /><style type="text/css">body { font-family: monospace; }</style> <div dir="ltr"><div>So from the logs what it looks to be a regression caused by commit 635c1c3 ( and the good news is that this is now fixed in release-3.12 branch and should be part of 3.12.5.<br /> </div>Commit which fixes this issue:<br /><br /><pre gmail-bz_wrap_comment_text" id="gmail-comment_text_6">
COMMIT: <a href="https://review.gluster.org/19146" target="_blank" title="This external link will open in a new window">https://review.gluster.org/19146</a> committed in release-3.12 by \"Atin Mukherjee\" <<a href="mailto:amukherj@redhat.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=amukherj@redhat.com'); return false;" target="_blank" title="This external link will open in a new window">amukherj@redhat.com</a>> with a commit message- glusterd: connect to an existing brick process when qourum status is NOT_APPLICABLE_QUORUM First of all, this patch reverts commit 635c1c3 as the same is causing a regression with bricks not coming up on time when a node is rebooted. This patch tries to fix the problem in a different way by just trying to connect to an existing running brick when quorum status is not applicable. >mainline patch : <a href="https://review.gluster.org/#/c/19134/" target="_blank" title="This external link will open in a new window">https://review.gluster.org/#/c/19134/</a> Change-Id: I0efb5901832824b1c15dcac529bffac85173e097 BUG: 1511301 Signed-off-by: Atin Mukherjee <<a href="mailto:amukherj@redhat.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=amukherj@redhat.com'); return false;" target="_blank" title="This external link will open in a new window">amukherj@redhat.com</a>><br /><br /><br /></pre><pre gmail-bz_wrap_comment_text" id="gmail-comment_text_6"><br /></pre></div><div><br /><div>On Mon, Jan 22, 2018 at 3:15 PM, Alan Orth <span dir="ltr"><<a href="mailto:alan.orth@gmail.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=alan.orth@gmail.com'); return false;" target="_blank" title="This external link will open in a new window">alan.orth@gmail.com</a>></span> wrote:<br /><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Ouch! Yes, I see two port-related fixes in the GlusterFS 3.12.3 release notes[0][1][2]. I've attached a tarball of all yesterday's logs from /var/log/glusterd on one the affected nodes (called "wingu3"). I hope that's what you need.<br /><br />[0] <a href="https://github.com/gluster/glusterfs/blob/release-3.12/doc/release-notes/3.12.3.md" target="_blank" title="This external link will open in a new window">https://github.com/gluster/glusterfs/blob/release-3.12/doc/release-notes/3.12.3.md</a><br />[1] <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1507747" target="_blank" title="This external link will open in a new window">https://bugzilla.redhat.com/show_bug.cgi?id=1507747</a><br />[2] <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1507748" target="_blank" title="This external link will open in a new window">https://bugzilla.redhat.com/show_bug.cgi?id=1507748</a><br /> </div>Thanks,<div><div><br /><div><br /><div><div dir="ltr">On Mon, Jan 22, 2018 at 6:34 AM Atin Mukherjee <<a href="mailto:amukherj@redhat.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=amukherj@redhat.com'); return false;" target="_blank" title="This external link will open in a new window">amukherj@redhat.com</a>> wrote:</div><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">The patch was definitely there in 3.12.3. Do you have the glusterd and brick logs handy with you when this happened?</div><div><br /><div>On Sun, Jan 21, 2018 at 10:21 PM, Alan Orth <span dir="ltr"><<a href="mailto:alan.orth@gmail.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=alan.orth@gmail.com'); return false;" target="_blank" title="This external link will open in a new window">alan.orth@gmail.com</a>></span> wrote:<br /><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>For what it's worth, I just updated some CentOS 7 servers from GlusterFS 3.12.1 to 3.12.4 and hit this bug. Did the patch make it into 3.12.4? I had to use Mike Hulsman's script to check the daemon port against the port in the volume's brick info, update the port, and restart glusterd on each node. Luckily I only have four servers! Hoping I don't have to do this every time I reboot!<br /> </div>Regards,</div><div><div><br /><div><div dir="ltr">On Sat, Dec 2, 2017 at 5:23 PM Atin Mukherjee <<a href="mailto:amukherj@redhat.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=amukherj@redhat.com'); return false;" target="_blank" title="This external link will open in a new window">amukherj@redhat.com</a>> wrote:</div><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><div><div dir="auto">On Sat, 2 Dec 2017 at 19:29, Jo Goossens <<a href="mailto:jo.goossens@hosted-power.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=jo.goossens@hosted-power.com'); return false;" target="_blank" title="This external link will open in a new window">jo.goossens@hosted-power.com</a>> wrote:</div></div></div></div><div><div><div><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div><p>Hello Atin,</p><p> </p><p> </p><p>Could you confirm this should have been fixed in 3.10.8? If so we'll test it for sure!</p></div></blockquote><div dir="auto"> </div><div dir="auto">Fix should be part of 3.10.8 which is awaiting release announcement.</div><div dir="auto"> </div></div></div></div><div><div><div><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><p> </p><p><br /><br />Regards</p></div><div><p>Jo</p><p> </p><p><br /> </p></div><div><blockquote style="border-left:2px solid #325fba;padding-left:5px;margin-left:5px">-----Original message-----<br /><b>From:</b>        Atin Mukherjee <<a href="mailto:amukherj@redhat.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=amukherj@redhat.com'); return false;" target="_blank" title="This external link will open in a new window">amukherj@redhat.com</a>><br /></blockquote></div><div><blockquote style="border-left:2px solid #325fba;padding-left:5px;margin-left:5px"><b>Sent:</b>        Mon 30-10-2017 17:40<br /><b>Subject:</b>        Re: [Gluster-users] BUG: After stop and start wrong port is advertised<br /><b>To:</b>        Jo Goossens <<a href="mailto:jo.goossens@hosted-power.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=jo.goossens@hosted-power.com'); return false;" target="_blank" title="This external link will open in a new window">jo.goossens@hosted-power.com</a>>; <br /><b>CC:</b>        <a href="mailto:gluster-users@gluster.org" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=gluster-users@gluster.org'); return false;" target="_blank" title="This external link will open in a new window">gluster-users@gluster.org</a>; <br /> <div><br /><div><div dir="auto">On Sat, 28 Oct 2017 at 02:36, Jo Goossens <<a href="mailto:jo.goossens@hosted-power.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=jo.goossens@hosted-power.com'); return false;" title="This external link will open in a new window" target="_blank">jo.goossens@hosted-power.com</a>> wrote:</div><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div><p>Hello Atin,</p><p> </p><p> </p><p>I just read it and very happy you found the issue. We really hope this will be fixed in the next 3.10.7 version!</p></div></blockquote><div dir="auto"> </div><div dir="auto">3.10.7 - no I guess as the patch is still in review and 3.10.7 is getting tagged today. You’ll get this fix in 3.10.8. </div><div dir="auto"> </div><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><p> </p><p> </p><p> </p><p>PS: Wow nice all that c code and those "goto out" statements (not always considered clean but the best way often I think). Can remember the days I wrote kernel drivers myself in c :)</p><p> </p><p> </p><p>Regards</p></div><div><p>Jo Goossens</p><p> </p><p> </p><p><br /> </p></div><div><blockquote style="border-left:2px solid #325fba;padding-left:5px;margin-left:5px">-----Original message-----<br /><b>From:</b>        Atin Mukherjee <<a href="mailto:amukherj@redhat.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=amukherj@redhat.com'); return false;" title="This external link will open in a new window" target="_blank">amukherj@redhat.com</a>><br /><b>Sent:</b>        Fri 27-10-2017 21:01<br /><b>Subject:</b>        Re: [Gluster-users] BUG: After stop and start wrong port is advertised<br /><b>To:</b>        Jo Goossens <<a href="mailto:jo.goossens@hosted-power.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=jo.goossens@hosted-power.com'); return false;" title="This external link will open in a new window" target="_blank">jo.goossens@hosted-power.com</a>>; <br /><b>CC:</b>        <a href="mailto:gluster-users@gluster.org" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=gluster-users@gluster.org'); return false;" title="This external link will open in a new window" target="_blank">gluster-users@gluster.org</a>; <br /> </blockquote></div><div><blockquote style="border-left:2px solid #325fba;padding-left:5px;margin-left:5px"><div><div>We (finally) figured out the root cause, Jo!<br /> </div>Patch <a href="https://review.gluster.org/#/c/18579" title="This external link will open in a new window" target="_blank">https://review.gluster.org/#/c/18579</a> posted upstream for review.</div><div><br /><div>On Thu, Sep 21, 2017 at 2:08 PM, Jo Goossens <<a href="mailto:jo.goossens@hosted-power.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=jo.goossens@hosted-power.com'); return false;" title="This external link will open in a new window" target="_blank">jo.goossens@hosted-power.com</a>> wrote:<br /><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div><p>Hi,</p><p> </p><p> </p><p>We use glusterfs 3.10.5 on Debian 9.</p><p> </p><p>When we stop or restart the service, e.g.: service glusterfs-server restart</p><p> </p><p>We see that the wrong port get's advertised afterwards. For example:</p><p> </p><p>Before restart:</p><p> </p><div>Status of volume: public</div><div>Gluster process TCP Port RDMA Port Online Pid</div><div>------------------------------------------------------------------------------</div><div>Brick 192.168.140.41:/gluster/public 49153 0 Y 6364</div><div>Brick 192.168.140.42:/gluster/public 49152 0 Y 1483</div><div>Brick 192.168.140.43:/gluster/public 49152 0 Y 5913</div><div>Self-heal Daemon on localhost N/A N/A Y 5932</div><div>Self-heal Daemon on 192.168.140.42 N/A N/A Y 13084</div><div>Self-heal Daemon on 192.168.140.41 N/A N/A Y 15499</div><div> </div><div>Task Status of Volume public</div><div>------------------------------------------------------------------------------</div><div>There are no active volume tasks</div><div> </div><div> </div><div>After restart of the service on one of the nodes (192.168.140.43) the port seems to have changed (but it didn't):</div><div> </div><div><div>root@app3:/var/log/glusterfs# gluster volume status</div><div>Status of volume: public</div><div>Gluster process TCP Port RDMA Port Online Pid</div><div>------------------------------------------------------------------------------</div><div>Brick 192.168.140.41:/gluster/public 49153 0 Y 6364</div><div>Brick 192.168.140.42:/gluster/public 49152 0 Y 1483</div><div>Brick 192.168.140.43:/gluster/public 49154 0 Y 5913</div><div>Self-heal Daemon on localhost N/A N/A Y 4628</div><div>Self-heal Daemon on 192.168.140.42 N/A N/A Y 3077</div><div>Self-heal Daemon on 192.168.140.41 N/A N/A Y 28777</div><div> </div><div>Task Status of Volume public</div><div>------------------------------------------------------------------------------</div><div>There are no active volume tasks</div><div> </div></div><div> </div><div>However the active process is STILL the same pid AND still listening on the old port</div><div> </div><div><div>root@192.168.140.43:/var/log/glusterfs# netstat -tapn | grep gluster</div><div>tcp 0 0 <a href="http://0.0.0.0:49152" title="This external link will open in a new window" target="_blank">0.0.0.0:49152</a> 0.0.0.0:* LISTEN 5913/glusterfsd</div><div> </div></div><div> </div><div>The other nodes logs fill up with errors because they can't reach the daemon anymore. They try to reach it on the "new" port instead of the old one:</div><div> </div><div><div>[2017-09-21 08:33:25.225006] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to <a href="http://192.168.140.43:49154" title="This external link will open in a new window" target="_blank">192.168.140.43:49154</a> failed (Connection refused); disconnecting socket</div><div>[2017-09-21 08:33:29.226633] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 0-public-client-2: changing port to 49154 (from 0)</div><div>[2017-09-21 08:33:29.227490] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to <a href="http://192.168.140.43:49154" title="This external link will open in a new window" target="_blank">192.168.140.43:49154</a> failed (Connection refused); disconnecting socket</div><div>[2017-09-21 08:33:33.225849] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 0-public-client-2: changing port to 49154 (from 0)</div><div>[2017-09-21 08:33:33.236395] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to <a href="http://192.168.140.43:49154" title="This external link will open in a new window" target="_blank">192.168.140.43:49154</a> failed (Connection refused); disconnecting socket</div><div>[2017-09-21 08:33:37.225095] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 0-public-client-2: changing port to 49154 (from 0)</div><div>[2017-09-21 08:33:37.225628] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to <a href="http://192.168.140.43:49154" title="This external link will open in a new window" target="_blank">192.168.140.43:49154</a> failed (Connection refused); disconnecting socket</div><div>[2017-09-21 08:33:41.225805] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 0-public-client-2: changing port to 49154 (from 0)</div><div>[2017-09-21 08:33:41.226440] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to <a href="http://192.168.140.43:49154" title="This external link will open in a new window" target="_blank">192.168.140.43:49154</a> failed (Connection refused); disconnecting socket</div><div> </div></div><div>So they now try 49154 instead of the old 49152 </div><div> </div><div>Is this also by design? We had a lot of issues because of this recently. We don't understand why it starts advertising a completely wrong port after stop/start.</div><div> </div><div> </div><div> </div><div> </div><p> </p><p>Regards</p><font color="#888888"><p>Jo Goossens</p><p> </p> </font></div> <br />_______________________________________________<br /> Gluster-users mailing list<br /> <a href="mailto:Gluster-users@gluster.org" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=Gluster-users@gluster.org'); return false;" title="This external link will open in a new window" target="_blank">Gluster-users@gluster.org</a><br /> <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" title="This external link will open in a new window" target="_blank">http://lists.gluster.org/mailman/listinfo/gluster-users</a><br /></blockquote></div></div> </blockquote></div></blockquote></div></div><div>--</div><div>- Atin (atinm)</div> </blockquote></div></blockquote></div></div></div><div dir="ltr">-- </div><div>- Atin (atinm)</div> _______________________________________________<br /> Gluster-users mailing list<br /> <a href="mailto:Gluster-users@gluster.org" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=Gluster-users@gluster.org'); return false;" target="_blank" title="This external link will open in a new window">Gluster-users@gluster.org</a><br /> <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank" title="This external link will open in a new window">http://lists.gluster.org/mailman/listinfo/gluster-users</a></blockquote></div><br clear="all" /><br />-- </div></div><font color="#888888"><div dir="ltr"><p dir="ltr">Alan Orth<br /> <a href="mailto:alan.orth@gmail.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=alan.orth@gmail.com'); return false;" target="_blank" title="This external link will open in a new window">alan.orth@gmail.com</a><br /> <a href="https://picturingjordan.com" target="_blank" title="This external link will open in a new window">https://picturingjordan.com</a><br /> <a href="https://englishbulgaria.net" target="_blank" title="This external link will open in a new window">https://englishbulgaria.net</a><br /> <a href="https://mjanja.ch" target="_blank" title="This external link will open in a new window">https://mjanja.ch</a></p></div> </font></blockquote></div></div> </blockquote></div></div></div></div></div><div><div><br clear="all" /><br />-- <br /><div dir="ltr"><p dir="ltr">Alan Orth<br /> <a href="mailto:alan.orth@gmail.com" onclick="parent.webclient.openWindow(this, 'createmail', 'index.php?load=dialog&task=createmail_standard&to=alan.orth@gmail.com'); return false;" target="_blank" title="This external link will open in a new window">alan.orth@gmail.com</a><br /> <a href="https://picturingjordan.com" target="_blank" title="This external link will open in a new window">https://picturingjordan.com</a><br /> <a href="https://englishbulgaria.net" target="_blank" title="This external link will open in a new window">https://englishbulgaria.net</a><br /> <a href="https://mjanja.ch" target="_blank" title="This external link will open in a new window">https://mjanja.ch</a></p></div></div></div></blockquote></div></div> </blockquote>
</body>
</html>