<div dir="ltr"><div dir="ltr"><div>Darell,</div><div><br></div><div>I fully understand that you can&#39;t reproduce it and you don&#39;t have bandwidth to test it again, but would you be able to send us the glusterd log from all the nodes when this happened. We would like to go through the logs and get back. I would particularly like to see if something has gone wrong with transport.socket.listen-port option. But with out the log files we can&#39;t find out anything. Hope you understand it.<br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 4, 2019 at 9:27 PM Darrell Budic &lt;<a href="mailto:budic@onholyground.com">budic@onholyground.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;">I didn’t follow any specific documents, just a generic rolling upgrade one node at a time. Once the first node didn’t reconnect, I tried to follow the workaround in the bug during the upgrade. Basic procedure was:<div><br></div><div>- take 3 nodes that were initially installed with 3.12.x (forget which, but low number) and had been upgraded directly to 5.5 from 3.12.15</div><div>  - op-version was 50400</div><div>- on node A:</div><div>  - yum install centos-release-gluster6</div><div>  - yum upgrade (was some ovirt cockpit components, gluster, and a lib or two this time), hit yes</div><div>  - discover glusterd was dead<br><div>  - systemctl restart glusterd</div><div>  - no peer connections, try iptables -F; systemctl restart glusterd, no change</div><div>- following the workaround in the bug, try iptables -F &amp; restart glusterd on other 2 nodes, no effect</div><div>  - nodes B &amp; C were still connected to each other and all bricks were fine at this point</div><div>- try upgrading other 2 nodes and restarting gluster, no effect (iptables still empty)</div><div>  - lost quota here, so all bricks went offline</div><div>- read logs, not finding much, but looked at glusterd.vol and compared to new versions</div><div>- updated glusterd.vol on A and restarted glusterd</div><div>  - A doesn’t show any connected peers, but both other nodes show A as connected</div><div>- update glusterd.vol on B &amp; C, restart glusterd</div><div>  - all nodes show connected and volumes are active and healing</div><div><br></div><div>The only odd thing in my process was that node A did not have any active bricks on it at the time of the upgrade. It doesn’t seem like this mattered since B &amp; C showed the same symptoms between themselves while being upgraded, but I don’t know. The only log entry that referenced anything about peer connections is included below already.</div><div><br></div><div>Looks like it was related to my glusterd settings, since that’s what fixed it for me. Unfortunately, I don’t have the bandwidth or the systems to test different versions of that specifically, but maybe you guys can on some test resources? Otherwise, I’ve got another cluster (my production one!) that’s midway through the upgrade from 3.12.15 -&gt; 5.5. I paused when I started getting multiple brick processes on the two nodes that had gone to 5.5 already. I think I’m going to jump the last node right to 6 to try and avoid that mess, and it has the same glusterd.vol settings. I’ll try and capture it’s logs during the upgrade and see if there’s any new info, or if it has the same issues as this group did.</div><div><br></div><div>  -Darrell</div><div><br><blockquote type="cite"><div>On Apr 4, 2019, at 2:54 AM, Sanju Rakonde &lt;<a href="mailto:srakonde@redhat.com" target="_blank">srakonde@redhat.com</a>&gt; wrote:</div><br class="gmail-m_2595462862866881005Apple-interchange-newline"><div><div dir="ltr"><div dir="ltr">We don&#39;t hit <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1694010" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1694010</a> while upgrading to glusterfs-6. We tested it in different setups and understood that this issue is seen because of some issue in setup.</div><div dir="ltr"><br></div><div>regarding the issue you have faced, can you please let us know which documentation you have followed for the upgrade? During our testing, we didn&#39;t hit any such issue. we would like to understand what went wrong.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 4, 2019 at 2:08 AM Darrell Budic &lt;<a href="mailto:budic@onholyground.com" target="_blank">budic@onholyground.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>Hari-<div><br></div><div>I was upgrading my test cluster from 5.5 to 6 and I hit this bug (<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1694010" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1694010</a><span>)</span> or something similar. In my case, the workaround did not work, and I was left with a gluster that had gone into no-quorum mode and stopped all the bricks. Wasn’t much in the logs either, but I noticed my /etc/glusterfs/glusterd.vol files were not the same as the newer versions, so I updated them, restarted glusterd, and suddenly the updated node showed as peer-in-cluster again. Once I updated other notes the same way, things started working again. Maybe a place to look?</div><div><br></div><div>My old config (all nodes):</div><div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">volume management</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    type mgmt/glusterd</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option working-directory /var/lib/glusterd</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport-type socket</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport.socket.keepalive-time 10</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport.socket.keepalive-interval 2</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport.socket.read-fail-log off</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option ping-timeout 10</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option event-threads 1</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option rpc-auth-allow-insecure on</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">#   option transport.address-family inet6</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">#   option base-port 49152</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">end-volume</span></div></div><div><span style="font-variant-ligatures:no-common-ligatures"><br></span></div><div><span style="font-variant-ligatures:no-common-ligatures">changed to:</span></div><div><span style="font-variant-ligatures:no-common-ligatures"><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">volume management</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    type mgmt/glusterd</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option working-directory /var/lib/glusterd</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport-type socket,rdma</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport.socket.keepalive-time 10</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport.socket.keepalive-interval 2</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport.socket.read-fail-log off</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport.socket.listen-port 24007</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option transport.rdma.listen-port 24008</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option ping-timeout 0</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option event-threads 1</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option rpc-auth-allow-insecure on</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">#   option lock-timer 180</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">#   option transport.address-family inet6</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">#   option base-port 49152</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">    option max-port  60999</span></div><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">end-volume</span></div><div><span style="font-variant-ligatures:no-common-ligatures"><br></span></div><div><span style="font-variant-ligatures:no-common-ligatures">the only thing I found in the glusterd logs that looks relevant was (repeated for both of the other nodes in this cluster), so no clue why it happened:</span></div><div><span style="font-variant-ligatures:no-common-ligatures"><div style="margin:0px;font-stretch:normal;line-height:normal"><span style="font-variant-ligatures:no-common-ligatures;background-color:rgba(255,255,255,0)">[2019-04-03 20:19:16.802638] I [MSGID: 106004] [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer &lt;ossuary-san&gt; (&lt;0ecbf953-681b-448f-9746-d1c1fe7a0978&gt;), in state &lt;Peer in Cluster&gt;, has disconnected from glusterd.</span></div><div><span style="font-variant-ligatures:no-common-ligatures"><br></span></div></span></div></span></div><div><div><br><blockquote type="cite"><div>On Apr 2, 2019, at 4:53 AM, Atin Mukherjee &lt;<a href="mailto:atin.mukherjee83@gmail.com" target="_blank">atin.mukherjee83@gmail.com</a>&gt; wrote:</div><br class="gmail-m_2595462862866881005gmail-m_8304304534599826919Apple-interchange-newline"><div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><br class="gmail-m_2595462862866881005gmail-m_8304304534599826919Apple-interchange-newline"><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 1 Apr 2019 at 10:28, Hari Gowtham &lt;<a href="mailto:hgowtham@redhat.com" target="_blank">hgowtham@redhat.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Comments inline.<br><br>On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay<br>&lt;<a href="mailto:sankarshan.mukhopadhyay@gmail.com" target="_blank">sankarshan.mukhopadhyay@gmail.com</a>&gt; wrote:<br>&gt;<br>&gt; Quite a considerable amount of detail here. Thank you!<br>&gt;<br>&gt; On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham &lt;<a href="mailto:hgowtham@redhat.com" target="_blank">hgowtham@redhat.com</a>&gt; wrote:<br>&gt; &gt;<br>&gt; &gt; Hello Gluster users,<br>&gt; &gt;<br>&gt; &gt; As you all aware that glusterfs-6 is out, we would like to inform you<br>&gt; &gt; that, we have spent a significant amount of time in testing<br>&gt; &gt; glusterfs-6 in upgrade scenarios. We have done upgrade testing to<br>&gt; &gt; glusterfs-6 from various releases like 3.12, 4.1 and 5.3.<br>&gt; &gt;<br>&gt; &gt; As glusterfs-6 has got in a lot of changes, we wanted to test those portions.<br>&gt; &gt; There were xlators (and respective options to enable/disable them)<br>&gt; &gt; added and deprecated in glusterfs-6 from various versions [1].<br>&gt; &gt;<br>&gt; &gt; We had to check the following upgrade scenarios for all such options<br>&gt; &gt; Identified in [1]:<br>&gt; &gt; 1) option never enabled and upgraded<br>&gt; &gt; 2) option enabled and then upgraded<br>&gt; &gt; 3) option enabled and then disabled and then upgraded<br>&gt; &gt;<br>&gt; &gt; We weren&#39;t manually able to check all the combinations for all the options.<br>&gt; &gt; So the options involving enabling and disabling xlators were prioritized.<br>&gt; &gt; The below are the result of the ones tested.<br>&gt; &gt;<br>&gt; &gt; Never enabled and upgraded:<br>&gt; &gt; checked from 3.12, 4.1, 5.3 to 6 the upgrade works.<br>&gt; &gt;<br>&gt; &gt; Enabled and upgraded:<br>&gt; &gt; Tested for tier which is deprecated, It is not a recommended upgrade.<br>&gt; &gt; As expected the volume won&#39;t be consumable and will have a few more<br>&gt; &gt; issues as well.<br>&gt; &gt; Tested with 3.12, 4.1 and 5.3 to 6 upgrade.<br>&gt; &gt;<br>&gt; &gt; Enabled, disabled before upgrade.<br>&gt; &gt; Tested for tier with 3.12 and the upgrade went fine.<br>&gt; &gt;<br>&gt; &gt; There is one common issue to note in every upgrade. The node being<br>&gt; &gt; upgraded is going into disconnected state. You have to flush the iptables<br>&gt; &gt; and the restart glusterd on all nodes to fix this.<br>&gt; &gt;<br>&gt;<br>&gt; Is this something that is written in the upgrade notes? I do not seem<br>&gt; to recall, if not, I&#39;ll send a PR<br><br>No this wasn&#39;t mentioned in the release notes. PRs are welcome.<br><br>&gt;<br>&gt; &gt; The testing for enabling new options is still pending. The new options<br>&gt; &gt; won&#39;t cause as much issues as the deprecated ones so this was put at<br>&gt; &gt; the end of the priority list. It would be nice to get contributions<br>&gt; &gt; for this.<br>&gt; &gt;<br>&gt;<br>&gt; Did the range of tests lead to any new issues?<br><br>Yes. In the first round of testing we found an issue and had to postpone the<br>release of 6 until the fix was made available.<br><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1684029" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1684029</a><br><br>And then we tested it again after this patch was made available.<br>and came  across this:<br><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1694010" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1694010</a></blockquote><div dir="auto"><br></div><div dir="auto">This isn’t a bug as we found that upgrade worked seamelessly in two different setup. So we have no issues in the upgrade path to glusterfs-6 release.</div><div dir="auto"><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1694010" rel="noreferrer" target="_blank"></a><br><br>Have mentioned this in the second mail as to how to over this situation<br>for now until the fix is available.<br><br>&gt;<br>&gt; &gt; For the disable testing, tier was used as it covers most of the xlator<br>&gt; &gt; that was removed. And all of these tests were done on a replica 3 volume.<br>&gt; &gt;<br>&gt;<br>&gt; I&#39;m not sure if the Glusto team is reading this, but it would be<br>&gt; pertinent to understand if the approach you have taken can be<br>&gt; converted into a form of automated testing pre-release.<br><br>I don&#39;t have an answer for this, have CCed Vijay.<br>He might have an idea.<br><br>&gt;<br>&gt; &gt; Note: This is only for upgrade testing of the newly added and removed<br>&gt; &gt; xlators. Does not involve the normal tests for the xlator.<br>&gt; &gt;<br>&gt; &gt; If you have any questions, please feel free to reach us.<br>&gt; &gt;<br>&gt; &gt; [1]<span class="gmail-m_2595462862866881005gmail-m_8304304534599826919Apple-converted-space"> </span><a href="https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing" rel="noreferrer" target="_blank">https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing</a><br>&gt; &gt;<br>&gt; &gt; Regards,<br>&gt; &gt; Hari and Sanju.<br>&gt; _______________________________________________<br>&gt; Gluster-users mailing list<br>&gt;<span class="gmail-m_2595462862866881005gmail-m_8304304534599826919Apple-converted-space"> </span><a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>&gt;<span class="gmail-m_2595462862866881005gmail-m_8304304534599826919Apple-converted-space"> </span><a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br><br><br><br>--<span class="gmail-m_2595462862866881005gmail-m_8304304534599826919Apple-converted-space"> </span><br>Regards,<br>Hari Gowtham.<br>_______________________________________________<br>Gluster-users mailing list<br><a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br><a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br></blockquote></div></div><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">--<span class="gmail-m_2595462862866881005gmail-m_8304304534599826919Apple-converted-space"> </span></span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><div dir="ltr" class="gmail-m_2595462862866881005gmail-m_8304304534599826919gmail_signature" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">--Atin</div><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">_______________________________________________</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Gluster-users mailing list</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><a href="mailto:Gluster-users@gluster.org" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" target="_blank">Gluster-users@gluster.org</a><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><a href="https://lists.gluster.org/mailman/listinfo/gluster-users" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></div></blockquote></div><br></div></div>_______________________________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail-m_2595462862866881005gmail_signature"><div dir="ltr"><div>Thanks,<br></div>Sanju<br></div></div>
</div></blockquote></div><br></div></div>_______________________________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></blockquote></div>