<div dir="ltr">Hi Ravi,<div><br></div><div>I would like to avoid an offline upgrade, since it would disrupt quite some services.</div><div>Is there anything further I can investigate or do?</div><div><br></div><div>Thanks Olaf</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Op di 3 nov. 2020 om 12:17 schreef Ravishankar N &lt;<a href="mailto:ravishankar@redhat.com">ravishankar@redhat.com</a>&gt;:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div>

    <p><br>

    </p>

    <div>On 02/11/20 8:35 pm, Olaf Buitelaar

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">Dear Gluster users,

        <div><br>

        </div>

        <div>I&#39;m trying to upgrade from gluster 6.10 to 7.8, i&#39;ve

          currently tried this on 2 hosts, but on both the Self-Heal

          Daemon refuses to start.</div>

        <div>It could be because not all not are updated yet, but i&#39;m a

          bit hesitant to continue, without the Self-Heal Daemon

          running.</div>

        <div>I&#39;m not using quata&#39;s and i&#39;m not seeing the peer reject

          messages, as other users reported in the mailing list.</div>

        <div>In fact gluster peer status and gluster pool list, display

          all nodes as connected.</div>

        <div>Also gluster v heal &lt;vol&gt; info shows all nodes as

          Status: connected, however some report pending heals, which

          don&#39;t really seem to progress. </div>

        <div>Only in gluster v status &lt;vol&gt; the 2 upgraded nodes

          report not running; </div>

        <div><br>

        </div>

        <div>Self-heal Daemon on localhost               N/A       N/A  

               N       N/A<br>

          Self-heal Daemon on 10.32.9.5               N/A       N/A    

             Y       24022<br>

          Self-heal Daemon on 10.201.0.4              N/A       N/A    

             Y       26704<br>

          Self-heal Daemon on 10.201.0.3              N/A       N/A    

             N       N/A<br>

          Self-heal Daemon on 10.32.9.4               N/A       N/A    

             Y       46294<br>

          Self-heal Daemon on 10.32.9.3               N/A       N/A    

             Y       22194<br>

          Self-heal Daemon on 10.201.0.9              N/A       N/A    

             Y       14902<br>

          Self-heal Daemon on 10.201.0.6              N/A       N/A    

             Y       5358<br>

          Self-heal Daemon on 10.201.0.5              N/A       N/A    

             Y       28073<br>

          Self-heal Daemon on 10.201.0.7              N/A       N/A    

             Y       15385<br>

          Self-heal Daemon on 10.201.0.1              N/A       N/A    

             Y       8917<br>

          Self-heal Daemon on 10.201.0.12             N/A       N/A    

             Y       56796<br>

          Self-heal Daemon on 10.201.0.8              N/A       N/A    

             Y       7990<br>

          Self-heal Daemon on 10.201.0.11             N/A       N/A    

             Y       68223<br>

          Self-heal Daemon on 10.201.0.10             N/A       N/A    

             Y       20828<br>

        </div>

        <div><br>

        </div>

        <div>After the upgrade i see the

          file /var/lib/glusterd/vols/&lt;vol&gt;/&lt;vol&gt;-shd.vol

          being created, which doesn&#39;t exists on the 6.10 nodes. </div>

        <div><br>

        </div>

        <div>in the logs i see these relevant messages; </div>

        <div>log: glusterd.log</div>

        <div>0-management: Regenerating volfiles due to a max op-version

          mismatch or glusterd.upgrade file not being present,

          op_version retrieved:60000, max op_version: 70200<br>

        </div>

      </div>

    </blockquote>

    <p>I think this is because of the shd multiplex

      (<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1659708" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1659708</a>) added by

      Rafi.</p>

    <p>Rafi, is there any workaround which can work for rolling

      upgrades? Or should we just do an offline upgrade of all server

      nodes for the shd to come online? <br>

    </p>

    <p>-Ravi<br>

    </p>

    <p><br>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div><br>

        </div>

        <div>[2020-10-31 21:48:42.256193] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: tier-enabled<br>

          [2020-10-31 21:48:42.256232] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-0<br>

          [2020-10-31 21:48:42.256240] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-1<br>

          [2020-10-31 21:48:42.256246] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-2<br>

          [2020-10-31 21:48:42.256251] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-3<br>

          [2020-10-31 21:48:42.256256] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-4<br>

          [2020-10-31 21:48:42.256261] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-5<br>

          [2020-10-31 21:48:42.256266] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-6<br>

          [2020-10-31 21:48:42.256271] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-7<br>

          [2020-10-31 21:48:42.256276] W [MSGID: 106204]

          [glusterd-store.c:3275:glusterd_store_update_volinfo]

          0-management: Unknown key: brick-8<br>

          <br>

        </div>

        <div>[2020-10-31 21:51:36.049009] W [MSGID: 106617]

          [glusterd-svc-helper.c:948:glusterd_attach_svc] 0-glusterd:

          attach failed for glustershd(volume=backups)<br>

          [2020-10-31 21:51:36.049055] E [MSGID: 106048]

          [glusterd-shd-svc.c:482:glusterd_shdsvc_start] 0-glusterd:

          Failed to attach shd svc(volume=backups) to pid=9262<br>

          [2020-10-31 21:51:36.049138] E [MSGID: 106615]

          [glusterd-shd-svc.c:638:glusterd_shdsvc_restart] 0-management:

          Couldn&#39;t start shd for vol: backups on restart<br>

          [2020-10-31 21:51:36.183133] I [MSGID: 106618]

          [glusterd-svc-helper.c:901:glusterd_attach_svc] 0-glusterd:

          adding svc glustershd (volume=backups) to existing process

          with pid 9262<br>

        </div>

        <div><br>

        </div>

        <div>log: glustershd.log

        </div>

        <div><br>

        </div>

        <div>[2020-10-31 21:49:55.976120] I [MSGID: 100041]

          [glusterfsd-mgmt.c:1111:glusterfs_handle_svc_attach]

          0-glusterfs: received attach request for

          volfile-id=shd/backups<br>

          [2020-10-31 21:49:55.976136] W [MSGID: 100042]

          [glusterfsd-mgmt.c:1137:glusterfs_handle_svc_attach]

          0-glusterfs: got attach for shd/backups but no active graph

          [Invalid argument]<br>

        </div>

        <div><br>

        </div>

        <div>So i suspect something in the logic for the self-heal

          daemon has changed, since it has the new *.vol configuration

          for the shd. Question is, is this just a transitional state,

          till all nodes are upgraded. And thus safe to continue the

          update. Or is this something that should be fixed, and if so,

          any clues how?</div>

        <div><br>

        </div>

        <div>Thanks Olaf</div>

      </div>

      <br>

      <fieldset></fieldset>

      <pre>________

Community Meeting Calendar:

Schedule -

Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC

Bridge: <a href="https://bluejeans.com/441850968" target="_blank">https://bluejeans.com/441850968</a>

Gluster-users mailing list

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a>

</pre>

    </blockquote>

  </div>

</blockquote></div>