<div dir="ltr">Dear Gluster users,<div><br></div><div>I&#39;m trying to upgrade from gluster 6.10 to 7.8, i&#39;ve currently tried this on 2 hosts, but on both the Self-Heal Daemon refuses to start.</div><div>It could be because not all not are updated yet, but i&#39;m a bit hesitant to continue, without the Self-Heal Daemon running.</div><div>I&#39;m not using quata&#39;s and i&#39;m not seeing the peer reject messages, as other users reported in the mailing list.</div><div>In fact gluster peer status and gluster pool list, display all nodes as connected.</div><div>Also gluster v heal &lt;vol&gt; info shows all nodes as Status: connected, however some report pending heals, which don&#39;t really seem to progress. </div><div>Only in gluster v status &lt;vol&gt; the 2 upgraded nodes report not running; </div><div><br></div><div>Self-heal Daemon on localhost               N/A       N/A        N       N/A<br>Self-heal Daemon on 10.32.9.5               N/A       N/A        Y       24022<br>Self-heal Daemon on 10.201.0.4              N/A       N/A        Y       26704<br>Self-heal Daemon on 10.201.0.3              N/A       N/A        N       N/A<br>Self-heal Daemon on 10.32.9.4               N/A       N/A        Y       46294<br>Self-heal Daemon on 10.32.9.3               N/A       N/A        Y       22194<br>Self-heal Daemon on 10.201.0.9              N/A       N/A        Y       14902<br>Self-heal Daemon on 10.201.0.6              N/A       N/A        Y       5358<br>Self-heal Daemon on 10.201.0.5              N/A       N/A        Y       28073<br>Self-heal Daemon on 10.201.0.7              N/A       N/A        Y       15385<br>Self-heal Daemon on 10.201.0.1              N/A       N/A        Y       8917<br>Self-heal Daemon on 10.201.0.12             N/A       N/A        Y       56796<br>Self-heal Daemon on 10.201.0.8              N/A       N/A        Y       7990<br>Self-heal Daemon on 10.201.0.11             N/A       N/A        Y       68223<br>Self-heal Daemon on 10.201.0.10             N/A       N/A        Y       20828<br></div><div><br></div><div>After the upgrade i see the file /var/lib/glusterd/vols/&lt;vol&gt;/&lt;vol&gt;-shd.vol being created, which doesn&#39;t exists on the 6.10 nodes. </div><div><br></div><div>in the logs i see these relevant messages; </div><div>log: glusterd.log</div><div>0-management: Regenerating volfiles due to a max op-version mismatch or glusterd.upgrade file not being present, op_version retrieved:60000, max op_version: 70200<br></div><div><br></div><div>[2020-10-31 21:48:42.256193] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: tier-enabled<br>[2020-10-31 21:48:42.256232] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-0<br>[2020-10-31 21:48:42.256240] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-1<br>[2020-10-31 21:48:42.256246] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-2<br>[2020-10-31 21:48:42.256251] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-3<br>[2020-10-31 21:48:42.256256] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-4<br>[2020-10-31 21:48:42.256261] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-5<br>[2020-10-31 21:48:42.256266] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-6<br>[2020-10-31 21:48:42.256271] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-7<br>[2020-10-31 21:48:42.256276] W [MSGID: 106204] [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown key: brick-8<br><br></div><div>[2020-10-31 21:51:36.049009] W [MSGID: 106617] [glusterd-svc-helper.c:948:glusterd_attach_svc] 0-glusterd: attach failed for glustershd(volume=backups)<br>[2020-10-31 21:51:36.049055] E [MSGID: 106048] [glusterd-shd-svc.c:482:glusterd_shdsvc_start] 0-glusterd: Failed to attach shd svc(volume=backups) to pid=9262<br>[2020-10-31 21:51:36.049138] E [MSGID: 106615] [glusterd-shd-svc.c:638:glusterd_shdsvc_restart] 0-management: Couldn&#39;t start shd for vol: backups on restart<br>[2020-10-31 21:51:36.183133] I [MSGID: 106618] [glusterd-svc-helper.c:901:glusterd_attach_svc] 0-glusterd: adding svc glustershd (volume=backups) to existing process with pid 9262<br></div><div><br></div><div>log: glustershd.log


</div><div><br></div><div>[2020-10-31 21:49:55.976120] I [MSGID: 100041] [glusterfsd-mgmt.c:1111:glusterfs_handle_svc_attach] 0-glusterfs: received attach request for volfile-id=shd/backups<br>[2020-10-31 21:49:55.976136] W [MSGID: 100042] [glusterfsd-mgmt.c:1137:glusterfs_handle_svc_attach] 0-glusterfs: got attach for shd/backups but no active graph [Invalid argument]<br></div><div><br></div><div>So i suspect something in the logic for the self-heal daemon has changed, since it has the new *.vol configuration for the shd. Question is, is this just a transitional state, till all nodes are upgraded. And thus safe to continue the update. Or is this something that should be fixed, and if so, any clues how?</div><div><br></div><div>Thanks Olaf</div></div>