For the arb_0 I seeonly 8 clients , while there should be 12 clients:<div id="yMail_cursorElementTracker_1622451858392"><br><div id="yMail_cursorElementTracker_1622451820851">Brick : 192.168.0.40:/var/bricks/0/brick</div><div id="yMail_cursorElementTracker_1622451820851">Clients connected : 12</div><div id="yMail_cursorElementTracker_1622451854959"><br></div><div id="yMail_cursorElementTracker_1622451853430"><div id="yMail_cursorElementTracker_1622451806199">Brick : 192.168.0.41:/var/bricks/0/brick</div><div id="yMail_cursorElementTracker_1622451818311">Clients connected : 12</div><div id="yMail_cursorElementTracker_1622451915742"><br></div><div id="yMail_cursorElementTracker_1622451916269">Brick : 192.168.0.80:/var/bricks/arb_0/brick</div><div id="yMail_cursorElementTracker_1622451912745">Clients connected : 8</div><div id="yMail_cursorElementTracker_1622451940137"><br></div><div id="yMail_cursorElementTracker_1622451940293">Can you try to reconnect them. The most simple way is to kill the arbiter process and 'gluster volume start force' , but always verify that you have both data bricks up and running.</div><div id="yMail_cursorElementTracker_1622452260648"><br></div><div id="yMail_cursorElementTracker_1622452271274"><br></div><div id="yMail_cursorElementTracker_1622452271437">Yet, this doesn't explain why the heal daemon is not able to replicate properly.</div><div id="yMail_cursorElementTracker_1622452308870"><br></div><div id="yMail_cursorElementTracker_1622452309105"><br></div><div id="yMail_cursorElementTracker_1622452260740">Best Regards,</div><div id="yMail_cursorElementTracker_1622452264560">Strahil Nikolov<br><div id="yMail_cursorElementTracker_1622451681324"> <blockquote style="margin: 0 0 20px 0;"> <div style="font-family:Roboto, sans-serif; color:#6D00F6;">  </div> <div style="padding: 10px 0 0 20px; margin: 10px 0 0 0; border-left: 1px solid #6D00F6;" id="yMail_cursorElementTracker_1622451863438"> Meanwhile I tried reset-brick on one of the failing arbiters on node2, but with same results. The behaviour is reproducible, arbiter stays empty.<br clear="none"><br clear="none"><br clear="none">node0: 192.168.0.40<br clear="none"><br clear="none">node1: 192.168.0.41<br clear="none"><br clear="none">node3: 192.168.0.80<br clear="none"><br clear="none"><br clear="none">volume info:<br clear="none"><br clear="none">Volume Name: gv0<br clear="none">Type: Distributed-Replicate<br clear="none">Volume ID: 9bafc4d2-d9b6-4b6d-a631-1cf42d1d2559<br clear="none">Status: Started<br clear="none">Snapshot Count: 0<br clear="none">Number of Bricks: 6 x (2 + 1) = 18<br clear="none">Transport-type: tcp<br clear="none">Bricks:<br clear="none">Brick1: 192.168.0.40:/var/bricks/0/brick<br clear="none">Brick2: 192.168.0.41:/var/bricks/0/brick<br clear="none">Brick3: 192.168.0.80:/var/bricks/arb_0/brick (arbiter)<br clear="none">Brick4: 192.168.0.40:/var/bricks/2/brick<br clear="none">Brick5: 192.168.0.80:/var/bricks/2/brick<br clear="none">Brick6: 192.168.0.41:/var/bricks/arb_1/brick (arbiter)<br clear="none">Brick7: 192.168.0.40:/var/bricks/1/brick<br clear="none">Brick8: 192.168.0.41:/var/bricks/1/brick<br clear="none">Brick9: 192.168.0.80:/var/bricks/arb_1/brick (arbiter)<br clear="none">Brick10: 192.168.0.40:/var/bricks/3/brick<br clear="none">Brick11: 192.168.0.80:/var/bricks/3/brick<br clear="none">Brick12: 192.168.0.41:/var/bricks/arb_0/brick (arbiter)<br clear="none">Brick13: 192.168.0.41:/var/bricks/3/brick<br clear="none">Brick14: 192.168.0.80:/var/bricks/4/brick<br clear="none">Brick15: 192.168.0.40:/var/bricks/arb_0/brick (arbiter)<br clear="none">Brick16: 192.168.0.41:/var/bricks/2/brick<br clear="none">Brick17: 192.168.0.80:/var/bricks/5/brick<br clear="none">Brick18: 192.168.0.40:/var/bricks/arb_1/brick (arbiter)<br clear="none">Options Reconfigured:<br clear="none">cluster.min-free-inodes: 6%<br clear="none">cluster.min-free-disk: 2%<br clear="none">performance.md-cache-timeout: 600<br clear="none">cluster.rebal-throttle: lazy<br clear="none">features.scrub-freq: monthly<br clear="none">features.scrub-throttle: lazy<br clear="none">features.scrub: Inactive<br clear="none">features.bitrot: off<br clear="none">cluster.server-quorum-type: none<br clear="none">performance.cache-refresh-timeout: 10<br clear="none">performance.cache-max-file-size: 64MB<br clear="none">performance.cache-size: 781901824<br clear="none">auth.allow: /(192.168.0.*),/usr/andreas(192.168.0.120),/usr/otis(192.168.0.168),/usr/otis(192.168.0.111),/usr/otis(192.168.0.249),/media(192.168.0.*),/virt(192.168.0.*),/cloud(192.168.0.247),/zm(192.168.0.136)<br clear="none">performance.cache-invalidation: on<br clear="none">performance.stat-prefetch: on<br clear="none">features.cache-invalidation-timeout: 600<br clear="none">cluster.quorum-type: auto<br clear="none">features.cache-invalidation: on<br clear="none">nfs.disable: on<br clear="none">transport.address-family: inet<br clear="none">cluster.self-heal-daemon: on<br clear="none">cluster.server-quorum-ratio: 51%<br clear="none"><br clear="none">volume status:<br clear="none"><br clear="none">Status of volume: gv0<br clear="none">Gluster process TCP Port RDMA Port Online Pid<br clear="none">------------------------------------------------------------------------------<br clear="none">Brick 192.168.0.40:/var/bricks/0/brick 49155 0 Y 713066<br clear="none">Brick 192.168.0.41:/var/bricks/0/brick 49152 0 Y 2082<br clear="none">Brick 192.168.0.80:/var/bricks/arb_0/brick 49152 0 Y 26186<br clear="none">Brick 192.168.0.40:/var/bricks/2/brick 49156 0 Y 713075<br clear="none">Brick 192.168.0.80:/var/bricks/2/brick 49154 0 Y 325<br clear="none">Brick 192.168.0.41:/var/bricks/arb_1/brick 49157 0 Y 1746903<br clear="none">Brick 192.168.0.40:/var/bricks/1/brick 49157 0 Y 713084<br clear="none">Brick 192.168.0.41:/var/bricks/1/brick 49153 0 Y 14104<br clear="none">Brick 192.168.0.80:/var/bricks/arb_1/brick 49159 0 Y 2314<br clear="none">Brick 192.168.0.40:/var/bricks/3/brick 49153 0 Y 2978692<br clear="none">Brick 192.168.0.80:/var/bricks/3/brick 49155 0 Y 23269<br clear="none">Brick 192.168.0.41:/var/bricks/arb_0/brick 49158 0 Y 1746942<br clear="none">Brick 192.168.0.41:/var/bricks/3/brick 49155 0 Y 897058<br clear="none">Brick 192.168.0.80:/var/bricks/4/brick 49156 0 Y 27433<br clear="none">Brick 192.168.0.40:/var/bricks/arb_0/brick 49152 0 Y 3561115<br clear="none">Brick 192.168.0.41:/var/bricks/2/brick 49156 0 Y 902602<br clear="none">Brick 192.168.0.80:/var/bricks/5/brick 49157 0 Y 29522<br clear="none">Brick 192.168.0.40:/var/bricks/arb_1/brick 49154 0 Y 3561159<br clear="none">Self-heal Daemon on localhost N/A N/A Y 26199<br clear="none">Self-heal Daemon on 192.168.0.41 N/A N/A Y 2240635<br clear="none">Self-heal Daemon on 192.168.0.40 N/A N/A Y 3912810<br clear="none"><br clear="none">Task Status of Volume gv0<br clear="none">------------------------------------------------------------------------------<br clear="none">There are no active volume tasks<br clear="none"><br clear="none">volume heal info summary:<br clear="none"><br clear="none">Brick 192.168.0.40:/var/bricks/0/brick <--- contains 100177 files in 25015 dirs<br clear="none">Status: Connected<br clear="none">Total Number of entries: 1006<br clear="none">Number of entries in heal pending: 1006<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.41:/var/bricks/0/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 1006<br clear="none">Number of entries in heal pending: 1006<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.80:/var/bricks/arb_0/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.40:/var/bricks/2/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.80:/var/bricks/2/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.41:/var/bricks/arb_1/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.40:/var/bricks/1/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 1006<br clear="none">Number of entries in heal pending: 1006<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.41:/var/bricks/1/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 1006<br clear="none">Number of entries in heal pending: 1006<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.80:/var/bricks/arb_1/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.40:/var/bricks/3/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.80:/var/bricks/3/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.41:/var/bricks/arb_0/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.41:/var/bricks/3/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.80:/var/bricks/4/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.40:/var/bricks/arb_0/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.41:/var/bricks/2/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.80:/var/bricks/5/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">Brick 192.168.0.40:/var/bricks/arb_1/brick<br clear="none">Status: Connected<br clear="none">Total Number of entries: 0<br clear="none">Number of entries in heal pending: 0<br clear="none">Number of entries in split-brain: 0<br clear="none">Number of entries possibly healing: 0<br clear="none"><br clear="none">client-list:<br clear="none"><br clear="none">Client connections for volume gv0<br clear="none">Name count<br clear="none">----- ------<br clear="none">fuse 5<br clear="none">gfapi.ganesha.nfsd 3<br clear="none">glustershd 3<br clear="none"><br clear="none">total clients for volume gv0 : 11<br clear="none">-----------------------------------------------------------------<br clear="none"><br clear="none">all clients: <a shape="rect" href="https://pro.hostit.de/nextcloud/index.php/s/tWdHox3aqb3qqbG" target="_blank">https://pro.hostit.de/nextcloud/index.php/s/tWdHox3aqb3qqbG</a><br clear="none"><br clear="none">failing mnt.log <a shape="rect" href="https://pro.hostit.de/nextcloud/index.php/s/2E2NLnXNsTy7EQe" target="_blank">https://pro.hostit.de/nextcloud/index.php/s/2E2NLnXNsTy7EQe</a><br clear="none"><br clear="none"><br clear="none">Thank you.<br clear="none"><br clear="none">A.<br clear="none"><br clear="none"><br clear="none">"Strahil Nikolov" <a shape="rect" ymailto="mailto:hunter86_bg@yahoo.com" href="mailto:hunter86_bg@yahoo.com">hunter86_bg@yahoo.com</a> – 31. Mai 2021 05:23<br clear="none">> Can you provide gluster volume info , gluster volume status and gluster volume heal info summary and most probably gluster volume status all clients/client-list<div class="yqt8688729916" id="yqtfd62137"><br clear="none">><br clear="none">><br clear="none">> Best Regards,<br clear="none">> Strahil Nikolov<br clear="none">><br clear="none">> > On Sun, May 30, 2021 at 15:17, <a shape="rect" ymailto="mailto:a.schwibbe@gmx.net" href="mailto:a.schwibbe@gmx.net">a.schwibbe@gmx.net</a><br clear="none">> > wrote:<br clear="none">> ><br clear="none">> > I am seeking help here after looking for solutions on the web for my distributed-replicated volume.<br clear="none">> ><br clear="none">> > My volume is operated since v3.10 and I upgraded through to 7.9, replaced nodes, replaced bricks without a problem. I love it.<br clear="none">> ><br clear="none">> > Finally I wanted to extend my 6x2 distributed replicated volume with arbiters for better split-brain protection.<br clear="none">> ><br clear="none">> ><br clear="none">> > So I add-brick with replication 3 arbiter 1 (as I had a 6x2 I obviously added 6 arb bricks) and it successfully converted to 6 x (2 +1) and self-heal immideately started. Looking good.<br clear="none">> ><br clear="none">> ><br clear="none">> ><br clear="none">> > Version: 7.9<br clear="none">> ><br clear="none">> ><br clear="none">> > Number of Bricks: 6 x (2 + 1) = 18<br clear="none">> ><br clear="none">> ><br clear="none">> > cluster.max-op-version: 70200<br clear="none">> ><br clear="none">> ><br clear="none">> > Peers: 3 (node[0..2])<br clear="none">> ><br clear="none">> ><br clear="none">> > Layout<br clear="none">> ><br clear="none">> ><br clear="none">> > |node0 |node1 |node2<br clear="none">> ><br clear="none">> > |brick0 |brick0 |arbit0<br clear="none">> ><br clear="none">> ><br clear="none">> > |arbit1 |brick1 |brick1<br clear="none">> ><br clear="none">> ><br clear="none">> > ....<br clear="none">> ><br clear="none">> ><br clear="none">> ><br clear="none">> > I then recognized that arbiter volumes on node0 & node1 have been healed successfully.<br clear="none">> ><br clear="none">> > Unfortunately all arbiter volumes on node2 have not been healed!<br clear="none">> ><br clear="none">> > I realized that the main dir on my arb mount point has been added (mount point /var/brick/arb_0 now contains dir "brick") however this dir on _all_ other bricks has numeric ID 33, but on this on it has 0). The brick dir on the faulty arb-volumes does contain ".glusterfs", however it has only very few entries. Other than that "brick" is empty.<br clear="none">> ><br clear="none">> > At that point I changed brick dir owner with chown to 33:33 and hoped for self-heal to work. It did not.<br clear="none">> ><br clear="none">> > I hoped a rebalance fix-layout would fix things. It did not.<br clear="none">> ><br clear="none">> > I hoped for a glusterd restart on node2 (as this is happening to both arb volumes on this node exclusively) would help. It did not.<br clear="none">> ><br clear="none">> ><br clear="none">> > Active mount points via nfs-ganesha or fuse continue to work.<br clear="none">> ><br clear="none">> > Existing clients cause errors in the arb-brick logs on node2 for missing files or dirs, but clients seem not affected. r/w operations work.<br clear="none">> ><br clear="none">> ><br clear="none">> > New clients are not able to fuse mount the volume for "authentication error".<br clear="none">> ><br clear="none">> > heal statistics heal-count show several hundred files need healing, this count is rising. Watching df on the arb-brick mount point on node2 shows every now and then a few bytes written, but then removed immideately after that.<br clear="none">> ><br clear="none">> ><br clear="none">> > Any help/recommendation for you highly appreciated.<br clear="none">> ><br clear="none">> > Thank you!<br clear="none">> ><br clear="none">> ><br clear="none">> > A.<br clear="none">> ><br clear="none">> > ________<br clear="none">> ><br clear="none">> ><br clear="none">> ><br clear="none">> ><br clear="none">> ><br clear="none">> > Community Meeting Calendar:<br clear="none">> ><br clear="none">> ><br clear="none">> > Schedule -<br clear="none">> ><br clear="none">> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br clear="none">> ><br clear="none">> > Bridge: <a shape="rect" href="https://meet.google.com/cpu-eiue-hvk" target="_blank">https://meet.google.com/cpu-eiue-hvk</a><br clear="none">> ><br clear="none">> > Gluster-users mailing list<br clear="none">> ><br clear="none">> > <a shape="rect" ymailto="mailto:Gluster-users@gluster.org" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br clear="none">> ><br clear="none">> > <a shape="rect" href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br clear="none">> ><br clear="none">> ><br clear="none"></div> </div> </blockquote></div></div></div></div>