<div dir="ltr"><div>The heal info is working fine. The explanation to what's happening:</div><div>When a node goes down, the changes to this node can't be done. So on the other nodes which were up, get the changes and keeps track saying <br></div><div>these files were changed (note: this change hasn't been reflected to the node which was down). Once the node down comes back up, <br></div><div>it doesn't know what happened when it was down. But the other nodes know that there are a few changes which didn't make it to the rebooted node. <br></div><div>So the node down is blamed by the other nodes .This is what is shown in the heal info. As the node which was up doesn't have any change that went into that node alone.</div><div>It says 0 files to be healed and the other nodes as it has the data say which are the files that need to heal.</div><div>This is the expected working.</div><div>So as per the rebooted node, heal info is working fine.<br></div><div><br></div><div>About healing the file itself:<br></div><div>Doing an operation on a file, triggers client side heal as per the design, that's the reason these files are getting corrected after the md5sum (I hope this is done from the client side not the backend itself).</div><div>So this is expected. <br></div><div>About the heals not happening for a long time, there can be some issue there. <br></div><div><a class="gmail_plusreply" id="plusReplyChip-0" href="mailto:ksubrahm@redhat.com" tabindex="-1">@Karthik Subrahmanya</a> is the better person to help you with this.</div><div><br></div><div>About the CPU usage going higher:</div><div>We need info about what is consuming more CPU.</div><div>Glusterd needs to do a bit of handshake and connect after reboot. During this a little bit of data is transferred as well.</div><div>If the number of nodes goes higher it can contribute to hike.</div><div>Similarly, if the heal is happening, then it can increase the usage.</div><div>So we need info about what is consuming the cpu to know if it's expected or not.</div><div>If this hike is expected, you can try using cscope to restrict the cpu usage by that particular process.<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 28, 2020 at 3:02 AM Artem Russakovskii <<a href="mailto:archon810@gmail.com">archon810@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Good news, after upgrading to 5.13 and running this scenario again, the self heal actually succeeded without my intervention following a server reboot.<div><br></div><div>The load was still high during this process, but at least the endless heal issue is resolved.</div><div><br></div><div>I'd still love to hear from the team on managing heal load spikes.<br clear="all"><div><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><br>Sincerely,<br>Artem<br><br>--<br>Founder, <a href="http://www.androidpolice.com" target="_blank">Android Police</a>, <a href="http://www.apkmirror.com/" style="font-size:12.8px" target="_blank">APK Mirror</a><span style="font-size:12.8px">, Illogical Robot LLC</span></div><div dir="ltr"><a href="http://beerpla.net/" target="_blank">beerpla.net</a> | <a href="http://twitter.com/ArtemR" target="_blank">@ArtemR</a><br></div></div></div></div></div></div></div></div></div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Apr 26, 2020 at 3:13 PM Artem Russakovskii <<a href="mailto:archon810@gmail.com" target="_blank">archon810@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi all,<br></div><div><br></div><div>I've been observing this problem for a long time now and it's time to finally figure out what's going on.</div><div><br></div><div>We're running gluster 5.11 and have a 10TB 1 x 4 = 4 replicate volume. I'll include its slightly redacted config below.</div><div><br></div><div>When I reboot one of the servers and it goes offline for a bit, when it comes back, heal info tells me there are some files and dirs that are "heal pending". 0 "split-brain" and "possibly healing" - only "heal pending" are >0.</div><div><ol><li>For some reason, the server that was rebooted shows "heal pending" 0. All other servers show "heal pending" with some number, say 65.</li><li>We have cluster.self-heal-daemon enabled.</li><li>The logs are full of "performing entry selfheal" and "completed entry selfheal" messages that continue to print endlessly.</li><li>This "heal pending" number never goes down by itself, but it does if I run some operation on it, like md5sum.</li><li>When the server goes down for reboot and especially when it comes back, the load on ALL servers shoots up through the roof (load of 100+) and ends up bringing everything down, including apache and nginx. My theory is that self-heal kicks in so hard that it kills IO on these attached Linode block devices. However, after some time - say 10 minutes - the load subsides, but the "heal pending" remains and the gluster logs continue to output
"performing entry selfheal" and "completed entry selfheal" messages. This load spike has become a huge issue for us because it brings down the whole site for entire minutes.</li><li>At this point in my investigation, I noticed that the selfheal messages actually repeat for the same gfids over and over.<br><font face="monospace">[2020-04-26 21:32:29.877987] I [MSGID: 108026] [afr-self-heal-entry.c:897:afr_selfheal_entry_do] 0-SNIP_data1-replicate-0: performing entry selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc<br>[2020-04-26 21:32:29.901246] I [MSGID: 108026] [afr-self-heal-common.c:1729:afr_log_selfheal] 0-SNIP_data1-replicate-0: Completed entry selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc. sources= sinks=0 1 2<br>[2020-04-26 21:32:32.171959] I [MSGID: 108026] [afr-self-heal-entry.c:897:afr_selfheal_entry_do] 0-SNIP_data1-replicate-0: performing entry selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc<br>[2020-04-26 21:32:32.225828] I [MSGID: 108026] [afr-self-heal-common.c:1729:afr_log_selfheal] 0-SNIP_data1-replicate-0: Completed entry selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc. sources= sinks=0 1 2<br>[2020-04-26 21:32:33.346990] I [MSGID: 108026] [afr-self-heal-entry.c:897:afr_selfheal_entry_do] 0-SNIP_data1-replicate-0: performing entry selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc<br>[2020-04-26 21:32:33.374413] I [MSGID: 108026] [afr-self-heal-common.c:1729:afr_log_selfheal] 0-SNIP_data1-replicate-0: Completed entry selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc. sources= sinks=0 1 2</font></li><li>I used gfid-resolver.sh from <a href="https://gist.github.com/4392640.git" target="_blank">https://gist.github.com/4392640.git</a> to resolve this gfid to the real location and yup - it was one of the files (a dir actually) listed as "heal pending" in heal info. As soon as I ran md5sum on the file inside (which was also listed in "heal pending"), the log messages stopped repeating for this entry and it disappeared from "heal pending" heal info. These were the final log lines:<br><font face="monospace">[2020-04-26 21:32:35.642662] I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-SNIP_data1-replicate-0: performing metadata selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc<br>[2020-04-26 21:32:35.658714] I [MSGID: 108026] [afr-self-heal-common.c:1729:afr_log_selfheal] 0-SNIP_data1-replicate-0: Completed metadata selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc. sources=0 [1] 2 sinks=3<br>[2020-04-26 21:32:35.686509] I [MSGID: 108026] [afr-self-heal-entry.c:897:afr_selfheal_entry_do] 0-SNIP_data1-replicate-0: performing entry selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc<br>[2020-04-26 21:32:35.720387] I [MSGID: 108026] [afr-self-heal-common.c:1729:afr_log_selfheal] 0-SNIP_data1-replicate-0: Completed entry selfheal on 96d282cf-402f-455c-9add-5f03c088a1bc. sources=0 [1] 2 sinks=3</font></li></ol></div><div>I have to repeat this song and dance every time I reboot servers and run md5sum on each "heal pending" file or else the messages will continue presumably indefinitely. In the meantime, the files seem to be fine when accessed.</div><div><br></div><div>What I don't understand is:</div><div><ol><li>Why doesn't
gluster
just heal them properly instead of getting stuck? Or maybe this was fixed in v6 or v7, which I haven't upgraded to due to waiting for another unrelated issue to be fixed?</li><li>Why does heal info show 0 "heal pending" files on the server that was rebooted, but all other servers show the same number of "heal pending" entries >0?</li><li>Why are there these insane load spikes upon going down and especially coming back online? Is it related to the issue here? I'm pretty sure that it didn't happen in previous versions of gluster, when this issue didn't manifest - I could easily bring down one of the servers without it creating havoc when it comes back online.</li></ol></div><div>Here's the volume info:<br></div><blockquote style="margin:0px 0px 0px 40px;border:medium none;padding:0px"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Volume Name: SNIP_data1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Type: Replicate</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Status: Started</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Snapshot Count: 0</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Number of Bricks: 1 x 4 = 4</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Transport-type: tcp</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Bricks:</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Brick1: SNIP:/mnt/SNIP_block1/SNIP_data1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Brick2: SNIP:/mnt/SNIP_block1/SNIP_data1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Brick3: SNIP:/mnt/SNIP_block1/SNIP_data1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Brick4: SNIP:/mnt/SNIP_block1/SNIP_data1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Options Reconfigured:</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.client-io-threads: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">nfs.disable: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">transport.address-family: inet</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">cluster.self-heal-daemon: enable</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.cache-size: 1GB</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">cluster.lookup-optimize: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.read-ahead: off</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">client.event-threads: 4</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">server.event-threads: 4</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.io-thread-count: 32</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">cluster.readdir-optimize: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">features.cache-invalidation: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">features.cache-invalidation-timeout: 600</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.stat-prefetch: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.cache-invalidation: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.md-cache-timeout: 600</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">network.inode-lru-limit: 500000</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.parallel-readdir: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.readdir-ahead: on</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">performance.rda-cache-limit: 256MB</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">network.remote-dio: enable</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">network.ping-timeout: 5</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">cluster.quorum-type: fixed</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">cluster.quorum-count: 1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">cluster.granular-entry-heal: enable</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">cluster.data-self-heal-algorithm: full</blockquote></blockquote><div><br></div><div>Appreciate any insight. Thank you.</div><div><div><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><br>Sincerely,<br>Artem<br><br>--<br>Founder, <a href="http://www.androidpolice.com" target="_blank">Android Police</a>, <a href="http://www.apkmirror.com/" style="font-size:12.8px" target="_blank">APK Mirror</a><span style="font-size:12.8px">, Illogical Robot LLC</span></div><div dir="ltr"><a href="http://beerpla.net/" target="_blank">beerpla.net</a> | <a href="http://twitter.com/ArtemR" target="_blank">@ArtemR</a><br></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>
</blockquote></div>
________<br>
<br>
<br>
<br>
Community Meeting Calendar:<br>
<br>
Schedule -<br>
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br>
Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>
<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature">Regards,<br>Hari Gowtham.</div></div>