<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 1 July 2018 at 22:37, Ashish Pandey <span dir="ltr">&lt;<a href="mailto:aspandey@redhat.com" target="_blank">aspandey@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="font-family: times\ new\ roman, new\ york, times, serif; font-size: 12pt; color: rgb(0, 0, 0);"><div><br></div><div>The only problem at the moment is that arbiter brick offline. You should only bother about completion of maintenance of arbiter brick ASAP.<br></div><div>Bring this brick UP, start FULL heal or index heal and the volume will be in healthy state.<br></div></div></div></blockquote><div><br></div><div>Doesn&#39;t the arbiter only resolve split-brain situations? None of the files that have been marked for healing are marked as in split-brain.<br><br></div><div>The arbiter has now been brought back up, however the problem continues.<br><br></div><div>I&#39;ve found the following information in the client log:<br><br>[2018-07-03 19:09:29.245089] W [MSGID: 108008] [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for &lt;gfid:db9afb92-d2bc-49ed-8e34-dcd437ba7be2&gt;/hosted-engine.metadata 5e95ba8c-2f12-49bf-be2d-b4baf210d366 on engine-client-1 and b9cd7613-3b96-415d-a549-1dc788a4f94d on engine-client-0<br>[2018-07-03 19:09:29.245585] W [fuse-bridge.c:471:fuse_entry_cbk] 0-glusterfs-fuse: 10430040: LOOKUP() /98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/hosted-engine.metadata =&gt; -1 (Input/output error)<br>[2018-07-03 19:09:30.619000] W [MSGID: 108008] [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for &lt;gfid:db9afb92-d2bc-49ed-8e34-dcd437ba7be2&gt;/hosted-engine.lockspace 8e86902a-c31c-4990-b0c5-0318807edb8f on engine-client-1 and e5899a4c-dc5d-487e-84b0-9bbc73133c25 on engine-client-0<br>[2018-07-03 19:09:30.619360] W [fuse-bridge.c:471:fuse_entry_cbk] 0-glusterfs-fuse: 10430656: LOOKUP() /98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/hosted-engine.lockspace =&gt; -1 (Input/output error)<br></div><br></div><div class="gmail_quote">As you can see from the logs I posted previously, neither of those two files, on either of the two servers, have any of gluster&#39;s extended attributes set.<br><br>The arbiter doesn&#39;t have any record of the files in question, as they were created after it went offline.<br><br></div><div class="gmail_quote">How do I fix this? Is it possible to locate the correct gfids somewhere &amp; redefine them on the files manually?<br><br></div><div class="gmail_quote">Cheers,<br></div><div class="gmail_quote"> Doug<br></div><div class="gmail_quote"><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="font-family: times\ new\ roman, new\ york, times, serif; font-size: 12pt; color: rgb(0, 0, 0);"><hr id="gmail-m_-7061863707512342433zwchr"><div style="color:rgb(0,0,0);font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt"><b>From: </b>&quot;Gambit15&quot; &lt;<a href="mailto:dougti%2Bgluster@gmail.com" target="_blank">dougti+gluster@gmail.com</a>&gt;<br><b>To: </b>&quot;Ashish Pandey&quot; &lt;<a href="mailto:aspandey@redhat.com" target="_blank">aspandey@redhat.com</a>&gt;<br><b>Cc: </b>&quot;gluster-users&quot; &lt;<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.org</a>&gt;<br><b>Sent: </b>Monday, July 2, 2018 1:45:01 AM<br><b>Subject: </b>Re: [Gluster-users] Files not healing &amp; missing their extended attributes - Help!<div><div class="gmail-h5"><br><div><br></div><div dir="ltr"><div>Hi Ashish,<br><div><br></div></div><div>The output is below. It&#39;s a rep 2+1 volume. The arbiter is offline for maintenance at the moment, however quorum is met &amp; no files are reported as in split-brain (it hosts VMs, so files aren&#39;t accessed concurrently).<br></div><div><br>======================<br>[root@v0 glusterfs]# gluster volume info engine<br><div><br></div>Volume Name: engine<br>Type: Replicate<br>Volume ID: 279737d3-3e5a-4ee9-8d4a-<wbr>97edcca42427<br>Status: Started<br>Snapshot Count: 0<br>Number of Bricks: 1 x (2 + 1) = 3<br>Transport-type: tcp<br>Bricks:<br>Brick1: s0:/gluster/engine/brick<br>Brick2: s1:/gluster/engine/brick<br>Brick3: s2:/gluster/engine/arbiter (arbiter)<br>Options Reconfigured:<br>nfs.disable: on<br>performance.readdir-ahead: on<br>transport.address-family: inet<br>performance.quick-read: off<br>performance.read-ahead: off<br>performance.io-cache: off<br>performance.stat-prefetch: off<br>cluster.eager-lock: enable<br>network.remote-dio: enable<br>cluster.quorum-type: auto<br>cluster.server-quorum-type: server<br>storage.owner-uid: 36<br>storage.owner-gid: 36<br>performance.low-prio-threads: 32<br><div><br></div>====================== <br><div><br></div>[root@v0 glusterfs]# gluster volume heal engine info<br>Brick s0:/gluster/engine/brick<br>/__DIRECT_IO_TEST__<br>/98495dbc-a29c-4893-b6a0-<wbr>0aa70860d0c9/ha_agent<br></div><div><div><div>/98495dbc-a29c-4893-b6a0-<wbr>0aa70860d0c9<br> &lt;LIST TRUNCATED FOR BREVITY&gt; <br>Status: Connected<br>Number of entries: 34<br><div><br></div>Brick s1:/gluster/engine/brick<br> &lt;SAME AS ABOVE - TRUNCATED FOR BREVITY&gt; <br>Status: Connected<br>Number of entries: 34<br><div><br></div>Brick s2:/gluster/engine/arbiter<br>Status: Ponto final de transporte não está conectado<br>Number of entries: -<br><div><br></div>======================<br>=== PEER V0 ===<br><div><br></div>[root@v0 glusterfs]# getfattr -m . -d -e hex /gluster/engine/brick/<wbr>98495dbc-a29c-4893-b6a0-<wbr>0aa70860d0c9/ha_agent<br>getfattr: Removing leading &#39;/&#39; from absolute path names<br># file: gluster/engine/brick/98495dbc-<wbr>a29c-4893-b6a0-0aa70860d0c9/<wbr>ha_agent<br>security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745f723a756e6c6162656c65645f74<wbr>3a733000<br>trusted.afr.dirty=<wbr>0x000000000000000000000000<br>trusted.afr.engine-client-2=<wbr>0x0000000000000000000024e8<br>trusted.gfid=<wbr>0xdb9afb92d2bc49ed8e34dcd437ba<wbr>7be2<br>trusted.glusterfs.dht=<wbr>0x000000010000000000000000ffff<wbr>ffff<br><div><br></div>[root@v0 glusterfs]# getfattr -m . -d -e hex /gluster/engine/brick/<wbr>98495dbc-a29c-4893-b6a0-<wbr>0aa70860d0c9/ha_agent/*<br>getfattr: Removing leading &#39;/&#39; from absolute path names<br># file: gluster/engine/brick/98495dbc-<wbr>a29c-4893-b6a0-0aa70860d0c9/<wbr>ha_agent/hosted-engine.<wbr>lockspace<br>security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745f723a6675736566735f743a7330<wbr>00<br><div><br></div># file: gluster/engine/brick/98495dbc-<wbr>a29c-4893-b6a0-0aa70860d0c9/<wbr>ha_agent/hosted-engine.<wbr>metadata<br>security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745f723a6675736566735f743a7330<wbr>00 <br><div><br></div></div><div>=== PEER V1 ===<br></div><div><br>[root@v1 glusterfs]# getfattr -m . -d -e hex /gluster/engine/brick/<wbr>98495dbc-a29c-4893-b6a0-<wbr>0aa70860d0c9/ha_agent<br>getfattr: Removing leading &#39;/&#39; from absolute path names<br># file: gluster/engine/brick/98495dbc-<wbr>a29c-4893-b6a0-0aa70860d0c9/<wbr>ha_agent<br>security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745f723a756e6c6162656c65645f74<wbr>3a733000<br>trusted.afr.dirty=<wbr>0x000000000000000000000000<br>trusted.afr.engine-client-2=<wbr>0x0000000000000000000024ec<br>trusted.gfid=<wbr>0xdb9afb92d2bc49ed8e34dcd437ba<wbr>7be2<br>trusted.glusterfs.dht=<wbr>0x000000010000000000000000ffff<wbr>ffff<br><div><br></div>======================<br><div><br></div>cmd_history.log-20180701:<br><div><br></div>[2018-07-01 03:11:38.461175]  : volume heal engine full : SUCCESS<br>[2018-07-01 03:11:51.151891]  : volume heal data full : SUCCESS<br><div><br></div>glustershd.log-20180701:<br></div><div>&lt;LOGS FROM 06/01 TRUNCATED&gt;<br></div><div>[2018-07-01 07:15:04.779122] I [MSGID: 100011] [glusterfsd.c:1396:<wbr>reincarnate] 0-glusterfsd: Fetching the volume file from server... <br><div><br></div>glustershd.log:<br>[2018-07-01 07:15:04.779693] I [glusterfsd-mgmt.c:1596:mgmt_<wbr>getspec_cbk] 0-glusterfs: No change in volfile, continuing<br></div><div><br></div><div>That&#39;s the *only* message in glustershd.log today.<br></div><div><br> ====================== <br><div><br></div>[root@v0 glusterfs]# gluster volume status engine<br>Status of volume: engine<br>Gluster process                       <wbr>      TCP Port  RDMA Port  Online  Pid<br>------------------------------<wbr>------------------------------<wbr>------------------<br>Brick s0:/gluster/engine/brick      <wbr>        49154     0          Y       2816<br>Brick s1:/gluster/engine/brick      <wbr>        49154     0          Y       3995<br>Self-heal Daemon on localhost               N/A       N/A        Y       2919<br>Self-heal Daemon on s1                      N/A       N/A        Y       4013<br><div><br></div>Task Status of Volume engine<br>------------------------------<wbr>------------------------------<wbr>------------------<br>There are no active volume tasks<br><div><br></div>====================== <br><div><br></div></div><div>Okay, so actually only the directory ha_agent is listed for healing (not its contents), &amp; that does have attributes set.<br><div><br></div></div><div>Many thanks for the reply!<br><div><br></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 1 July 2018 at 15:34, Ashish Pandey <span dir="ltr">&lt;<a href="mailto:aspandey@redhat.com" target="_blank">aspandey@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="font-family: times\ new\ roman, new\ york, times, serif; font-size: 12pt; color: rgb(0, 0, 0);"><div>You have not even talked about the volume type and configuration and this issue would require lot of other information to fix it.<br></div><div><br></div><div>1 - What is the type of volume and config.<br></div><div>2 - Provide the gluster v &lt;volname&gt; info out put<br></div><div>3 - Heal info out put<br></div><div>4 - getxattr of one of the file, which needs healing, from all the bricks.<br></div><div>5 - What lead to the healing of file?<br></div><div>6 - gluster v &lt;volname&gt; status<br></div><div>7 - glustershd.log out put just after you run full heal or index heal<br></div><div><br></div><div>----<br></div><div>Ashish<br></div><div><br></div><hr id="gmail-m_-7061863707512342433m_-4349002051472701379zwchr"><div style="color:rgb(0,0,0);font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt"><b>From: </b>&quot;Gambit15&quot; &lt;<a href="mailto:dougti%2Bgluster@gmail.com" target="_blank">dougti+gluster@gmail.com</a>&gt;<br><b>To: </b>&quot;gluster-users&quot; &lt;<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.org</a>&gt;<br><b>Sent: </b>Sunday, July 1, 2018 11:50:16 PM<br><b>Subject: </b>[Gluster-users] Files not healing &amp; missing their extended        attributes - Help!<div><div class="gmail-m_-7061863707512342433h5"><br><div><br></div><div dir="ltr"><div><div><div><div><div><div>Hi Guys,<br></div> I had to restart our datacenter yesterday, but since doing so a number of the files on my gluster share have been stuck, marked as healing. After no signs of progress, I manually set off a full heal last night, but after 24hrs, nothing&#39;s happened.<br></div><br>The gluster logs all look normal, and there&#39;re no messages about failed connections or heal processes kicking off.<br><div><br></div></div>I checked the listed files&#39; extended attributes on their bricks today, and they only show the selinux attribute. There&#39;s none of the trusted.* attributes I&#39;d expect.<br></div>The healthy files on the bricks do have their extended attributes though.<br><div><br></div></div>I&#39;m guessing that perhaps the files somehow lost their attributes, and gluster is no longer able to work out what to do with them? It&#39;s not logged any errors, warnings, or anything else out of the normal though, so I&#39;ve no idea what the problem is or how to resolve it.<br><div><br></div></div>I&#39;ve got 16 hours to get this sorted before the start of work, Monday. Help!<br></div><br></div></div>______________________________<wbr>_________________<br>Gluster-users mailing list<br><a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br><a href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br></div><div><br></div></div></div></blockquote></div><br></div><br>______________________________<wbr>_________________<br>Gluster-users mailing list<br><a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br><a href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a></div></div></div><div><br></div></div></div></blockquote></div><br></div></div>