<div dir="ltr"><div><div>Hi Richard,<br><br></div>Thanks for the informations. As you said there is gfid mismatch for the file.<br></div><div>On brick-1 &amp; brick-2 the gfids are same &amp; on brick-3 the gfid is different.<br>This is not considered as split-brain because we have two good copies here.</div><div>Gluster 3.10 does not have a method to resolve this situation other than the<br>manual intervention [1]. Basically what you need to do is remove the file and<br>the gfid hardlink from brick-3 (considering brick-3 entry as bad). Then when<br>you do a lookup for the file from mount it will recreate the entry on the other brick.<br></div><br><div>Form 3.12 we have methods to resolve this situation with the cli option [2] and<br></div><div>with favorite-child-policy [3]. For the time being you can use [1] to resolve this<br></div><div>and if you can consider upgrading to 3.12 that would give you options to handle<br></div><div>these scenarios.<br><br>[1] <a href="http://docs.gluster.org/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain">http://docs.gluster.org/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain</a><br>[2] <a href="https://review.gluster.org/#/c/17485/">https://review.gluster.org/#/c/17485/</a><br>[3] <a href="https://review.gluster.org/#/c/16878/">https://review.gluster.org/#/c/16878/</a><br><br></div><div>HTH,<br></div><div>Karthik<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 26, 2017 at 12:40 PM, Richard Neuboeck <span dir="ltr">&lt;<a href="mailto:hawk@tbi.univie.ac.at" target="_blank">hawk@tbi.univie.ac.at</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Karthik,<br>
<br>
thanks for taking a look at this. I&#39;m not working with gluster long<br>
enough to make heads or tails out of the logs. The logs are attached to<br>
this mail and here is the other information:<br>
<br>
# gluster volume info home<br>
<br>
Volume Name: home<br>
Type: Replicate<br>
Volume ID: fe6218ae-f46b-42b3-a467-<wbr>5fc6a36ad48a<br>
Status: Started<br>
Snapshot Count: 1<br>
Number of Bricks: 1 x 3 = 3<br>
Transport-type: tcp<br>
Bricks:<br>
Brick1: sphere-six:/srv/gluster_home/<wbr>brick<br>
Brick2: sphere-five:/srv/gluster_home/<wbr>brick<br>
Brick3: sphere-four:/srv/gluster_home/<wbr>brick<br>
Options Reconfigured:<br>
features.barrier: disable<br>
cluster.quorum-type: auto<br>
cluster.server-quorum-type: server<br>
nfs.disable: on<br>
performance.readdir-ahead: on<br>
transport.address-family: inet<br>
features.cache-invalidation: on<br>
features.cache-invalidation-<wbr>timeout: 600<br>
performance.stat-prefetch: on<br>
performance.cache-samba-<wbr>metadata: on<br>
performance.cache-<wbr>invalidation: on<br>
performance.md-cache-timeout: 600<br>
network.inode-lru-limit: 90000<br>
performance.cache-size: 1GB<br>
performance.client-io-threads: on<br>
cluster.lookup-optimize: on<br>
cluster.readdir-optimize: on<br>
features.quota: on<br>
features.inode-quota: on<br>
features.quota-deem-statfs: on<br>
cluster.server-quorum-ratio: 51%<br>
<br>
<br>
[root@sphere-four ~]# getfattr -d -e hex -m .<br>
/srv/gluster_home/brick/<wbr>romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/recovery.baklz4<br>
getfattr: Removing leading &#39;/&#39; from absolute path names<br>
# file:<br>
srv/gluster_home/brick/<wbr>romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/recovery.baklz4<br>
security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745f723a756e6c6162656c65645f74<wbr>3a733000<br>
trusted.afr.dirty=<wbr>0x000000000000000000000000<br>
trusted.bit-rot.version=<wbr>0x020000000000000059df20a40006<wbr>f989<br>
trusted.gfid=<wbr>0xda1c94b1643544b18d5b6f4654f6<wbr>0bf5<br>
trusted.glusterfs.quota.<wbr>48e9eea6-cda6-4e53-bb4a-<wbr>72059debf4c2.contri.1=<wbr>0x0000000000009a00000000000000<wbr>0001<br>
trusted.pgfid.48e9eea6-cda6-<wbr>4e53-bb4a-72059debf4c2=<wbr>0x00000001<br>
<br>
[root@sphere-five ~]# getfattr -d -e hex -m .<br>
/srv/gluster_home/brick/<wbr>romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/recovery.baklz4<br>
getfattr: Removing leading &#39;/&#39; from absolute path names<br>
# file:<br>
srv/gluster_home/brick/<wbr>romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/recovery.baklz4<br>
security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745f723a756e6c6162656c65645f74<wbr>3a733000<br>
trusted.afr.dirty=<wbr>0x000000000000000000000000<br>
trusted.afr.home-client-4=<wbr>0x000000010000000100000000<br>
trusted.bit-rot.version=<wbr>0x020000000000000059df1f310006<wbr>ce63<br>
trusted.gfid=<wbr>0xea8ecfd195fd4e48b994fd0a2da2<wbr>26f9<br>
trusted.glusterfs.quota.<wbr>48e9eea6-cda6-4e53-bb4a-<wbr>72059debf4c2.contri.1=<wbr>0x0000000000009a00000000000000<wbr>0001<br>
trusted.pgfid.48e9eea6-cda6-<wbr>4e53-bb4a-72059debf4c2=<wbr>0x00000001<br>
<br>
[root@sphere-six ~]# getfattr -d -e hex -m .<br>
/srv/gluster_home/brick/<wbr>romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/recovery.baklz4<br>
getfattr: Removing leading &#39;/&#39; from absolute path names<br>
# file:<br>
srv/gluster_home/brick/<wbr>romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/recovery.baklz4<br>
security.selinux=<wbr>0x73797374656d5f753a6f626a6563<wbr>745f723a756e6c6162656c65645f74<wbr>3a733000<br>
trusted.afr.dirty=<wbr>0x000000000000000000000000<br>
trusted.afr.home-client-4=<wbr>0x000000010000000100000000<br>
trusted.bit-rot.version=<wbr>0x020000000000000059df11cd0005<wbr>48ec<br>
trusted.gfid=<wbr>0xea8ecfd195fd4e48b994fd0a2da2<wbr>26f9<br>
trusted.glusterfs.quota.<wbr>48e9eea6-cda6-4e53-bb4a-<wbr>72059debf4c2.contri.1=<wbr>0x0000000000009a00000000000000<wbr>0001<br>
trusted.pgfid.48e9eea6-cda6-<wbr>4e53-bb4a-72059debf4c2=<wbr>0x00000001<br>
<br>
Cheers<br>
Richard<br>
<br>
On 26.10.17 07:41, Karthik Subrahmanya wrote:<br>
&gt; HeyRichard,<br>
<span class="">&gt;<br>
&gt; Could you share the following informations please?<br>
&gt; 1. gluster volume info &lt;volname&gt;<br>
&gt; 2. getfattr output of that file from all the bricks<br>
&gt;     getfattr -d -e hex -m . &lt;brickpath/filepath&gt;<br>
&gt; 3. glustershd &amp; glfsheal logs<br>
&gt;<br>
&gt; Regards,<br>
&gt; Karthik<br>
&gt;<br>
&gt; On Thu, Oct 26, 2017 at 10:21 AM, Amar Tumballi &lt;<a href="mailto:atumball@redhat.com">atumball@redhat.com</a><br>
</span><span class="">&gt; &lt;mailto:<a href="mailto:atumball@redhat.com">atumball@redhat.com</a>&gt;&gt; wrote:<br>
&gt;<br>
&gt;     On a side note, try recently released health report tool, and see if<br>
&gt;     it does diagnose any issues in setup. Currently you may have to run<br>
&gt;     it in all the three machines.<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;     On 26-Oct-2017 6:50 AM, &quot;Amar Tumballi&quot; &lt;<a href="mailto:atumball@redhat.com">atumball@redhat.com</a><br>
</span><span class="">&gt;     &lt;mailto:<a href="mailto:atumball@redhat.com">atumball@redhat.com</a>&gt;&gt; wrote:<br>
&gt;<br>
&gt;         Thanks for this report. This week many of the developers are at<br>
&gt;         Gluster Summit in Prague, will be checking this and respond next<br>
&gt;         week. Hope that&#39;s fine.<br>
&gt;<br>
&gt;         Thanks,<br>
&gt;         Amar<br>
&gt;<br>
&gt;<br>
&gt;         On 25-Oct-2017 3:07 PM, &quot;Richard Neuboeck&quot;<br>
</span><span class="">&gt;         &lt;<a href="mailto:hawk@tbi.univie.ac.at">hawk@tbi.univie.ac.at</a> &lt;mailto:<a href="mailto:hawk@tbi.univie.ac.at">hawk@tbi.univie.ac.at</a>&gt;<wbr>&gt; wrote:<br>
&gt;<br>
&gt;             Hi Gluster Gurus,<br>
&gt;<br>
&gt;             I&#39;m using a gluster volume as home for our users. The volume is<br>
&gt;             replica 3, running on CentOS 7, gluster version 3.10<br>
&gt;             (3.10.6-1.el7.x86_64). Clients are running Fedora 26 and also<br>
&gt;             gluster 3.10 (3.10.6-3.fc26.x86_64).<br>
&gt;<br>
&gt;             During the data backup I got an I/O error on one file. Manually<br>
&gt;             checking for this file on a client confirms this:<br>
&gt;<br>
&gt;             ls -l<br>
&gt;             romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/<br>
&gt;             ls: cannot access<br>
&gt;             &#39;romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/<a href="http://recovery.ba" rel="noreferrer" target="_blank">recovery.ba</a><br>
</span>&gt;             &lt;<a href="http://recovery.ba" rel="noreferrer" target="_blank">http://recovery.ba</a>&gt;klz4&#39;:<br>
<span class="">&gt;             Input/output error<br>
&gt;             total 2015<br>
&gt;             -rw-------. 1 romanoch tbi 998211 Sep 15 18:44 previous.js<br>
&gt;             -rw-------. 1 romanoch tbi  65222 Oct 17 17:57 previous.jsonlz4<br>
&gt;             -rw-------. 1 romanoch tbi 149161 Oct  1 13:46 recovery.bak<br>
&gt;             -?????????? ? ?        ?        ?            ? recovery.baklz4<br>
&gt;<br>
&gt;             Out of curiosity I checked all the bricks for this file. It&#39;s<br>
&gt;             present there. Making a checksum shows that the file is<br>
&gt;             different on<br>
&gt;             one of the three replica servers.<br>
&gt;<br>
&gt;             Querying healing information shows that the file should be<br>
&gt;             healed:<br>
&gt;             # gluster volume heal home info<br>
&gt;             Brick sphere-six:/srv/gluster_home/<wbr>brick<br>
&gt;             /romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/<a href="http://recovery.ba" rel="noreferrer" target="_blank">recovery.ba</a><br>
</span>&gt;             &lt;<a href="http://recovery.ba" rel="noreferrer" target="_blank">http://recovery.ba</a>&gt;klz4<br>
<span class="">&gt;<br>
&gt;             Status: Connected<br>
&gt;             Number of entries: 1<br>
&gt;<br>
&gt;             Brick sphere-five:/srv/gluster_home/<wbr>brick<br>
&gt;             /romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/<a href="http://recovery.ba" rel="noreferrer" target="_blank">recovery.ba</a><br>
</span>&gt;             &lt;<a href="http://recovery.ba" rel="noreferrer" target="_blank">http://recovery.ba</a>&gt;klz4<br>
<div><div class="h5">&gt;<br>
&gt;             Status: Connected<br>
&gt;             Number of entries: 1<br>
&gt;<br>
&gt;             Brick sphere-four:/srv/gluster_home/<wbr>brick<br>
&gt;             Status: Connected<br>
&gt;             Number of entries: 0<br>
&gt;<br>
&gt;             Manually triggering heal doesn&#39;t report an error but also<br>
&gt;             does not<br>
&gt;             heal the file.<br>
&gt;             # gluster volume heal home<br>
&gt;             Launching heal operation to perform index self heal on<br>
&gt;             volume home<br>
&gt;             has been successful<br>
&gt;<br>
&gt;             Same with a full heal<br>
&gt;             # gluster volume heal home full<br>
&gt;             Launching heal operation to perform full self heal on volume<br>
&gt;             home<br>
&gt;             has been successful<br>
&gt;<br>
&gt;             According to the split brain query that&#39;s not the problem:<br>
&gt;             # gluster volume heal home info split-brain<br>
&gt;             Brick sphere-six:/srv/gluster_home/<wbr>brick<br>
&gt;             Status: Connected<br>
&gt;             Number of entries in split-brain: 0<br>
&gt;<br>
&gt;             Brick sphere-five:/srv/gluster_home/<wbr>brick<br>
&gt;             Status: Connected<br>
&gt;             Number of entries in split-brain: 0<br>
&gt;<br>
&gt;             Brick sphere-four:/srv/gluster_home/<wbr>brick<br>
&gt;             Status: Connected<br>
&gt;             Number of entries in split-brain: 0<br>
&gt;<br>
&gt;<br>
&gt;             I have no idea why this situation arose in the first place<br>
&gt;             and also<br>
&gt;             no idea as how to solve this problem. I would highly<br>
&gt;             appreciate any<br>
&gt;             helpful feedback I can get.<br>
&gt;<br>
&gt;             The only mention in the logs matching this file is a rename<br>
&gt;             operation:<br>
&gt;             /var/log/glusterfs/bricks/srv-<wbr>gluster_home-brick.log:[2017-<wbr>10-23<br>
&gt;             09:19:11.561661] I [MSGID: 115061]<br>
&gt;             [server-rpc-fops.c:1022:<wbr>server_rename_cbk] 0-home-server:<br>
&gt;             5266153:<br>
&gt;             RENAME<br>
&gt;             /romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/recovery.jsonlz4<br>
&gt;             (48e9eea6-cda6-4e53-bb4a-<wbr>72059debf4c2/recovery.jsonlz4) -&gt;<br>
&gt;             /romanoch/.mozilla/firefox/<wbr>vzzqqxrm.default-<wbr>1396429081309/sessionstore-<wbr>backups/<a href="http://recovery.ba" rel="noreferrer" target="_blank">recovery.ba</a><br>
</div></div>&gt;             &lt;<a href="http://recovery.ba" rel="noreferrer" target="_blank">http://recovery.ba</a>&gt;klz4<br>
<span class="">&gt;             (48e9eea6-cda6-4e53-bb4a-<wbr>72059debf4c2/recovery.baklz4), client:<br>
&gt;             romulus.tbi.univie.ac.at-<wbr>11894-2017/10/18-07:06:07:<wbr>206366-home-client-3-0-0,<br>
&gt;             error-xlator: home-posix [No data available]<br>
&gt;<br>
&gt;             I enabled directory quotas the same day this problem showed<br>
&gt;             up but<br>
&gt;             I&#39;m not sure how quotas could have an effect like this<br>
&gt;             (maybe unless<br>
&gt;             the limit is reached but that&#39;s also not the case).<br>
&gt;<br>
&gt;             Thanks again if anyone as an idea.<br>
&gt;             Cheers<br>
&gt;             Richard<br>
&gt;             --<br>
&gt;             /dev/null<br>
&gt;<br>
&gt;<br>
&gt;             ______________________________<wbr>_________________<br>
&gt;             Gluster-users mailing list<br>
</span>&gt;             <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a> &lt;mailto:<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.<wbr>org</a>&gt;<br>
&gt;             <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
<span class="im HOEnZb">&gt;             &lt;<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><wbr>&gt;<br>
&gt;<br>
&gt;<br>
&gt;     ______________________________<wbr>_________________<br>
&gt;     Gluster-users mailing list<br>
</span><div class="HOEnZb"><div class="h5">&gt;     <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a> &lt;mailto:<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.<wbr>org</a>&gt;<br>
&gt;     <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
&gt;     &lt;<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><wbr>&gt;<br>
&gt;<br>
&gt;<br>
<br>
</div></div></blockquote></div><br></div>