<div dir="ltr"><div><div><div><div><div><div><div>Hey,<br><br></div>Can you give us the volume info output for this volume?<br></div>Why are you not able to get the xattrs from arbiter brick? It is the same way as you do it on data bricks.<br></div><div>The changelog xattrs are named trusted.afr.virt_images-<wbr>client-{1,2,3} in the getxattr outputs you have provided.<br></div><div>Did you do a remove-brick and add-brick any time? Otherwise it will be trusted.afr.virt_images-<wbr>client-{0,1,2} usually.<br></div><br></div>To overcome this scenario you can do what Ben Turner had suggested. Select the source copy and change the xattrs manually.<br></div>I am suspecting that it has hit the arbiter becoming source for data heal bug. But to confirm that we need the xattrs on the arbiter brick also.<br><br></div>Regards,<br></div>Karthik<br><div><div><div><div><div><div><div><br></div></div></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Dec 21, 2017 at 9:55 AM, Ben Turner <span dir="ltr">&lt;<a href="mailto:bturner@redhat.com" target="_blank">bturner@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Here is the process for resolving split brain on replica 2:<br>
<br>
<a href="https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/Recovering_from_File_Split-brain.html" rel="noreferrer" target="_blank">https://access.redhat.com/<wbr>documentation/en-US/Red_Hat_<wbr>Storage/2.1/html/<wbr>Administration_Guide/<wbr>Recovering_from_File_Split-<wbr>brain.html</a><br>
<br>
It should be pretty much the same for replica 3, you change the xattrs with something like:<br>
<br>
# setfattr -n trusted.afr.vol-client-0 -v 0x000000000000000100000000 /gfs/brick-b/a<br>
<br>
When I try to decide which copy to use I normally run things like:<br>
<br>
# stat /&lt;path to brick&gt;/pat/to/file<br>
<br>
Check out the access and change times of the file on the back end bricks.  I normally pick the copy with the latest access / change times.  I&#39;ll also check:<br>
<br>
# md5sum /&lt;path to brick&gt;/pat/to/file<br>
<br>
Compare the hashes of the file on both bricks to see if the data actually differs.  If the data is the same it makes choosing the proper replica easier.<br>
<br>
Any idea how you got in this situation?  Did you have a loss of NW connectivity?  I see you are using server side quorum, maybe check the logs for any loss of quorum?  I wonder if there was a loos of quorum and there was some sort of race condition hit:<br>
<br>
<a href="http://docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/#server-quorum-and-some-pitfalls" rel="noreferrer" target="_blank">http://docs.gluster.org/en/<wbr>latest/Administrator%20Guide/<wbr>arbiter-volumes-and-quorum/#<wbr>server-quorum-and-some-<wbr>pitfalls</a><br>
<br>
&quot;Unlike in client-quorum where the volume becomes read-only when quorum is lost, loss of server-quorum in a particular node makes glusterd kill the brick processes on that node (for the participating volumes) making even reads impossible.&quot;<br>
<br>
I wonder if the killing of brick processes could have led to some sort of race condition where writes were serviced on one brick / the arbiter and not the other?<br>
<br>
If you can find a reproducer for this please open a BZ with it, I have been seeing something similar(I think) but I haven&#39;t been able to run the issue down yet.<br>
<span class="HOEnZb"><font color="#888888"><br>
-b<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
----- Original Message -----<br>
&gt; From: &quot;Henrik Juul Pedersen&quot; &lt;<a href="mailto:hjp@liab.dk">hjp@liab.dk</a>&gt;<br>
&gt; To: <a href="mailto:gluster-users@gluster.org">gluster-users@gluster.org</a><br>
&gt; Cc: &quot;Henrik Juul Pedersen&quot; &lt;<a href="mailto:henrik@corepower.dk">henrik@corepower.dk</a>&gt;<br>
&gt; Sent: Wednesday, December 20, 2017 1:26:37 PM<br>
&gt; Subject: [Gluster-users] Gluster replicate 3 arbiter 1 in split brain.        gluster cli seems unaware<br>
&gt;<br>
&gt; Hi,<br>
&gt;<br>
&gt; I have the following volume:<br>
&gt;<br>
&gt; Volume Name: virt_images<br>
&gt; Type: Replicate<br>
&gt; Volume ID: 9f3c8273-4d9d-4af2-a4e7-<wbr>4cb4a51e3594<br>
&gt; Status: Started<br>
&gt; Snapshot Count: 2<br>
&gt; Number of Bricks: 1 x (2 + 1) = 3<br>
&gt; Transport-type: tcp<br>
&gt; Bricks:<br>
&gt; Brick1: virt3:/data/virt_images/brick<br>
&gt; Brick2: virt2:/data/virt_images/brick<br>
&gt; Brick3: printserver:/data/virt_images/<wbr>brick (arbiter)<br>
&gt; Options Reconfigured:<br>
&gt; features.quota-deem-statfs: on<br>
&gt; features.inode-quota: on<br>
&gt; features.quota: on<br>
&gt; features.barrier: disable<br>
&gt; features.scrub: Active<br>
&gt; features.bitrot: on<br>
&gt; nfs.rpc-auth-allow: on<br>
&gt; server.allow-insecure: on<br>
&gt; user.cifs: off<br>
&gt; features.shard: off<br>
&gt; cluster.shd-wait-qlength: 10000<br>
&gt; cluster.locking-scheme: granular<br>
&gt; cluster.data-self-heal-<wbr>algorithm: full<br>
&gt; cluster.server-quorum-type: server<br>
&gt; cluster.quorum-type: auto<br>
&gt; cluster.eager-lock: enable<br>
&gt; network.remote-dio: enable<br>
&gt; performance.low-prio-threads: 32<br>
&gt; performance.io-cache: off<br>
&gt; performance.read-ahead: off<br>
&gt; performance.quick-read: off<br>
&gt; nfs.disable: on<br>
&gt; transport.address-family: inet<br>
&gt; server.outstanding-rpc-limit: 512<br>
&gt;<br>
&gt; After a server reboot (brick 1) a single file has become unavailable:<br>
&gt; # touch fedora27.qcow2<br>
&gt; touch: setting times of &#39;fedora27.qcow2&#39;: Input/output error<br>
&gt;<br>
&gt; Looking at the split brain status from the client side cli:<br>
&gt; # getfattr -n replica.split-brain-status fedora27.qcow2<br>
&gt; # file: fedora27.qcow2<br>
&gt; replica.split-brain-status=&quot;<wbr>The file is not under data or metadata<br>
&gt; split-brain&quot;<br>
&gt;<br>
&gt; However, in the client side log, a split brain is mentioned:<br>
&gt; [2017-12-20 18:05:23.570762] E [MSGID: 108008]<br>
&gt; [afr-transaction.c:2629:afr_<wbr>write_txn_refresh_done]<br>
&gt; 0-virt_images-replicate-0: Failing SETATTR on gfid<br>
&gt; 7a36937d-52fc-4b55-a932-<wbr>99e2328f02ba: split-brain observed.<br>
&gt; [Input/output error]<br>
&gt; [2017-12-20 18:05:23.576046] W [MSGID: 108027]<br>
&gt; [afr-common.c:2733:afr_<wbr>discover_done] 0-virt_images-replicate-0: no<br>
&gt; read subvols for /fedora27.qcow2<br>
&gt; [2017-12-20 18:05:23.578149] W [fuse-bridge.c:1153:fuse_<wbr>setattr_cbk]<br>
&gt; 0-glusterfs-fuse: 182: SETATTR() /fedora27.qcow2 =&gt; -1 (Input/output<br>
&gt; error)<br>
&gt;<br>
&gt; = Server side<br>
&gt;<br>
&gt; No mention of a possible split brain:<br>
&gt; # gluster volume heal virt_images info split-brain<br>
&gt; Brick virt3:/data/virt_images/brick<br>
&gt; Status: Connected<br>
&gt; Number of entries in split-brain: 0<br>
&gt;<br>
&gt; Brick virt2:/data/virt_images/brick<br>
&gt; Status: Connected<br>
&gt; Number of entries in split-brain: 0<br>
&gt;<br>
&gt; Brick printserver:/data/virt_images/<wbr>brick<br>
&gt; Status: Connected<br>
&gt; Number of entries in split-brain: 0<br>
&gt;<br>
&gt; The info command shows the file:<br>
&gt; ]# gluster volume heal virt_images info<br>
&gt; Brick virt3:/data/virt_images/brick<br>
&gt; /fedora27.qcow2<br>
&gt; Status: Connected<br>
&gt; Number of entries: 1<br>
&gt;<br>
&gt; Brick virt2:/data/virt_images/brick<br>
&gt; /fedora27.qcow2<br>
&gt; Status: Connected<br>
&gt; Number of entries: 1<br>
&gt;<br>
&gt; Brick printserver:/data/virt_images/<wbr>brick<br>
&gt; /fedora27.qcow2<br>
&gt; Status: Connected<br>
&gt; Number of entries: 1<br>
&gt;<br>
&gt;<br>
&gt; The heal and heal full commands does nothing, and I can&#39;t find<br>
&gt; anything in the logs about them trying and failing to fix the file.<br>
&gt;<br>
&gt; Trying to manually resolve the split brain from cli gives the following:<br>
&gt; # gluster volume heal virt_images split-brain source-brick<br>
&gt; virt3:/data/virt_images/brick /fedora27.qcow2<br>
&gt; Healing /fedora27.qcow2 failed: File not in split-brain.<br>
&gt; Volume heal failed.<br>
&gt;<br>
&gt; The attrs from virt2 and virt3 are as follows:<br>
&gt; [root@virt2 brick]# getfattr -d -m . -e hex fedora27.qcow2<br>
&gt; # file: fedora27.qcow2<br>
&gt; trusted.afr.dirty=<wbr>0x000000000000000000000000<br>
&gt; trusted.afr.virt_images-<wbr>client-1=<wbr>0x000002280000000000000000<br>
&gt; trusted.afr.virt_images-<wbr>client-3=<wbr>0x000000000000000000000000<br>
&gt; trusted.bit-rot.version=<wbr>0x1d000000000000005a3aa0db000c<wbr>6563<br>
&gt; trusted.gfid=<wbr>0x7a36937d52fc4b55a93299e2328f<wbr>02ba<br>
&gt; trusted.gfid2path.<wbr>c076c6ac27a43012=<wbr>0x30303030303030302d303030302d<wbr>303030302d303030302d3030303030<wbr>303030303030312f6665646f726132<wbr>372e71636f7732<br>
&gt; trusted.glusterfs.quota.<wbr>00000000-0000-0000-0000-<wbr>000000000001.contri.1=<wbr>0x00000000a49eb000000000000000<wbr>0001<br>
&gt; trusted.pgfid.00000000-0000-<wbr>0000-0000-000000000001=<wbr>0x00000001<br>
&gt;<br>
&gt; # file: fedora27.qcow2<br>
&gt; trusted.afr.dirty=<wbr>0x000000000000000000000000<br>
&gt; trusted.afr.virt_images-<wbr>client-2=<wbr>0x000003ef0000000000000000<br>
&gt; trusted.afr.virt_images-<wbr>client-3=<wbr>0x000000000000000000000000<br>
&gt; trusted.bit-rot.version=<wbr>0x19000000000000005a3a9f82000c<wbr>382a<br>
&gt; trusted.gfid=<wbr>0x7a36937d52fc4b55a93299e2328f<wbr>02ba<br>
&gt; trusted.gfid2path.<wbr>c076c6ac27a43012=<wbr>0x30303030303030302d303030302d<wbr>303030302d303030302d3030303030<wbr>303030303030312f6665646f726132<wbr>372e71636f7732<br>
&gt; trusted.glusterfs.quota.<wbr>00000000-0000-0000-0000-<wbr>000000000001.contri.1=<wbr>0x00000000a2fbe000000000000000<wbr>0001<br>
&gt; trusted.pgfid.00000000-0000-<wbr>0000-0000-000000000001=<wbr>0x00000001<br>
&gt;<br>
&gt; I don&#39;t know how to find similar information from the arbiter...<br>
&gt;<br>
&gt; Versions are the same on all three systems:<br>
&gt; # glusterd --version<br>
&gt; glusterfs 3.12.2<br>
&gt;<br>
&gt; # gluster volume get all cluster.op-version<br>
&gt; Option                                  Value<br>
&gt; ------                                  -----<br>
&gt; cluster.op-version                      31202<br>
&gt;<br>
&gt; I might try upgrading to version 3.13.0 tomorrow, but I want to hear<br>
&gt; you out first.<br>
&gt;<br>
&gt; How do I fix this? Do I have to manually change the file attributes?<br>
&gt;<br>
&gt; Also, in the guides for manual resolution through setfattr, all the<br>
&gt; bricks are listed with a &quot;trusted.afr.&lt;volume&gt;-client-&lt;<wbr>brick&gt;&quot;. But in<br>
&gt; my system (as can be seen above), I only see the other bricks? So<br>
&gt; which attributes should be changes into what?<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; I hope someone might know a solution. If you need any more information<br>
&gt; I&#39;ll try and provide it. I can probably change the virtual machine to<br>
&gt; another image for now.<br>
&gt;<br>
&gt; Best regards,<br>
&gt; Henrik Juul Pedersen<br>
&gt; LIAB ApS<br>
&gt; ______________________________<wbr>_________________<br>
&gt; Gluster-users mailing list<br>
&gt; <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
&gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
&gt;<br>
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
</div></div></blockquote></div><br></div>