[Gluster-users] GFID Mismatch - Automatic Correction ?

Wed Jan 4 06:32:13 UTC 2017

Shouldn't that heal with an odd-man-out strategy? Or are all three GFIDs different?

On January 3, 2017 10:21:31 PM PST, Ravishankar N <ravishankar at redhat.com> wrote:
>
>On 01/04/2017 09:31 AM, Michael Ward wrote:
>>
>> Hey,
>>
>> To give some more context around the initial incident.. These systems
>
>> are hosted in AWS. The gluster brick for each instance is a seperate 
>> volume to the root volume. On prod-gluster01 a couple of nights ago
>we 
>> experienced massively high read iops on the root volume that we are 
>> unable to account for (> 200,000 iops when it usually sits between 0
>- 
>> 100 iops ). The box became inaccessible as a result and after 
>> approximately 40 minutes with no sign of the iops reducing was 
>> rebooted through the AWS console.
>>
>> The GFID mismatch problems appeared after that. There were initially 
>> ~50 impacted files, but I've fixed all but 1 of them now, which I'm 
>> leaving broken intentionally for further testing if required.
>>
>> If you don't mind, could you have a look over the information below 
>> and identify anything that looks like a problem, since obviously we 
>> did have a bunch of GFID mismatched files, which based on your email 
>> shouldn't happen..
>>
>> I've included everything I can think of, but if there is something 
>> else you would like to see, please let me know.
>>
>> # gluster volume info gv0
>>
>> Volume Name: gv0
>>
>> Type: Replicate
>>
>> Volume ID: 0ec7c49d-811c-4d4d-a3a9-e4ea9e83000c
>>
>> Status: Started
>>
>> Snapshot Count: 0
>>
>> Number of Bricks: 1 x (2 + 1) = 3
>>
>> Transport-type: tcp
>>
>> Bricks:
>>
>> Brick1: prod-gluster01.fqdn.com:/export/glus_brick0/brick
>>
>> Brick2: prod-gluster02.fqdn.com:/export/glus_brick0/brick
>>
>> Brick3: prod-gluster03.fqdn.com:/export/glus_brick0/brick (arbiter)
>>
>> Options Reconfigured:
>>
>> cluster.favorite-child-policy: none
>>
>> nfs.disable: on
>>
>> performance.readdir-ahead: on
>>
>> client.event-threads: 7
>>
>> server.event-threads: 3
>>
>> performance.cache-size: 256MB
>>
>> cluster.favorite-child-policy is set to none because I reverted the 
>> change to majority when it didn't make any difference.
>>
>> [root at prod-gluster01 glusterfs]# getfattr -d -m . -e hex 
>> /export/glus_brick0/brick/home/user/.viminfo
>>
>> getfattr: Removing leading '/' from absolute path names
>>
>> # file: export/glus_brick0/brick/home/user/.viminfo
>>
>>
>security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>
>> trusted.afr.dirty=0x000000000000000000000000
>>
>> trusted.bit-rot.version=0x0200000000000000585756be00024333
>>
>> trusted.gfid=0x1b86a5a76e884f40be583fa33aa9a576
>>
>> [root at prod-gluster02 glusterfs]# getfattr -d -m . -e hex 
>> /export/glus_brick0/brick/home/user/.viminfo
>>
>> getfattr: Removing leading '/' from absolute path names
>>
>> # file: export/glus_brick0/brick/home/user/.viminfo
>>
>>
>security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>
>> trusted.afr.dirty=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-0=0x000000020000000100000000
>>
>> trusted.bit-rot.version=0x020000000000000058593aac000661fa
>>
>> trusted.gfid=0x4931b10977f34496a7cdf8f23809c372
>>
>> [root at prod-gluster03 glusterfs]# getfattr -d -m . -e hex 
>> /export/glus_brick0/brick/home/user/.viminfo
>>
>> getfattr: Removing leading '/' from absolute path names
>>
>> # file: export/glus_brick0/brick/home/user/.viminfo
>>
>>
>security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>
>> trusted.afr.dirty=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-0=0x000000020000000100000000
>>
>> trusted.bit-rot.version=0x020000000000000058585ed6000f2077
>>
>> trusted.gfid=0x4931b10977f34496a7cdf8f23809c372
>>
>> Just in case it's useful, here is the getfattr for the parent
>directory:
>>
>> [root at prod-gluster01 glusterfs]# getfattr -d -m . -e hex 
>> /export/glus_brick0/brick/home/user
>>
>> getfattr: Removing leading '/' from absolute path names
>>
>> # file: export/glus_brick0/brick/home/user
>>
>>
>security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>
>> trusted.afr.dirty=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-1=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-2=0x000000000000000000000000
>>
>> trusted.gfid=0x0a49de7ee4f04aae9fc8a88378e0d193
>>
>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>
>> [root at prod-gluster02 glusterfs]# getfattr -d -m . -e hex 
>> /export/glus_brick0/brick/home/user
>>
>> getfattr: Removing leading '/' from absolute path names
>>
>> # file: export/glus_brick0/brick/home/user
>>
>>
>security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>
>> trusted.afr.dirty=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-0=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-2=0x000000000000000000000000
>>
>> trusted.gfid=0x0a49de7ee4f04aae9fc8a88378e0d193
>>
>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>
>> [root at prod-gluster03 glusterfs]# getfattr -d -m . -e hex 
>> /export/glus_brick0/brick/home/user
>>
>> getfattr: Removing leading '/' from absolute path names
>>
>> # file: export/glus_brick0/brick/home/user
>>
>>
>security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>
>> trusted.afr.dirty=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-0=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-1=0x000000000000000000000000
>>
>> trusted.afr.gv0-client-2=0x000000000000000000000000
>>
>> trusted.gfid=0x0a49de7ee4f04aae9fc8a88378e0d193
>>
>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>
>> [root at prod-gluster01 bricks]# gluster volume heal gv0 info
>>
>> Brick prod-gluster01.fqdn.com:/export/glus_brick0/brick
>>
>> Status: Connected
>>
>> Number of entries: 0
>>
>> Brick prod-gluster02.fqdn.com:/export/glus_brick0/brick
>>
>> <gfid:4931b109-77f3-4496-a7cd-f8f23809c372>
>>
>> Status: Connected
>>
>> Number of entries: 1
>>
>> Brick prod-gluster03.fqdn.com:/export/glus_brick0/brick
>>
>> <gfid:4931b109-77f3-4496-a7cd-f8f23809c372>
>>
>> Status: Connected
>>
>> Number of entries: 1
>>
>> [root at prod-gluster01 bricks]# gluster volume heal gv0 info
>split-brain
>>
>> Brick prod-gluster01.fqdn.com:/export/glus_brick0/brick
>>
>> Status: Connected
>>
>> Number of entries in split-brain: 0
>>
>> Brick prod-gluster02.fqdn.com:/export/glus_brick0/brick
>>
>> Status: Connected
>>
>> Number of entries in split-brain: 0
>>
>> Brick prod-gluster03.fqdn.com:/export/glus_brick0/brick
>>
>> Status: Connected
>>
>> Number of entries in split-brain: 0
>>
>> Clients show this in the gluster.log:
>>
>> [2017-01-04 03:13:40.863695] W [MSGID: 108008] 
>> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check] 
>> 0-gv0-replicate-0: GFID mismatch for 
>> <gfid:0a49de7e-e4f0-4aae-9fc8-a88378e0d193>/.viminfo 
>> 4931b109-77f3-4496-a7cd-f8f23809c372 on gv0-client-1 and 
>> 1b86a5a7-6e88-4f40-be58-3fa33aa9a576 on gv0-client-0
>>
>> [2017-01-04 03:13:40.867853] W [fuse-bridge.c:471:fuse_entry_cbk] 
>> 0-glusterfs-fuse: 13067223: LOOKUP() /home/user/.viminfo => -1 
>> (Input/output error)
>>
>> There's no mention of either of the GFID's for the .viminfo file in 
>> /var/log/gluster/*.log or 
>> /var/log/gluster/brick/export-glus_brick0-brick.log file.
>>
>
>Thanks for the details Michael. While it does look like a bug, I am not
>
>sure how we ended in this state. Either the afr xattrs of the parent 
>directory were cleared without self-heal of .vimrc  happening from 
>gluster02 or 03 to 01  (or) it wasn't set in the first place when the 
>file was recreated on 02 and 03 when 01 was down. If you have some
>steps 
>to re-create the issue, please raise a bug.
>
>Regards,
>Ravi
>
>
>> Thank you very much for your time,
>>
>> Michael Ward
>>
>> *From:*Ravishankar N [mailto:ravishankar at redhat.com]
>> *Sent:* Wednesday, 4 January 2017 12:21 PM
>> *To:* Michael Ward <Michael.Ward at melbourneit.com.au>; 
>> gluster-users at gluster.org
>> *Subject:* Re: [Gluster-users] GFID Mismatch - Automatic Correction ?
>>
>> On 01/04/2017 06:27 AM, Michael Ward wrote:
>>
>>     Hi,
>>
>>     We have a 2 data node plus 1 arbiter node replicate gluster
>volume
>>     running gluster 3.8.5. Clients are also using 3.8.5.
>>
>>     One of the data nodes failed the other night, and whilst it was
>>     down, several files were replaced on the second data node /
>>     arbiter (and thus the filesystem path was linked to a new GFID).
>>
>>     When the broken node was restarted, these files were in a gfid
>>     mismatch state. I know how to manually correct them, but was
>>     wondering if there is an automated way ?
>>
>> For resolving gfid-split-brains, there is no automated way or 
>> favorite-child policy. When you say 2 data+1 arbiter, you are using
>an 
>> actual arbiter volume right? (as opposed to a replica 2 volume + a 
>> dummy node which some people are referring to as arbiter for 
>> server-quourm). gfid-split-brains should not occur on either
>replica-3 
>> or arbiter volumes with the steps you described.
>> Regards,
>> Ravi
>>
>>
>>     I thought the cluster.favorite-child-policy volume setting of
>>     majority would work, but it made no difference. Clients were
>still
>>     getting Input/output error when attempting to access those files.
>>
>>     Regards,
>>
>>     Michael Ward
>>
>>
>>
>>
>>     _______________________________________________
>>
>>     Gluster-users mailing list
>>
>>     Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>
>>     http://www.gluster.org/mailman/listinfo/gluster-users
>>

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20170103/4bcda48e/attachment.html>