[Gluster-users] ... i was able to produce a split brain...

Ravishankar N ravishankar at redhat.com
Thu Jan 29 04:19:20 UTC 2015


On 01/28/2015 10:58 PM, Ml Ml wrote:
> "/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" is a binary file.
>
>
> Here is the output of gluster volume info:
> --------------------------------------------------------------------------------------
>
>
> [root at ovirt-node03 ~]# gluster volume info
>
> Volume Name: RaidVolB
> Type: Replicate
> Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: ovirt-node03.example.local:/raidvol/volb/brick
> Brick2: ovirt-node04.example.local:/raidvol/volb/brick
> Options Reconfigured:
> storage.owner-gid: 36
> storage.owner-uid: 36
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> auth.allow: *
> user.cifs: disable
> nfs.disable: on
>
>
>
>
> [root at ovirt-node04 ~]#  gluster volume info
>
> Volume Name: RaidVolB
> Type: Replicate
> Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: ovirt-node03.example.local:/raidvol/volb/brick
> Brick2: ovirt-node04.example.local:/raidvol/volb/brick
> Options Reconfigured:
> nfs.disable: on
> user.cifs: disable
> auth.allow: *
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> storage.owner-uid: 36
> storage.owner-gid: 36
>
>
> Here is the getfattr command in node03 and node 04:
> --------------------------------------------------------------------------------------
>
>
>   getfattr -d -m . -e hex
> /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
> getfattr: Removing leading '/' from absolute path names
> # file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
> trusted.afr.RaidVolB-client-0=0x000000000000000000000000
> trusted.afr.RaidVolB-client-1=0x000000000000000000000000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73
>
>
>
> [root at ovirt-node04 ~]# getfattr -d -m . -e hex
> /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
> getfattr: Removing leading '/' from absolute path names
> # file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
> trusted.afr.RaidVolB-client-0=0x000000000000000000000000
> trusted.afr.RaidVolB-client-1=0x000000000000000000000000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73
>

These xattrs seem to indicate there is no split-brain for the file, 
heal-info also shows 0 entries on both bricks.
Are you getting I/O error when you read 
"1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" from the mount?
If yes, is there a difference in file size on both nodes? How about the 
contents (check if md5sum is same)?
> Am i supposed to run those commands on the mounted brick?:
> --------------------------------------------------------------------------------------
> 127.0.0.1:RaidVolB on
> /rhev/data-center/mnt/glusterSD/127.0.0.1:RaidVolB type fuse.glusterfs
> (rw,default_permissions,allow_other,max_read=131072)
>
>
> At the very beginning i thought i removed the file with "rm
> /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids"
> hoping gluster would then fix itself somehow :)
> It was gone but it seems to be here again. Dunno if this is any help.
>
>
> Here is gluster volume heal RaidVolB info on both nodes:
> --------------------------------------------------------------------------------------
>
> [root at ovirt-node03 ~]#  gluster volume heal RaidVolB info
> Brick ovirt-node03.example.local:/raidvol/volb/brick/
> Number of entries: 0
>
> Brick ovirt-node04.example.local:/raidvol/volb/brick/
> Number of entries: 0
>
>
> [root at ovirt-node04 ~]#  gluster volume heal RaidVolB info
> Brick ovirt-node03.example.local:/raidvol/volb/brick/
> Number of entries: 0
>
> Brick ovirt-node04.example.local:/raidvol/volb/brick/
> Number of entries: 0
>
>
> Thanks a lot,
> Mario
>
>
>
> On Wed, Jan 28, 2015 at 4:57 PM, Ravishankar N <ravishankar at redhat.com> wrote:
>> On 01/28/2015 08:34 PM, Ml Ml wrote:
>>> Hello Ravi,
>>>
>>> thanks a lot for your reply.
>>>
>>> The Data on ovirt-node03 is the one which i want.
>>>
>>> Here are the infos collected by following the howto:
>>>
>>> https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md
>>>
>>>
>>>
>>> [root at ovirt-node03 ~]# gluster volume heal RaidVolB info split-brain
>>> Gathering list of split brain entries on volume RaidVolB has been
>>> successful
>>>
>>> Brick ovirt-node03.example.local:/raidvol/volb/brick
>>> Number of entries: 0
>>>
>>> Brick ovirt-node04.example.local:/raidvol/volb/brick
>>> Number of entries: 14
>>> at                    path on brick
>>> -----------------------------------
>>> 2015-01-27 17:33:00 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
>>> 2015-01-27 17:34:01 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
>>> 2015-01-27 17:35:04 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
>>> 2015-01-27 17:36:05 <gfid:cd411b57-6078-4f3c-80d1-0ac1455186a6>/ids
>>> 2015-01-27 17:37:06 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:37:07 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:38:08 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:38:21 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:39:22 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:40:23 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:41:24 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:42:25 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:43:26 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> 2015-01-27 17:44:27 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>>
>>> [root at ovirt-node03 ~]# gluster volume heal RaidVolB info
>>> Brick ovirt-node03.example.local:/raidvol/volb/brick/
>>> Number of entries: 0
>>>
>>> Brick ovirt-node04.example.local:/raidvol/volb/brick/
>>> Number of entries: 0
>>
>> Hi Mario,
>> Is "/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" a file or a directory?
>> Whatever it is, it should be shown in the output of heal info /heal info
>> split-brain command of both nodes. But I see it being listed only under
>> node03.
>> Also, heal info is showing zero entries for both nodes which is strange.
>>
>> Are node03 and node04 bricks of the same replica pair? Can you share
>> 'gluster volume info` of RaidVolB?
>> How did you infer that there is a split-brain? Does accessing the file(s)
>> from the mount give input/output error?
>>
>>> [root at ovirt-node03 ~]# getfattr -d -m . -e hex
>>> /raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> getfattr: Removing leading '/' from absolute path names
>>> # file: raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>> trusted.afr.RaidVolB-client-0=0x000000000000000000000000
>>> trusted.afr.RaidVolB-client-1=0x000000000000000000000000
>>> trusted.afr.dirty=0x000000000000000000000000
>>> trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73
>> What is the getfattr output of this file on the other brick? The afr
>> specific xattrs being all zeros certainly don't indicate the possibility of
>> a split-brain
>>
>>> The "Resetting the relevant changelogs to resolve the split-brain: "
>>> part of the howto is now a little complictaed. Do i have a data or
>>> meta split brain now?
>>> I guess i have a data split brain in my case, right?
>>>
>>> What are my next setfattr commands nowin my case if i want to keep the
>>> data from node03?
>>>
>>> Thanks a lot!
>>>
>>> Mario
>>>
>>>
>>> On Wed, Jan 28, 2015 at 9:44 AM, Ravishankar N <ravishankar at redhat.com>
>>> wrote:
>>>> On 01/28/2015 02:02 PM, Ml Ml wrote:
>>>>> I want to either take the file from node03 or node04. i really don’t
>>>>>> mind. Can i not just tell gluster that it should use one node as the
>>>>>> „current“ one?
>>>> Policy based split-brain resolution [1] which does just that, has been
>>>> merged in master and should be available in glusterfs 3.7.
>>>> For the moment, you would have to modify the xattrs on the one of the
>>>> bricks
>>>> and trigger heal. You can see
>>>>
>>>> https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md
>>>> on how to do it.
>>>>
>>>> Hope this helps,
>>>> Ravi
>>>>
>>>> [1] http://review.gluster.org/#/c/9377/
>>>>



More information about the Gluster-users mailing list