[Gluster-users] ... i was able to produce a split brain...

Ml Ml mliebherr99 at googlemail.com
Wed Jan 28 17:28:50 UTC 2015


"/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" is a binary file.


Here is the output of gluster volume info:
--------------------------------------------------------------------------------------


[root at ovirt-node03 ~]# gluster volume info

Volume Name: RaidVolB
Type: Replicate
Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: ovirt-node03.example.local:/raidvol/volb/brick
Brick2: ovirt-node04.example.local:/raidvol/volb/brick
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
auth.allow: *
user.cifs: disable
nfs.disable: on




[root at ovirt-node04 ~]#  gluster volume info

Volume Name: RaidVolB
Type: Replicate
Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: ovirt-node03.example.local:/raidvol/volb/brick
Brick2: ovirt-node04.example.local:/raidvol/volb/brick
Options Reconfigured:
nfs.disable: on
user.cifs: disable
auth.allow: *
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
storage.owner-uid: 36
storage.owner-gid: 36


Here is the getfattr command in node03 and node 04:
--------------------------------------------------------------------------------------


 getfattr -d -m . -e hex
/raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
getfattr: Removing leading '/' from absolute path names
# file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
trusted.afr.RaidVolB-client-0=0x000000000000000000000000
trusted.afr.RaidVolB-client-1=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73



[root at ovirt-node04 ~]# getfattr -d -m . -e hex
/raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
getfattr: Removing leading '/' from absolute path names
# file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
trusted.afr.RaidVolB-client-0=0x000000000000000000000000
trusted.afr.RaidVolB-client-1=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73


Am i supposed to run those commands on the mounted brick?:
--------------------------------------------------------------------------------------
127.0.0.1:RaidVolB on
/rhev/data-center/mnt/glusterSD/127.0.0.1:RaidVolB type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)


At the very beginning i thought i removed the file with "rm
/raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids"
hoping gluster would then fix itself somehow :)
It was gone but it seems to be here again. Dunno if this is any help.


Here is gluster volume heal RaidVolB info on both nodes:
--------------------------------------------------------------------------------------

[root at ovirt-node03 ~]#  gluster volume heal RaidVolB info
Brick ovirt-node03.example.local:/raidvol/volb/brick/
Number of entries: 0

Brick ovirt-node04.example.local:/raidvol/volb/brick/
Number of entries: 0


[root at ovirt-node04 ~]#  gluster volume heal RaidVolB info
Brick ovirt-node03.example.local:/raidvol/volb/brick/
Number of entries: 0

Brick ovirt-node04.example.local:/raidvol/volb/brick/
Number of entries: 0


Thanks a lot,
Mario



On Wed, Jan 28, 2015 at 4:57 PM, Ravishankar N <ravishankar at redhat.com> wrote:
>
> On 01/28/2015 08:34 PM, Ml Ml wrote:
>>
>> Hello Ravi,
>>
>> thanks a lot for your reply.
>>
>> The Data on ovirt-node03 is the one which i want.
>>
>> Here are the infos collected by following the howto:
>>
>> https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md
>>
>>
>>
>> [root at ovirt-node03 ~]# gluster volume heal RaidVolB info split-brain
>> Gathering list of split brain entries on volume RaidVolB has been
>> successful
>>
>> Brick ovirt-node03.example.local:/raidvol/volb/brick
>> Number of entries: 0
>>
>> Brick ovirt-node04.example.local:/raidvol/volb/brick
>> Number of entries: 14
>> at                    path on brick
>> -----------------------------------
>> 2015-01-27 17:33:00 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
>> 2015-01-27 17:34:01 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
>> 2015-01-27 17:35:04 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73>
>> 2015-01-27 17:36:05 <gfid:cd411b57-6078-4f3c-80d1-0ac1455186a6>/ids
>> 2015-01-27 17:37:06 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:37:07 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:38:08 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:38:21 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:39:22 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:40:23 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:41:24 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:42:25 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:43:26 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> 2015-01-27 17:44:27 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>>
>> [root at ovirt-node03 ~]# gluster volume heal RaidVolB info
>> Brick ovirt-node03.example.local:/raidvol/volb/brick/
>> Number of entries: 0
>>
>> Brick ovirt-node04.example.local:/raidvol/volb/brick/
>> Number of entries: 0
>
>
> Hi Mario,
> Is "/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" a file or a directory?
> Whatever it is, it should be shown in the output of heal info /heal info
> split-brain command of both nodes. But I see it being listed only under
> node03.
> Also, heal info is showing zero entries for both nodes which is strange.
>
> Are node03 and node04 bricks of the same replica pair? Can you share
> 'gluster volume info` of RaidVolB?
> How did you infer that there is a split-brain? Does accessing the file(s)
> from the mount give input/output error?
>
>>
>> [root at ovirt-node03 ~]# getfattr -d -m . -e hex
>> /raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> getfattr: Removing leading '/' from absolute path names
>> # file: raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids
>> trusted.afr.RaidVolB-client-0=0x000000000000000000000000
>> trusted.afr.RaidVolB-client-1=0x000000000000000000000000
>> trusted.afr.dirty=0x000000000000000000000000
>> trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73
>
> What is the getfattr output of this file on the other brick? The afr
> specific xattrs being all zeros certainly don't indicate the possibility of
> a split-brain
>
>>
>> The "Resetting the relevant changelogs to resolve the split-brain: "
>> part of the howto is now a little complictaed. Do i have a data or
>> meta split brain now?
>> I guess i have a data split brain in my case, right?
>>
>> What are my next setfattr commands nowin my case if i want to keep the
>> data from node03?
>>
>> Thanks a lot!
>>
>> Mario
>>
>>
>> On Wed, Jan 28, 2015 at 9:44 AM, Ravishankar N <ravishankar at redhat.com>
>> wrote:
>>>
>>> On 01/28/2015 02:02 PM, Ml Ml wrote:
>>>>
>>>> I want to either take the file from node03 or node04. i really don’t
>>>>>
>>>>> mind. Can i not just tell gluster that it should use one node as the
>>>>> „current“ one?
>>>
>>> Policy based split-brain resolution [1] which does just that, has been
>>> merged in master and should be available in glusterfs 3.7.
>>> For the moment, you would have to modify the xattrs on the one of the
>>> bricks
>>> and trigger heal. You can see
>>>
>>> https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md
>>> on how to do it.
>>>
>>> Hope this helps,
>>> Ravi
>>>
>>> [1] http://review.gluster.org/#/c/9377/
>>>
>


More information about the Gluster-users mailing list