[Gluster-users] Heal flapping between Possibly undergoing heal and In split brain

Karthik Subrahmanya ksubrahm at redhat.com
Thu Mar 21 10:43:26 UTC 2019


Can you attach the "glustershd.log"  file which will be present under
"/var/log/glusterfs/" from both the nodes and the "stat" & "getfattr -d -m
. -e hex <file-path-on-brick>" output of all the entries listed in the heal
info output from both the bricks?

On Thu, Mar 21, 2019 at 3:54 PM Milos Cuculovic <cuculovic at mdpi.com> wrote:

> Thanks Karthik!
>
> I was trying to find some resolution methods from [2] but unfortunately
> none worked (I can explain what I tried if needed).
>
> I guess the volume you are talking about is of type replica-2 (1x2).
>
> That’s correct, aware of the arbiter solution but still didn’t took time
> to implement.
>
> From the info results I posted, how to know in which situation I am. No
> files are mentioned in spit brain, only directories. One brick has 3
> entries and one two entries.
>
> sudo gluster volume heal storage2 info
> [sudo] password for sshadmin:
> Brick storage3:/data/data-cluster
> <gfid:256ca960-1601-4f0d-9b08-905c6fd52326>
> <gfid:7a63a729-c48f-4a00-9040-c3e2a0710ae6>
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 3
>
> Brick storage4:/data/data-cluster
> <gfid:276fec9a-1c9b-4efe-9715-dcf4207e99b0>
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 2
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculovic at mdpi.com <cuculovic at mdpi.com>
> Skype: milos.cuculovic.mdpi
>
> Disclaimer: The information and files contained in this message
> are confidential and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this message in
> error, please notify me and delete this message from your system. You may
> not copy this message in its entirety or in part, or disclose its contents
> to anyone.
>
> On 21 Mar 2019, at 10:27, Karthik Subrahmanya <ksubrahm at redhat.com> wrote:
>
> Hi,
>
> Note: I guess the volume you are talking about is of type replica-2 (1x2).
> Usually replica 2 volumes are prone to split-brain. If you can consider
> converting them to arbiter or replica-3, they will handle most of the cases
> which can lead to slit-brains. For more information see [1].
>
> Resolving the split-brain: [2] talks about how to interpret the heal info
> output and different ways to resolve them using the CLI/manually/using the
> favorite-child-policy.
> If you are having entry split brain, and is a gfid split-brain (file/dir
> having different gfids on the replica bricks) then you can use the CLI
> option to resolve them. If a directory is in gfid split-brain in a
> distributed-replicate volume and you are using the source-brick option
> please make sure you use the brick of this subvolume, which has the same
> gfid as that of the other distribute subvolume(s) where you have the
> correct gfid, as the source.
> If you are having a type mismatch then follow the steps in [3] to resolve
> the split-brain.
>
> [1]
> https://docs.gluster.org/en/v3/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/
> [2]
> https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/
> [3]
> https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/#dir-split-brain
>
> HTH,
> Karthik
>
> On Thu, Mar 21, 2019 at 1:45 PM Milos Cuculovic <cuculovic at mdpi.com>
> wrote:
>
>> I was now able to catch the split brain log:
>>
>> sudo gluster volume heal storage2 info
>> Brick storage3:/data/data-cluster
>> <gfid:256ca960-1601-4f0d-9b08-905c6fd52326>
>> <gfid:7a63a729-c48f-4a00-9040-c3e2a0710ae6>
>> /dms/final_archive - Is in split-brain
>>
>> Status: Connected
>> Number of entries: 3
>>
>> Brick storage4:/data/data-cluster
>> <gfid:276fec9a-1c9b-4efe-9715-dcf4207e99b0>
>> /dms/final_archive - Is in split-brain
>>
>> Status: Connected
>> Number of entries: 2
>>
>> Milos
>>
>> On 21 Mar 2019, at 09:07, Milos Cuculovic <cuculovic at mdpi.com> wrote:
>>
>> Since 24h, after upgrading from 4.0 to 4.1.7 one of the servers, the heal
>> shows this:
>>
>> sudo gluster volume heal storage2 info
>> Brick storage3:/data/data-cluster
>> <gfid:256ca960-1601-4f0d-9b08-905c6fd52326>
>> <gfid:7a63a729-c48f-4a00-9040-c3e2a0710ae6>
>> /dms/final_archive - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 3
>>
>> Brick storage4:/data/data-cluster
>> <gfid:276fec9a-1c9b-4efe-9715-dcf4207e99b0>
>> /dms/final_archive - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 2
>>
>> The same files stay there. From time to time the status of the
>> /dms/final_archive is in split brain at the following command shows:
>>
>> sudo gluster volume heal storage2 info split-brain
>> Brick storage3:/data/data-cluster
>> /dms/final_archive
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> Brick storage4:/data/data-cluster
>> /dms/final_archive
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> How to know the file who is in split brain? The files in
>> /dms/final_archive are not very important, fine to remove (ideally resolve
>> the split brain) for the ones that differ.
>>
>> I can only see the directory and GFID. Any idea on how to resolve this
>> situation as I would like to continue with the upgrade on the 2nd server,
>> and for this the heal needs to be done with 0 entries in sudo gluster
>> volume heal storage2 info
>>
>> Thank you in advance, Milos.
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190321/35c8ac76/attachment.html>


More information about the Gluster-users mailing list