[Gluster-users] split-brain recovery automation, any plans?
Dmitry Melekhov
dm at belkam.com
Wed Jul 13 04:13:01 UTC 2016
13.07.2016 07:44, Pranith Kumar Karampuri пишет:
>
>
> On Tue, Jul 12, 2016 at 9:27 PM, Dmitry Melekhov <dm at belkam.com
> <mailto:dm at belkam.com>> wrote:
>
>
>
> 12.07.2016 17:38, Pranith Kumar Karampuri пишет:
>> Did you wait for heals to complete before upgrading second node?
>
> no...
>
>
> So basically if you have operations in progress on the mount, you
> should wait for heals to complete before you upgrade second node. If
> you have all the operations on all the mounts stopped or you unmounted
> all the mounts for the volume, then you can upgrade all the servers
> one by one then clients. Otherwise it will lead to problems. That said
> in 3 way replica it shouldn't cause split-brains. So I would like to
> know exact steps that lead to this problem.
Thank you, this is all I can remember :-(
> We know of one issue which leads to split-brains in case of VM
> workloads where we take down bricks in cyclic manner without waiting
> for heals to complete. I wonder if the steps that lead to split-brain
> on your setup are similar. We are targetting this for future releases...
I guess we hit this...
>
>
>>
>> On Tue, Jul 12, 2016 at 3:08 PM, Dmitry Melekhov <dm at belkam.com
>> <mailto:dm at belkam.com>> wrote:
>>
>> 12.07.2016 13:31, Pranith Kumar Karampuri пишет:
>>>
>>>
>>> On Mon, Jul 11, 2016 at 2:26 PM, Dmitry Melekhov
>>> <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>
>>> 11.07.2016 12:47, Gandalf Corvotempesta пишет:
>>>
>>> 2016-07-11 9:54 GMT+02:00 Dmitry Melekhov
>>> <dm at belkam.com <mailto:dm at belkam.com>>:
>>>
>>> We just got split-brain during update to 3.7.13 ;-)
>>>
>>> This is an interesting point.
>>> Could you please tell me which replica count did you
>>> set ?
>>>
>>>
>>> 3
>>>
>>>
>>> With replica "3" split brain should not occurs, right ?
>>>
>>>
>>> I guess we did something wrong :-)
>>>
>>>
>>> Or there is a bug we never found? Could you please share
>>> details about what you did?
>>
>> upgraded to 3.7.13 from 3.7.11 using yum, while at least one
>> VM is running :-)
>> on all 3 servers, one by one:
>>
>> yum upgrade
>> systemctl stop glusterd
>> than killed glusterfsd processes using kill
>> and systemctl start glusterd
>>
>> then next server....
>>
>> after this we tried to restart VM, but it failed, because we
>> forget to restart libvirtd, and it used old libraries,
>> I guess this is point where we got this problem.
>>
>>>
>>> I'm planning a new cluster and I would like to be
>>> protected against
>>> split brains.
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>>
>>> --
>>> Pranith
>>
>>
>>
>>
>> --
>> Pranith
>
>
>
>
> --
> Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/993f2804/attachment.html>
More information about the Gluster-users
mailing list