[Gluster-users] split-brain recovery automation, any plans?

Wed Jul 13 04:13:01 UTC 2016

13.07.2016 07:44, Pranith Kumar Karampuri пишет:
>
>
> On Tue, Jul 12, 2016 at 9:27 PM, Dmitry Melekhov <dm at belkam.com 
> <mailto:dm at belkam.com>> wrote:
>
>
>
>     12.07.2016 17:38, Pranith Kumar Karampuri пишет:
>>     Did you wait for heals to complete before upgrading second node?
>
>     no...
>
>
> So basically if you have operations in progress on the mount, you 
> should wait for heals to complete before you upgrade second node. If 
> you have all the operations on all the mounts stopped or you unmounted 
> all the mounts for the volume, then you can upgrade all the servers 
> one by one then clients. Otherwise it will lead to problems. That said 
> in 3 way replica it shouldn't cause split-brains. So I would like to 
> know exact steps that lead to this problem.

Thank you, this is all I can remember :-(

> We know of one issue which leads to split-brains in case of VM 
> workloads where we take down bricks in cyclic manner without waiting 
> for heals to complete. I wonder if the steps that lead to split-brain 
> on your setup are similar. We are targetting this for future releases...
I guess we hit this...

>
>
>>
>>     On Tue, Jul 12, 2016 at 3:08 PM, Dmitry Melekhov <dm at belkam.com
>>     <mailto:dm at belkam.com>> wrote:
>>
>>         12.07.2016 13:31, Pranith Kumar Karampuri пишет:
>>>
>>>
>>>         On Mon, Jul 11, 2016 at 2:26 PM, Dmitry Melekhov
>>>         <dm at belkam.com <mailto:dm at belkam.com>> wrote:
>>>
>>>             11.07.2016 12:47, Gandalf Corvotempesta пишет:
>>>
>>>                 2016-07-11 9:54 GMT+02:00 Dmitry Melekhov
>>>                 <dm at belkam.com <mailto:dm at belkam.com>>:
>>>
>>>                     We just got split-brain during update to 3.7.13 ;-)
>>>
>>>                 This is an interesting point.
>>>                 Could you please tell me which replica count did you
>>>                 set ?
>>>
>>>
>>>             3
>>>
>>>
>>>                 With replica "3" split brain should not occurs, right ?
>>>
>>>
>>>             I guess we did something wrong :-)
>>>
>>>
>>>         Or there is a bug we never found? Could you please share
>>>         details about what you did?
>>
>>         upgraded to 3.7.13 from 3.7.11 using yum, while at least one
>>         VM is running :-)
>>         on all 3 servers, one by one:
>>
>>         yum upgrade
>>         systemctl stop glusterd
>>         than killed glusterfsd processes using kill
>>         and systemctl start glusterd
>>
>>         then next server....
>>
>>         after this we tried to restart VM, but it failed, because we
>>         forget to restart libvirtd, and it used old libraries,
>>         I guess this is point where we got this problem.
>>
>>>
>>>                 I'm planning a new cluster and I would like to be
>>>                 protected against
>>>                 split brains.
>>>
>>>
>>>             _______________________________________________
>>>             Gluster-users mailing list
>>>             Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>             http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>>
>>>         -- 
>>>         Pranith
>>
>>
>>
>>
>>     -- 
>>     Pranith
>
>
>
>
> -- 
> Pranith

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/993f2804/attachment.html>