[Gluster-users] gluster volume 3.10.4 hangs

Mon Jul 31 21:29:26 UTC 2017

On 7/31/2017 1:12 AM, Seva Gluschenko wrote:
> Hi folks,
>
>
> I'm running a simple gluster setup with a single volume replicated at 
> two servers, as follows:
>
> Volume Name: gv0
> Type: Replicate
> Volume ID: dd4996c0-04e6-4f9b-a04e-73279c4f112b
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp

> The problem is, when it happened that one of replica servers hung, it 
> caused the whole glusterfs to hang.

Yes, you lost quorum and the system doesn't want you to get a split-brain.

> Could you please drop me a hint, is it expected behaviour, or are 
> there any tweaks and server or volume settings that might be altered 
> to change this? Any help would be appreciated much.
>

Add a third replica node (or just an arbiter node if you aren't that 
ambitious or want to save on the kit)

That way when you lose a node, the cluster it will pause for 40 seconds 
or so while it figures things out and then continue on.
When the missing node returns, the self-heal will kick in and you will 
be back to 100%.

Your other alternative is to turn off quorum. But that risks 
split-brain. Depending upon your data, that may or may not be a serious 
issue.

-wk

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170731/197afbb8/attachment.html>