[Bugs] [Bug 1352277] New: a two node glusterfs seems not possible anymore?!

Sun Jul 3 10:01:00 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1352277

            Bug ID: 1352277
           Summary: a two node glusterfs seems not possible anymore?!
           Product: GlusterFS
           Version: mainline
         Component: glusterd
          Keywords: Reopened, Triaged
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: amukherj at redhat.com
                CC: amukherj at redhat.com, bugs at gluster.org,
                    joe at julianfamily.org, jules at ispire.me,
                    sasundar at redhat.com
        Depends On: 1347329

+++ This bug was initially created as a clone of Bug #1347329 +++

Description of problem:
a two node glusterfs seems not possible anymore?! 

Version-Release number of selected component (if applicable):
3.7-11, 3.8

How reproducible:

set: 
cluster.quorum-type: fixed
cluster.quorum-count: 1
cluster.server-quorum-type: none

Steps to Reproduce:
1. Shutdown (kill) other node glusterd/glusterfs
2. Check: gluster volume start nfs-storage
3. Get error like bellow.

Actual results:

volume start: nfs-storage: failed: Quorum not met. Volume operation not
allowed.

Expected results:

Running bricks on one node.

Additional info:

Latest Debian Jessie

--- Additional comment from Joe Julian on 2016-06-16 15:37:19 EDT ---

With a 2 server replica 2 started volume - no other volumes:

Without any quorum settings being set, the brick will not start on boot if the
other server is down. This is a departure from prior behavior and cannot be
worked around.

Setting cluster.server-quorum-type to none does not allow the brick to be
started by glusterd. It looks like as long as
glusterd_is_any_volume_in_server_quorum returns gf_false it should. In my test
case there was only one volume and as long as no volumes are set to "server"
that function should return gf_false.

If cluster.server-quorum-type is set to server and cluster-server-quorum-ratio
is set to 0, the brick will start, but neither nfs nor glustershd start.

--- Additional comment from Atin Mukherjee on 2016-06-21 08:11:56 EDT ---

Is there any specific reason why are we considering quorum tunables with a two
node set up? Ideally we should consider the quorum options in case of at least
three node set up. Also as Joe pointed out on a two node set up, if one of the
node goes for a reboot while the other is down, the daemons do not get spawned
as data consistency is not guaranteed here. The moment the peer update is
received the daemons are spawned. IMO, this is an expected behaviour as per the
design. I am closing this bug, please feel free to reopen if you think
otherwise.

--- Additional comment from Jules on 2016-06-21 08:19:21 EDT ---

So what are the switches "none" for if they doesn't function?

--- Additional comment from Atin Mukherjee on 2016-06-21 08:23:11 EDT ---

(In reply to Jules from comment #3)
> So what are the switches "none" for if they doesn't function?

That's the default value. If you don't set server quorum type to server, its
basically considered to be off and that's what it implies here.

--- Additional comment from Jules on 2016-06-21 08:26:35 EDT ---

Have you tested the second part that JoeJulian mentioned?

--- Additional comment from Atin Mukherjee on 2016-06-21 08:55:59 EDT ---

"If cluster.server-quorum-type is set to server and cluster-server-quorum-ratio
is set to 0, the brick will start, but neither nfs nor glustershd start." - is
this valid for a set up having more than 2 nodes?

Anyways, I will test this and get back.

--- Additional comment from Atin Mukherjee on 2016-06-22 01:49:43 EDT ---

I tested the same with a three node set up and bricks don't come up in that
case. As I mentioned earlier, that having quorum tunables with a 2 node setup
doesn't make sense to me, I'd not consider it as a bug.

--- Additional comment from Joe Julian on 2016-06-22 03:02:31 EDT ---

It breaks prior production behavior with no workaround and should thus be
considered a bug. 

If you want to protect users from themselves by default, I'm all behind this,
but if a user knows the risks and wishes to override the safety defaults to
retain prior behavior, this should be allowed.

--- Additional comment from Atin Mukherjee on 2016-06-22 03:06:05 EDT ---

Well, you always have an option to use volume start force as a work around in
this case, isn't it?

--- Additional comment from Jules on 2016-07-02 05:15:56 EDT ---

(In reply to Atin Mukherjee from comment #9)
> Well, you always have an option to use volume start force as a work around
> in this case, isn't it?

Well, since this needs to be done by manual intervention this is not a good
work around i recommend. How about a new config switch to get this working
without using the force option like it was in the past.
As Joe Julian mentioned. A user which knows the risks should be able to
override the safety defaults.

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1347329
[Bug 1347329] a two node glusterfs seems not possible anymore?!
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.