[Bugs] [Bug 1235512] New: quorum calculation might go for toss for a concurrent peer probe command

bugzilla at redhat.com bugzilla at redhat.com
Thu Jun 25 03:34:47 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1235512

            Bug ID: 1235512
           Summary: quorum calculation might go for toss for a concurrent
                    peer probe command
           Product: GlusterFS
           Version: 3.7.2
         Component: glusterd
          Keywords: Triaged
          Assignee: bugs at gluster.org
          Reporter: amukherj at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com
        Depends On: 1232686



+++ This bug was initially created as a clone of Bug #1232686 +++

Description of problem:

Current codebase is skipping quorum calculation check if there a peer is in
process of joining the cluster (quorum_contrib = QUORUM_WAITING). This might
very well impact the server side quorum feature. We would allow operations to
proceed even if the server side quorum is not met. 

The reproducer is pretty hard as to get to a state of a peer where its
quorum_contrib is set to QUORUM_WAITING. However running the following test
case might fail in the last volume stop command.

#!/bin/bash                                                                    
. $(dirname $0)/../../include.rc                                                
. $(dirname $0)/../../volume.rc                                                 
. $(dirname $0)/../../cluster.rc                                                

cleanup;                                                                        

TEST launch_cluster 4;                                                          

TEST $CLI_1 peer probe $H2;                                                     
TEST $CLI_1 peer probe $H3;                                                     

EXPECT_WITHIN $PROBE_TIMEOUT 2 peer_count                                       

TEST $CLI_1 volume create $V0 $H1:$B1/$V0 $H2:$B2/$V0                           
TEST $CLI_1 volume set $V0 cluster.server-quorum-type server                    
TEST $CLI_1 volume start $V0                                                    

TEST kill_glusterd 2                                                            
TEST kill_glusterd 3                                                            

TEST ! $CLI_1 volume stop $V0;                                                  

cleanup;  

Version-Release number of selected component (if applicable):
Mainline

How reproducible:
Rare

Steps to Reproduce:
1. Source install gluster
2. run the above test cases in a loop and it might fail in the last test case

Actual results:
last test case might fail

Expected results:
All the test cases should pass every time.

Additional info:

--- Additional comment from Anand Avati on 2015-06-17 06:00:45 EDT ---

REVIEW: http://review.gluster.org/11275 (glusterd: fix quorum calculation
logic) posted (#1) for review on master by Atin Mukherjee (amukherj at redhat.com)

--- Additional comment from Anand Avati on 2015-06-23 02:18:42 EDT ---

REVIEW: http://review.gluster.org/11275 (glusterd: fix quorum calculation
logic) posted (#2) for review on master by Atin Mukherjee (amukherj at redhat.com)

--- Additional comment from Anand Avati on 2015-06-24 02:54:06 EDT ---

REVIEW: http://review.gluster.org/11275 (glusterd: fix quorum calculation
logic) posted (#3) for review on master by Atin Mukherjee (amukherj at redhat.com)

--- Additional comment from Anand Avati on 2015-06-24 23:30:16 EDT ---

COMMIT: http://review.gluster.org/11275 committed in master by Atin Mukherjee
(amukherj at redhat.com) 
------
commit 0be38bdb4007c1bcb51545057e6402f6e14922cd
Author: Atin Mukherjee <amukherj at redhat.com>
Date:   Wed Jun 17 14:20:14 2015 +0530

    glusterd: fix quorum calculation logic

    glusterd_get_quorum_cluster_counts () skips quorum calculation if it finds
any
    of its peer in QUORUM_WAITING state. This means if any peer probe has been
    triggered and at the same point of time a transaction has been initiated,
it
    might pass through the server quorum check which it should not.

    Change-Id: I44eda8905eab3349c9ebf2842e7131d4e758a528
    BUG: 1232686
    Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
    Reviewed-on: http://review.gluster.org/11275
    Reviewed-by: Krishnan Parthasarathi <kparthas at redhat.com>
    Reviewed-by: Anand Nekkunti <anekkunt at redhat.com>
    Tested-by: NetBSD Build System <jenkins at build.gluster.org>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1232686
[Bug 1232686] quorum calculation might go for toss for a concurrent peer
probe command
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list