[Bugs] [Bug 1500835] New: [geo-rep]: Status shows ACTIVE for most workers in EC before it becomes the PASSIVE

bugzilla at redhat.com bugzilla at redhat.com
Wed Oct 11 15:02:13 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1500835

            Bug ID: 1500835
           Summary: [geo-rep]: Status shows ACTIVE for most workers in EC
                    before it becomes the PASSIVE
           Product: GlusterFS
           Version: 3.12
         Component: geo-replication
          Keywords: ZStream
          Severity: low
          Assignee: bugs at gluster.org
          Reporter: khiremat at redhat.com
                CC: bugs at gluster.org, csaba at redhat.com,
                    khiremat at redhat.com, rhinduja at redhat.com,
                    rhs-bugs at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1460918, 1500284



+++ This bug was initially created as a clone of Bug #1500284 +++

+++ This bug was initially created as a clone of Bug #1460918 +++

Description of problem:
=======================

My understanding is that all the workers will try to acquire lock and the one
which gets becomes ACTIVE, rest all will be initializing and will go to
PASSIVE. However with EC (Tried once), Observed all becoming ACTIVE and than it
became PASSIVE. 

[root at dhcp37-150 scripts]# gluster volume geo-replication master
10.70.37.71::slave status

MASTER NODE     MASTER VOL    MASTER BRICK       SLAVE USER    SLAVE           
     SLAVE NODE      STATUS             CRAWL STATUS       LAST_SYNCED          
-------------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.150    master        /rhs/brick1/b1     root         
10.70.37.71::slave    10.70.37.181    Active             Changelog Crawl   
2017-06-13 06:51:59          
10.70.37.150    master        /rhs/brick2/b7     root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.171    master        /rhs/brick1/b2     root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.171    master        /rhs/brick2/b8     root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.105    master        /rhs/brick1/b3     root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.105    master        /rhs/brick2/b9     root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.194    master        /rhs/brick1/b4     root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.194    master        /rhs/brick2/b10    root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.42     master        /rhs/brick1/b5     root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.42     master        /rhs/brick2/b11    root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.190    master        /rhs/brick1/b6     root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
10.70.37.190    master        /rhs/brick2/b12    root         
10.70.37.71::slave    N/A             Initializing...    N/A                N/A 
[root at dhcp37-150 scripts]# 
[root at dhcp37-150 scripts]# 
[root at dhcp37-150 scripts]# gluster volume geo-replication master
10.70.37.71::slave status

MASTER NODE     MASTER VOL    MASTER BRICK       SLAVE USER    SLAVE           
     SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.150    master        /rhs/brick1/b1     root         
10.70.37.71::slave    10.70.37.181    Active     Changelog Crawl    2017-06-13
06:52:01          
10.70.37.150    master        /rhs/brick2/b7     root         
10.70.37.71::slave    10.70.37.181    Active     Changelog Crawl    2017-06-13
06:52:01          
10.70.37.42     master        /rhs/brick1/b5     root         
10.70.37.71::slave    10.70.37.181    Active     N/A                N/A         
10.70.37.42     master        /rhs/brick2/b11    root         
10.70.37.71::slave    10.70.37.181    Active     N/A                N/A         
10.70.37.190    master        /rhs/brick1/b6     root         
10.70.37.71::slave    10.70.37.71     Passive    N/A                N/A         
10.70.37.190    master        /rhs/brick2/b12    root         
10.70.37.71::slave    10.70.37.71     Active     N/A                N/A         
10.70.37.171    master        /rhs/brick1/b2     root         
10.70.37.71::slave    10.70.37.71     Active     N/A                N/A         
10.70.37.171    master        /rhs/brick2/b8     root         
10.70.37.71::slave    10.70.37.71     Active     N/A                N/A         
10.70.37.194    master        /rhs/brick1/b4     root         
10.70.37.71::slave    10.70.37.71     Active     N/A                N/A         
10.70.37.194    master        /rhs/brick2/b10    root         
10.70.37.71::slave    10.70.37.71     Active     N/A                N/A         
10.70.37.105    master        /rhs/brick1/b3     root         
10.70.37.71::slave    10.70.37.181    Active     N/A                N/A         
10.70.37.105    master        /rhs/brick2/b9     root         
10.70.37.71::slave    10.70.37.181    Active     N/A                N/A         
[root at dhcp37-150 scripts]# 


[root at dhcp37-150 scripts]# gluster volume geo-replication master
10.70.37.71::slave status

MASTER NODE     MASTER VOL    MASTER BRICK       SLAVE USER    SLAVE           
     SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.150    master        /rhs/brick1/b1     root         
10.70.37.71::slave    10.70.37.181    Active     Changelog Crawl    2017-06-13
06:52:01          
10.70.37.150    master        /rhs/brick2/b7     root         
10.70.37.71::slave    10.70.37.181    Active     Changelog Crawl    2017-06-13
06:52:01          
10.70.37.190    master        /rhs/brick1/b6     root         
10.70.37.71::slave    10.70.37.71     Passive    N/A                N/A         
10.70.37.190    master        /rhs/brick2/b12    root         
10.70.37.71::slave    10.70.37.71     Passive    N/A                N/A         
10.70.37.105    master        /rhs/brick1/b3     root         
10.70.37.71::slave    10.70.37.181    Passive    N/A                N/A         
10.70.37.105    master        /rhs/brick2/b9     root         
10.70.37.71::slave    10.70.37.181    Passive    N/A                N/A         
10.70.37.42     master        /rhs/brick1/b5     root         
10.70.37.71::slave    10.70.37.181    Passive    N/A                N/A         
10.70.37.42     master        /rhs/brick2/b11    root         
10.70.37.71::slave    10.70.37.181    Passive    N/A                N/A         
10.70.37.194    master        /rhs/brick1/b4     root         
10.70.37.71::slave    10.70.37.71     Passive    N/A                N/A         
10.70.37.194    master        /rhs/brick2/b10    root         
10.70.37.71::slave    10.70.37.71     Passive    N/A                N/A         
10.70.37.171    master        /rhs/brick1/b2     root         
10.70.37.71::slave    10.70.37.71     Passive    N/A                N/A         
10.70.37.171    master        /rhs/brick2/b8     root         
10.70.37.71::slave    10.70.37.71     Passive    N/A                N/A         
[root at dhcp37-150 scripts]# 


Version-Release number of selected component (if applicable):
=============================================================

mainline


How reproducible:
=================

Have seen once, and will try again


Steps to Reproduce:
===================
1. Create geo-replication with EC being Master
2. Monitor the status in a loop

--- Additional comment from Worker Ant on 2017-10-10 06:40:17 EDT ---

REVIEW: https://review.gluster.org/18464 (geo-rep: Fix status transition)
posted (#1) for review on master by Kotresh HR (khiremat at redhat.com)

--- Additional comment from Worker Ant on 2017-10-11 06:13:39 EDT ---

COMMIT: https://review.gluster.org/18464 committed in master by Aravinda VK
(avishwan at redhat.com) 
------
commit 3edf926a1bda43879c09694cf3904c214c94c9dc
Author: Kotresh HR <khiremat at redhat.com>
Date:   Tue Oct 10 05:54:04 2017 -0400

    geo-rep: Fix status transition

    The status transition is as below which is
    wrong.

    Created->Initializing->Active->Active/Passive->Stopped

    As soon as the monitor spawns the worker, the state
    is changed from 'Initializing' to 'Active' and then to
    'Active/Passive' based on whether worker gets the lock
    or not. This is wrong and it should directly tranistion
    as below.

    Created->Initializing->Active/Passive->Stopped

    Change-Id: Ibf5ca5c4fdf168c403c6da01db60b93f0604aae7
    BUG: 1500284
    Signed-off-by: Kotresh HR <khiremat at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1460918
[Bug 1460918] [geo-rep]: Status shows ACTIVE for most workers in EC before
it becomes the PASSIVE
https://bugzilla.redhat.com/show_bug.cgi?id=1500284
[Bug 1500284] [geo-rep]: Status shows ACTIVE for most workers in EC before
it becomes the PASSIVE
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list