[Bugs] [Bug 1539657] New: Georeplication tests intermittently fail

bugzilla at redhat.com bugzilla at redhat.com
Mon Jan 29 11:57:30 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1539657

            Bug ID: 1539657
           Summary: Georeplication tests intermittently fail
           Product: GlusterFS
           Version: 4.0
         Component: geo-replication
          Assignee: bugs at gluster.org
          Reporter: nigelb at redhat.com
                CC: bugs at gluster.org, khiremat at redhat.com,
                    srangana at redhat.com
        Depends On: 1537602



+++ This bug was initially created as a clone of Bug #1537602 +++

The tests fail due to a possible configuration issue. Shyam and I debugged it
down to that. Going to disable the test to unblock all the other reviews.

--- Additional comment from Worker Ant on 2018-01-23 10:18:46 EST ---

REVIEW: https://review.gluster.org/19301 (tests: Disable geo-rep tests) posted
(#1) for review on master by Nigel Babu

--- Additional comment from Shyamsundar on 2018-01-23 10:24:54 EST ---

The test fails in the following runs:

https://build.gluster.org/job/centos6-regression/8602/console
https://build.gluster.org/job/centos6-regression/8604/console
https://build.gluster.org/job/centos6-regression/8607/console
https://build.gluster.org/job/centos6-regression/8608/console
https://build.gluster.org/job/centos6-regression/8612/console

Failure is almost always when checking for which nodes are in "Active" and
"Passive" states, 

06:58:02 not ok 22 Got "1" instead of "2", LINENUM:83
06:58:02 FAILED COMMAND: 2 check_status_num_rows Passive
AND/OR
06:58:02 not ok 37 Got "1" instead of "2", LINENUM:102
06:58:02 FAILED COMMAND: 2 check_status_num_rows Passive

On checking slave25 and rerunning this test (from a fresh clone of the sources
etc.) post step in line 83 it is noted that the command output looks as
follows,


[root at slave25 ~]# gluster volume geo-replication master 127.0.0.1::slave status
detail

MASTER NODE                  MASTER VOL    MASTER BRICK           SLAVE USER   
SLAVE               SLAVE NODE                   STATUS             CRAWL
STATUS       LAST_SYNCED            ENTRY    DATA    META    FAILURES   
CHECKPOINT TIME    CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME   
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
slave25.cloud.gluster.org    master        /d/backends/master1    root         
127.0.0.1::slave    slave25.cloud.gluster.org    Active             Changelog
Crawl    2018-01-23 14:17:38    7        0       0       0           N/A       
        N/A                     N/A                          
slave25.cloud.gluster.org    master        /d/backends/master2    root         
127.0.0.1::slave    N/A                          Faulty             N/A        
       N/A                    N/A      N/A     N/A     N/A         N/A         
      N/A                     N/A                          
slave25.cloud.gluster.org    master        /d/backends/master3    root         
127.0.0.1::slave    N/A                          Initializing...    N/A        
       N/A                    N/A      N/A     N/A     N/A         N/A         
      N/A                     N/A                          
slave25.cloud.gluster.org    master        /d/backends/master4    root         
127.0.0.1::slave    slave25.cloud.gluster.org    Active             Changelog
Crawl    2018-01-23 14:17:38    9        0       0       0           N/A       
        N/A                     N/A   

The above never recovers, and so is not a timing issue per-se.

Can someone from the geo-rep team take a look at the logs from those runs to
determine what is going wrong and why is the status "Faulty" or Initializing"
as that seem to be th estart of the test failure.

--- Additional comment from Worker Ant on 2018-01-23 21:11:52 EST ---

COMMIT: https://review.gluster.org/19301 committed in master by \"Nigel Babu\"
<nigelb at redhat.com> with a commit message- tests: Disable geo-rep tests

These tests are prone to issues at the moment that need further
debugging and fixing.

BUG: 1537602
Change-Id: Ic59ca620925c6f43948b8a751eaddb571b791969
Signed-off-by: Nigel Babu <nigelb at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1537602
[Bug 1537602] Georeplication tests intermittently fail
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list