[Bugs] [Bug 1217928] New: [georep]: Transition from xsync to changelog doesn't happen once the brick is brought online

bugzilla at redhat.com bugzilla at redhat.com
Sun May 3 04:59:36 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1217928

            Bug ID: 1217928
           Summary: [georep]: Transition from xsync to changelog doesn't
                    happen once the brick is brought online
           Product: GlusterFS
           Version: 3.7.0
         Component: geo-replication
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: avishwan at redhat.com
                CC: aavati at redhat.com, avishwan at redhat.com,
                    bugs at gluster.org, csaba at redhat.com,
                    gluster-bugs at redhat.com, nlevinki at redhat.com,
                    rhinduja at redhat.com, storage-qa-internal at redhat.com,
                    vagarwal at redhat.com
        Depends On: 1201712, 1202649



+++ This bug was initially created as a clone of Bug #1202649 +++

+++ This bug was initially created as a clone of Bug #1201712 +++

Description of problem:
=======================

If a brick is offline there is a transition from changelog to xsync since
changelogs can not be captured, once the brick is brough online the xsync
continuous to be active and doesnt trasition to changelog:

[2015-03-13 19:20:52.923316] E [repce(agent):117:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 41,
in scan
    return Changes.cl_scan()
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 45,
in cl_scan
    cls.raise_changelog_err()
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 27,
in raise_changelog_err
    raise ChangelogException(errn, os.strerror(errn))
ChangelogException: [Errno 111] Connection refused
[2015-03-13 19:20:52.924300] E [repce(/rhs/brick1/b1):207:__call__]
RepceClient: call 28276:140684070041344:1426254652.92 (scan) failed on peer
with ChangelogException
[2015-03-13 19:20:52.924525] I [resource(/rhs/brick1/b1):1352:service_loop]
GLUSTER: Changelog crawl failed, fallback to xsync

Steps carried:
==============

1. Create a master volume (2x3) from 3 nodes N1,N2,N3 consisting 2 bricks each.
2. Start the master volume
3. Create a slave volume (2x2) from 2 nodes S1,S2
4. Start a slave volume
5. Mount the master volume to the client
6. Create and start the georep session between master and slave
7. Copy the huge set of data from the client on master volume
8. While the data is in progress, bring bricks offline and online from node N1
and N2. Ensured that not to bring bricks offline from node N3 keeping one brick
constant up in x3 replica.
9. After sometime when all bricks are online, check the geo-rep status and logs

Actual results:
==============

georep status is shown as hybrid and logs shows that it failed to transition to
changelog and fallsback to xsync

--- Additional comment from Anand Avati on 2015-03-17 02:52:33 EDT ---

REVIEW: http://review.gluster.org/9758 ([WIP] geo-rep: Do not fail-back to
xsync if Changelog is failed) posted (#2) for review on master by Aravinda VK
(avishwan at redhat.com)

--- Additional comment from Anand Avati on 2015-03-17 04:54:51 EDT ---

REVIEW: http://review.gluster.org/9758 (geo-rep: Do not fail-back to xsync if
Changelog is failed) posted (#3) for review on master by Aravinda VK
(avishwan at redhat.com)

--- Additional comment from Anand Avati on 2015-03-17 04:56:44 EDT ---

REVIEW: http://review.gluster.org/9758 (geo-rep: Do not fail-back to xsync if
Changelog is failed) posted (#4) for review on master by Aravinda VK
(avishwan at redhat.com)

--- Additional comment from Anand Avati on 2015-04-27 07:37:58 EDT ---

COMMIT: http://review.gluster.org/9758 committed in master by Vijay Bellur
(vbellur at redhat.com) 
------
commit 60f764631971de4357d2f72a8995f844949de8ca
Author: Aravinda VK <avishwan at redhat.com>
Date:   Tue Mar 17 12:18:30 2015 +0530

    geo-rep: Do not fail-back to xsync if Changelog is failed

    Unless change_detector is set to xsync, do not fallback to
    xsync, except during Initial Sync or Partial History.

    When a brick goes down, Changelog exception is raised due
    to which geo-rep fallback to xsync. Even after brick comes
    back geo-rep will not consume Changelog.

    BUG: 1202649
    Change-Id: I1f8ea26ac7735f6ee09b3b143ee3eb66bfc9fc37
    Signed-off-by: Aravinda VK <avishwan at redhat.com>
    Reviewed-on: http://review.gluster.org/9758
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Saravanakumar Arumugam <sarumuga at redhat.com>
    Reviewed-by: Kotresh HR <khiremat at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1201712
[Bug 1201712] [georep]: Transition from xsync to changelog doesn't happen
once the brick is brought online
https://bugzilla.redhat.com/show_bug.cgi?id=1202649
[Bug 1202649] [georep]: Transition from xsync to changelog doesn't happen
once the brick is brought online
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list