[Bugs] [Bug 1577862] New: [geo-rep]: Upgrade fails, session in FAULTY state
bugzilla at redhat.com
bugzilla at redhat.com
Mon May 14 09:59:31 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1577862
Bug ID: 1577862
Summary: [geo-rep]: Upgrade fails, session in FAULTY state
Product: GlusterFS
Version: 3.12
Component: geo-replication
Keywords: Regression
Severity: urgent
Assignee: bugs at gluster.org
Reporter: khiremat at redhat.com
CC: amukherj at redhat.com, bugs at gluster.org,
csaba at redhat.com, rallan at redhat.com,
rhinduja at redhat.com, rhs-bugs at redhat.com,
sankarshan at redhat.com, storage-qa-internal at redhat.com
Depends On: 1569490, 1575490
Blocks: 1474012, 1503137
+++ This bug was initially created as a clone of Bug #1575490 +++
Description of problem:
=======================
While upgrading from gluster version 3.8 to v.3.12 encountered a FAULTY session
where there was only one worker ACTIVE.
[root at dhcp42-53 master]# gluster volume geo-replication master
10.70.42.164::slave status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.53 master /rhs/brick1/b1 root
10.70.42.164::slave N/A Faulty N/A N/A
10.70.42.53 master /rhs/brick2/b4 root
10.70.42.164::slave N/A Faulty N/A N/A
10.70.42.138 master /rhs/brick1/b3 root
10.70.42.164::slave 10.70.42.164 Active History Crawl N/A
10.70.42.138 master /rhs/brick2/b6 root
10.70.42.164::slave N/A Faulty N/A N/A
10.70.42.160 master /rhs/brick1/b2 root
10.70.42.164::slave N/A Faulty N/A N/A
10.70.42.160 master /rhs/brick2/b5 root
10.70.42.164::slave N/A Faulty N/A N/A
Traceback in geo-rep logs:
--------------------------------
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 210, in main
main_i()
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 802, in
main_i
local.service_loop(*[r for r in [remote] if r])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1676, in
service_loop
g3.crawlwrap(oneshot=True)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 597, in
crawlwrap
self.crawl()
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1470, in
crawl
self.changelogs_batch_process(changes)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1370, in
changelogs_batch_process
self.process(batch)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1204, in
process
self.process_change(change, done, retry)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1123, in
process_change
entry_stime_to_update[0])
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncdstatus.py", line 200, in
set_field
return self._update(merger)
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncdstatus.py", line 161, in
_update
data = mergerfunc(data)
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncdstatus.py", line 194, in
merger
if data[key] == value:
KeyError: 'last_synced_entry'
Version-Release number of selected component (if applicable):
=============================================================
How reproducible:
=================
1/1
Actual results:
===============
Session is FAULTY.
Expected results:
=================
Session should not be FAULTY.
--- Additional comment from Worker Ant on 2018-05-07 02:06:22 EDT ---
REVIEW: https://review.gluster.org/19969 (geo-rep: Fix upgrade issue) posted
(#1) for review on master by Kotresh HR
--- Additional comment from Worker Ant on 2018-05-07 06:17:41 EDT ---
COMMIT: https://review.gluster.org/19969 committed in master by "Aravinda VK"
<avishwan at redhat.com> with a commit message- geo-rep: Fix upgrade issue
Cause and Analysis:
The last synced changelog for entry operations is
marked in current version to avoid re-processing
of already processed entry operations in a batch
during crash/restart of geo-rep. This was not
present in previous versoins.
The marker is maintained in the dictionary with the
key 'last_synced_entry' and dictionary is persisted
into status file. So upgrading to current version in
which the marker is present was failing with KeyError.
Solution:
Load the dictionary with default keys first which
contains all the keys including latest ones and then
load the values from status file instead of doing
otherwise.
fixes: bz#1575490
Change-Id: Ic654e6f9a3c97f616761f1362f890352a2186fb4
Signed-off-by: Kotresh HR <khiremat at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1474012
[Bug 1474012] [geo-rep]: Incorrect last sync "0" during hystory crawl after
upgrade/stop-start
https://bugzilla.redhat.com/show_bug.cgi?id=1569490
[Bug 1569490] [geo-rep]: in-service upgrade fails, session in FAULTY state
https://bugzilla.redhat.com/show_bug.cgi?id=1575490
[Bug 1575490] [geo-rep]: Upgrade fails, session in FAULTY state
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list