[Bugs] [Bug 1445213] Unable to take snapshot on a geo-replicated volume, even after stopping the session
bugzilla at redhat.com
bugzilla at redhat.com
Tue Apr 25 09:24:28 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1445213
Kotresh HR <khiremat at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|bugs at gluster.org |khiremat at redhat.com
--- Comment #1 from Kotresh HR <khiremat at redhat.com> ---
Description of problem:
========================
Had two 4node clusters, with one as master and the other acting as slave. Both
were part of RHGS-Console. Had 2 geo-rep sessions created in 3.7.9-12 build.
Upgraded the RHGS bits to 3.8.4-12 by following the procedure mentioned in the
guide.
Tried to take a snapshot on the master volume, and it complained: 'the geo-rep
session is running. Please stop before taking a snapshot.' Stopped the geo-rep
session and again tried to take a snapshot. It complained with the same error
as before - 'that it found a running geo-rep session', even though the session
was stopped.
Found a way to reproduce it consistently
1. Have a geo-rep session in 'started' state between 'master' and 'slave'
volumes
2. Restart glusterd on one of the master nodes
3. Stop the session between 'master' and 'slave' volumes
4. Take a snapshot on 'master'
Expected result: Snapshot creation should succeed.
Actual result: Snapshot creation fails with the error - 'found a running
geo-rep session'
Version-Release number of selected component (if applicable):
============================================================
mainline
How reproducible:
================
Seeing it on 2 of my geo-rep sessions.
Additional info:
=================
[root at dhcp47-26 ~]# gluster v geo-rep status
MASTER NODE MASTER VOL MASTER BRICK
SLAVE USER SLAVE SLAVE NODE
STATUS CRAWL STATUS LAST_SYNCED
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26 masterB /bricks/brick1/masterB_1
root ssh://dhcp35-100.lab.eng.blr.redhat.com::slaveB
dhcp35-100.lab.eng.blr.redhat.com Active Changelog Crawl 2017-01-12
11:56:35
10.70.47.26 masterD /bricks/brick0/masterD_2
us2 ssh://us2@dhcp35-100.lab.eng.blr.redhat.com::slaveD
10.70.35.101 Active Changelog Crawl 2017-01-24
11:21:10
10.70.47.26 mm /bricks/brick0/mm2
geo ssh://geo@10.70.35.115::ss
10.70.35.101 Active Changelog Crawl 2017-01-17
11:21:46
10.70.47.60 masterB /bricks/brick1/masterB_3
root ssh://dhcp35-100.lab.eng.blr.redhat.com::slaveB
10.70.35.101 Active Changelog Crawl 2017-01-12
11:56:43
10.70.47.60 masterD /bricks/brick0/masterD_0
us2 ssh://us2@dhcp35-100.lab.eng.blr.redhat.com::slaveD
10.70.35.115 Active Changelog Crawl 2017-01-24
11:21:14
10.70.47.60 mm /bricks/brick0/mm0
geo ssh://geo@10.70.35.115::ss
10.70.35.115 Active Changelog Crawl 2017-01-17
11:21:33
dhcp47-27.lab.eng.blr.redhat.com masterB /bricks/brick1/masterB_0
root ssh://dhcp35-100.lab.eng.blr.redhat.com::slaveB
10.70.35.115 Active Changelog Crawl 2017-01-12
11:56:35
10.70.47.27 masterD /bricks/brick0/masterD_3
us2 ssh://us2@dhcp35-100.lab.eng.blr.redhat.com::slaveD
10.70.35.100 Passive N/A N/A
10.70.47.27 mm /bricks/brick0/mm3
geo ssh://geo@10.70.35.115::ss
10.70.35.100 Passive N/A N/A
10.70.47.61 masterB /bricks/brick1/masterB_2
root ssh://dhcp35-100.lab.eng.blr.redhat.com::slaveB
10.70.35.104 Active Changelog Crawl 2017-01-12
11:56:35
10.70.47.61 masterD /bricks/brick0/masterD_1
us2 ssh://us2@dhcp35-100.lab.eng.blr.redhat.com::slaveD
10.70.35.104 Passive N/A N/A
10.70.47.61 mm /bricks/brick0/mm1
geo ssh://geo@10.70.35.115::ss
10.70.35.104 Passive N/A N/A
[root at dhcp47-26 ~]# gluster v geo-rep masterB
dhcp35-100.lab.eng.blr.redhat.com::slaveB status
MASTER NODE MASTER VOL MASTER BRICK
SLAVE USER SLAVE SLAVE NODE
STATUS CRAWL STATUS LAST_SYNCED
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26 masterB /bricks/brick1/masterB_1
root dhcp35-100.lab.eng.blr.redhat.com::slaveB
dhcp35-100.lab.eng.blr.redhat.com Active Changelog Crawl 2017-01-12
11:56:35
10.70.47.61 masterB /bricks/brick1/masterB_2
root dhcp35-100.lab.eng.blr.redhat.com::slaveB 10.70.35.104
Active Changelog Crawl 2017-01-12 11:56:35
dhcp47-27.lab.eng.blr.redhat.com masterB /bricks/brick1/masterB_0
root dhcp35-100.lab.eng.blr.redhat.com::slaveB 10.70.35.115
Active Changelog Crawl 2017-01-12 11:56:35
10.70.47.60 masterB /bricks/brick1/masterB_3
root dhcp35-100.lab.eng.blr.redhat.com::slaveB 10.70.35.101
Active Changelog Crawl 2017-01-12 11:56:43
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]# gluster v geo-rep masterB
dhcp35-100.lab.eng.blr.redhat.com::slaveB status
MASTER NODE MASTER VOL MASTER BRICK
SLAVE USER SLAVE SLAVE NODE STATUS
CRAWL STATUS LAST_SYNCED
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26 masterB /bricks/brick1/masterB_1
root dhcp35-100.lab.eng.blr.redhat.com::slaveB N/A
Stopped N/A N/A
10.70.47.60 masterB /bricks/brick1/masterB_3
root dhcp35-100.lab.eng.blr.redhat.com::slaveB N/A
Stopped N/A N/A
10.70.47.61 masterB /bricks/brick1/masterB_2
root dhcp35-100.lab.eng.blr.redhat.com::slaveB N/A
Stopped N/A N/A
dhcp47-27.lab.eng.blr.redhat.com masterB /bricks/brick1/masterB_0
root dhcp35-100.lab.eng.blr.redhat.com::slaveB N/A
Stopped N/A N/A
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]# gluster snap create masterB_snap1
Invalid Syntax.
Usage: snapshot create <snapname> <volname> [no-timestamp] [description
<description>] [force]
[root at dhcp47-26 ~]# gluster snap create masterB_snap1 masterB no-timestamp
snapshot create: failed: geo-replication session is running for the volume
masterB. Session needs to be stopped before taking a snapshot.
Snapshot command failed
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]# gluster v geo-rep mm geo at 10.70.35.115::ss status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
---------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26 mm /bricks/brick0/mm2 geo
geo at 10.70.35.115::ss 10.70.35.101 Active Changelog Crawl
2017-01-17 11:21:46
10.70.47.27 mm /bricks/brick0/mm3 geo
geo at 10.70.35.115::ss 10.70.35.100 Passive N/A N/A
10.70.47.60 mm /bricks/brick0/mm0 geo
geo at 10.70.35.115::ss 10.70.35.115 Active Changelog Crawl
2017-01-17 11:21:33
10.70.47.61 mm /bricks/brick0/mm1 geo
geo at 10.70.35.115::ss 10.70.35.104 Passive N/A N/A
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]# gluster v geo-rep mm geo at 10.70.35.115::ss stop
Stopping geo-replication session between mm & geo at 10.70.35.115::ss has been
successful
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]# gluster v geo-rep mm geo at 10.70.35.115::ss status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
--------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26 mm /bricks/brick0/mm2 geo
geo at 10.70.35.115::ss N/A Stopped N/A N/A
10.70.47.61 mm /bricks/brick0/mm1 geo
geo at 10.70.35.115::ss N/A Stopped N/A N/A
10.70.47.27 mm /bricks/brick0/mm3 geo
geo at 10.70.35.115::ss N/A Stopped N/A N/A
10.70.47.60 mm /bricks/brick0/mm0 geo
geo at 10.70.35.115::ss N/A Stopped N/A N/A
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]# gluster snap create mm_snap mm
snapshot create: failed: geo-replication session is running for the volume mm.
Session needs to be stopped before taking a snapshot.
Snapshot command failed
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]# vim
/var/log/glusterfs/geo-replication/mm/ssh%3A%2F%2Fgeo%4010.70.35.115%3Agluster%3A%2F%2F127.0.0.1%3Ass.log
[root at dhcp47-26 ~]#
[root at dhcp47-26 ~]#
[root at dhcp47-60 ~]# gluster peer status
Number of Peers: 3
Hostname: dhcp47-27.lab.eng.blr.redhat.com
Uuid: 6eb0185c-cc76-4bd1-a691-2ecb6a652901
State: Peer in Cluster (Connected)
Hostname: 10.70.47.61
Uuid: 3f350e37-69aa-4fc3-b9af-70c4db688721
State: Peer in Cluster (Connected)
Hostname: 10.70.47.26
Uuid: 53883823-cb8e-4da1-b6ee-a53e0ef7cd9a
State: Peer in Cluster (Connected)
[root at dhcp47-60 ~]#
[root at dhcp47-60 ~]#
[root at dhcp47-60 ~]# rpm -qa | grep gluster
vdsm-gluster-4.17.33-1.1.el7rhgs.noarch
glusterfs-api-3.8.4-12.el7rhgs.x86_64
glusterfs-libs-3.8.4-12.el7rhgs.x86_64
python-gluster-3.8.4-12.el7rhgs.noarch
glusterfs-3.8.4-12.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-12.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-cli-3.8.4-12.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-12.el7rhgs.x86_64
glusterfs-server-3.8.4-12.el7rhgs.x86_64
gluster-nagios-addons-0.2.8-1.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-12.el7rhgs.x86_64
glusterfs-fuse-3.8.4-12.el7rhgs.x86_64
[root at dhcp47-60 ~]#
[root at dhcp47-60 ~]#
[root at dhcp47-60 ~]# gluster v list
gluster_shared_storage
masterA
masterB
masterD
mm
[root at dhcp47-60 ~]#
[root at dhcp47-60 ~]#
[root at dhcp47-60 ~]# gluster v info mm
Volume Name: mm
Type: Distributed-Replicate
Volume ID: 4c435eff-24de-4030-a8dc-769bbaf292a4
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.47.60:/bricks/brick0/mm0
Brick2: 10.70.47.61:/bricks/brick0/mm1
Brick3: 10.70.47.26:/bricks/brick0/mm2
Brick4: 10.70.47.27:/bricks/brick0/mm3
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
performance.readdir-ahead: on
nfs.disable: off
transport.address-family: inet
cluster.enable-shared-storage: enable
[root at dhcp47-60 ~]#
[root at dhcp47-60 ~]#
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list