[Bugs] [Bug 1443977] Unable to take snapshot on a geo-replicated volume, even after stopping the session

bugzilla at redhat.com bugzilla at redhat.com
Thu Apr 20 11:21:21 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1443977

Kotresh HR <khiremat at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|bugs at gluster.org            |khiremat at redhat.com



--- Comment #1 from Kotresh HR <khiremat at redhat.com> ---
Description of problem:
========================
Had two 4node clusters, with one as master and the other acting as slave. Both
were part of RHGS-Console. Had 2 geo-rep sessions created in 3.7.9-12 build.
Upgraded the RHGS bits to 3.8.4-12 by following the procedure mentioned in the
guide. 

Tried to take a snapshot on the master volume, and it complained: 'the geo-rep
session is running. Please stop before taking a snapshot.' Stopped the geo-rep
session and again tried to take a snapshot. It complained with the same error
as before - 'that it found a running geo-rep session', even though the session
was stopped. 

Found a way to reproduce it consistently

1. Have a geo-rep session in 'started' state between 'master' and 'slave'
volumes
2. Restart glusterd on one of the master nodes
3. Stop the session between 'master' and 'slave' volumes
4. Take a snapshot on 'master'

Expected result: Snapshot creation should succeed.
Actual result: Snapshot creation fails with the error - 'found a running
geo-rep session'

Version-Release number of selected component (if applicable):
============================================================
mainline

How reproducible:
================
Seeing it on 2 of my geo-rep sessions.


Additional info:
=================
[root at dhcp47-26 ~]# gluster v geo-rep status

MASTER NODE                         MASTER VOL    MASTER BRICK               
SLAVE USER    SLAVE                                                  SLAVE NODE
                          STATUS     CRAWL STATUS       LAST_SYNCED             
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26                         masterB       /bricks/brick1/masterB_1   
root          ssh://dhcp35-100.lab.eng.blr.redhat.com::slaveB       
dhcp35-100.lab.eng.blr.redhat.com    Active     Changelog Crawl    2017-01-12
11:56:35          
10.70.47.26                         masterD       /bricks/brick0/masterD_2   
us2           ssh://us2@dhcp35-100.lab.eng.blr.redhat.com::slaveD   
10.70.35.101                         Active     Changelog Crawl    2017-01-24
11:21:10          
10.70.47.26                         mm            /bricks/brick0/mm2         
geo           ssh://geo@10.70.35.115::ss                            
10.70.35.101                         Active     Changelog Crawl    2017-01-17
11:21:46          
10.70.47.60                         masterB       /bricks/brick1/masterB_3   
root          ssh://dhcp35-100.lab.eng.blr.redhat.com::slaveB       
10.70.35.101                         Active     Changelog Crawl    2017-01-12
11:56:43          
10.70.47.60                         masterD       /bricks/brick0/masterD_0   
us2           ssh://us2@dhcp35-100.lab.eng.blr.redhat.com::slaveD   
10.70.35.115                         Active     Changelog Crawl    2017-01-24
11:21:14          
10.70.47.60                         mm            /bricks/brick0/mm0         
geo           ssh://geo@10.70.35.115::ss                            
10.70.35.115                         Active     Changelog Crawl    2017-01-17
11:21:33          
dhcp47-27.lab.eng.blr.redhat.com    masterB       /bricks/brick1/masterB_0   
root          ssh://dhcp35-100.lab.eng.blr.redhat.com::slaveB       
10.70.35.115                         Active     Changelog Crawl    2017-01-12
11:56:35          
10.70.47.27                         masterD       /bricks/brick0/masterD_3   
us2           ssh://us2@dhcp35-100.lab.eng.blr.redhat.com::slaveD   
10.70.35.100                         Passive    N/A                N/A          
10.70.47.27                         mm            /bricks/brick0/mm3         
geo           ssh://geo@10.70.35.115::ss                            
10.70.35.100                         Passive    N/A                N/A          
10.70.47.61                         masterB       /bricks/brick1/masterB_2   
root          ssh://dhcp35-100.lab.eng.blr.redhat.com::slaveB       
10.70.35.104                         Active     Changelog Crawl    2017-01-12
11:56:35          
10.70.47.61                         masterD       /bricks/brick0/masterD_1   
us2           ssh://us2@dhcp35-100.lab.eng.blr.redhat.com::slaveD   
10.70.35.104                         Passive    N/A                N/A          
10.70.47.61                         mm            /bricks/brick0/mm1         
geo           ssh://geo@10.70.35.115::ss                            
10.70.35.104                         Passive    N/A                N/A          
[root at dhcp47-26 ~]# gluster v geo-rep masterB
dhcp35-100.lab.eng.blr.redhat.com::slaveB status

MASTER NODE                         MASTER VOL    MASTER BRICK               
SLAVE USER    SLAVE                                        SLAVE NODE          
                STATUS    CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26                         masterB       /bricks/brick1/masterB_1   
root          dhcp35-100.lab.eng.blr.redhat.com::slaveB   
dhcp35-100.lab.eng.blr.redhat.com    Active    Changelog Crawl    2017-01-12
11:56:35          
10.70.47.61                         masterB       /bricks/brick1/masterB_2   
root          dhcp35-100.lab.eng.blr.redhat.com::slaveB    10.70.35.104        
                Active    Changelog Crawl    2017-01-12 11:56:35          
dhcp47-27.lab.eng.blr.redhat.com    masterB       /bricks/brick1/masterB_0   
root          dhcp35-100.lab.eng.blr.redhat.com::slaveB    10.70.35.115        
                Active    Changelog Crawl    2017-01-12 11:56:35          
10.70.47.60                         masterB       /bricks/brick1/masterB_3   
root          dhcp35-100.lab.eng.blr.redhat.com::slaveB    10.70.35.101        
                Active    Changelog Crawl    2017-01-12 11:56:43          
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# gluster v geo-rep masterB
dhcp35-100.lab.eng.blr.redhat.com::slaveB status

MASTER NODE                         MASTER VOL    MASTER BRICK               
SLAVE USER    SLAVE                                        SLAVE NODE    STATUS
    CRAWL STATUS    LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26                         masterB       /bricks/brick1/masterB_1   
root          dhcp35-100.lab.eng.blr.redhat.com::slaveB    N/A          
Stopped    N/A             N/A                  
10.70.47.60                         masterB       /bricks/brick1/masterB_3   
root          dhcp35-100.lab.eng.blr.redhat.com::slaveB    N/A          
Stopped    N/A             N/A                  
10.70.47.61                         masterB       /bricks/brick1/masterB_2   
root          dhcp35-100.lab.eng.blr.redhat.com::slaveB    N/A          
Stopped    N/A             N/A                  
dhcp47-27.lab.eng.blr.redhat.com    masterB       /bricks/brick1/masterB_0   
root          dhcp35-100.lab.eng.blr.redhat.com::slaveB    N/A          
Stopped    N/A             N/A                  
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# gluster snap create masterB_snap1 
Invalid Syntax.
Usage: snapshot create <snapname> <volname> [no-timestamp] [description
<description>] [force]
[root at dhcp47-26 ~]# gluster snap create masterB_snap1 masterB no-timestamp
snapshot create: failed: geo-replication session is running for the volume
masterB. Session needs to be stopped before taking a snapshot.
Snapshot command failed
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# gluster v geo-rep mm geo at 10.70.35.115::ss status

MASTER NODE    MASTER VOL    MASTER BRICK          SLAVE USER    SLAVE         
         SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED              
---------------------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26    mm            /bricks/brick0/mm2    geo          
geo at 10.70.35.115::ss    10.70.35.101    Active     Changelog Crawl   
2017-01-17 11:21:46          
10.70.47.27    mm            /bricks/brick0/mm3    geo          
geo at 10.70.35.115::ss    10.70.35.100    Passive    N/A                N/A       
10.70.47.60    mm            /bricks/brick0/mm0    geo          
geo at 10.70.35.115::ss    10.70.35.115    Active     Changelog Crawl   
2017-01-17 11:21:33          
10.70.47.61    mm            /bricks/brick0/mm1    geo          
geo at 10.70.35.115::ss    10.70.35.104    Passive    N/A                N/A       
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# gluster v geo-rep mm geo at 10.70.35.115::ss stop
Stopping geo-replication session between mm & geo at 10.70.35.115::ss has been
successful
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# gluster v geo-rep mm geo at 10.70.35.115::ss status

MASTER NODE    MASTER VOL    MASTER BRICK          SLAVE USER    SLAVE         
         SLAVE NODE    STATUS     CRAWL STATUS    LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------------
10.70.47.26    mm            /bricks/brick0/mm2    geo          
geo at 10.70.35.115::ss    N/A           Stopped    N/A             N/A            
10.70.47.61    mm            /bricks/brick0/mm1    geo          
geo at 10.70.35.115::ss    N/A           Stopped    N/A             N/A            
10.70.47.27    mm            /bricks/brick0/mm3    geo          
geo at 10.70.35.115::ss    N/A           Stopped    N/A             N/A            
10.70.47.60    mm            /bricks/brick0/mm0    geo          
geo at 10.70.35.115::ss    N/A           Stopped    N/A             N/A            
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# gluster snap create mm_snap mm
snapshot create: failed: geo-replication session is running for the volume mm.
Session needs to be stopped before taking a snapshot.
Snapshot command failed
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# vim
/var/log/glusterfs/geo-replication/mm/ssh%3A%2F%2Fgeo%4010.70.35.115%3Agluster%3A%2F%2F127.0.0.1%3Ass.log
[root at dhcp47-26 ~]# 
[root at dhcp47-26 ~]# 
[root at dhcp47-60 ~]# gluster peer status
Number of Peers: 3

Hostname: dhcp47-27.lab.eng.blr.redhat.com
Uuid: 6eb0185c-cc76-4bd1-a691-2ecb6a652901
State: Peer in Cluster (Connected)

Hostname: 10.70.47.61
Uuid: 3f350e37-69aa-4fc3-b9af-70c4db688721
State: Peer in Cluster (Connected)

Hostname: 10.70.47.26
Uuid: 53883823-cb8e-4da1-b6ee-a53e0ef7cd9a
State: Peer in Cluster (Connected)
[root at dhcp47-60 ~]# 
[root at dhcp47-60 ~]# 
[root at dhcp47-60 ~]# rpm -qa | grep gluster
vdsm-gluster-4.17.33-1.1.el7rhgs.noarch
glusterfs-api-3.8.4-12.el7rhgs.x86_64
glusterfs-libs-3.8.4-12.el7rhgs.x86_64
python-gluster-3.8.4-12.el7rhgs.noarch
glusterfs-3.8.4-12.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-12.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-cli-3.8.4-12.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-12.el7rhgs.x86_64
glusterfs-server-3.8.4-12.el7rhgs.x86_64
gluster-nagios-addons-0.2.8-1.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-12.el7rhgs.x86_64
glusterfs-fuse-3.8.4-12.el7rhgs.x86_64
[root at dhcp47-60 ~]# 
[root at dhcp47-60 ~]# 
[root at dhcp47-60 ~]# gluster v list
gluster_shared_storage
masterA
masterB
masterD
mm
[root at dhcp47-60 ~]# 
[root at dhcp47-60 ~]# 
[root at dhcp47-60 ~]# gluster v info mm

Volume Name: mm
Type: Distributed-Replicate
Volume ID: 4c435eff-24de-4030-a8dc-769bbaf292a4
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.47.60:/bricks/brick0/mm0
Brick2: 10.70.47.61:/bricks/brick0/mm1
Brick3: 10.70.47.26:/bricks/brick0/mm2
Brick4: 10.70.47.27:/bricks/brick0/mm3
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
performance.readdir-ahead: on
nfs.disable: off
transport.address-family: inet
cluster.enable-shared-storage: enable
[root at dhcp47-60 ~]# 
[root at dhcp47-60 ~]#

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list