[Bugs] [Bug 1405909] New: Snapshot: After snapshot restore failure , snapshot goes into inconsistent state

bugzilla at redhat.com bugzilla at redhat.com
Mon Dec 19 06:49:11 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1405909

            Bug ID: 1405909
           Summary: Snapshot: After snapshot restore failure , snapshot
                    goes into inconsistent state
           Product: GlusterFS
           Version: 3.9
         Component: snapshot
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: asengupt at redhat.com
                CC: amukherj at redhat.com, ashah at redhat.com,
                    bugs at gluster.org, rcyriac at redhat.com,
                    rhinduja at redhat.com, rhs-bugs at redhat.com,
                    rjoseph at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1403672, 1404118



+++ This bug was initially created as a clone of Bug #1404118 +++

+++ This bug was initially created as a clone of Bug #1403672 +++

Description of problem:

With reference to bug #1403169, After snapshot restore failure, snapshot goes
into inconsistent state.
Can't activate this snapshot, coz activate command shows snapshot is already
activated and snapshot status command shows some bricks process are down. 


Version-Release number of selected component (if applicable):

glusterfs-3.8.4-7.el7rhgs.x86_64


How reproducible:

100%

Steps to Reproduce:
1. Create 2*2 distribute replicate volume
2. Enable cluster.enable-shared-storage 
3. enable nfs ganesha
4. create snapshot 
5. disable nfs ganesha
6. bring gluster-shared-storage volume
7. Restore snapshot  , this command will fail
8. Check snapshot status, or trying taking clone of snapshot

Actual results:

clone command fails 
snapshot status command shows some bricks process are down and activate command
says snapshot is already activated

Expected results:

clone command should not fail


Additional info:



[2016-12-12 06:51:25.513917] E [MSGID: 106122]
[glusterd-snapshot.c:2389:glusterd_snapshot_clone_prevalidate] 0-management:
Failed to pre validate
[2016-12-12 06:51:25.513948] E [MSGID: 106443]
[glusterd-snapshot.c:2405:glusterd_snapshot_clone_prevalidate] 0-management:
One or more bricks are not running. Please run snapshot status command to see
brick status.
Please start the stopped brick and then issue snapshot clone command 
[2016-12-12 06:51:25.513960] W [MSGID: 106443]
[glusterd-snapshot.c:8636:glusterd_snapshot_prevalidate] 0-management: Snapshot
clone pre-validation failed
[2016-12-12 06:51:25.513969] W [MSGID: 106122]
[glusterd-mgmt.c:167:gd_mgmt_v3_pre_validate_fn] 0-management: Snapshot
Prevalidate Failed
[2016-12-12 06:51:25.513978] E [MSGID: 106122]
[glusterd-mgmt.c:916:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Snapshot on local node
[2016-12-12 06:51:25.513987] E [MSGID: 106122]
[glusterd-mgmt.c:2272:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre
Validation Failed
[2016-12-12 06:51:25.514003] E [MSGID: 106027]
[glusterd-snapshot.c:8113:glusterd_snapshot_clone_postvalidate] 0-management:
unable to find clone clone1 volinfo
[2016-12-12 06:51:25.514012] W [MSGID: 106444]
[glusterd-snapshot.c:9136:glusterd_snapshot_postvalidate] 0-management:
Snapshot create post-validation failed
[2016-12-12 06:51:25.514019] W [MSGID: 106121]
[glusterd-mgmt.c:373:gd_mgmt_v3_post_validate_fn] 0-management: postvalidate
operation failed
[2016-12-12 06:51:25.514027] E [MSGID: 106121]
[glusterd-mgmt.c:1689:glusterd_mgmt_v3_post_validate] 0-management: Post
Validation failed for operation Snapshot on local node
[2016-12-12 06:51:25.514035] E [MSGID: 106122]
[glusterd-mgmt.c:2392:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post
Validation Failed

===========================================================

[2016-12-12 07:02:29.274196] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre Validation
failed on 10.70.36.46. Snapshot snap1 is already activated.
[2016-12-12 07:02:29.274267] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre Validation
failed on 10.70.36.71. Snapshot snap1 is already activated.
[2016-12-12 07:02:29.274294] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre Validation
failed on 10.70.44.7. Snapshot snap1 is already activated.
[2016-12-12 07:02:29.274328] E [MSGID: 106122]
[glusterd-mgmt.c:979:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed on peers
[2016-12-12 07:02:29.274390] E [MSGID: 106122]
[glusterd-mgmt.c:2272:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre
Validation Failed
===========================================================


[root at rhs-client46 glusterfs]# gluster snapshot status snap1

Snap Name : snap1
Snap UUID : 09114c3e-9ac3-42d7-b8a2-d1c65e0782b8

    Brick Path        :  
10.70.36.70:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick1/b1
    Volume Group      :   RHS_vg1
    Brick Running     :   No
    Brick PID         :   N/A
    Data Percentage   :   0.55
    LV Size           :   199.00g


    Brick Path        :  
10.70.36.71:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick2/b2
    Volume Group      :   RHS_vg1
    Brick Running     :   Yes
    Brick PID         :   11850
    Data Percentage   :   0.57
    LV Size           :   199.00g


    Brick Path        :  
10.70.36.46:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick3/b3
    Volume Group      :   RHS_vg1
    Brick Running     :   Yes
    Brick PID         :   28314
    Data Percentage   :   0.11
    LV Size           :   1.80t


    Brick Path        :  
10.70.44.7:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick4/b4
    Volume Group      :   RHS_vg1
    Brick Running     :   Yes
    Brick PID         :   24756
    Data Percentage   :   0.16
    LV Size           :   926.85g


    Brick Path        :  
10.70.36.70:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick5/b5
    Volume Group      :   RHS_vg2
    Brick Running     :   No
    Brick PID         :   N/A
    Data Percentage   :   0.55
    LV Size           :   199.00g


    Brick Path        :  
10.70.36.71:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick6/b6
    Volume Group      :   RHS_vg2
    Brick Running     :   Yes
    Brick PID         :   11870
    Data Percentage   :   0.57
    LV Size           :   199.00g

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-12-12
02:06:40 EST ---

This bug is automatically being proposed for the current release of Red Hat
Gluster Storage 3 under active development, by setting the release flag
'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change
the proposed release flag.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-12-12
06:48:13 EST ---

Since this bug has been approved for the RHGS 3.2.0 release of Red Hat Gluster
Storage 3, through release flag 'rhgs-3.2.0+', and through the Internal
Whiteboard entry of '3.2.0', the Target Release is being automatically set to
'RHGS 3.2.0'

--- Additional comment from Rejy M Cyriac on 2016-12-12 08:13:52 EST ---

At the 'RHGS 3.2.0 - Pre-Devel-Freeze Bug Triage' meeting on 12 December, it
was decided that this BZ is being accepted for fix at the RHGS 3.2.0 release

--- Additional comment from Worker Ant on 2016-12-13 02:23:59 EST ---

REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to
reassign snap volume ids to bricks) posted (#1) for review on master by Avra
Sengupta (asengupt at redhat.com)

--- Additional comment from Worker Ant on 2016-12-14 04:34:54 EST ---

REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to
reassign snap volume ids to bricks) posted (#2) for review on master by Avra
Sengupta (asengupt at redhat.com)

--- Additional comment from Worker Ant on 2016-12-14 04:41:57 EST ---

REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to
reassign snap volume ids to bricks) posted (#3) for review on master by Avra
Sengupta (asengupt at redhat.com)

--- Additional comment from Worker Ant on 2016-12-15 05:18:47 EST ---

REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to
reassign snap volume ids to bricks) posted (#4) for review on master by Avra
Sengupta (asengupt at redhat.com)

--- Additional comment from Worker Ant on 2016-12-16 04:25:54 EST ---

REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to
reassign snap volume ids to bricks) posted (#5) for review on master by Avra
Sengupta (asengupt at redhat.com)

--- Additional comment from Worker Ant on 2016-12-17 02:07:34 EST ---

COMMIT: http://review.gluster.org/16116 committed in master by Rajesh Joseph
(rjoseph at redhat.com) 
------
commit d0528cf2408533b45383a796d419c49fa96e810b
Author: Avra Sengupta <asengupt at redhat.com>
Date:   Tue Dec 13 11:55:19 2016 +0530

    snapshot: Fix restore rollback to reassign snap volume ids to bricks

    Added further checks to ensure we do not go beyond prevalidate
    when trying to restore a snapshot which has a nfs-gansha conf
    file, in a cluster when nfs-ganesha is not enabled

    The error message for the particular scenario is:
    "Snapshot(<snapname>) has a nfs-ganesha export conf
    file. cluster.enable-shared-storage and nfs-ganesha
    should be enabled before restoring this snapshot."

    Change-Id: I1b87e9907e0a5e162f26ef1ca89fe76e8da8610f
    BUG: 1404118
    Signed-off-by: Avra Sengupta <asengupt at redhat.com>
    Reviewed-on: http://review.gluster.org/16116
    Reviewed-by: Rajesh Joseph <rjoseph at redhat.com>
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1403672
[Bug 1403672] Snapshot: After snapshot restore failure , snapshot goes into
inconsistent state
https://bugzilla.redhat.com/show_bug.cgi?id=1404118
[Bug 1404118] Snapshot: After snapshot restore failure , snapshot goes into
inconsistent state
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list