[Bugs] [Bug 1367270] New: [HC]: After bringing down and up of the bricks VM' s are getting paused

Tue Aug 16 06:09:47 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1367270

            Bug ID: 1367270
           Summary: [HC]: After bringing down and up of the bricks  VM's
                    are getting paused
           Product: GlusterFS
           Version: 3.7.14
         Component: replicate
          Keywords: Triaged
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: kdhananj at redhat.com
                CC: bugs at gluster.org, mzywusko at redhat.com,
                    pkarampu at redhat.com, rhs-bugs at redhat.com,
                    rmekala at redhat.com, sabose at redhat.com,
                    sasundar at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1333406, 1363721

+++ This bug was initially created as a clone of Bug #1363721 +++

+++ This bug was initially created as a clone of Bug #1333406 +++

Description of problem:
=====================
After bringing down and up of the bricks, VM's are getting paused

Version-Release number of selected component (if applicable):
=============
glusterfs-server-3.7.9-2.el7rhgs.x86_64

How reproducible:

Steps to Reproduce:
=====================
1. Create 1x3 volume and host few VM's on the gluster volumes
2. Login to the VM's and run script to populate data (using DD) 
3. While IO is going on bring down one of the brick and after some time bring
up the brick and bring down another brick 
4. After some time Bring up the down brick and bring down another brick during
the brick down and bring up process observed few VM's are getting paused 

Actual results:
==================
Virtual machines are getting paused 

Expected results:
=================
VM's should not be paused 

Additional info:
===================
[root at zod ~]# gluster vol info

Volume Name: data
Type: Replicate
Volume ID: 5021c1f8-0b2f-4b34-92ea-a087afe84ce3
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: server1:/rhgs/data/data-brick1
Brick2: server2:/rhgs/data/data-brick2
Brick3: server3:/rhgs/data/data-brick3
Options Reconfigured:
diagnostics.client-log-level: INFO
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
nfs.disable: on
cluster.shd-max-threads: 16

Volume Name: engine
Type: Replicate
Volume ID: 5e14889a-0ffc-415f-8fbd-259451972c46
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: server1:/rhgs/engine/engine-brick1
Brick2: server2:/rhgs/engine/engine-brick2
Brick3: server3:/rhgs/engine/engine-brick3
Options Reconfigured:
cluster.shd-max-threads: 16
nfs.disable: on
cluster.data-self-heal-algorithm: full
performance.low-prio-threads: 32
features.shard-block-size: 512MB
features.shard: on
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.readdir-ahead: on

Volume Name: vmstore
Type: Replicate
Volume ID: edd3e117-138e-437b-9e65-319084fecc4b
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: server1:/rhgs/vmstore/vmstore-brick1
Brick2: server2:/rhgs/vmstore/vmstore-brick2
Brick3: server3:/rhgs/vmstore/vmstore-brick3
Options Reconfigured:
cluster.shd-max-threads: 16
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
nfs.disable: on
[root at zod ~]#

--- Additional comment from Sahina Bose on 2016-05-19 05:42:59 EDT ---

This bug is related to cyclic network outage test causing file to be in split
brain and is not a very likely scenario.

--- Additional comment from Krutika Dhananjay on 2016-07-18 01:39:27 EDT ---

(In reply to RajeshReddy from comment #0)
> Description of problem:
> =====================
> After bringing down and up of the bricks, VM's are getting paused
> 
> Version-Release number of selected component (if applicable):
> =============
> glusterfs-server-3.7.9-2.el7rhgs.x86_64
> 
> How reproducible:
> 
> 
> Steps to Reproduce:
> =====================
> 1. Create 1x3 volume and host few VM's on the gluster volumes
> 2. Login to the VM's and run script to populate data (using DD) 
> 3. While IO is going on bring down one of the brick and after some time
> bring up the brick and bring down another brick 
> 4. After some time Bring up the down brick and bring down another brick
> during the brick down and bring up process observed few VM's are getting
> paused 
> 
> Actual results:
> ==================
> Virtual machines are getting paused 
> 
> 
> Expected results:
> =================
> VM's should not be paused 

Just wondering whether it is possible at all to keep the VM from pausing in
this scenario. The best we can do is to prevent the shard/vm image from going
into a split-brain when bricks are brought offline and back online in cyclic
order, which means the VM(s) will _still_ pause (with EROFS?) at some point,
only this time after the particular file/shard is healed, IO may be resumed
from inside the VM without requiring manual intervention to fix the
split-brain.

@Pranith: Are the above statements correct? Or is there a way to actually keep
the VM from pausing?

-Krutika

--- Additional comment from Pranith Kumar K on 2016-07-18 06:14:41 EDT ---

You are correct, we can't prevent VMs getting paused. We only need to make sure
that split-brains won't happen. Please note that this case may lead to the VM
image going extremely bad, but all we can guarantee is the file not going into
split-brain.

--- Additional comment from Vijay Bellur on 2016-08-03 09:06:18 EDT ---

REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when
bricks are brought on and off in cyclic order) posted (#1) for review on master
by Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-08-03 09:07:12 EDT ---

REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when
bricks are brought off and on in cyclic order) posted (#2) for review on master
by Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-08-04 07:46:41 EDT ---

REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when
bricks are brought off and on in cyclic order) posted (#3) for review on master
by Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-08-04 22:33:30 EDT ---

REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when
bricks are brought off and on in cyclic order) posted (#4) for review on master
by Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-08-09 03:30:40 EDT ---

REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when
bricks are brought off and on in cyclic order) posted (#5) for review on master
by Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-08-09 04:22:24 EDT ---

REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when
bricks are brought off and on in cyclic order) posted (#6) for review on master
by Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-08-11 05:42:06 EDT ---

REVIEW: http://review.gluster.org/15145 (cluster/afr: Bug fixes in txn
codepath) posted (#1) for review on master by Krutika Dhananjay
(kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-08-11 21:23:15 EDT ---

REVIEW: http://review.gluster.org/15145 (cluster/afr: Bug fixes in txn
codepath) posted (#2) for review on master by Krutika Dhananjay
(kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-08-15 06:40:58 EDT ---

COMMIT: http://review.gluster.org/15145 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit 79b9ad3dfa146ef29ac99bf87d1c31f5a6fe1fef
Author: Krutika Dhananjay <kdhananj at redhat.com>
Date:   Fri Aug 5 12:18:05 2016 +0530

    cluster/afr: Bug fixes in txn codepath

    AFR sets transaction.pre_op[] array even before actually doing the
    pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
    but also the failed_subvols[] information before setting the pre_op_done[]
    flag. This patch fixes that.

    Change-Id: I78ccd39106bd4959441821355a82572659e3affb
    BUG: 1363721
    Signed-off-by: Krutika Dhananjay <kdhananj at redhat.com>
    Reviewed-on: http://review.gluster.org/15145
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Ravishankar N <ravishankar at redhat.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>
    Reviewed-by: Anuradha Talur <atalur at redhat.com>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1333406
[Bug 1333406] [HC]: After bringing down and up of the bricks  VM's are
getting paused
https://bugzilla.redhat.com/show_bug.cgi?id=1363721
[Bug 1363721] [HC]: After bringing down and up of the bricks  VM's are
getting paused
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.