[Bugs] [Bug 1392844] New: Hosted Engine VM paused post replace-brick operation

Tue Nov 8 11:10:05 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1392844

            Bug ID: 1392844
           Summary: Hosted Engine VM paused post replace-brick operation
           Product: GlusterFS
           Version: 3.9
         Component: sharding
          Keywords: Triaged
          Severity: high
          Assignee: kdhananj at redhat.com
          Reporter: kdhananj at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org, rhinduja at redhat.com,
                    rhs-bugs at redhat.com, sasundar at redhat.com,
                    storage-qa-internal at redhat.com
        Depends On: 1370350, 1392445

+++ This bug was initially created as a clone of Bug #1392445 +++

+++ This bug was initially created as a clone of Bug #1370350 +++

Description of problem:
-----------------------
After replacing the defunct brick of replica 3 sharded volume, the hosted
engine VM running with its image on that volume went to paused state.

Fuse mount logs showed EIO

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHGS 3.1.3
RHV 4.0.2

How reproducible:
-----------------
1/1

Steps to Reproduce:
-------------------
1. Create a replica 3 sharded volume optimized for VM store
2. Create a hosted engine VM with this volume as a 'data domain'
3. After hosted-engine up and operational, kill one of the bricks of the volume
4. Add a new node to the cluster
5. Replace the old-brick with the new-brick from the newly added node to the
cluster

Actual results:
---------------
hosted-engine vm went to paused state, fuse mount showed EIO with error 'Lookup
on shard 3 failed.'

Expected results:
-----------------
There shouldn't be any error messages, after performing replace brick operation

--- Additional comment from SATHEESARAN on 2016-08-25 22:38:05 EDT ---

1. Setup details
----------------
   4 Nodes in the cluster, all running RHGS 3.1.3 ( glusterfs-3.7.9.10.el7rhgs
)
   Hosts are :
   cambridge.lab.eng.blr.redhat.com ( 10.70.37.73 )
   zod.lab.eng.blr.redhat.com ( 10.70.37.76 )
   tettnang.lab.eng.blr.redhat.com ( 10.70.37.77 ) <-- This host is defunct for
more than 12 hours

   ** Here is the newly added host ** :
   yarrow.lab.eng.blr.redhat.com ( 10.70.37.78 )

2. Peer status
--------------

[root at cambridge ~]# gluster pe s
Number of Peers: 3

Hostname: tettnang-nic2.lab.eng.blr.redhat.com
Uuid: d9cd9f98-6dc3-436f-b6ca-aab0b059fc41
State: Peer in Cluster (Disconnected)
Other names:
10.70.36.77

Hostname: 10.70.36.76
Uuid: 7bd8c3ea-4b88-431f-b62e-a330b9ae6b9a
State: Peer in Cluster (Connected)

Hostname: yarrow.lab.eng.blr.redhat.com
Uuid: 3d98430d-50f3-4aa8-9389-ac133a58c9b3
State: Peer in Cluster (Connected)

----------

[root at cambridge ~]# gluster pool list
UUID                    Hostname                                State
d9cd9f98-6dc3-436f-b6ca-aab0b059fc41    tettnang-nic2.lab.eng.blr.redhat.com   
Disconnected 
7bd8c3ea-4b88-431f-b62e-a330b9ae6b9a    10.70.36.76                            
Connected 
3d98430d-50f3-4aa8-9389-ac133a58c9b3    yarrow.lab.eng.blr.redhat.com          
Connected 
3c3ad2a9-e8f8-496b-bdde-1c0777552dee    localhost                              
Connected 

3. Gluster volume details
-------------------------
[root at cambridge ~]# gluster volume info enginevol

Volume Name: enginevol
Type: Replicate
Volume ID: 46a261df-f527-479c-9776-4bdb21fe19b1
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.36.73:/rhgs/engine/enginebrick
Brick2: 10.70.36.76:/rhgs/engine/enginebrick
Brick3: yarrow.lab.eng.blr.redhat.com:/rhgs/engine/enginebrick <-- new brick
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
performance.strict-o-direct: on
network.ping-timeout: 30
user.cifs: off
nfs.disable: on

Note: the old brick was - 10.70.36.77:/rhgs/engine/enginebrick

4. volume status
----------------
[root at cambridge ~]# gluster volume status enginevol
Status of volume: enginevol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.36.73:/rhgs/engine/enginebrick  49152     0          Y       18194
Brick 10.70.36.76:/rhgs/engine/enginebrick  49152     0          Y       15595
Brick yarrow.lab.eng.blr.redhat.com:/rhgs/e
ngine/enginebrick                           49152     0          Y       19457
Self-heal Daemon on localhost               N/A       N/A        Y       12863
Self-heal Daemon on yarrow.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       19462
Self-heal Daemon on 10.70.36.76             N/A       N/A        Y       32641

Task Status of Volume enginevol
------------------------------------------------------------------------------
There are no active volume tasks

--- Additional comment from SATHEESARAN on 2016-08-25 22:39:22 EDT ---

self-heal and split-brain info

[root at cambridge ~]# gluster volume heal enginevol info
Brick 10.70.36.73:/rhgs/engine/enginebrick
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.81 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.1 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.6 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.24 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.3 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.72 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.84 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.18 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.7 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.8 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.36 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.4 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.9 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.57 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.39 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.12 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.90 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.42 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.10 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.96 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.21 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.87 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.45 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.99 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.48 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.60 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.15 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.27 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.54 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.69 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.78 
Status: Connected
Number of entries: 31

Brick 10.70.36.76:/rhgs/engine/enginebrick
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.81 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.1 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.6 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.24 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.3 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.72 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.84 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.18 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.7 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.8 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.36 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.4 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.9 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.57 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.39 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.12 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.90 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.42 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.10 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.96 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.21 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.87 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.45 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.99 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.48 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.60 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.15 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.27 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.54 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.69 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.78 
Status: Connected
Number of entries: 31

Brick yarrow.lab.eng.blr.redhat.com:/rhgs/engine/enginebrick
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.81 
/.shard/853758b3-f79d-4114-86b4-c9e4fe1f97db.3 
Status: Connected
Number of entries: 2

[root at cambridge ~]# gluster volume heal enginevol info split-brain
Brick 10.70.36.73:/rhgs/engine/enginebrick
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.36.76:/rhgs/engine/enginebrick
Status: Connected
Number of entries in split-brain: 0

Brick yarrow.lab.eng.blr.redhat.com:/rhgs/engine/enginebrick
Status: Connected
Number of entries in split-brain: 0

--- Additional comment from SATHEESARAN on 2016-08-25 22:45:40 EDT ---

[2016-08-25 08:16:44.373211] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-enginevol-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2016-08-25 08:16:44.373283] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-enginevol-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2016-08-25 08:16:44.373343] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-enginevol-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2016-08-25 08:16:44.374685] E [MSGID: 133010]
[shard.c:1582:shard_common_lookup_shards_cbk] 2-enginevol-shard: Lookup on
shard 3 failed. Base file gfid = 853758b3-f79d-4114-86b4-c9e4fe1f97db
[Input/output error]
[2016-08-25 08:16:44.374734] W [fuse-bridge.c:2224:fuse_readv_cbk]
0-glusterfs-fuse: 3986689: READ => -1 (Input/output error)
[2016-08-25 08:37:22.183269] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-enginevol-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2016-08-25 08:37:22.183355] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-enginevol-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2016-08-25 08:37:22.183414] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 2-enginevol-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2016-08-25 08:37:22.184586] E [MSGID: 133010]
[shard.c:1582:shard_common_lookup_shards_cbk] 2-enginevol-shard: Lookup on
shard 3 failed. Base file gfid = 853758b3-f79d-4114-86b4-c9e4fe1f97db
[Input/output error]
[2016-08-25 08:37:22.184625] W [fuse-bridge.c:2224:fuse_readv_cbk]
0-glusterfs-fuse: 4023987: READ => -1 (Input/output error)

--- Additional comment from SATHEESARAN on 2016-08-25 22:46:06 EDT ---

comment3 is the snip from fuse mount.log

--- Additional comment from SATHEESARAN on 2016-08-30 23:05:51 EDT ---

I couldn't hit back for the second time, when Krutika asked for debug enabled
logs.

Later, Krutika also confirmed that the community user too seeing this problem

--- Additional comment from Krutika Dhananjay on 2016-09-13 09:58:51 EDT ---

(In reply to SATHEESARAN from comment #5)
> I couldn't hit back for the second time, when Krutika asked for debug
> enabled logs.
> 
> Later, Krutika also confirmed that the community user too seeing this problem

I stand corrected. I figured later that it was with granular-entry-heal enabled
and a case where the same brick was wiped off and healed. The issue there was a
combination of dated documentation and the lack of reset-brick functionality.

I did see logs of the kind you have pasted in comment #3, of lookups failing
with EINVAL but no input/output errors.

-Krutika

--- Additional comment from Krutika Dhananjay on 2016-09-13 10:06:52 EDT ---

Sas,

Do you have the logs from this run, of the bricks, shds and the clients?

-Krutika

--- Additional comment from SATHEESARAN on 2016-10-14 06:55:56 EDT ---

(In reply to Krutika Dhananjay from comment #7)
> Sas,
> 
> Do you have the logs from this run, of the bricks, shds and the clients?
> 
> -Krutika

Hi Krutika,

I have missed the logs.
I will try to reproduce this issue.

But could you guess any problems with the error messages in comment3 ?

--- Additional comment from Krutika Dhananjay on 2016-10-24 03:08:04 EDT ---

Not quite, Sas. The EINVAL seems to be getting propagated by the brick(s) since
protocol/client which is the lowest layer in the client stack is receiving
EINVAL from over the network.

-Krutika

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-11-07
05:23:09 EST ---

This bug is automatically being provided 'pm_ack+' for the release flag
'rhgs‑3.2.0', the current release of Red Hat Gluster Storage 3 under active
development, having been appropriately marked for the release, and having been
provided ACK from Development and QE

If the 'blocker' flag had been proposed/set on this BZ, it has now been unset,
since the 'blocker' flag is not valid for the current phase of RHGS 3.2.0
development

--- Additional comment from Worker Ant on 2016-11-07 09:17:21 EST ---

REVIEW: http://review.gluster.org/15788 (features/shard: Fill loc.pargfid too
for named lookups on individual shards) posted (#1) for review on master by
Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Worker Ant on 2016-11-08 03:45:10 EST ---

REVIEW: http://review.gluster.org/15788 (features/shard: Fill loc.pargfid too
for named lookups on individual shards) posted (#2) for review on master by
Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Worker Ant on 2016-11-08 06:05:31 EST ---

COMMIT: http://review.gluster.org/15788 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit e9023083b3a165390a8cc8fc77253f354744e81a
Author: Krutika Dhananjay <kdhananj at redhat.com>
Date:   Mon Nov 7 16:06:56 2016 +0530

    features/shard: Fill loc.pargfid too for named lookups on individual shards

    On a sharded volume when a brick is replaced while IO is going on, named
    lookup on individual shards as part of read/write was failing with
    ENOENT on the replaced brick, and as a result AFR initiated name heal in
    lookup callback. But since pargfid was empty (which is what this patch
    attempts to fix), the resolution of the shards by protocol/server used
    to fail and the following pattern of logs was seen:

    Brick-logs:

    [2016-11-08 07:41:49.387127] W [MSGID: 115009]
    [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
    for (null) (LOOKUP)
    [2016-11-08 07:41:49.387157] E [MSGID: 115050]
    [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)

(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
    ==> (Invalid argument) [Invalid argument]

    Client-logs:
    [2016-11-08 07:41:27.497687] W [MSGID: 114031]
    [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
    operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
    [Invalid argument]
    [2016-11-08 07:41:27.497755] W [MSGID: 114031]
    [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
    operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
    [Invalid argument]
    [2016-11-08 07:41:27.498500] W [MSGID: 114031]
    [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
    operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
    [Invalid argument]
    [2016-11-08 07:41:27.499680] E [MSGID: 133010]

    Also, this patch makes AFR by itself choose a non-NULL pargfid even if
    its ancestors fail to initialize all pargfid placeholders.

    Change-Id: I5f85b303ede135baaf92e87ec8e09941f5ded6c1
    BUG: 1392445
    Signed-off-by: Krutika Dhananjay <kdhananj at redhat.com>
    Reviewed-on: http://review.gluster.org/15788
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    Reviewed-by: Ravishankar N <ravishankar at redhat.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>
    Smoke: Gluster Build System <jenkins at build.gluster.org>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1370350
[Bug 1370350] Hosted Engine VM paused post replace-brick operation
https://bugzilla.redhat.com/show_bug.cgi?id=1392445
[Bug 1392445] Hosted Engine VM paused post replace-brick operation
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.