[Bugs] [Bug 1398888] New: self-heal info command hangs after triggering self-heal

Sun Nov 27 02:20:55 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1398888

            Bug ID: 1398888
           Summary: self-heal info command hangs after triggering
                    self-heal
           Product: GlusterFS
           Version: 3.9
         Component: replicate
          Keywords: Triaged
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: kdhananj at redhat.com
                CC: amukherj at redhat.com, bugs at gluster.org,
                    rhs-bugs at redhat.com, sasundar at redhat.com,
                    storage-qa-internal at redhat.com
        Depends On: 1396166, 1398566

+++ This bug was initially created as a clone of Bug #1398566 +++

+++ This bug was initially created as a clone of Bug #1396166 +++

Description of problem:
------------------------
After issuing 'gluster volume heal', 'gluster volume heal info' hangs, when
compound-fops is enabled on the replica 3 volume

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHEL 7.3
RHGS 3.2.0 interim build ( glusterfs-3.8.4-5.el7rhgs )

How reproducible:
-----------------
Always

Steps to Reproduce:
-------------------
1. Create a replica 3 volume
2. Optimize the volume for VM store usecase
3. Enable compound-fops on the volume
4. Create a VM, and install OS
5. While OS installation is in progress, kill brick1 on server1
6. After VM installation is completed, bring back the brick up
7. Trigger self-heal on the volume
8. Get the self-heal info

Actual results:
---------------
self-heal info command is hung

Expected results:
-----------------
'self-heal info' should provide the correct information about un-synced entries

Additional info:
----------------
When compound-fops is disabled on the volume, this issue is not seen

--- Additional comment from SATHEESARAN on 2016-11-17 11:19:40 EST ---

1. Cluster info
---------------
There are 3 hosts in the cluster. All of them are VMs installed with RHGS
interim build over RHEL 7.3

[root at Server1 ~]# gluster peer status
Number of Peers: 2

Hostname: server2
Uuid: 209154aa-836f-47c1-8446-a5c5d15eb566
State: Peer in Cluster (Connected)

Hostname: server3
Uuid: e88a05e5-7772-4b31-9b7f-a1de1509adb7
State: Peer in Cluster (Connected)

2. gluster volume info
-----------------------
[root at server1 ~]# gluster volume info

Volume Name: volume1
Type: Replicate
Volume ID: aa01f3d2-4ba2-4747-893e-84058788f1dd
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: server1:/gluster/brick1/b1
Brick2: server2:/gluster/brick1/b1
Brick3: server3:/gluster/brick1/b1
Options Reconfigured:
cluster.granular-entry-heal: on
user.cifs: off
network.ping-timeout: 30
performance.strict-o-direct: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
performance.low-prio-threads: 32
features.shard-block-size: 512MB
features.shard: on
storage.owner-gid: 107
storage.owner-uid: 107
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: off
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

--- Additional comment from Krutika Dhananjay on 2016-11-22 10:22:39 EST ---

You do have the brick statedump too, don't you? Could you please attach those
as well?

-Krutika

--- Additional comment from SATHEESARAN on 2016-11-23 02:04:43 EST ---

(In reply to Krutika Dhananjay from comment #7)
> You do have the brick statedump too, don't you? Could you please attach
> those as well?
> 
> -Krutika

Hi Krutika,

I have mistakenly re-provisioned my third server in the cluster to simulate
failed node scenario.

But I have brick statedump from server1 and server2. I will attach them

--- Additional comment from Worker Ant on 2016-11-25 05:36:40 EST ---

REVIEW: http://review.gluster.org/15929 (cluster/afr: Fix deadlock due to
compound fops) posted (#1) for review on master by Krutika Dhananjay
(kdhananj at redhat.com)

--- Additional comment from Worker Ant on 2016-11-26 06:10:35 EST ---

COMMIT: http://review.gluster.org/15929 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit 2fe8ba52108e94268bc816ba79074a96c4538271
Author: Krutika Dhananjay <kdhananj at redhat.com>
Date:   Fri Nov 25 15:54:30 2016 +0530

    cluster/afr: Fix deadlock due to compound fops

    When an afr data transaction is eligible for using
    eager-lock, this information is represented in
    local->transaction.eager_lock_on. However, if non-blocking
    inodelk attempt (which is a full lock) fails, AFR falls back
    to blocking locks which are range locks. At this point,
    local->transaction.eager_lock[] per brick is reset but
    local->transaction.eager_lock_on is still true.
    When AFR decides to compound post-op and unlock, it is after
    confirming that the transaction did not use eager lock (well,
    except for a small bug where local->transaction.locks_acquired[]
    is not considered).

    But within afr_post_op_unlock_do(), afr again incorrectly sets
    the lock range to full-lock based on local->transaction.eager_lock_on
    value. This is a bug and can lead to deadlock since the locks acquired
    were range locks and a full unlock is being sent leading to unlock failure
    and thereby every other lock request (be it from SHD or other clients or
    glfsheal) getting blocked forever and the user perceives a hang.

    FIX:
    Unconditionally rely on the range locks in inodelk object for unlocking
    when using compounded post-op + unlock.

    Big thanks to Pranith for helping with the debugging.

    Change-Id: Idb4938f90397fb4bd90921f9ae6ea582042e5c67
    BUG: 1398566
    Signed-off-by: Krutika Dhananjay <kdhananj at redhat.com>
    Reviewed-on: http://review.gluster.org/15929
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1396166
[Bug 1396166] self-heal info command hangs after triggering self-heal
https://bugzilla.redhat.com/show_bug.cgi?id=1398566
[Bug 1398566] self-heal info command hangs after triggering self-heal
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.