[Bugs] [Bug 1245934] New: [RHEV-RHGS] App VMs paused due to IO error caused by split-brain, after initiating remove-brick operation

Thu Jul 23 07:13:26 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1245934

            Bug ID: 1245934
           Summary: [RHEV-RHGS] App VMs paused due to IO error caused by
                    split-brain, after initiating remove-brick operation
           Product: GlusterFS
           Version: 3.7.3
         Component: distribute
          Keywords: Triaged
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: ravishankar at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com,
                    nbalacha at redhat.com, ravishankar at redhat.com,
                    rcyriac at redhat.com, rgowdapp at redhat.com,
                    sasundar at redhat.com, ssampat at redhat.com
        Depends On: 1243542, 1244165

Description of problem:
------------------------
With 8X2 distributed replicate volume, initiated a remove-brick with data
migration. After few minutes, all the application VMs with its disk image on
that gluster volume went in to paused state.

Noticed the split-brain error message in the fuse mount log

Version
--------
RHEL 6.7 as hypervisor
RHGS 3.1 based on RHEL 7.1

How reproducible:
-----------------
Tried only once

Steps to Reproduce:
-------------------
1. Create a 2X2 distributed replicate volume
2. Use this gluster volume as the 'Data Domain' for RHEV
3. Create few App VMs and install OS
4. Remove the bricks where the disk image of App VMs are residing

Actual results:
----------------
App VMs went in to **paused** state

Expected results:
-----------------
App VMs should be healthy

--- Additional comment from SATHEESARAN on 2015-07-15 14:17:43 EDT ---

Following error messages are seen in the fuse mount logs:

[2015-07-15 17:49:42.709088] E [MSGID: 114031]
[client-rpc-fops.c:1673:client3_3_finodelk_cbk] 6-vol1-client-0: remote
operation failed [Transport endpoint is not connected]

[2015-07-15 17:49:42.710849] W [MSGID: 114031]
[client-rpc-fops.c:1028:client3_3_fsync_cbk] 6-vol1-client-0: remote operation
failed [Transport endpoint is not connected]
[2015-07-15 17:49:42.710874] W [MSGID: 108035]
[afr-transaction.c:1614:afr_changelog_fsync_cbk] 6-vol1-replicate-0:
fsync(b7d21675-6fd8-472a-b7d9-71d7436c614d) failed on subvolume vol1-client-0.
Transaction was WRITE [Transport endpoint is not connected]
[2015-07-15 17:49:42.710897] W [MSGID: 108001]
[afr-transaction.c:686:afr_handle_quorum] 6-vol1-replicate-0:
b7d21675-6fd8-472a-b7d9-71d7436c614d: Failing WRITE as quorum is not met

[2015-07-15 18:12:15.544061] E [MSGID: 108008]
[afr-transaction.c:1984:afr_transaction] 12-vol1-replicate-5: Failing WRITE on
gfid b7d21675-6fd8-472a-b7d9-71d7436c614d: split-brain observed. [Input/output
error]
[2015-07-15 18:12:15.737906] W [fuse-bridge.c:2273:fuse_writev_cbk]
0-glusterfs-fuse: 293197: WRITE => -1 (Input/output error)
[2015-07-15 18:12:17.022070] W [MSGID: 114031]
[client-rpc-fops.c:2971:client3_3_lookup_cbk] 12-vol1-client-5: remote
operation failed. Path:
/c29ec775-c933-4109-87bf-0b7c4373d0a0/images/9ddffb02-b804-4f28-a8fb-df609eaa884a/c7637ade-9c78-4bd7-a9e4-a14913f9060b
(d83a3f9a-7625-4872-b61f-0e4b63922a75) [No such file or directory]
[2015-07-15 18:12:17.022073] W [MSGID: 114031]
[client-rpc-fops.c:2971:client3_3_lookup_cbk] 12-vol1-client-4: remote
operation failed. Path:
/c29ec775-c933-4109-87bf-0b7c4373d0a0/images/9ddffb02-b804-4f28-a8fb-df609eaa884a/c7637ade-9c78-4bd7-a9e4-a14913f9060b
(d83a3f9a-7625-4872-b61f-0e4b63922a75) [No such file or directory]
[2015-07-15 18:12:22.952290] W [fuse-bridge.c:2273:fuse_writev_cbk]
0-glusterfs-fuse: 293304: WRITE => -1 (Input/output error)
[2015-07-15 18:12:22.952550] W [fuse-bridge.c:2273:fuse_writev_cbk]
0-glusterfs-fuse: 293306: WRITE => -1 (Input/output error)

--- Additional comment from Ravishankar N on 2015-07-16 07:33:59 EDT ---

Able to reproduce the issue with running  a continuous  `dd ` into a file from
fuse mount on a 2x2 volume and reducing it to a 1x2, making sure to remove the
replica pair in which file resides. dd terminated with EIO. 

[root at vm2 fuse_mnt]# dd if=/dev/urandom of=file
dd: writing to ‘file’: Input/output error
dd: closing output file ‘file’: Input/output error
[root at vm2 fuse_mnt]# 

The EIO is returned by afr_transaction() which is not able to find a readable
subvolume for the inode. I need to debug further to see why.

FWIW, there was no data corruption/loss and the migration completed
successfully. New reads/writes to the file was successful.

[root at vm2 fuse_mnt]# echo append>>file
[root at vm2 fuse_mnt]# echo $?
0
[root at vm2 fuse_mnt]# tail -1 file
��_�d�!��aappend
[root at vm2 fuse_mnt]# 
[root at vm2 fuse_mnt]# echo $?
0

-

--- Additional comment from Anand Avati on 2015-07-17 06:59:43 EDT ---

REVIEW: http://review.gluster.org/11713 (dht: send lookup even for fd based
operations during rebalance) posted (#1) for review on master by Ravishankar N
(ravishankar at redhat.com)

--- Additional comment from Anand Avati on 2015-07-17 13:04:54 EDT ---

REVIEW: http://review.gluster.org/11713 (dht: send lookup even for fd based
operations during rebalance) posted (#2) for review on master by Ravishankar N
(ravishankar at redhat.com)

--- Additional comment from Anand Avati on 2015-07-19 05:24:44 EDT ---

REVIEW: http://review.gluster.org/11713 (dht: send lookup even for fd based
operations during rebalance) posted (#3) for review on master by Ravishankar N
(ravishankar at redhat.com)

--- Additional comment from Anand Avati on 2015-07-23 02:45:22 EDT ---

COMMIT: http://review.gluster.org/11713 committed in master by Raghavendra G
(rgowdapp at redhat.com) 
------
commit 94372373ee355e42dfe1660a50315adb4f019d64
Author: Ravishankar N <ravishankar at redhat.com>
Date:   Fri Jul 17 16:04:01 2015 +0530

    dht: send lookup even for fd based operations during rebalance

    Problem:
    dht_rebalance_inprogress_task() was not sending lookups to the
    destination subvolume for a file undergoing writes during rebalance. Due to
    this, afr was not able to populate the read_subvol and failed the write
    with EIO.

    Fix:
    Send lookup for fd based operations as well.

    Thanks to Raghavendra G for helping with the RCA.

    Change-Id: I638c203abfaa45b29aa5902ffd76e692a8212a19
    BUG: 1244165
    Signed-off-by: Ravishankar N <ravishankar at redhat.com>
    Reviewed-on: http://review.gluster.org/11713
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: N Balachandran <nbalacha at redhat.com>
    Reviewed-by: Raghavendra G <rgowdapp at redhat.com>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1243542
[Bug 1243542] [RHEV-RHGS] App VMs paused due to IO error caused by
split-brain, after initiating remove-brick operation
https://bugzilla.redhat.com/show_bug.cgi?id=1244165
[Bug 1244165] [RHEV-RHGS] App VMs paused due to IO error caused by
split-brain, after initiating remove-brick operation
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.