[Bugs] [Bug 1224171] New: Input/Output error with disperse volume when geo-replication is started

bugzilla at redhat.com bugzilla at redhat.com
Fri May 22 09:57:56 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1224171

            Bug ID: 1224171
           Summary: Input/Output error with disperse volume when
                    geo-replication is started
           Product: Red Hat Gluster Storage
           Version: 3.1
         Component: glusterfs-geo-replication
          Severity: urgent
          Assignee: rhs-bugs at redhat.com
          Reporter: byarlaga at redhat.com
        QA Contact: storage-qa-internal at redhat.com
                CC: aavati at redhat.com, avishwan at redhat.com,
                    bugs at gluster.org, byarlaga at redhat.com,
                    csaba at redhat.com, gluster-bugs at redhat.com,
                    nlevinki at redhat.com, pkarampu at redhat.com
        Depends On: 1207712
            Blocks: 1186580 (qe_tracker_everglades)
             Group: redhat



+++ This bug was initially created as a clone of Bug #1207712 +++

Description of problem:
======================
Input/Output error on disperse volume with geo-replication after start.

Version-Release number of selected component (if applicable):
=============================================================
[root at vertigo ~]# gluster --version
glusterfs 3.7dev built on Mar 31 2015 01:05:54
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
[root at vertigo ~]# 

How reproducible:
=================
100%

Steps to Reproduce:
1. Create a 1x(4+2) disperse volume both for master and slave
2. Try to establish geo-replication b/w the volumes.
3. Once its started it thows out Input/Output error in the log file.

Actual results:
I/O error

Expected results:


Additional info:
================

[root at vertigo ~]# gluster v info geo-master

Volume Name: geo-master
Type: Disperse
Volume ID: fdb55cd4-34e7-4c15-a407-d9a831a09737
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: ninja:/rhs/brick1/geo-1
Brick2: vertigo:/rhs/brick1/geo-2
Brick3: ninja:/rhs/brick2/geo-3
Brick4: vertigo:/rhs/brick2/geo-4
Brick5: ninja:/rhs/brick3/geo-5
Brick6: vertigo:/rhs/brick3/geo-6
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
[root at vertigo ~]# gluster v status geo-master
Status of volume: geo-master
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick ninja:/rhs/brick1/geo-1               49202     0          Y       4714 
Brick vertigo:/rhs/brick1/geo-2             49203     0          Y       4643 
Brick ninja:/rhs/brick2/geo-3               49203     0          Y       4731 
Brick vertigo:/rhs/brick2/geo-4             49204     0          Y       4660 
Brick ninja:/rhs/brick3/geo-5               49204     0          Y       4748 
Brick vertigo:/rhs/brick3/geo-6             49205     0          Y       4677 
NFS Server on localhost                     2049      0          Y       5224 
NFS Server on ninja                         2049      0          Y       5090 

Task Status of Volume geo-master
------------------------------------------------------------------------------
There are no active volume tasks

[root at vertigo ~]# 

Slave configuration:
====================

[root at dhcp37-164 ~]# gluster v info

Volume Name: disperse-slave
Type: Disperse
Volume ID: 1cbbe781-ee69-4295-bd17-a1dff37637ab
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: dhcp37-164:/rhs/brick1/b1
Brick2: dhcp37-95:/rhs/brick1/b1
Brick3: dhcp37-164:/rhs/brick2/b2
Brick4: dhcp37-95:/rhs/brick2/b2
Brick5: dhcp37-164:/rhs/brick3/b3
Brick6: dhcp37-95:/rhs/brick3/b3
[root at dhcp37-164 ~]# gluster v status
Status of volume: disperse-slave
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp37-164:/rhs/brick1/b1             49152     0          Y       4066 
Brick dhcp37-95:/rhs/brick1/b1              49152     0          Y       6988 
Brick dhcp37-164:/rhs/brick2/b2             49153     0          Y       4083 
Brick dhcp37-95:/rhs/brick2/b2              49153     0          Y       7005 
Brick dhcp37-164:/rhs/brick3/b3             49154     0          Y       4100 
Brick dhcp37-95:/rhs/brick3/b3              49154     0          Y       7022 
NFS Server on localhost                     2049      0          Y       4120 
NFS Server on 10.70.37.95                   2049      0          Y       7044 

Task Status of Volume disperse-slave
------------------------------------------------------------------------------
There are no active volume tasks

[root at dhcp37-164 ~]# 

Log file of the master will be attached.

--- Additional comment from Anand Avati on 2015-03-31 16:34:16 EDT ---

REVIEW: http://review.gluster.org/10077 (cluster/ec: Ignore volume-mark key for
comparing dicts) posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-03-31 16:34:29 EDT ---

REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-03-31 16:34:32 EDT ---

REVIEW: http://review.gluster.org/10079 (cluster/ec: Handle stime, xtime
differently) posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-04-10 07:10:09 EDT ---

COMMIT: http://review.gluster.org/10077 committed in master by Vijay Bellur
(vbellur at redhat.com) 
------
commit fcb55d54a62c8d4a2e8ce4596cd462c471f74dd3
Author: Pranith Kumar K <pkarampu at redhat.com>
Date:   Tue Mar 31 18:09:25 2015 +0530

    cluster/ec: Ignore volume-mark key for comparing dicts

    Change-Id: Id60107e9fb96588d24fa2f3be85c764b7f08e3d1
    BUG: 1207712
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/10077
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Xavier Hernandez <xhernandez at datalab.es>

--- Additional comment from Anand Avati on 2015-04-12 07:09:35 EDT ---

REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#2) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-04-12 07:13:49 EDT ---

REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#3) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-04-28 13:23:34 EDT ---

REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#4) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-05-03 22:54:08 EDT ---

REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#5) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-05-04 04:56:17 EDT ---

REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#6) for review on master by Avra Sengupta
(asengupt at redhat.com)

--- Additional comment from Anand Avati on 2015-05-04 22:46:35 EDT ---

COMMIT: http://review.gluster.org/10078 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit c8cd488b794d7abb3d37f32a6d8d0a3b365aa46e
Author: Pranith Kumar K <pkarampu at redhat.com>
Date:   Tue Mar 31 23:07:09 2015 +0530

    cluster/ec: Fix dictionary compare function

    If both dicts are NULL then equal. If one of the dicts is NULL but the
other
    has only ignorable keys then also they are equal. If both dicts are
non-null
    then check if for each non-ignorable key, values are same or not. 
value_ignore
    function is used to skip comparing values for the keys which must be
present in
    both the dictionaries but the value could be different.

    geo-rep's stime xattr doesn't need to be present in list xattr but when
    getxattr comes on stime xattr even if there aren't enough responses with
the
    xattr we should still give out an answer which is maximum of the stimes
    available.

    Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d
    BUG: 1207712
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/10078
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Xavier Hernandez <xhernandez at datalab.es>

--- Additional comment from Niels de Vos on 2015-05-14 13:27:12 EDT ---

This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages
for several distributions should become available in the near future. Keep an
eye on the Gluster Users mailinglist [2] and the update infrastructure for your
distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

--- Additional comment from Niels de Vos on 2015-05-14 13:28:40 EDT ---

This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages
for several distributions should become available in the near future. Keep an
eye on the Gluster Users mailinglist [2] and the update infrastructure for your
distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

--- Additional comment from Niels de Vos on 2015-05-14 13:35:19 EDT ---

This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages
for several distributions should become available in the near future. Keep an
eye on the Gluster Users mailinglist [2] and the update infrastructure for your
distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1207712
[Bug 1207712] Input/Output error with disperse volume when geo-replication
is started
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=3VeVScgKBX&a=cc_unsubscribe


More information about the Bugs mailing list