[Bugs] [Bug 1224171] New: Input/Output error with disperse volume when geo-replication is started
bugzilla at redhat.com
bugzilla at redhat.com
Fri May 22 09:57:56 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1224171
Bug ID: 1224171
Summary: Input/Output error with disperse volume when
geo-replication is started
Product: Red Hat Gluster Storage
Version: 3.1
Component: glusterfs-geo-replication
Severity: urgent
Assignee: rhs-bugs at redhat.com
Reporter: byarlaga at redhat.com
QA Contact: storage-qa-internal at redhat.com
CC: aavati at redhat.com, avishwan at redhat.com,
bugs at gluster.org, byarlaga at redhat.com,
csaba at redhat.com, gluster-bugs at redhat.com,
nlevinki at redhat.com, pkarampu at redhat.com
Depends On: 1207712
Blocks: 1186580 (qe_tracker_everglades)
Group: redhat
+++ This bug was initially created as a clone of Bug #1207712 +++
Description of problem:
======================
Input/Output error on disperse volume with geo-replication after start.
Version-Release number of selected component (if applicable):
=============================================================
[root at vertigo ~]# gluster --version
glusterfs 3.7dev built on Mar 31 2015 01:05:54
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
[root at vertigo ~]#
How reproducible:
=================
100%
Steps to Reproduce:
1. Create a 1x(4+2) disperse volume both for master and slave
2. Try to establish geo-replication b/w the volumes.
3. Once its started it thows out Input/Output error in the log file.
Actual results:
I/O error
Expected results:
Additional info:
================
[root at vertigo ~]# gluster v info geo-master
Volume Name: geo-master
Type: Disperse
Volume ID: fdb55cd4-34e7-4c15-a407-d9a831a09737
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: ninja:/rhs/brick1/geo-1
Brick2: vertigo:/rhs/brick1/geo-2
Brick3: ninja:/rhs/brick2/geo-3
Brick4: vertigo:/rhs/brick2/geo-4
Brick5: ninja:/rhs/brick3/geo-5
Brick6: vertigo:/rhs/brick3/geo-6
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
[root at vertigo ~]# gluster v status geo-master
Status of volume: geo-master
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick ninja:/rhs/brick1/geo-1 49202 0 Y 4714
Brick vertigo:/rhs/brick1/geo-2 49203 0 Y 4643
Brick ninja:/rhs/brick2/geo-3 49203 0 Y 4731
Brick vertigo:/rhs/brick2/geo-4 49204 0 Y 4660
Brick ninja:/rhs/brick3/geo-5 49204 0 Y 4748
Brick vertigo:/rhs/brick3/geo-6 49205 0 Y 4677
NFS Server on localhost 2049 0 Y 5224
NFS Server on ninja 2049 0 Y 5090
Task Status of Volume geo-master
------------------------------------------------------------------------------
There are no active volume tasks
[root at vertigo ~]#
Slave configuration:
====================
[root at dhcp37-164 ~]# gluster v info
Volume Name: disperse-slave
Type: Disperse
Volume ID: 1cbbe781-ee69-4295-bd17-a1dff37637ab
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: dhcp37-164:/rhs/brick1/b1
Brick2: dhcp37-95:/rhs/brick1/b1
Brick3: dhcp37-164:/rhs/brick2/b2
Brick4: dhcp37-95:/rhs/brick2/b2
Brick5: dhcp37-164:/rhs/brick3/b3
Brick6: dhcp37-95:/rhs/brick3/b3
[root at dhcp37-164 ~]# gluster v status
Status of volume: disperse-slave
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dhcp37-164:/rhs/brick1/b1 49152 0 Y 4066
Brick dhcp37-95:/rhs/brick1/b1 49152 0 Y 6988
Brick dhcp37-164:/rhs/brick2/b2 49153 0 Y 4083
Brick dhcp37-95:/rhs/brick2/b2 49153 0 Y 7005
Brick dhcp37-164:/rhs/brick3/b3 49154 0 Y 4100
Brick dhcp37-95:/rhs/brick3/b3 49154 0 Y 7022
NFS Server on localhost 2049 0 Y 4120
NFS Server on 10.70.37.95 2049 0 Y 7044
Task Status of Volume disperse-slave
------------------------------------------------------------------------------
There are no active volume tasks
[root at dhcp37-164 ~]#
Log file of the master will be attached.
--- Additional comment from Anand Avati on 2015-03-31 16:34:16 EDT ---
REVIEW: http://review.gluster.org/10077 (cluster/ec: Ignore volume-mark key for
comparing dicts) posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-03-31 16:34:29 EDT ---
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-03-31 16:34:32 EDT ---
REVIEW: http://review.gluster.org/10079 (cluster/ec: Handle stime, xtime
differently) posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-04-10 07:10:09 EDT ---
COMMIT: http://review.gluster.org/10077 committed in master by Vijay Bellur
(vbellur at redhat.com)
------
commit fcb55d54a62c8d4a2e8ce4596cd462c471f74dd3
Author: Pranith Kumar K <pkarampu at redhat.com>
Date: Tue Mar 31 18:09:25 2015 +0530
cluster/ec: Ignore volume-mark key for comparing dicts
Change-Id: Id60107e9fb96588d24fa2f3be85c764b7f08e3d1
BUG: 1207712
Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
Reviewed-on: http://review.gluster.org/10077
Tested-by: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez at datalab.es>
--- Additional comment from Anand Avati on 2015-04-12 07:09:35 EDT ---
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#2) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-04-12 07:13:49 EDT ---
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#3) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-04-28 13:23:34 EDT ---
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#4) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-05-03 22:54:08 EDT ---
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#5) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)
--- Additional comment from Anand Avati on 2015-05-04 04:56:17 EDT ---
REVIEW: http://review.gluster.org/10078 (cluster/ec: Fix dictionary compare
function) posted (#6) for review on master by Avra Sengupta
(asengupt at redhat.com)
--- Additional comment from Anand Avati on 2015-05-04 22:46:35 EDT ---
COMMIT: http://review.gluster.org/10078 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com)
------
commit c8cd488b794d7abb3d37f32a6d8d0a3b365aa46e
Author: Pranith Kumar K <pkarampu at redhat.com>
Date: Tue Mar 31 23:07:09 2015 +0530
cluster/ec: Fix dictionary compare function
If both dicts are NULL then equal. If one of the dicts is NULL but the
other
has only ignorable keys then also they are equal. If both dicts are
non-null
then check if for each non-ignorable key, values are same or not.
value_ignore
function is used to skip comparing values for the keys which must be
present in
both the dictionaries but the value could be different.
geo-rep's stime xattr doesn't need to be present in list xattr but when
getxattr comes on stime xattr even if there aren't enough responses with
the
xattr we should still give out an answer which is maximum of the stimes
available.
Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d
BUG: 1207712
Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
Reviewed-on: http://review.gluster.org/10078
Tested-by: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez at datalab.es>
--- Additional comment from Niels de Vos on 2015-05-14 13:27:12 EDT ---
This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-3.7.0, please open a new bug report.
glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages
for several distributions should become available in the near future. Keep an
eye on the Gluster Users mailinglist [2] and the update infrastructure for your
distribution.
[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user
--- Additional comment from Niels de Vos on 2015-05-14 13:28:40 EDT ---
This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-3.7.0, please open a new bug report.
glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages
for several distributions should become available in the near future. Keep an
eye on the Gluster Users mailinglist [2] and the update infrastructure for your
distribution.
[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user
--- Additional comment from Niels de Vos on 2015-05-14 13:35:19 EDT ---
This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-3.7.0, please open a new bug report.
glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages
for several distributions should become available in the near future. Keep an
eye on the Gluster Users mailinglist [2] and the update infrastructure for your
distribution.
[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1207712
[Bug 1207712] Input/Output error with disperse volume when geo-replication
is started
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=3VeVScgKBX&a=cc_unsubscribe
More information about the Bugs
mailing list