[Bugs] [Bug 1183716] New: Force replace-brick lead to the persistent write(use dd) return Input/output error

bugzilla at redhat.com bugzilla at redhat.com
Mon Jan 19 14:49:48 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1183716

            Bug ID: 1183716
           Summary: Force replace-brick lead to the persistent write(use
                    dd) return Input/output error
           Product: GlusterFS
           Version: 3.6.1
         Component: disperse
          Keywords: Triaged
          Assignee: bugs at gluster.org
          Reporter: xhernandez at datalab.es
                CC: bugs at gluster.org, gluster-bugs at redhat.com,
                    jiademing.dd at gmail.com, lidi at perabytes.com,
                    xhernandez at datalab.es
        Depends On: 1176062



+++ This bug was initially created as a clone of Bug #1176062 +++

Description of problem:
    I mkdir /mountpoint/a/b/c -p, after that exec dd if=/dev/zero
of=/mountpoint/a/b/c/test.bak bs=1M.  then I relace-brick commit force. 
replace-brick success, but the write return Input/output error.

Version-Release number of selected component (if applicable):
 glusterfs-master or glusterfs-3.6.2beta1

How reproducible:


Steps to Reproduce:
1.I create a disperse 3 redundancy 1 volume

Volume Name: test
Type: Disperse
Volume ID: bfdbfc8e-3dcc-4459-a1e4-9de17df03db5
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: node-1:/sda/
Brick2: node-1:/sdb/
Brick3: node-1:/sdc/
Options Reconfigured:
features.quota: on
performance.high-prio-threads: 64
performance.low-prio-threads: 64
performance.least-prio-threads: 64
performance.normal-prio-threads: 64
performance.io-thread-count: 64
server.allow-insecure: on
features.lock-heal: on
network.ping-timeout: 5
performance.client-io-threads: enable

2.mkdir -p /mountpoint/a/b/c

3.dd if=/dev/zero of=/mountpoint/a/b/c/test.bak bs=1M

4.gluster volume replace-brick node-1:/sda node-1:/sdd commit force

Actual results:

replace-brick success, but dd write return Input/output error.

Expected results:

replace-brick success and the persistent write all should be OK.

Additional info:

--- Additional comment from jiademing on 2014-12-19 11:30:26 CET ---

I test the the persistent read also has this problem.(glusterfs-master or
glusterfs-release-3.6.2beta1)

--- Additional comment from Anand Avati on 2015-01-07 12:51:33 CET ---

REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
posted (#1) for review on master by Xavier Hernandez (xhernandez at datalab.es)

--- Additional comment from jiademing on 2015-01-09 09:08:45 CET ---

(In reply to Anand Avati from comment #2)
> REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
> posted (#1) for review on master by Xavier Hernandez (xhernandez at datalab.es)

I test this patch, after force relpace-brick,it can persistent write, but  I ls
/mountpoint,  return Input/output error Occasionally. then I stop the dd write,
ls /mountpoint is OK.

--- Additional comment from jiademing on 2015-01-09 10:36:54 CET ---

(In reply to jiademing from comment #3)
> (In reply to Anand Avati from comment #2)
> > REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
> > posted (#1) for review on master by Xavier Hernandez (xhernandez at datalab.es)
> 
> I test this patch, after force relpace-brick,it can persistent write, but  I
> ls /mountpoint,  return Input/output error Occasionally. then I stop the dd
> write, ls /mountpoint is OK.


Error logs:

[2015-01-09 17:30:04.058135] E [ec-helpers.c:410:ec_loc_setup_path]
3-test-disperse-0: Invalid path '<gfid:060bd8ef-6e58-4fcd-ac21-2c0e85b70e54>'
in loc
[2015-01-09 17:30:04.058165] I [dht-layout.c:663:dht_layout_normalize]
3-test-dht: Found anomalies in <gfid:060bd8ef-6e58-4fcd-ac21-2c0e85b70e54>
(gfid = 060bd8ef-6e58-4fcd-ac21-2c0e85b70e54). Holes=1 overlaps=0
[2015-01-09 17:30:04.058187] W [fuse-resolve.c:147:fuse_resolve_gfid_cbk]
0-fuse: 060bd8ef-6e58-4fcd-ac21-2c0e85b70e54: failed to resolve (Input/output
error)
[2015-01-09 17:30:04.058201] E [fuse-bridge.c:808:fuse_getattr_resume]
0-digioceanfs-fuse: 47449: GETATTR 6883340
(060bd8ef-6e58-4fcd-ac21-2c0e85b70e54) resolution failed

--- Additional comment from Xavier Hernandez on 2015-01-09 11:59:53 CET ---

(In reply to jiademing from comment #3)
> (In reply to Anand Avati from comment #2)
> > REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
> > posted (#1) for review on master by Xavier Hernandez (xhernandez at datalab.es)
> 
> I test this patch, after force relpace-brick,it can persistent write, but  I
> ls /mountpoint,  return Input/output error Occasionally. then I stop the dd
> write, ls /mountpoint is OK.

I've tried to do an ls of <mountpoint>, <mountpoint>/a, <mountpoint>/a/b and
<mountpoint>/a/b/c while the dd was running in background and replace brick had
completed. I haven't seen any Input/Output error. However I've seen that 'ls'
sometimes takes more time than expected to complete. I'll try to see why.

The error logs you show seem to come from a different version of ec (program
lines do not match with current code). I've tried it with current master with
this patch added. What version are you trying ?

--- Additional comment from jiademing on 2015-01-12 07:01:00 CET ---

(In reply to Xavier Hernandez from comment #5)
> (In reply to jiademing from comment #3)
> > (In reply to Anand Avati from comment #2)
> > > REVIEW: http://review.gluster.org/9407 (ec: Fix failures with missing files)
> > > posted (#1) for review on master by Xavier Hernandez (xhernandez at datalab.es)
> > 
> > I test this patch, after force relpace-brick,it can persistent write, but  I
> > ls /mountpoint,  return Input/output error Occasionally. then I stop the dd
> > write, ls /mountpoint is OK.
> 
> I've tried to do an ls of <mountpoint>, <mountpoint>/a, <mountpoint>/a/b and
> <mountpoint>/a/b/c while the dd was running in background and replace brick
> had completed. I haven't seen any Input/Output error. However I've seen that
> 'ls' sometimes takes more time than expected to complete. I'll try to see
> why.
> 
> The error logs you show seem to come from a different version of ec (program
> lines do not match with current code). I've tried it with current master
> with this patch added. What version are you trying ?

Sorry, I merged this patch by manual.Then I try on master + this patch, that's
OK.


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1176062
[Bug 1176062] Force replace-brick lead to the persistent write(use dd)
return Input/output error
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list