[Bugs] [Bug 1416321] New: read error: Input/ output error but not in brain split stat

bugzilla at redhat.com bugzilla at redhat.com
Wed Jan 25 09:39:08 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1416321

            Bug ID: 1416321
           Summary: read error: Input/output error but not in brain split
                    stat
           Product: GlusterFS
           Version: 3.8
         Component: core
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: likunbyl at qq.com
                CC: bugs at gluster.org



Created attachment 1244198
  --> https://bugzilla.redhat.com/attachment.cgi?id=1244198&action=edit
volume info and volume status/heal info, brick logs at that time

Description of problem:

One of our glusterfs consumer reported an error: java.io.FileNotFoundException:
/mnt/glusterfs/gps/data/2017-01/02/4.txt (Input/output error).

Then I tried to access this file:
# cd /mnt/glusterfs/gps/data/2017-01/02
# more 4.txt
# cat 4.txt
cat: read error: Input/output error

Then I logined into the bricks that hold this file, the size of the file 4.txt
was different:

core at ab09 /export/gluster/sdb3/vol/gps/data/2017-01/02 $ ls -l 4.txt
-rw-r--r--. 2 root root  7490 Jan  2 00:00 4.txt

core at ac08 /export/gluster/sdb3/vol/gps/data/2017-01/02 $ ls -l 4.txt 
-rw-r--r--. 2 root root 9046 Jan  2 00:00 4.txt

core at ad08 /export/gluster/sdb3/vol/gps/data/2017-01/02 $ ls -l 4.txt 
-rw-r--r--. 2 root root 9046 Jan  2 00:00 4.txt

And the getfattr output:

core at ab09 /export/gluster/sdb3/vol/gps/data/2017-01/02 #  getfattr -m . -d -e
hex 4.txt
# file: 4.txt
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.gvol0-client-51=0x000000000000000000000000
trusted.afr.gvol0-client-52=0x000000000000000000000000
trusted.afr.gvol0-client-53=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005850b03d00072c12
trusted.gfid=0xa61d7b2fd53f4eb9be7a656b0915dbea
trusted.glusterfs.quota.363d3c2d-5ccb-497e-9ecc-6eaf5f435026.contri.1=0x0000000000001e000000000000000001
trusted.pgfid.363d3c2d-5ccb-497e-9ecc-6eaf5f435026=0x00000001

core at ac08 /export/gluster/sdb3/vol/gps/data/2017-01/02 #  getfattr -m . -d -e
hex 4.txt
# file: 4.txt
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.gvol0-client-51=0x000000000000000000000000
trusted.afr.gvol0-client-52=0x000000000000000000000000
trusted.afr.gvol0-client-53=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005850b03d0007711e
trusted.gfid=0xa61d7b2fd53f4eb9be7a656b0915dbea
trusted.glusterfs.quota.363d3c2d-5ccb-497e-9ecc-6eaf5f435026.contri.1=0x00000000000024000000000000000001
trusted.pgfid.363d3c2d-5ccb-497e-9ecc-6eaf5f435026=0x00000001

core at ad08 /export/gluster/sdb3/vol/gps/data/2017-01/02 #  getfattr -m . -d -e
hex 4.txt
# file: 4.txt
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.gvol0-client-51=0x000000000000000000000000
trusted.afr.gvol0-client-52=0x000000000000000000000000
trusted.afr.gvol0-client-53=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005861dcf900054117
trusted.gfid=0xa61d7b2fd53f4eb9be7a656b0915dbea
trusted.glusterfs.quota.363d3c2d-5ccb-497e-9ecc-6eaf5f435026.contri.1=0x00000000000024000000000000000001
trusted.pgfid.363d3c2d-5ccb-497e-9ecc-6eaf5f435026=0x00000001

gluster volume heal info didn’t report anything. 

Although a “gluster volume heal gvol0 full” command handled this, synced this
file, the impact to glusterfs consumer was real, and would happen again in the
near future, I supposed.

Setup:

OS: coreos 1185.5.0
kubernetes: v1.5.1
Image: official gluster-centos:gluster3u8_centos7
Gluster: 3.8.5

20 nodes cluster running 132 bricks dist-rep 44x3 volume

Version-Release number of selected component (if applicable):
3.8.5

How reproducible:
Not always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
will attach the volume info and volume status/heal info, brick logs at that
time.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list