[Bugs] [Bug 1449167] New: After selfheal of brick file size of few files differs

bugzilla at redhat.com bugzilla at redhat.com
Tue May 9 10:56:18 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1449167

            Bug ID: 1449167
           Summary: After selfheal of brick file size of few files differs
           Product: GlusterFS
           Version: 3.10
         Component: selfheal
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: amudhan83 at gmail.com
                CC: bugs at gluster.org



Description of problem:
EC volume after replacing brick selfheal started and completed. but brick size
differs from other bricks in same set.

when comparing files between good brick and healed brick found few files size
differ in healed disk.

Version-Release number of selected component (if applicable):
3.10.1

File which is showing size difference after brick heal. 
Also, there is a difference in ls -l and du -h in healed brick

===========================
File info from Healed brick 
===========================

du -h /media/disk11/brick11/file1
2.2G    /media/disk11/brick11/file1

ls -lh /media/disk11/brick11/file1
-rw-r--r-- 2 root root 3.5G Nov 10 00:03 /media/disk11/brick11/file1

stat /media/disk11/brick11/file1
  File: ‘/media/disk11/brick11/file1’
  Size: 3661745152      Blocks: 4565608    IO Block: 4096   regular file
Device: 8c1h/2241d      Inode: 5931163503  Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2016-11-09 23:58:07.083459000 +0530
Modify: 2016-11-10 00:03:15.955455000 +0530
Change: 2017-04-23 05:56:33.570068918 +0530
 Birth: -


getfattr -m. -e hex -d /media/disk11/brick11/file1
getfattr: Removing leading '/' from absolute path names
# file: media/disk11/brick11/file1
trusted.bit-rot.signature=0x010500000000000000574ef2ff2bba2798a0451de3d9bca857380c1c36a8ca39fc7fd4e8c85dd4e559
trusted.bit-rot.version=0x050000000000000058ef4cad000c2af5
trusted.ec.config=0x0000080a02000200
trusted.ec.size=0x00000006d20e5937
trusted.ec.version=0x00000000000369080000000000036909
trusted.gfid=0xc1fadd2e84c34e5d825d6431cfb17e48

==========================
File info from good brick 
==========================
 ls -lh /media/disk11/brick11/file1
-rw-r--r-- 2 root root 3.5G Nov 10 00:03 /media/disk11/brick11/file1

 du -h /media/disk11/brick11/file1
3.5G    /media/disk11/brick11/file1

getfattr -m. -e hex -d /media/disk11/brick11/file1
getfattr: Removing leading '/' from absolute path names
# file: media/disk11/brick11/file1
trusted.bit-rot.signature=0x010500000000000000b87cccce67fe51c0c2c224459d3987fe6beb2d674264048bf508d793443a6837
trusted.bit-rot.version=0x050000000000000058e10e9d00056438
trusted.ec.config=0x0000080a02000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000006d20e5937
trusted.ec.version=0x00000000000369080000000000036909
trusted.gfid=0xc1fadd2e84c34e5d825d6431cfb17e48


How reproducible:

First time seeing this behaviour in production environment.

Listing out few points which i was doing during heal process.

1. during heal process reading file which is about to heal.
2. reading file from healing brick was slow. so, killed healing brick pid for
user to download file. this was done twice in a days gap.
3. to speed up heal process tried running command "getfattr -h -n
trusted.ec.heal 'filename' " but that also took time to heal file. so stopped
4. other than heal brick process. rebalance fix-layout and bitrot signer
process were running in cluster.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list