[Gluster-devel] Input/output error when files in .shard folder are deleted

Mon Oct 24 10:40:48 UTC 2016

Hi,

I am currently running a simple gluster setup using one server node
with multiple disks. I realize that if i delete away all the .shard
files in one replica in the backend, my application (dd) will report
Input/Output error even though i have 3 replicas.

My gluster version is 3.7.16

gluster volume file

Volume Name: testHeal
Type: Replicate
Volume ID: 26d16d7f-bc4f-44a6-a18b-eab780d80851
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.123.4:/mnt/sdb_mssd/testHeal2
Brick2: 192.168.123.4:/mnt/sde_mssd/testHeal2
Brick3: 192.168.123.4:/mnt/sdd_mssd/testHeal2
Options Reconfigured:
cluster.self-heal-daemon: on
features.shard-block-size: 16MB
features.shard: on
performance.readdir-ahead: on

dd error

[root at fujitsu05 .shard]# dd of=/home/test if=/mnt/fuseMount/ddTest
bs=16M count=20 oflag=direct
dd: error reading ‘/mnt/fuseMount/ddTest’: Input/output error
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.111038 s, 151 MB/s

in the .shard folder where i deleted all the .shard file, i can see
one .shard file is recreated

getfattr -d -e hex -m.  9061198a-eb7e-45a2-93fb-eb396d1b2727.1
# file: 9061198a-eb7e-45a2-93fb-eb396d1b2727.1
trusted.afr.testHeal-client-0=0x000000010000000100000000
trusted.afr.testHeal-client-2=0x000000010000000100000000
trusted.gfid=0x41b653f7daa14627b1f91f9e8554ddde

However, the gfid is not the same compare to the other replicas

getfattr -d -e hex -m.  9061198a-eb7e-45a2-93fb-eb396d1b2727.1
# file: 9061198a-eb7e-45a2-93fb-eb396d1b2727.1
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.testHeal-client-1=0x000000000000000000000000
trusted.bit-rot.version=0x0300000000000000580dde99000e5e5d
trusted.gfid=0x9ee5c5eed7964a6cb9ac1a1419de5a40

Is this consider a bug?

Regards,

Cwtan