[Gluster-devel] Need help with https://bugzilla.redhat.com/show_bug.cgi?id=1224180
Xavier Hernandez
xhernandez at datalab.es
Tue Sep 13 06:20:27 UTC 2016
Hi Sanoj,
I'm unable to see bug 1224180. Access is restricted.
Not sure what is the problem exactly, but I see that quota is involved.
Currently disperse doesn't play well with quota when the limit is near.
The reason is that not all bricks fail at the same time with EDQUOT due
to small differences is computed space. This causes a valid write to
succeed on some bricks and fail on others. If it fails simultaneously on
more than redundancy bricks but less that the number of data bricks,
there's no way to rollback the changes on the bricks that have
succeeded, so the operation is inconsistent and an I/O error is returned.
For example, on a 6:2 configuration (4 data bricks and 2 redundancy), if
3 bricks succeed and 3 fail, there are not enough bricks with the
updated version, but there aren't enough bricks with the old version either.
If you force 2 bricks to be down, the problem can appear more frequently
as only a single failure causes this problem.
Xavi
On 13/09/16 06:09, Raghavendra Gowdappa wrote:
> +gluster-devel
>
> ----- Original Message -----
>> From: "Sanoj Unnikrishnan" <sunnikri at redhat.com>
>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Ashish Pandey" <aspandey at redhat.com>, xhernandez at datalab.es,
>> "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>> Sent: Monday, September 12, 2016 7:06:59 PM
>> Subject: Need help with https://bugzilla.redhat.com/show_bug.cgi?id=1224180
>>
>> Hello Xavi/Pranith,
>>
>> I have been able to reproduce the BZ with the following steps:
>>
>> gluster volume create v_disp disperse 6 redundancy 2 $tm1:/export/sdb/br1
>> $tm2:/export/sdb/b2 $tm3:/export/sdb/br3 $tm1:/export/sdb/b4
>> $tm2:/export/sdb/b5 $tm3:/export/sdb/b6 force
>> #(Used only 3 nodes, should not matter here)
>> gluster volume start v_disp
>> mount -t glusterfs $tm1:v_disp /gluster_vols/v_disp
>> mkdir /gluster_vols/v_disp/dir1
>> dd if=/dev/zero of=/gluster_vols/v_disp/dir1/x bs=10k count=90000 &
>> gluster v quota v_disp enable
>> gluster v quota v_disp limit-usage /dir1 200MB
>> gluster v quota v_disp soft-timeout 0
>> gluster v quota v_disp hard-timeout 0
>> #optional remove 2 bricks (reproduces more often with this)
>> #pgrep glusterfsd | xargs kill -9
>>
>> IO error on stdout when Quota exceeds, followed by Disk Quota exceeded.
>>
>> Also note the issue is seen when A flush happens simultaneous with quota
>> limit hit, Hence Its not seen only on some runs.
>>
>> The following are the error in logs.
>> [2016-09-12 10:40:02.431568] E [MSGID: 122034]
>> [ec-common.c:488:ec_child_select] 0-v_disp-disperse-0: Insufficient
>> available childs for this request (have 0, need 4)
>> [2016-09-12 10:40:02.431627] E [MSGID: 122037]
>> [ec-common.c:1830:ec_update_size_version_done] 0-Disperse: sku-debug:
>> pre-version=0/0, size=0post-version=1865/1865, size=209571840
>> [2016-09-12 10:40:02.431637] E [MSGID: 122037]
>> [ec-common.c:1835:ec_update_size_version_done] 0-v_disp-disperse-0: Failed
>> to update version and size [Input/output error]
>> [2016-09-12 10:40:02.431664] E [MSGID: 122034]
>> [ec-common.c:417:ec_child_select] 0-v_disp-disperse-0: sku-debug: mask: 36,
>> ec->xl_up 36, ec->node_mask 3f, parent->mask:36, fop->parent->healing:0,
>> id:29
>>
>> [2016-09-12 10:40:02.431673] E [MSGID: 122034]
>> [ec-common.c:480:ec_child_select] 0-v_disp-disperse-0: sku-debug: mask: 36,
>> remaining: 36, healing: 0, ec->xl_up 36, ec->node_mask 3f, parent->mask:36,
>> num:4, minimum: 1, id:29
>>
>> ...
>> [2016-09-12 10:40:02.487302] W [fuse-bridge.c:2311:fuse_writev_cbk]
>> 0-glusterfs-fuse: 41159: WRITE => -1
>> gfid=ee0b4aa1-1f44-486a-883c-acddc13ee318 fd=0x7f1d9c003edc (Input/output
>> error)
>> [2016-09-12 10:40:02.500151] W [MSGID: 122006]
>> [ec-combine.c:206:ec_iatt_combine] 0-v_disp-disperse-0: Failed to combine
>> iatt (inode: 9816911356190712600-9816911356190712600, links: 1-1, uid: 0-0,
>> gid: 0-0, rdev: 0-0, size: 52423680-52413440, mode: 100644-100644)
>> [2016-09-12 10:40:02.500188] N [MSGID: 122029]
>> [ec-combine.c:93:ec_combine_write] 0-v_disp-disperse-0: Mismatching iatt in
>> answers of 'WRITE'
>> [2016-09-12 10:40:02.504551] W [MSGID: 122006]
>> [ec-combine.c:206:ec_iatt_combine] 0-v_disp-disperse-0: Failed to combine
>> iatt (inode: 9816911356190712600-9816911356190712600, links: 1-1, uid: 0-0,
>> gid: 0-0, rdev: 0-0, size: 52423680-52413440, mode: 100644-100644)
>> ....
>> ....
>>
>> [2016-09-12 10:40:02.571272] N [MSGID: 122029]
>> [ec-combine.c:93:ec_combine_write] 0-v_disp-disperse-0: Mismatching iatt in
>> answers of 'WRITE'
>> [2016-09-12 10:40:02.571510] W [MSGID: 122006]
>> [ec-combine.c:206:ec_iatt_combine] 0-v_disp-disperse-0: Failed to combine
>> iatt (inode: 9816911356190712600-9816911356190712600, links: 1-1, uid: 0-0,
>> gid: 0-0, rdev: 0-0, size: 52423680-52413440, mode: 100644-100644)
>> [2016-09-12 10:40:02.571544] N [MSGID: 122029]
>> [ec-combine.c:93:ec_combine_write] 0-v_disp-disperse-0: Mismatching iatt in
>> answers of 'WRITE'
>> [2016-09-12 10:40:02.571772] W [fuse-bridge.c:1290:fuse_err_cbk]
>> 0-glusterfs-fuse: 41160: FLUSH() ERR => -1 (Input/output error)
>>
>> Also, for some fops before the write I noticed the fop->mask field as 0, Its
>> not clear why this happens ??
>>
>> [2016-09-12 10:40:02.431561] E [MSGID: 122034]
>> [ec-common.c:480:ec_child_select] 0-v_disp-disperse-0: sku-debug: mask: 0,
>> remaining: 0, healing: 0, ec->xl_up 36, ec->node_mask 3f, parent->mask:36,
>> num:0, minimum: 4, fop->id:34
>> [2016-09-12 10:40:02.431568] E [MSGID: 122034]
>> [ec-common.c:488:ec_child_select] 0-v_disp-disperse-0: Insufficient
>> available childs for this request (have 0, need 4)
>> [2016-09-12 10:40:02.431637] E [MSGID: 122037]
>> [ec-common.c:1835:ec_update_size_version_done] 0-v_disp-disperse-0: Failed
>> to update version and size [Input/output error]
>>
>> Is the zero value of fop->mask related to mismatch in iatt ?
>> Any scenario of race between write/flush fop?
>> please suggest how to proceed.
>>
>> Thanks and Regards,
>> Sanoj
>>
More information about the Gluster-devel
mailing list