[Bugs] [Bug 1235964] Disperse volume: FUSE I/O error after self healing the failed disk files

bugzilla at redhat.com bugzilla at redhat.com
Mon Jul 27 06:35:48 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1235964



--- Comment #4 from Fang Huang <fanghuang.data at yahoo.com> ---
I write a test script to trigger the bug.

--
# cat tests/basic/ec/ec-proactive-heal.t   

#!/bin/bash

. $(dirname $0)/../../include.rc
. $(dirname $0)/../../volume.rc

cleanup

ec_test_dir=$M0/test

function ec_test_generate_src()
{
   mkdir -p $ec_test_dir
   for i in `seq 0 19`
   do
      dd if=/dev/zero of=$ec_test_dir/$i.c bs=1024 count=2
   done
}

function ec_test_make()
{
   for i in `ls *.c`
   do
     file=`basename $i`
     filename=${file%.*}
     cp $i $filename.o
   done 
}

## step 1
TEST glusterd
TEST pidof glusterd
TEST $CLI volume create $V0 disperse 7 redundancy 3 $H0:$B0/${V0}{0..6}
TEST $CLI volume start $V0
TEST glusterfs --entry-timeout=0 --attribute-timeout=0 -s $H0 --volfile-id $V0
$M0
EXPECT_WITHIN $CHILD_UP_TIMEOUT "7" ec_child_up_count $V0 0

## step 2
TEST ec_test_generate_src

cd $ec_test_dir
TEST ec_test_make

## step 3
TEST kill_brick $V0 $H0 $B0/${V0}0
TEST kill_brick $V0 $H0 $B0/${V0}1
EXPECT '5' online_brick_count

TEST rm -f *.o
TEST ec_test_make

## step 4
TEST $CLI volume start $V0 force
EXPECT '7' online_brick_count

# active heal
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "[0-9][0-9]*" get_shd_process_pid
TEST $CLI volume heal $V0 full

TEST rm -f *.o
TEST ec_test_make


## step 5
TEST kill_brick $V0 $H0 $B0/${V0}2
TEST kill_brick $V0 $H0 $B0/${V0}3
EXPECT '5' online_brick_count

TEST rm -f *.o 
TEST ec_test_make

EXPECT '5' online_brick_count

## step 6
TEST $CLI volume start $V0 force
EXPECT '7' online_brick_count

# self-healing
TEST rm -f *.o
TEST ec_test_make

TEST pidof glusterd
EXPECT "$V0" volinfo_field $V0 'Volume Name'
EXPECT 'Started' volinfo_field $V0 'Status'
EXPECT '7' online_brick_count

## cleanup
cd
EXPECT_WITHIN $UMOUNT_TIMEOUT "Y" force_umount $M0
TEST $CLI volume stop $V0
TEST $CLI volume delete $V0
TEST rm -rf $B0/*

cleanup;
--

I tested on branch release-3.7 with commitID b639cb9f62ae, and on master with
commitID 9442e7bf80f5c. On both branches the I/O error was reported in Step 5
during the two tests "TEST rm -f *.o and TEST ec_test_make". 

Please note that if we use the root directory of the mount-point, i.e. set the
ec_test_dir to $M0, the test always passes.

Hope this helps.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=0NENO8TqyE&a=cc_unsubscribe


More information about the Bugs mailing list