[Bugs] [Bug 1264831] New: Data Tiering:Loss of data writes(IO error) to an existing file when detach-tier start is issued(seems like writes and detach are mutually exclusive)

bugzilla at redhat.com bugzilla at redhat.com
Mon Sep 21 10:25:04 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1264831

            Bug ID: 1264831
           Summary: Data Tiering:Loss of data writes(IO error) to an
                    existing file when detach-tier start is issued(seems
                    like writes and detach are mutually exclusive)
           Product: GlusterFS
           Version: 3.7.4
         Component: tiering
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: nchilaka at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org



Description of problem:
========================
while  writes are going to a file if user issue a detach-tier, then the IOs are
getting missed to the file.
For eg, I mounted a tier vol on fuse and issued file creates of size 100Mb as
below:
[root at localhost lanka]# for i in {1..100};do dd if=/dev/urandom of=file.$i
bs=1024 count=100000;done

Now after about 2 files were created and when third file was in progress and
about 70% of writes(70MB) was created, I issued a detach-tier start. The writes
to the file stopped there and detach-tier went on. After detach tier start
completed, it started to create the 4th file on cold tier. That means the 3rd
file was incomplete and writes were missed

See brick logs:
[root at zod ~]# ###############no destach start when file3 is creating##########
[root at zod ~]# ll /rhs/brick*/lank*
/rhs/brick1/lanka:
total 335244
-rw-r--r--. 2 root root 102400000 Sep 21 15:26 file.1
-rw-r--r--. 2 root root 102400000 Sep 21 15:27 file.2
-rw-r--r--. 2 root root  71508992 Sep 21 15:27 file.3(writes stopped to this
file as soon as detach-tier start was issued)(also files were migrated to cold
brick as here)
-rw-r--r--. 2 root root  57340928 Sep 21 15:28 file.4(new file create post
detach tier start completed)

==========Mount point fuse error========
[root at localhost lanka]# for i in {1..100};do dd if=/dev/urandom of=file.$i
bs=1024 count=100000;done
100000+0 records in
100000+0 records out
102400000 bytes (102 MB) copied, 53.8811 s, 1.9 MB/s
100000+0 records in
100000+0 records out
102400000 bytes (102 MB) copied, 55.6703 s, 1.8 MB/s
dd: error writing ‘file.3’: Input/output error
69858+0 records in
69857+0 records out
71533568 bytes (72 MB) copied, 41.3222 s, 1.7 MB/s
100000+0 records in
100000+0 records out
102400000 bytes (102 MB) copied, 59.3975 s, 1.7 MB/s
100000+0 records in
100000+0 records out





Version-Release number of selected component (if applicable):
===============================================================

[root at zod ~]# gluster --version
glusterfs 3.7.4 built on Sep 19 2015 01:30:43
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
[root at zod ~]# rpm -qa|grep gluster
glusterfs-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-fuse-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-debuginfo-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-api-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-client-xlators-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-server-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-cli-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-libs-3.7.4-0.43.gitf139283.el7.centos.x86_64
[root at zod ~]# 


[root at zod ~]# gluster v info

Volume Name: lanka
Type: Tier
Volume ID: 258a9a07-43e8-417e-8152-880ca5186f53
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: yarrow:/rhs/brick7/lanka_hot
Brick2: zod:/rhs/brick7/lanka_hot
Brick3: yarrow:/rhs/brick6/lanka_hot
Brick4: zod:/rhs/brick6/lanka_hot
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 3 x 2 = 6
Brick5: zod:/rhs/brick1/lanka
Brick6: yarrow:/rhs/brick1/lanka
Brick7: zod:/rhs/brick2/lanka
Brick8: yarrow:/rhs/brick2/lanka
Brick9: zod:/rhs/brick3/lanka
Brick10: yarrow:/rhs/brick3/lanka
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
[root at zod ~]# gluster v status
Status of volume: lanka
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick yarrow:/rhs/brick7/lanka_hot          49261     0          Y       23253
Brick zod:/rhs/brick7/lanka_hot             49278     0          Y       8213 
Brick yarrow:/rhs/brick6/lanka_hot          49260     0          Y       23234
Brick zod:/rhs/brick6/lanka_hot             49277     0          Y       8195 
Cold Bricks:
Brick zod:/rhs/brick1/lanka                 49274     0          Y       8015 
Brick yarrow:/rhs/brick1/lanka              49257     0          Y       22961
Brick zod:/rhs/brick2/lanka                 49275     0          Y       8033 
Brick yarrow:/rhs/brick2/lanka              49258     0          Y       22981
Brick zod:/rhs/brick3/lanka                 49276     0          Y       8051 
Brick yarrow:/rhs/brick3/lanka              49259     0          Y       22999
NFS Server on localhost                     2049      0          Y       8232 
Quota Daemon on localhost                   N/A       N/A        Y       8246 
NFS Server on yarrow                        2049      0          Y       23294
Quota Daemon on yarrow                      N/A       N/A        Y       23324

Task Status of Volume lanka
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 4306a687-1a83-4df8-8890-bdec702820c0
Status               : in progress         




Steps to Reproduce:
====================
1.attach tier layer to a volume with quota enabled
2.enable ctr
3.now fuse mount the volume
4. Now start file creates for about 15 files  in loop using dd command, of say
100MB, 
5)Now, after two files are created completely, and while 3rd file create has
started, issue a detach tier start
6)after detach tier start is completed, it can be seen that the 3rd file create
abrubptbly ends and 4th file is created on cold layer, with IO error for the
3rd file on mount

Expected results:
===================
seemless detach-tier is required

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list