[Bugs] [Bug 1225284] New: Disperse volume: I/O error on client when USS is turned on

Wed May 27 03:07:20 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1225284

            Bug ID: 1225284
           Summary: Disperse volume: I/O error on client when USS is
                    turned on
           Product: GlusterFS
           Version: 3.7.0
         Component: disperse
          Keywords: Triaged
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: pkarampu at redhat.com
                CC: annair at redhat.com, aspandey at redhat.com,
                    bugs at gluster.org, byarlaga at redhat.com,
                    gluster-bugs at redhat.com, pkarampu at redhat.com
        Depends On: 1188145
            Blocks: 1186580 (qe_tracker_everglades), 1214994, 1224112,
                    1224188

+++ This bug was initially created as a clone of Bug #1188145 +++

Description of problem:
=======================

while the IO is in progress on client like creating files/directories, if the
feature USS is turned on/off, there's an IO error. 

Version-Release number of selected component (if applicable):
=============================================================
glusterfs 3.7dev built on Jan 29 2015 01:05:44

How reproducible:
=================
100%

Number of volumes :
===================
1

Volume Names :
==============
testvol

Volume on which the particular issue is seen [ if applicable ]
===============================================================
testvol

Type of volumes :
=================
Disperse

Volume options if available :
=============================

[root at dhcp37-120 ~]# gluster volume get testvol all
Option                                  Value                                   
------                                  -----                                   
cluster.lookup-unhashed                 on                                      
cluster.min-free-disk                   10%                                     
cluster.min-free-inodes                 5%                                      
cluster.rebalance-stats                 off                                     
cluster.subvols-per-directory           (null)                                  
cluster.readdir-optimize                off                                     
cluster.rsync-hash-regex                (null)                                  
cluster.extra-hash-regex                (null)                                  
cluster.dht-xattr-name                  trusted.glusterfs.dht                   
cluster.randomize-hash-range-by-gfid    off                                     
cluster.local-volume-name               (null)                                  
cluster.weighted-rebalance              on                                      
cluster.switch-pattern                  (null)                                  
cluster.entry-change-log                on                                      
cluster.read-subvolume                  (null)                                  
cluster.read-subvolume-index            -1                                      
cluster.read-hash-mode                  1                                       
cluster.background-self-heal-count      16                                      
cluster.metadata-self-heal              on                                      
cluster.data-self-heal                  on                                      
cluster.entry-self-heal                 on                                      
cluster.self-heal-daemon                on                                      
cluster.self-heal-window-size           1                                       
cluster.data-change-log                 on                                      
cluster.metadata-change-log             on                                      
cluster.data-self-heal-algorithm        (null)                                  
cluster.eager-lock                      on                                      
cluster.quorum-type                     none                                    
cluster.quorum-count                    (null)                                  
cluster.choose-local                    true                                    
cluster.self-heal-readdir-size          1KB                                     
cluster.post-op-delay-secs              1                                       
cluster.ensure-durability               on                                      
cluster.stripe-block-size               128KB                                   
cluster.stripe-coalesce                 true                                    
diagnostics.latency-measurement         off                                     
diagnostics.dump-fd-stats               off                                     
diagnostics.count-fop-hits              off                                     
diagnostics.brick-log-level             INFO                                    
diagnostics.client-log-level            INFO                                    
diagnostics.brick-sys-log-level         CRITICAL                                
diagnostics.client-sys-log-level        CRITICAL                                
diagnostics.brick-logger                (null)                                  
diagnostics.client-logger               (null)                                  
diagnostics.brick-log-format            (null)                                  
diagnostics.client-log-format           (null)                                  
diagnostics.brick-log-buf-size          5                                       
diagnostics.client-log-buf-size         5                                       
diagnostics.brick-log-flush-timeout     120                                     
diagnostics.client-log-flush-timeout    120                                     
performance.cache-max-file-size         0                                       
performance.cache-min-file-size         0                                       
performance.cache-refresh-timeout       1                                       
performance.cache-priority                                                      
performance.cache-size                  32MB                                    
performance.io-thread-count             16                                      
performance.high-prio-threads           16                                      
performance.normal-prio-threads         16                                      
performance.low-prio-threads            16                                      
performance.least-prio-threads          1                                       
performance.enable-least-priority       on                                      
performance.least-rate-limit            0                                       
performance.cache-size                  128MB                                   
performance.flush-behind                on                                      
performance.nfs.flush-behind            on                                      
performance.write-behind-window-size    1MB                                     
performance.nfs.write-behind-window-size1MB                                     
performance.strict-o-direct             off                                     
performance.nfs.strict-o-direct         off                                     
performance.strict-write-ordering       off                                     
performance.nfs.strict-write-ordering   off                                     
performance.lazy-open                   yes                                     
performance.read-after-open             no                                      
performance.read-ahead-page-count       4                                       
performance.md-cache-timeout            1                                       
features.encryption                     off                                     
encryption.master-key                   (null)                                  
encryption.data-key-size                256                                     
encryption.block-size                   4096                                    
network.frame-timeout                   1800                                    
network.ping-timeout                    42                                      
network.tcp-window-size                 (null)                                  
features.lock-heal                      off                                     
features.grace-timeout                  10                                      
network.remote-dio                      disable                                 
network.tcp-window-size                 (null)                                  
network.inode-lru-limit                 16384                                   
auth.allow                              *                                       
auth.reject                             (null)                                  
transport.keepalive                     (null)                                  
server.allow-insecure                   (null)                                  
server.root-squash                      off                                     
server.anonuid                          65534                                   
server.anongid                          65534                                   
server.statedump-path                   /var/run/gluster                        
server.outstanding-rpc-limit            64                                      
features.lock-heal                      off                                     
features.grace-timeout                  (null)                                  
server.ssl                              (null)                                  
auth.ssl-allow                          *                                       
server.manage-gids                      off                                     
client.send-gids                        on                                      
server.gid-timeout                      2                                       
server.own-thread                       (null)                                  
performance.write-behind                on                                      
performance.read-ahead                  on                                      
performance.readdir-ahead               off                                     
performance.io-cache                    on                                      
performance.quick-read                  on                                      
performance.open-behind                 on                                      
performance.stat-prefetch               on                                      
performance.client-io-threads           off                                     
performance.nfs.write-behind            on                                      
performance.nfs.read-ahead              off                                     
performance.nfs.io-cache                off                                     
performance.nfs.quick-read              off                                     
performance.nfs.stat-prefetch           off                                     
performance.nfs.io-threads              off                                     
performance.force-readdirp              true                                    
features.file-snapshot                  off                                     
features.uss                            on                                      
features.snapshot-directory             .snaps                                  
features.show-snapshot-directory        off                                     
network.compression                     off                                     
network.compression.window-size         -15                                     
network.compression.mem-level           8                                       
network.compression.min-size            0                                       
network.compression.compression-level   -1                                      
network.compression.debug               false                                   
features.limit-usage                    (null)                                  
features.quota-timeout                  0                                       
features.default-soft-limit             80%                                     
features.soft-timeout                   60                                      
features.hard-timeout                   5                                       
features.alert-time                     86400                                   
features.quota-deem-statfs              off                                     
geo-replication.indexing                off                                     
geo-replication.indexing                off                                     
geo-replication.ignore-pid-check        off                                     
geo-replication.ignore-pid-check        off                                     
features.quota                          on                                      
debug.trace                             off                                     
debug.log-history                       no                                      
debug.log-file                          no                                      
debug.exclude-ops                       (null)                                  
debug.include-ops                       (null)                                  
debug.error-gen                         off                                     
debug.error-failure                     (null)                                  
debug.error-number                      (null)                                  
debug.random-failure                    off                                     
debug.error-fops                        (null)                                  
nfs.enable-ino32                        no                                      
nfs.mem-factor                          15                                      
nfs.export-dirs                         on                                      
nfs.export-volumes                      on                                      
nfs.addr-namelookup                     off                                     
nfs.dynamic-volumes                     off                                     
nfs.register-with-portmap               on                                      
nfs.outstanding-rpc-limit               16                                      
nfs.port                                2049                                    
nfs.rpc-auth-unix                       on                                      
nfs.rpc-auth-null                       on                                      
nfs.rpc-auth-allow                      all                                     
nfs.rpc-auth-reject                     none                                    
nfs.ports-insecure                      off                                     
nfs.trusted-sync                        off                                     
nfs.trusted-write                       off                                     
nfs.volume-access                       read-write                              
nfs.export-dir                                                                  
nfs.disable                             false                                   
nfs.nlm                                 on                                      
nfs.acl                                 on                                      
nfs.mount-udp                           off                                     
nfs.mount-rmtab                         /var/lib/glusterd/nfs/rmtab             
nfs.rpc-statd                           /sbin/rpc.statd                         
nfs.server-aux-gids                     off                                     
nfs.drc                                 off                                     
nfs.drc-size                            0x20000                                 
nfs.read-size                           (1 * 1048576ULL)                        
nfs.write-size                          (1 * 1048576ULL)                        
nfs.readdir-size                        (1 * 1048576ULL)                        
features.read-only                      off                                     
features.worm                           off                                     
storage.linux-aio                       off                                     
storage.batch-fsync-mode                reverse-fsync                           
storage.batch-fsync-delay-usec          0                                       
storage.owner-uid                       -1                                      
storage.owner-gid                       -1                                      
storage.node-uuid-pathinfo              off                                     
storage.health-check-interval           30                                      
storage.build-pgfid                     off                                     
storage.bd-aio                          off                                     
cluster.server-quorum-type              off                                     
cluster.server-quorum-ratio             0                                       
changelog.changelog                     off                                     
changelog.changelog-dir                 (null)                                  
changelog.encoding                      ascii                                   
changelog.rollover-time                 15                                      
changelog.fsync-interval                5                                       
changelog.changelog-barrier-timeout     120                                     
features.barrier                        disable                                 
features.barrier-timeout                120                                     
locks.trace                             disable                                 
cluster.disperse-self-heal-daemon       enable                                  
[root at dhcp37-120 ~]# 

Output of gluster volume info :
===============================
[root at dhcp37-120 ~]# gluster volume info

Volume Name: testvol
Type: Disperse
Volume ID: ad1a31fb-2e69-4d5d-9ae0-d057879b8fd5
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: dhcp37-120:/rhs/brick1/b1
Brick2: dhcp37-208:/rhs/brick1/b1
Brick3: dhcp37-178:/rhs/brick1/b1
Brick4: dhcp37-183:/rhs/brick1/b1
Brick5: dhcp37-120:/rhs/brick2/b2
Brick6: dhcp37-208:/rhs/brick2/b2
Options Reconfigured:
features.quota: on
features.uss: on
[root at dhcp37-120 ~]# 

Output of gluster volume status :
=================================

[root at dhcp37-120 ~]# gluster volume status
Status of volume: testvol
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick dhcp37-120:/rhs/brick1/b1                49152    Y    16353
Brick dhcp37-208:/rhs/brick1/b1                49156    Y    30577
Brick dhcp37-178:/rhs/brick1/b1                49156    Y    30675
Brick dhcp37-183:/rhs/brick1/b1                49156    Y    30439
Brick dhcp37-120:/rhs/brick2/b2                49153    Y    4957
Brick dhcp37-208:/rhs/brick2/b2                49157    Y    30588
Snapshot Daemon on localhost                49154    Y    16412
NFS Server on localhost                    2049    Y    16420
Quota Daemon on localhost                N/A    Y    16374
Snapshot Daemon on dhcp37-183                49164    Y    26945
NFS Server on dhcp37-183                2049    Y    27899
Quota Daemon on dhcp37-183                N/A    Y    30515
Snapshot Daemon on dhcp37-178                49164    Y    8563
NFS Server on dhcp37-178                2049    Y    8571
Quota Daemon on dhcp37-178                N/A    Y    30751
Snapshot Daemon on dhcp37-208                49165    Y    9581
NFS Server on dhcp37-208                2049    Y    9589
Quota Daemon on dhcp37-208                N/A    Y    30678

Task Status of Volume testvol
------------------------------------------------------------------------------
There are no active volume tasks

Steps to Reproduce:
===================

1. Create a 1x(4+2) disperse volume
2. start creating files/dirs from the client.
3. On one of the storage node, turn on/off the USS feature with 'gluster volume
set testvol uss on/off'

Actual results:
=================

IO error on the client

Expected results:
=================

There should not be any IO error

Additional info:
================
volume statedumps will be attached to this bug.

--- Additional comment from Bhaskarakiran on 2015-03-09 03:34:01 EDT ---

The command that are used to create files/directories is 

for i in `seq 1 1000`; do dd if=/dev/urandom of=testfile.$i bs=1k count=$i ;
done

for i in `seq 1 1000`; do mkdir dir.$i ; done

--- Additional comment from Pranith Kumar K on 2015-03-09 05:19:29 EDT ---

Tried re-creating the issue, but the test case worked fine. Seems like this bug
is fixed as well with http://review.gluster.com/9717

--- Additional comment from Bhaskarakiran on 2015-05-05 02:06:14 EDT ---

When the USS is turned off, IO completely hangs and when turned back IO
resumes. Moving back the bug.

--- Additional comment from Anand Avati on 2015-05-19 05:07:31 EDT ---

REVIEW: http://review.gluster.org/10787 (cluster/ec: Correctly cleanup delayed
locks) posted (#2) for review on master by Xavier Hernandez
(xhernandez at datalab.es)

--- Additional comment from Anand Avati on 2015-05-19 07:15:53 EDT ---

REVIEW: http://review.gluster.org/10787 (cluster/ec: Correctly cleanup delayed
locks) posted (#3) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-05-20 04:00:04 EDT ---

COMMIT: http://review.gluster.org/10787 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit 61cfcf65f0d4ad70fc8a47395c583d4b5bf1efbe
Author: Xavier Hernandez <xhernandez at datalab.es>
Date:   Thu May 14 20:07:10 2015 +0200

    cluster/ec: Correctly cleanup delayed locks

    When a delayed lock is pending, a graph switch doesn't correctly
    terminate it. This means that the update of version and size xattrs
    is lost, causing EIO errors.

    This patch handles GF_EVENT_PARENT_DOWN event to correctly finish
    pending udpdates before completing the graph switch.

    Change-Id: I394f3b8d41df8d83cdd36636aeb62330f30a66d5
    BUG: 1188145
    Signed-off-by: Xavier Hernandez <xhernandez at datalab.es>
    Reviewed-on: http://review.gluster.org/10787
    Tested-by: NetBSD Build System
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>

--- Additional comment from Anand Avati on 2015-05-20 23:43:33 EDT ---

REVIEW: http://review.gluster.org/10868 (cluster/ec: Fix use after free crash)
posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-05-21 05:07:56 EDT ---

REVIEW: http://review.gluster.org/10868 (cluster/ec: Fix use after free crash)
posted (#2) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-05-21 05:11:23 EDT ---

REVIEW: http://review.gluster.org/10868 (cluster/ec: Fix use after free crash)
posted (#3) for review on master by Xavier Hernandez (xhernandez at datalab.es)

--- Additional comment from Anand Avati on 2015-05-21 06:14:24 EDT ---

REVIEW: http://review.gluster.org/10868 (cluster/ec: Fix use after free crash)
posted (#4) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Anand Avati on 2015-05-21 09:08:28 EDT ---

COMMIT: http://review.gluster.org/10868 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit 0910bab5e5b957e11f356d525eccccfd36d334f9
Author: Pranith Kumar K <pkarampu at redhat.com>
Date:   Wed May 20 23:56:17 2015 +0530

    cluster/ec: Fix use after free crash

    ec_heal creates ec_fop_data but doesn't run ec_manager.
ec_fop_data_allocate
    adds this fop to ec->pending_fops, because ec_manager is not run on this
heal
    fop it is never removed from ec->pending_fops. When it is accessed after
free
    it leads to crash. It is better to not to add HEAL fops to ec->pending_fops
    because we don't want graph switch to hang the mount because of a BIG
    file/directory heal.

    BUG: 1188145
    Change-Id: I8abdc92f06e0563192300ca4abca3909efcca9c3
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/10868
    Reviewed-by: Xavier Hernandez <xhernandez at datalab.es>
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Raghavendra Bhat <raghavendra at redhat.com>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1188145
[Bug 1188145] Disperse volume: I/O error on client when USS is turned on
https://bugzilla.redhat.com/show_bug.cgi?id=1214994
[Bug 1214994] Disperse volume: Rebalance failed when plain disperse volume
is converted to distributed disperse volume
https://bugzilla.redhat.com/show_bug.cgi?id=1224112
[Bug 1224112] Disperse volume: I/O error on client when USS is turned on
https://bugzilla.redhat.com/show_bug.cgi?id=1224188
[Bug 1224188] Disperse volume: Rebalance failed when plain disperse volume
is converted to distributed disperse volume
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.