[Bugs] [Bug 1229236] New: Data Tiering:rebalance fails on a tiered volume

bugzilla at redhat.com bugzilla at redhat.com
Mon Jun 8 10:16:42 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1229236

            Bug ID: 1229236
           Summary: Data Tiering:rebalance fails on a tiered volume
           Product: Red Hat Gluster Storage
           Version: 3.1
         Component: glusterfs
     Sub Component: tiering
          Keywords: Triaged
          Severity: urgent
          Priority: urgent
          Assignee: rhs-bugs at redhat.com
          Reporter: nchilaka at redhat.com
        QA Contact: nchilaka at redhat.com
                CC: annair at redhat.com, bugs at gluster.org,
                    dlambrig at redhat.com, gluster-bugs at redhat.com,
                    josferna at redhat.com, rkavunga at redhat.com
        Depends On: 1205624
            Blocks: 1186580 (qe_tracker_everglades), 1227188, 1221476



+++ This bug was initially created as a clone of Bug #1205624 +++

Description of problem:
=======================
rebalance operation fails on a tiered volume.
Have tried it on a regular volume , where it passes


Version-Release number of selected component (if applicable):
============================================================
3.7 upstream nightlies build
http://download.gluster.org/pub/gluster/glusterfs/nightly/glusterfs/epel-6-x86_64/glusterfs-3.7dev-0.777.git2308c07.autobuild/


How reproducible:
=================
reproduced it twice on tiered volume


Steps to Reproduce:
==================
1.create a gluster volume(i created a distribute type) and start the volume
2.create some files on the volume
3.attach a tier to the volume using attach-tier
4. Now run a rebalance using "gluster v rebalance <vol> start" on the tiered
volume
5.check the status of rebalance


Actual results:
===============
The rebalance action fails as below
[root at rhs-client44 glusterfs]# gluster v rebalance nag_vol2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client38                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client37                0        0Bytes        
    0             0             0               failed               0.00
volume rebalance: nag_vol2: success: 


Expected results:
================
rebalance should pass on tiered vol too.


Additional info(CLI logs):
===============

[root at rhs-client44 glusterfs]# tail -f nag_vol2-rebalance.log 
-------------------------------------------------------------
[2015-03-25 10:45:33.631882] I [MSGID: 100030] [glusterfsd.c:2288:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7dev
(args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/nag_vol2
--xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes
--xlator-option *dht.assert-no-child-down=yes --xlator-option
*replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off --xlator-option
*replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on
--xlator-option *tier-dht.xattr-name=trusted.tier-gfid --xlator-option
*dht.rebalance-cmd=1 --xlator-option
*dht.node-uuid=1327654c-0521-46f8-8be3-b0f9c183d137 --socket-file
/var/run/gluster/gluster-rebalance-4f00d705-0ab4-4a6e-8605-15493153db76.sock
--pid-file
/var/lib/glusterd/vols/nag_vol2/rebalance/1327654c-0521-46f8-8be3-b0f9c183d137.pid
-l /var/log/glusterfs/nag_vol2-rebalance.log)
[2015-03-25 10:45:33.642596] I [event-epoll.c:629:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1
[2015-03-25 10:45:38.631172] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'node-uuid' for volume 'tier-dht' with value
'1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631207] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'rebalance-cmd' for volume 'tier-dht' with value '1'
[2015-03-25 10:45:38.631222] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'xattr-name' for volume 'tier-dht' with value 'trusted.tier-gfid'
[2015-03-25 10:45:38.631244] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'readdir-optimize' for volume 'tier-dht' with value 'on'
[2015-03-25 10:45:38.631259] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'assert-no-child-down' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631272] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'lookup-unhashed' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631288] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'use-readdirp' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631300] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'node-uuid' for volume 'nag_vol2-hot-dht'
with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631323] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'rebalance-cmd' for volume 'nag_vol2-hot-dht'
with value '1'
[2015-03-25 10:45:38.631337] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'readdir-optimize' for volume
'nag_vol2-hot-dht' with value 'on'
[2015-03-25 10:45:38.631354] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'assert-no-child-down' for volume
'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631367] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'lookup-unhashed' for volume
'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631380] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'use-readdirp' for volume 'nag_vol2-hot-dht'
with value 'yes'
[2015-03-25 10:45:38.631397] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'node-uuid' for volume 'nag_vol2-cold-dht'
with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631415] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'rebalance-cmd' for volume
'nag_vol2-cold-dht' with value '1'
[2015-03-25 10:45:38.631427] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'readdir-optimize' for volume
'nag_vol2-cold-dht' with value 'on'
[2015-03-25 10:45:38.631443] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'assert-no-child-down' for volume
'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.631455] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'lookup-unhashed' for volume
'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.631471] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'use-readdirp' for volume
'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.632109] I [dht-shared.c:340:dht_init_regex] 0-tier-dht:
using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
[2015-03-25 10:45:38.633278] W [options.c:1193:xlator_option_init_int32]
0-tier-dht: unknown option: write-freq-threshold
[2015-03-25 10:45:38.633313] E [xlator.c:426:xlator_init] 0-tier-dht:
Initialization of volume 'tier-dht' failed, review your volfile again
[2015-03-25 10:45:38.633326] E [graph.c:322:glusterfs_graph_init] 0-tier-dht:
initializing translator failed
[2015-03-25 10:45:38.633336] E [graph.c:661:glusterfs_graph_activate] 0-graph:
init failed
[2015-03-25 10:45:38.633716] W [glusterfsd.c:1212:cleanup_and_exit] (--> 0-:
received signum (0), shutting down
##########################################################################################################################################################


[root at rhs-client44 glusterfs]#tail -f etc-glusterfs-glusterd.vol.log 
---------------------------------------------------------------------

[2015-03-25 10:45:33.548238] I
[glusterd-utils.c:8923:glusterd_generate_and_set_task_id] 0-management:
Generated task-id e8891b0d-3861-4104-bc96-1510aceed88d for key rebalance-id
[2015-03-25 10:45:38.626851] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-03-25 10:45:38.634730] W [socket.c:642:__socket_rwv] 0-management: readv
on /var/run/gluster/gluster-rebalance-4f00d705-0ab4-4a6e-8605-15493153db76.sock
failed (No data available)
[2015-03-25 10:45:38.730494] I [MSGID: 106007]
[glusterd-rebalance.c:173:__glusterd_defrag_notify] 0-management: Rebalance
process for volume nag_vol2 has disconnected.
[2015-03-25 10:45:38.730534] I [mem-pool.c:557:mem_pool_destroy] 0-management:
size=588 max=0 total=0
[2015-03-25 10:45:38.730550] I [mem-pool.c:557:mem_pool_destroy] 0-management:
size=124 max=0 total=0
[2015-03-25 10:45:43.733289] E
[glusterd-utils.c:8078:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to
get index
[2015-03-25 10:45:43.746431] E
[glusterd-utils.c:8078:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to
get index
[2015-03-25 10:45:48.840974] I
[glusterd-handler.c:3970:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume nag_vol2
##########################################################################################################################################################

[root at rhs-client44 glusterfs]# tail -f cli.log 
----------------------------------------------
[2015-03-25 10:45:33.543436] I [event-epoll.c:629:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1
[2015-03-25 10:45:33.543559] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:36.412879] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:39.413296] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:42.413729] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:43.879079] I [input.c:36:cli_batch] 0-: Exiting with: 0
[2015-03-25 10:45:48.838839] I [event-epoll.c:629:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1
[2015-03-25 10:45:48.839012] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:48.851709] I [input.c:36:cli_batch] 0-: Exiting with: 0




[root at rhs-client44 glusterfs]# gluster v info vol1

Volume Name: vol1
Type: Tier
Volume ID: 3382e788-ee37-4d6c-b214-8469ca68e376
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/vol1_hot/hb2
Brick2: rhs-client44:/pavanbrick2/vol1_hot/hb2
Brick3: rhs-client44:/pavanbrick1/vol1/b1
Brick4: rhs-client38:/pavanbrick1/vol1/b1
Brick5: rhs-client37:/pavanbrick1/vol1/b1
[root at rhs-client44 glusterfs]# gluster v rebalance start vol1
Usage: volume rebalance <VOLNAME> {{fix-layout start} | {start
[force]|stop|status}}
[root at rhs-client44 glusterfs]# gluster v rebalance vol1 start
volume rebalance: vol1: success: Rebalance on vol1 has been started
successfully. Use rebalance status command to check status of the rebalance
process.
ID: 368050e1-75ab-4332-83c1-f5fb7c4fea41

[root at rhs-client44 glusterfs]# gluster v rebalance vol1 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client38                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client37                0        0Bytes        
    0             0             0               failed               0.00
volume rebalance: vol1: success: 
[root at rhs-client44 glusterfs]# gluster v info nag_vol2

Volume Name: nag_vol2
Type: Tier
Volume ID: 4f00d705-0ab4-4a6e-8605-15493153db76
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/nag_vol2
Brick2: rhs-client44:/pavanbrick2/nag_vol2/hb1
Brick3: rhs-client44:/pavanbrick1/nag_vol2/b1
Brick4: rhs-client37:/pavanbrick1/nag_vol2/b1
Brick5: rhs-client38:/pavanbrick1/nag_vol2/b1
[root at rhs-client44 glusterfs]# gluster v status nag_vol2
Status of volume: nag_vol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhs-client37:/pavanbrick2/nag_vol2    49157     0          Y       32000
Brick rhs-client44:/pavanbrick2/nag_vol2/hb
1                                           49157     0          Y       32707
Brick rhs-client44:/pavanbrick1/nag_vol2/b1 49156     0          Y       32535
Brick rhs-client37:/pavanbrick1/nag_vol2/b1 49156     0          Y       31885
Brick rhs-client38:/pavanbrick1/nag_vol2/b1 49155     0          Y       625  
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on rhs-client38                  N/A       N/A        N       N/A  
NFS Server on rhs-client37                  N/A       N/A        N       N/A  

Task Status of Volume nag_vol2
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : e8891b0d-3861-4104-bc96-1510aceed88d
Status               : failed              

[root at rhs-client44 glusterfs]# gluster v rebalance nag_vol2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client38                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client37                0        0Bytes        
    0             0             0               failed               0.00
volume rebalance: nag_vol2: success: 
[root at rhs-client44 glusterfs]# 
[root at rhs-client44 glusterfs]# 
[root at rhs-client44 glusterfs]# 
[root at rhs-client44 glusterfs]# gluster --version
glusterfs 3.7dev built on Mar 24 2015 01:04:20
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
[root at rhs-client44 glusterfs]# gluster pool lsit
unrecognized word: lsit (position 1)
[root at rhs-client44 glusterfs]# gluster pool list
UUID                    Hostname        State
0f5fa6d4-8545-41ec-8f5e-9612fa72262a    rhs-client38    Connected 
456e0cc9-e2fc-44fb-a4ff-aec8fe60cba2    rhs-client37    Connected 
1327654c-0521-46f8-8be3-b0f9c183d137    localhost       Connected

--- Additional comment from Dan Lambright on 2015-04-21 14:44:55 EDT ---

We do not support rebalance with tiered volumes. You need to detach a tier,
then rebalance it. The CLI should probably say this and the rebalance command
should fail gracefully.

--- Additional comment from Anand Avati on 2015-04-23 07:00:02 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#1) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Dan Lambright on 2015-04-24 07:52:36 EDT ---



--- Additional comment from Anand Avati on 2015-04-30 02:15:32 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#2) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Anand Avati on 2015-05-05 02:57:13 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#3) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Anand Avati on 2015-05-05 09:28:58 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#4) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Anand Avati on 2015-05-06 03:25:04 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#5) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Niels de Vos on 2015-05-15 08:57:36 EDT ---

This change should not be in "ON_QA", the patch posted for this bug is only
available in the master branch and not in a release yet. Moving back to
MODIFIED until there is an beta release for the next GlusterFS version.


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1205624
[Bug 1205624] Data Tiering:rebalance fails on a tiered volume
https://bugzilla.redhat.com/show_bug.cgi?id=1221476
[Bug 1221476] Data Tiering:rebalance fails on a tiered volume
https://bugzilla.redhat.com/show_bug.cgi?id=1227188
[Bug 1227188] Data Tiering:rebalance fails on a tiered volume
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=8tQXKElefw&a=cc_unsubscribe


More information about the Bugs mailing list