[Bugs] [Bug 1221476] New: Data Tiering:rebalance fails on a tiered volume

bugzilla at redhat.com bugzilla at redhat.com
Thu May 14 06:48:32 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1221476

            Bug ID: 1221476
           Summary: Data Tiering:rebalance fails on a tiered volume
           Product: GlusterFS
           Version: 3.7.0
         Component: tiering
          Keywords: Triaged
          Severity: urgent
          Priority: urgent
          Assignee: bugs at gluster.org
          Reporter: rkavunga at redhat.com
        QA Contact: bugs at gluster.org
                CC: annair at redhat.com, bugs at gluster.org,
                    dlambrig at redhat.com, gluster-bugs at redhat.com,
                    josferna at redhat.com, nchilaka at redhat.com,
                    rkavunga at redhat.com
        Depends On: 1205624
            Blocks: 1186580 (qe_tracker_everglades), 1199352
                    (glusterfs-3.7.0)



+++ This bug was initially created as a clone of Bug #1205624 +++

Description of problem:
=======================
rebalance operation fails on a tiered volume.
Have tried it on a regular volume , where it passes


Version-Release number of selected component (if applicable):
============================================================
3.7 upstream nightlies build
http://download.gluster.org/pub/gluster/glusterfs/nightly/glusterfs/epel-6-x86_64/glusterfs-3.7dev-0.777.git2308c07.autobuild/


How reproducible:
=================
reproduced it twice on tiered volume


Steps to Reproduce:
==================
1.create a gluster volume(i created a distribute type) and start the volume
2.create some files on the volume
3.attach a tier to the volume using attach-tier
4. Now run a rebalance using "gluster v rebalance <vol> start" on the tiered
volume
5.check the status of rebalance


Actual results:
===============
The rebalance action fails as below
[root at rhs-client44 glusterfs]# gluster v rebalance nag_vol2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client38                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client37                0        0Bytes        
    0             0             0               failed               0.00
volume rebalance: nag_vol2: success: 


Expected results:
================
rebalance should pass on tiered vol too.


Additional info(CLI logs):
===============

[root at rhs-client44 glusterfs]# tail -f nag_vol2-rebalance.log 
-------------------------------------------------------------
[2015-03-25 10:45:33.631882] I [MSGID: 100030] [glusterfsd.c:2288:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7dev
(args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/nag_vol2
--xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes
--xlator-option *dht.assert-no-child-down=yes --xlator-option
*replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off --xlator-option
*replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on
--xlator-option *tier-dht.xattr-name=trusted.tier-gfid --xlator-option
*dht.rebalance-cmd=1 --xlator-option
*dht.node-uuid=1327654c-0521-46f8-8be3-b0f9c183d137 --socket-file
/var/run/gluster/gluster-rebalance-4f00d705-0ab4-4a6e-8605-15493153db76.sock
--pid-file
/var/lib/glusterd/vols/nag_vol2/rebalance/1327654c-0521-46f8-8be3-b0f9c183d137.pid
-l /var/log/glusterfs/nag_vol2-rebalance.log)
[2015-03-25 10:45:33.642596] I [event-epoll.c:629:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1
[2015-03-25 10:45:38.631172] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'node-uuid' for volume 'tier-dht' with value
'1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631207] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'rebalance-cmd' for volume 'tier-dht' with value '1'
[2015-03-25 10:45:38.631222] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'xattr-name' for volume 'tier-dht' with value 'trusted.tier-gfid'
[2015-03-25 10:45:38.631244] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'readdir-optimize' for volume 'tier-dht' with value 'on'
[2015-03-25 10:45:38.631259] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'assert-no-child-down' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631272] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'lookup-unhashed' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631288] I [graph.c:269:gf_add_cmdline_options] 0-tier-dht:
adding option 'use-readdirp' for volume 'tier-dht' with value 'yes'
[2015-03-25 10:45:38.631300] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'node-uuid' for volume 'nag_vol2-hot-dht'
with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631323] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'rebalance-cmd' for volume 'nag_vol2-hot-dht'
with value '1'
[2015-03-25 10:45:38.631337] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'readdir-optimize' for volume
'nag_vol2-hot-dht' with value 'on'
[2015-03-25 10:45:38.631354] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'assert-no-child-down' for volume
'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631367] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'lookup-unhashed' for volume
'nag_vol2-hot-dht' with value 'yes'
[2015-03-25 10:45:38.631380] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-hot-dht: adding option 'use-readdirp' for volume 'nag_vol2-hot-dht'
with value 'yes'
[2015-03-25 10:45:38.631397] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'node-uuid' for volume 'nag_vol2-cold-dht'
with value '1327654c-0521-46f8-8be3-b0f9c183d137'
[2015-03-25 10:45:38.631415] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'rebalance-cmd' for volume
'nag_vol2-cold-dht' with value '1'
[2015-03-25 10:45:38.631427] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'readdir-optimize' for volume
'nag_vol2-cold-dht' with value 'on'
[2015-03-25 10:45:38.631443] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'assert-no-child-down' for volume
'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.631455] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'lookup-unhashed' for volume
'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.631471] I [graph.c:269:gf_add_cmdline_options]
0-nag_vol2-cold-dht: adding option 'use-readdirp' for volume
'nag_vol2-cold-dht' with value 'yes'
[2015-03-25 10:45:38.632109] I [dht-shared.c:340:dht_init_regex] 0-tier-dht:
using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
[2015-03-25 10:45:38.633278] W [options.c:1193:xlator_option_init_int32]
0-tier-dht: unknown option: write-freq-threshold
[2015-03-25 10:45:38.633313] E [xlator.c:426:xlator_init] 0-tier-dht:
Initialization of volume 'tier-dht' failed, review your volfile again
[2015-03-25 10:45:38.633326] E [graph.c:322:glusterfs_graph_init] 0-tier-dht:
initializing translator failed
[2015-03-25 10:45:38.633336] E [graph.c:661:glusterfs_graph_activate] 0-graph:
init failed
[2015-03-25 10:45:38.633716] W [glusterfsd.c:1212:cleanup_and_exit] (--> 0-:
received signum (0), shutting down
##########################################################################################################################################################


[root at rhs-client44 glusterfs]#tail -f etc-glusterfs-glusterd.vol.log 
---------------------------------------------------------------------

[2015-03-25 10:45:33.548238] I
[glusterd-utils.c:8923:glusterd_generate_and_set_task_id] 0-management:
Generated task-id e8891b0d-3861-4104-bc96-1510aceed88d for key rebalance-id
[2015-03-25 10:45:38.626851] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-03-25 10:45:38.634730] W [socket.c:642:__socket_rwv] 0-management: readv
on /var/run/gluster/gluster-rebalance-4f00d705-0ab4-4a6e-8605-15493153db76.sock
failed (No data available)
[2015-03-25 10:45:38.730494] I [MSGID: 106007]
[glusterd-rebalance.c:173:__glusterd_defrag_notify] 0-management: Rebalance
process for volume nag_vol2 has disconnected.
[2015-03-25 10:45:38.730534] I [mem-pool.c:557:mem_pool_destroy] 0-management:
size=588 max=0 total=0
[2015-03-25 10:45:38.730550] I [mem-pool.c:557:mem_pool_destroy] 0-management:
size=124 max=0 total=0
[2015-03-25 10:45:43.733289] E
[glusterd-utils.c:8078:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to
get index
[2015-03-25 10:45:43.746431] E
[glusterd-utils.c:8078:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to
get index
[2015-03-25 10:45:48.840974] I
[glusterd-handler.c:3970:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume nag_vol2
##########################################################################################################################################################

[root at rhs-client44 glusterfs]# tail -f cli.log 
----------------------------------------------
[2015-03-25 10:45:33.543436] I [event-epoll.c:629:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1
[2015-03-25 10:45:33.543559] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:36.412879] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:39.413296] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:42.413729] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:43.879079] I [input.c:36:cli_batch] 0-: Exiting with: 0
[2015-03-25 10:45:48.838839] I [event-epoll.c:629:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1
[2015-03-25 10:45:48.839012] I [socket.c:2409:socket_event_handler]
0-transport: disconnecting now
[2015-03-25 10:45:48.851709] I [input.c:36:cli_batch] 0-: Exiting with: 0




[root at rhs-client44 glusterfs]# gluster v info vol1

Volume Name: vol1
Type: Tier
Volume ID: 3382e788-ee37-4d6c-b214-8469ca68e376
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/vol1_hot/hb2
Brick2: rhs-client44:/pavanbrick2/vol1_hot/hb2
Brick3: rhs-client44:/pavanbrick1/vol1/b1
Brick4: rhs-client38:/pavanbrick1/vol1/b1
Brick5: rhs-client37:/pavanbrick1/vol1/b1
[root at rhs-client44 glusterfs]# gluster v rebalance start vol1
Usage: volume rebalance <VOLNAME> {{fix-layout start} | {start
[force]|stop|status}}
[root at rhs-client44 glusterfs]# gluster v rebalance vol1 start
volume rebalance: vol1: success: Rebalance on vol1 has been started
successfully. Use rebalance status command to check status of the rebalance
process.
ID: 368050e1-75ab-4332-83c1-f5fb7c4fea41

[root at rhs-client44 glusterfs]# gluster v rebalance vol1 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client38                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client37                0        0Bytes        
    0             0             0               failed               0.00
volume rebalance: vol1: success: 
[root at rhs-client44 glusterfs]# gluster v info nag_vol2

Volume Name: nag_vol2
Type: Tier
Volume ID: 4f00d705-0ab4-4a6e-8605-15493153db76
Status: Started
Number of Bricks: 5 x 1 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client37:/pavanbrick2/nag_vol2
Brick2: rhs-client44:/pavanbrick2/nag_vol2/hb1
Brick3: rhs-client44:/pavanbrick1/nag_vol2/b1
Brick4: rhs-client37:/pavanbrick1/nag_vol2/b1
Brick5: rhs-client38:/pavanbrick1/nag_vol2/b1
[root at rhs-client44 glusterfs]# gluster v status nag_vol2
Status of volume: nag_vol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhs-client37:/pavanbrick2/nag_vol2    49157     0          Y       32000
Brick rhs-client44:/pavanbrick2/nag_vol2/hb
1                                           49157     0          Y       32707
Brick rhs-client44:/pavanbrick1/nag_vol2/b1 49156     0          Y       32535
Brick rhs-client37:/pavanbrick1/nag_vol2/b1 49156     0          Y       31885
Brick rhs-client38:/pavanbrick1/nag_vol2/b1 49155     0          Y       625  
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on rhs-client38                  N/A       N/A        N       N/A  
NFS Server on rhs-client37                  N/A       N/A        N       N/A  

Task Status of Volume nag_vol2
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : e8891b0d-3861-4104-bc96-1510aceed88d
Status               : failed              

[root at rhs-client44 glusterfs]# gluster v rebalance nag_vol2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client38                0        0Bytes        
    0             0             0               failed               0.00
                            rhs-client37                0        0Bytes        
    0             0             0               failed               0.00
volume rebalance: nag_vol2: success: 
[root at rhs-client44 glusterfs]# 
[root at rhs-client44 glusterfs]# 
[root at rhs-client44 glusterfs]# 
[root at rhs-client44 glusterfs]# gluster --version
glusterfs 3.7dev built on Mar 24 2015 01:04:20
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
[root at rhs-client44 glusterfs]# gluster pool lsit
unrecognized word: lsit (position 1)
[root at rhs-client44 glusterfs]# gluster pool list
UUID                    Hostname        State
0f5fa6d4-8545-41ec-8f5e-9612fa72262a    rhs-client38    Connected 
456e0cc9-e2fc-44fb-a4ff-aec8fe60cba2    rhs-client37    Connected 
1327654c-0521-46f8-8be3-b0f9c183d137    localhost       Connected

--- Additional comment from Dan Lambright on 2015-04-21 14:44:55 EDT ---

We do not support rebalance with tiered volumes. You need to detach a tier,
then rebalance it. The CLI should probably say this and the rebalance command
should fail gracefully.

--- Additional comment from Anand Avati on 2015-04-23 07:00:02 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#1) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Dan Lambright on 2015-04-24 07:52:36 EDT ---



--- Additional comment from Anand Avati on 2015-04-30 02:15:32 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#2) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Anand Avati on 2015-05-05 02:57:13 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#3) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Anand Avati on 2015-05-05 09:28:58 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#4) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Anand Avati on 2015-05-06 03:25:04 EDT ---

REVIEW: http://review.gluster.org/10349 (tiering: Do not allow some operations
on tiered volume) posted (#5) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1199352
[Bug 1199352] GlusterFS 3.7.0 tracker
https://bugzilla.redhat.com/show_bug.cgi?id=1205624
[Bug 1205624] Data Tiering:rebalance fails on a tiered volume
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list