[Bugs] [Bug 1221534] New: rebalance failed after attaching the tier to the volume.

bugzilla at redhat.com bugzilla at redhat.com
Thu May 14 10:09:33 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1221534

            Bug ID: 1221534
           Summary: rebalance failed after attaching the tier to the
                    volume.
           Product: GlusterFS
           Version: 3.7.0
         Component: tiering
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: trao at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org



Description of problem:

After attaching the tier to the volume rebalance status shows failed and
rebalance doesnot happen between cold and hot tier.


Version-Release number of selected component (if applicable):
root at rhsqa14-vm3 ~]# glusterfs --version
glusterfs 3.7.0beta2 built on May 11 2015 01:27:45
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
You have new mail in /var/spool/mail/root
[root at rhsqa14-vm3 ~]# 
[root at rhsqa14-vm3 ~]# 
[root at rhsqa14-vm3 ~]# rpm -qa | grep gluster
glusterfs-libs-3.7.0beta2-0.0.el6.x86_64
glusterfs-fuse-3.7.0beta2-0.0.el6.x86_64
glusterfs-rdma-3.7.0beta2-0.0.el6.x86_64
glusterfs-3.7.0beta2-0.0.el6.x86_64
glusterfs-api-3.7.0beta2-0.0.el6.x86_64
glusterfs-cli-3.7.0beta2-0.0.el6.x86_64
glusterfs-geo-replication-3.7.0beta2-0.0.el6.x86_64
glusterfs-extra-xlators-3.7.0beta2-0.0.el6.x86_64
glusterfs-client-xlators-3.7.0beta2-0.0.el6.x86_64
glusterfs-server-3.7.0beta2-0.0.el6.x86_64
[root at rhsqa14-vm3 ~]#


How reproducible:
easily

Steps to Reproduce:
1.create dist-rep volume.
2.fuse mount it and add some dirs.
3. attach tier to the volume.
4. rebalance status shows failed.
5. on mount point dirs are missing.


Actual results:

rebalance fails.

Additional info:

[root at rhsqa14-vm1 linux-4.0]# gluster  v info test

Volume Name: test
Type: Distribute
Volume ID: d5b37bbc-af62-48cd-8942-43b5a395da65
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.70.46.233:/rhs/brick4/t1
Brick2: 10.70.46.236:/rhs/brick4/t1
Options Reconfigured:
performance.readdir-ahead: on 
[root at rhsqa14-vm1 linux-4.0]# 
[root at rhsqa14-vm1 linux-4.0]# gluster v attach-tier test replica 2
10.70.46.233:/rhs/brick3/t2 10.70.46.236:/rhs/brick3/t2 10.70.46.233:/rhs
/brick5/t2 10.70.46.236:/rhs/brick5/t2 force
Attach tier is recommended only for testing purposes in this release. Do you
want to continue? (y/n) y
volume attach-tier: success   
volume rebalance: test: success: Rebalance on test has been started
successfully. Use rebalance status command to check status of the rebala
nce process.
ID: a89419aa-c45b-42ef-8998-1a23cde16efe

[root at rhsqa14-vm1 linux-4.0]# 

[root at rhsqa14-vm1 linux-4.0]# gluster v rebalance test status                   
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status   run
 time in secs
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     -
-------------
                               localhost                0        0Bytes        
    0             0             0               failed               0.00
                            10.70.46.236                0        0Bytes        
    0             0             0               failed               0.00
volume rebalance: test: success:
[root at rhsqa14-vm1 linux-4.0]# 
[root at rhsqa14-vm1 linux-4.0]# gluster v info test

Volume Name: test
Type: Tier
Volume ID: d5b37bbc-af62-48cd-8942-43b5a395da65
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4   
Brick1: 10.70.46.236:/rhs/brick5/t2
Brick2: 10.70.46.233:/rhs/brick5/t2
Brick3: 10.70.46.236:/rhs/brick3/t2
Brick4: 10.70.46.233:/rhs/brick3/t2
Cold Bricks:
Cold Tier Type : Distribute   
Number of Bricks: 2
Brick5: 10.70.46.233:/rhs/brick4/t1
Brick6: 10.70.46.236:/rhs/brick4/t1
Options Reconfigured:
performance.readdir-ahead: on 


Before attaching the tier: on mount poit
[root at rhsqa14-vm5 disk1]# ls -la
total 4
drwxr-xr-x.  5 root root  133 May 14  2015 .
dr-xr-xr-x. 30 root root 4096 May 14 05:47 ..
-rw-r--r--.  1 root root    0 May 14 05:47 t1
-rw-r--r--.  1 root root    0 May 14 05:47 t2
-rw-r--r--.  1 root root    0 May 14 05:47 t4
drwxr-xr-x.  3 root root   48 May 14  2015 .trashcan
drwxr-xr-x.  2 root root   12 May 14  2015 triveni
[root at rhsqa14-vm5 disk1]#


After attaching the tier on mount point

[root at rhsqa14-vm5 disk1]# ls -la
total 4
drwxr-xr-x.  4 root root  211 May 14  2015 .
dr-xr-xr-x. 30 root root 4096 May 14 05:47 ..
-rw-r--r--.  1 root root    0 May 14 05:47 t1
-rw-r--r--.  1 root root    0 May 14 05:47 t2
-rw-r--r--.  1 root root    0 May 14 05:47 t4
drwxr-xr-x.  3 root root   96 May 14  2015 .trashcan
[root at rhsqa14-vm5 disk1]# 


rebalance Log messages:

[2015-05-14 09:52:37.116716] I [graph.c:269:gf_add_cmdline_options]
0-test-hot-replicate-1: adding option 'data-self-heal' for volume
'test-hot-replicate-1' with value 'off'
[2015-05-14 09:52:37.116736] I [graph.c:269:gf_add_cmdline_options]
0-test-hot-replicate-0: adding option 'readdir-failover' for volume
'test-hot-replicate-0' with value 'off'
[2015-05-14 09:52:37.116747] I [graph.c:269:gf_add_cmdline_options]
0-test-hot-replicate-0: adding option 'entry-self-heal' for volume
'test-hot-replicate-0' with value 'off'
[2015-05-14 09:52:37.116757] I [graph.c:269:gf_add_cmdline_options]
0-test-hot-replicate-0: adding option 'metadata-self-heal' for volume
'test-hot-replicate-0' with value 'off'
[2015-05-14 09:52:37.116767] I [graph.c:269:gf_add_cmdline_options]
0-test-hot-replicate-0: adding option 'data-self-heal' for volume
'test-hot-replicate-0' with value 'off'
[2015-05-14 09:52:37.116778] I [graph.c:269:gf_add_cmdline_options]
0-test-cold-dht: adding option 'commit-hash' for volume 'test-cold-dht' with
value '2862842620'
[2015-05-14 09:52:37.116788] I [graph.c:269:gf_add_cmdline_options]
0-test-cold-dht: adding option 'node-uuid' for volume 'test-cold-dht' with
value '87acbf29-e821-48bf-9aa8-bbda9321e609'
[2015-05-14 09:52:37.116798] I [graph.c:269:gf_add_cmdline_options]
0-test-cold-dht: adding option 'rebalance-cmd' for volume 'test-cold-dht' with
value '6'
[2015-05-14 09:52:37.116809] I [graph.c:269:gf_add_cmdline_options]
0-test-cold-dht: adding option 'readdir-optimize' for volume 'test-cold-dht'
with value 'on'
[2015-05-14 09:52:37.116820] I [graph.c:269:gf_add_cmdline_options]
0-test-cold-dht: adding option 'assert-no-child-down' for volume
'test-cold-dht' with value 'yes'
[2015-05-14 09:52:37.116835] I [graph.c:269:gf_add_cmdline_options]
0-test-cold-dht: adding option 'lookup-unhashed' for volume 'test-cold-dht'
with value 'yes'
[2015-05-14 09:52:37.116846] I [graph.c:269:gf_add_cmdline_options]
0-test-cold-dht: adding option 'use-readdirp' for volume 'test-cold-dht' with
value 'yes'
[2015-05-14 09:52:37.118410] I [dht-shared.c:598:dht_init] 0-tier-dht: dht_init
using commit hash 2862842620
[2015-05-14 09:52:37.120852] E [MSGID: 109037]
[tier.c:1007:tier_load_externals] 0-tier-dht: Error loading libgfdb.so
/usr/lib64/libgfdb.so: cannot open shared object file: No such file or
directory

[2015-05-14 09:52:37.120892] E [MSGID: 109037] [tier.c:1070:tier_init]
0-tier-dht: Could not load externals. Aborting
[2015-05-14 09:52:37.120903] E [xlator.c:426:xlator_init] 0-tier-dht:
Initialization of volume 'tier-dht' failed, review your volfile again
[2015-05-14 09:52:37.120913] E [graph.c:322:glusterfs_graph_init] 0-tier-dht:
initializing translator failed
[2015-05-14 09:52:37.120921] E [graph.c:661:glusterfs_graph_activate] 0-graph:
init failed
[2015-05-14 09:52:37.124323] W [glusterfsd.c:1219:cleanup_and_exit] (--> 0-:
received signum (0), shutting down
~
~

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list