[Bugs] [Bug 1211264] New: Data Tiering: glusterd(management) communication issues seen on tiering setup

Mon Apr 13 13:12:35 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1211264

            Bug ID: 1211264
           Summary: Data Tiering: glusterd(management) communication
                    issues seen on tiering setup
           Product: GlusterFS
           Version: mainline
         Component: tiering
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: nchilaka at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org

Description of problem:
======================
While executing commands like quota on, attach-tier, detach tier etc on a
cluster
with one tiered volume atleast, there are errors observed like updating the
tables on other nodes of clusters.
Some examples are:
1)volume remove-brick unknown: failed: Commit failed on localhost. Please check
the log file for more details.

2) Sometimes when a command like detach tier or quota disable is issued on a
multi node cluster, the command gets executed only on the local node and fails
to get updated in the tables or graphs of other nodes.

We have seen this issue even on non tiered volume sometimes , but can be seen
after any tiering commands have been executed on that cluster.
There seems to be a issue with management deamon b/w nodes

In more detail, I issued a detach tier command from one node's cli, and
following is the o/p seen from both the nodes resepctive cli:

(local node, where i have been executing all the commands so far)
[root at rhs-client6 glusterd]# gluster v info disperse

Volume Name: disperse
Type: Disperse
Volume ID: a6e4f8dd-bbf8-484f-9d1c-c6267899bb0a
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: yarrow:/yarrow_200G_7/disperse
Brick2: yarrow:/yarrow_200G_8/disperse
Brick3: rhs-client6:/brick15/disperse

[root at yarrow glusterd]# gluster v info disperse

Volume Name: disperse
Type: Tier
Volume ID: a6e4f8dd-bbf8-484f-9d1c-c6267899bb0a
Status: Started
Number of Bricks: 2 x 2 = 5
Transport-type: tcp
Bricks:
Brick1: rhs-client6:/brick16/disperse
Brick2: yarrow:/yarrow_ssd_75G_2/disperse
Brick3: yarrow:/yarrow_200G_7/disperse
Brick4: yarrow:/yarrow_200G_8/disperse
Brick5: rhs-client6:/brick15/disperse

It can be clearly seen that the other nodes havent been updated.

Version-Release number of selected component (if applicable):
============================================================
[root at rhs-client6 glusterd]# gluster --version
glusterfs 3.7dev built on Apr 13 2015 07:14:27
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.

[root at rhs-client6 glusterd]# rpm -qa|grep gluster
glusterfs-api-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-libs-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-fuse-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-cli-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-server-3.7dev-0.994.gitf522001.el6.x86_64

How reproducible:
================
quite easily

Steps to Reproduce:
==================
1.Install latest nightly 
2.create a cluster with atleast two nodes
3.create a tired volume
4. try to enable and then disable quotas and we can see the issue
or else sometimes even detach tier can reproduce the issue

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.