[Bugs] [Bug 1399088] New: geo-replica slave node goes faulty for non-root user session due to fail to locate gluster binary

bugzilla at redhat.com bugzilla at redhat.com
Mon Nov 28 09:30:06 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1399088

            Bug ID: 1399088
           Summary: geo-replica slave node goes faulty for non-root user
                    session due to fail to locate gluster binary
           Product: GlusterFS
           Version: 3.8
         Component: geo-replication
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: avishwan at redhat.com
                CC: amukherj at redhat.com, avishwan at redhat.com,
                    bugs at gluster.org, csaba at redhat.com,
                    pdhange at redhat.com, rhinduja at redhat.com,
                    rhs-bugs at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1382241, 1383898, 1386123
            Blocks: 1388150



+++ This bug was initially created as a clone of Bug #1386123 +++

+++ This bug was initially created as a clone of Bug #1383898 +++

+++ This bug was initially created as a clone of Bug #1382241 +++

Description of problem:
The slave nodes goes to faulty state because of popen command failed with error
"execution of "gluster" failed with ENOENT (No such file or directory)" when
geo-replication session started for non-root user.


How reproducible:
Frequently for non-root user

Steps to Reproduce:
1. Setup master cluster 
2. Setup slave cluster
3. Create volume on master and slave cluster
4. Create geo-replication session between master and slave volume for non-root
user
5. Start geo-replication session
# gluster volume geo-replication geovol geouser at slave-node1::geovol start


Actual results:
The master node, geo-replication logs showing below error:
[2016-10-06 02:21:42.558072] E [resource(/brick/brick_georepl_01):226:errlog]
Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-2jmykC/57bcd6e6cf5884cd7845fa826e0cf3b5.sock
geouser at dell10-pd-gluster-node2 /nonexistent/gsyncd --session-owner
e2a48078-ba0d-4f22-9fbb-482c07654c09 -N --listen --timeout 120
gluster://localhost:geovol" returned with 1, saying:
[2016-10-06 02:21:42.558177] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> Warning: Permanently added 'dell10-pd-gluster-node2,10.74.130.162'
(ECDSA) to the list of known hosts.^M
[2016-10-06 02:21:42.558247] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.305069] I [cli.c:721:main] 0-cli: Started
running /usr/sbin/gluster with version 3.7.9
[2016-10-06 02:21:42.558303] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.305097] I [cli.c:608:cli_rpc_init] 0-cli:
Connecting to remote glusterd at localhost
[2016-10-06 02:21:42.558374] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.376366] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2016-10-06 02:21:42.558440] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.376450] I [socket.c:2472:socket_event_handler]
0-transport: disconnecting now
[2016-10-06 02:21:42.558495] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.376966] I
[cli-rpc-ops.c:6514:gf_cli_getwd_cbk] 0-cli: Received resp to getwd
[2016-10-06 02:21:42.558548] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.376998] I [input.c:36:cli_batch] 0-: Exiting
with: 0
[2016-10-06 02:21:42.558599] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.440936] I [gsyncd(slave):710:main_i] <top>:
syncing: gluster://localhost:geovol
[2016-10-06 02:21:42.558651] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.445895] E
[syncdutils(slave):247:log_raise_exception] <top>: execution of "gluster"
failed with ENOENT (No such file or directory)
[2016-10-06 02:21:42.558701] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> failure: execution of "gluster" failed with ENOENT (No such file or
directory)
[2016-10-06 02:21:42.558753] E [resource(/brick/brick_georepl_01):230:logerr]
Popen: ssh> [2016-10-06 02:21:29.446126] I [syncdutils(slave):220:finalize]
<top>: exiting.

Expected results:
All slave nodes should be in Active/Passive state



--- Additional comment from Prashant Dhange on 2016-10-12 01:33:29 EDT ---

Adding /usr/sbin path to PATH environment variable in non-root user's .bashrc
file on all slave nodes resolves the issue

1. edit /home/geouser/.bashrc
2. Add below lines to /home/geouser/.bashrc
PATH=/usr/sbin:$PATH
export PATH

--- Additional comment from Aravinda VK on 2016-10-13 06:27:33 EDT ---

Debugged this issue, If the Geo-replication config file is not readable by
slave gsyncd, then gconf.gluster_command_dir will be substituted as default
empty value.

gluster_bin_path = gluster_command_dir + "gluster"

If gluster_command_dir is empty, then gluster_bin_path will be set to "gluster"
instead of "/usr/sbin/gluster".

.bashrc step is not required if
/var/lib/glusterd/geo-replication/gsyncd-template.conf is readable by Slave
User.

--- Additional comment from Atin Mukherjee on 2016-10-13 12:47:05 EDT ---

(In reply to Aravinda VK from comment #3)
> Debugged this issue, If the Geo-replication config file is not readable by
> slave gsyncd, then gconf.gluster_command_dir will be substituted as default
> empty value.
> 
> gluster_bin_path = gluster_command_dir + "gluster"
> 
> If gluster_command_dir is empty, then gluster_bin_path will be set to
> "gluster" instead of "/usr/sbin/gluster".
> 
> .bashrc step is not required if
> /var/lib/glusterd/geo-replication/gsyncd-template.conf is readable by Slave
> User.

If it is expected that a slave user should have read permission which
apparently was not the case here is this a valid bug? If no, can this BZ be
closed?

--- Additional comment from Prashant Dhange on 2016-10-14 02:57:04 EDT ---

Considering the fact that for the non-root user due to gsyncd_template.conf
permission issue, the gluster_command_dir value could not be read.

Can it be possible to set the default value for gluster_command_dir to
'/usr/sbin' ? If there is no harm in making this value as a default.
Are there any consequences if we do so?

I am suggesting this change based on default installation path for gluster
binaries.

--- Additional comment from Aravinda VK on 2016-10-17 08:19:40 EDT ---

(In reply to Prashant Dhange from comment #5)
> Considering the fact that for the non-root user due to gsyncd_template.conf
> permission issue, the gluster_command_dir value could not be read.
> 
> Can it be possible to set the default value for gluster_command_dir to
> '/usr/sbin' ? If there is no harm in making this value as a default.
> Are there any consequences if we do so?
> 
> I am suggesting this change based on default installation path for gluster
> binaries.

Good suggestion for the default values in code instead of blank. But we don't
know what other problems exists with default values of variables which can't be
read from default config. Proper fix should be done by raising error when
template.conf is not readable instead of silently continuing without reading
config values and substituting other values.

--- Additional comment from Aravinda VK on 2016-10-18 04:42:39 EDT ---

Posted Patch to Master http://review.gluster.org/15669

--- Additional comment from Worker Ant on 2016-10-18 08:07:57 EDT ---

REVIEW: http://review.gluster.org/15669 (geo-rep: Assert error if gsyncd conf
file is not readable) posted (#2) for review on master by Aravinda VK
(avishwan at redhat.com)

--- Additional comment from Worker Ant on 2016-10-19 02:03:58 EDT ---

REVIEW: http://review.gluster.org/15669 (geo-rep: Assert error if gsyncd conf
file is not readable) posted (#3) for review on master by Aravinda VK
(avishwan at redhat.com)

--- Additional comment from Worker Ant on 2016-10-19 03:12:16 EDT ---

REVIEW: http://review.gluster.org/15669 (geo-rep: Assert error if gsyncd conf
file is not readable) posted (#4) for review on master by Aravinda VK
(avishwan at redhat.com)

--- Additional comment from Worker Ant on 2016-10-21 06:40:39 EDT ---

REVIEW: http://review.gluster.org/15669 (geo-rep: Upgrade conf file only if it
is session config) posted (#5) for review on master by Aravinda VK
(avishwan at redhat.com)

--- Additional comment from Aravinda VK on 2016-10-21 06:42:52 EDT ---

This patch in upstream already asserts if any issue while opening config file.
http://review.gluster.org/14777

Modified Patch http://review.gluster.org/15669 to ignore upgrading config file
if it is not session config.

--- Additional comment from Worker Ant on 2016-10-24 11:26:29 EDT ---

COMMIT: http://review.gluster.org/15669 committed in master by Aravinda VK
(avishwan at redhat.com) 
------
commit 1506c7a98d8d3b31e68d0f214ab331f28ffa9fb5
Author: Aravinda VK <avishwan at redhat.com>
Date:   Tue Oct 18 13:34:57 2016 +0530

    geo-rep: Upgrade conf file only if it is session config

    Ignore config upgrade if it is template config file present in
    /var/lib/glusterd/geo-replication/gsyncd_template.conf

    BUG: 1386123
    Change-Id: I2cbba3103b6801c16ff57f778a90b9a0bb2467cf
    Signed-off-by: Aravinda VK <avishwan at redhat.com>
    Reviewed-on: http://review.gluster.org/15669
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Kotresh HR <khiremat at redhat.com>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1386123
[Bug 1386123] geo-replica slave node goes faulty for non-root user session
due to fail to locate gluster binary
https://bugzilla.redhat.com/show_bug.cgi?id=1388150
[Bug 1388150] geo-replica slave node goes faulty for non-root user session
due to fail to locate gluster binary
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list