[Bugs] [Bug 1663583] New: Geo-replication fails to open logfile "/var/log/glusterfs/cli.log" on slave.

bugzilla at redhat.com bugzilla at redhat.com
Fri Jan 4 23:33:54 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1663583

            Bug ID: 1663583
           Summary: Geo-replication fails to open logfile
                    "/var/log/glusterfs/cli.log" on slave.
           Product: GlusterFS
           Version: 5
          Hardware: All
                OS: Linux
            Status: NEW
         Component: geo-replication
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: vnosov at stonefly.com
                CC: bugs at gluster.org
  Target Milestone: ---
    Classification: Community



Description of problem: When non root account on slave system is used by SSH to
setup geo-replication, the initialization fails on slave to open logfile
/var/log/glusterfs/cli.log. As result geo-replication is not active and has
status "Created". 


Version-Release number of selected component (if applicable):
GlusterFS 5.2, manually installed from GlusterFS source code.

How reproducible: 100%

In our tests:
Master system IP address: 10.10.60.182.
Slave system DNS name: vn-sc-2.westus2.cloudapp.azure.com
Master volume: master-volume-0007
Slave volume:  slave-volume-0001

Steps to Reproduce:

1. Setup nonprivileged account on slave system. In our case account "nasgorep"
was used. Test SSH from master to slave.

12592 01/04/2019 14:27:35.101560269 1546640855 command: /usr/bin/ssh
nasgorep at vn-sc-2.westus2.cloudapp.azure.com /bin/pwd
12592 01/04/2019 14:27:35.839285578 1546640855 status=0 /usr/bin/ssh
nasgorep at vn-sc-2.westus2.cloudapp.azure.com /bin/pwd
12640 01/04/2019 14:27:37.847050433 1546640857 command: /usr/bin/ssh -q
-oConnectTimeout=5 nasgorep at vn-sc-2.westus2.cloudapp.azure.com /bin/pwd 2>&1
12640 01/04/2019 14:27:38.596877738 1546640858 status=0 /usr/bin/ssh -q
-oConnectTimeout=5 nasgorep at vn-sc-2.westus2.cloudapp.azure.com /bin/pwd 2>&1


2. Setup geo-replication:

12592 01/04/2019 14:27:35.101560269 1546640855 command: /usr/bin/ssh
nasgorep at vn-sc-2.westus2.cloudapp.azure.com /bin/pwd
12592 01/04/2019 14:27:35.839285578 1546640855 status=0 /usr/bin/ssh
nasgorep at vn-sc-2.westus2.cloudapp.azure.com /bin/pwd
12640 01/04/2019 14:27:37.847050433 1546640857 command: /usr/bin/ssh -q
-oConnectTimeout=5 nasgorep at vn-sc-2.westus2.cloudapp.azure.com /bin/pwd 2>&1
12640 01/04/2019 14:27:38.596877738 1546640858 status=0 /usr/bin/ssh -q
-oConnectTimeout=5 nasgorep at vn-sc-2.westus2.cloudapp.azure.com /bin/pwd 2>&1


[2019-01-04 22:27:52.449174]  : system:: execute gsec_create : SUCCESS
[2019-01-04 22:28:32.288278]  : volume geo-replication master-volume-0007
nasgorep at vn-sc-2.westus2.cloudapp.azure.com::slave-volume-0001 create push-pem
: SUCCESS
[2019-01-04 22:28:39.447194]  : volume geo-replication master-volume-0007
nasgorep at vn-sc-2.westus2.cloudapp.azure.com::slave-volume-0001 start : SUCCESS


Actual results: 

Check status of the geo-replication, it is in "Created" state:

[root at SC-10-10-63-182 log]# /usr/sbin/gluster volume geo-replication
master-volume-0007
nasgorep at vn-sc-2.westus2.cloudapp.azure.com::slave-olume-0001 status detail

MASTER NODE     MASTER VOL            MASTER BRICK                             
         SLAVE USER    SLAVE                                                   
         SLAVE NODE    STATUS     CRAWL STATUS    LAST_SYNCED    ENTRY    DATA 
  META    FAILURES    CHECKPOINT TIME    CHECKPOINT COMPLETED    CHECKPOINT
COMPLETION TIME
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.10.60.182    master-volume-0007   
/exports/master-segment-0015/master-volume-0007    nasgorep     
nasgorep at vn-sc-2.westus2.cloudapp.azure.com::slave-volume-0001    N/A          
Created    N/A             N/A            N/A      N/A     N/A     N/A        
N/A                N/A                     N/A


Expected results: state of the geo-replication has to be "Active".


Additional info:

On master system: geo-replication logfile "gsyncd.log" has next error messages:

[root at SC-10-10-63-182 log]# vi
/var/log/glusterfs/geo-replication/master-volume-0005_10.10.60.183_slave-volume-0001/gsyncd.log:


……….

[2019-01-04 22:28:40.276280] E [syncdutils(monitor):809:errlog] Popen: command
returned error   cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no
-i /var/lib/glusterd/geo-replication/secret.pem -p 22
nasgorep at vn-sc-2.westus2.cloudapp.azure.com /usr/sbin/gluster --xml
--remote-host=localhost volume info slave-volume-0001  error=255
[2019-01-04 22:28:40.277328] E [syncdutils(monitor):813:logerr] Popen: ssh>
ERROR: failed to create logfile "/var/log/glusterfs/cli.log" (Permission
denied)
[2019-01-04 22:28:40.277438] E [syncdutils(monitor):813:logerr] Popen: ssh>
ERROR: failed to open logfile /var/log/glusterfs/cli.log
…….

On slave: file "cli.log" has next attributes:

[root at VN-SC-2 log]# ls -l /var/log/glusterfs/cli.log
-rw------- 1 root root 28552 Jan  4 14:35 /var/log/glusterfs/cli.log

It seems slave system does not let user "nasgorep" open or create logfile 
/var/log/glusterfs/cli.log" when "gluster" call is invoked from master system
through SSH.

On slave: command history log:

[root at VN-SC-2 log]# cat /var/log/glusterfs/cmd_history.log
[2019-01-04 22:22:49.893261]  : volume status : SUCCESS
[2019-01-04 22:22:59.929183]  : volume geo-replication status : SUCCESS
[2019-01-04 22:24:25.105256]  : volume create slave-volume-0001 transport tcp
VN-SC-2.cxn2n2xii2hepedpq4siv2akfe.xx.internal.cloudapp.net:/exports/nas-segment-0015/slave-volume-0001
: SUCCESS
[2019-01-04 22:24:25.894525]  : volume set slave-volume-0001
nfs.addr-namelookup off : SUCCESS
[2019-01-04 22:24:26.418826]  : volume reset slave-volume-0001
network.ping-timeout : SUCCESS
[2019-01-04 22:24:27.086126]  : volume set slave-volume-0001 nfs.disable on :
SUCCESS
[2019-01-04 22:24:28.082211]  : volume set slave-volume-0001
performance.stat-prefetch off : SUCCESS
[2019-01-04 22:24:28.873671]  : volume set slave-volume-0001
performance.quick-read off : SUCCESS
[2019-01-04 22:24:29.716433]  : volume set slave-volume-0001 allow-insecure on
: SUCCESS
[2019-01-04 22:24:30.272508]  : volume reset slave-volume-0001
nfs.rpc-auth-allow : SUCCESS
[2019-01-04 22:24:31.303757]  : volume start slave-volume-0001 : SUCCESS
[2019-01-04 22:24:51.709793]  : volume geo-replication status : SUCCESS
[2019-01-04 22:24:52.575909]  : volume status slave-volume-0001 : SUCCESS
[2019-01-04 22:27:55.543280]  : system:: uuid get : SUCCESS
[2019-01-04 22:27:55.574006]  : system:: execute mountbroker.py node-add
slave-volume-0001 nasgorep : SUCCESS
[2019-01-04 22:28:38.062065]  : system:: copy file
/geo-replication/master-volume-0007_slave-volume-0001_common_secret.pem.pub :
SUCCESS
[2019-01-04 22:28:38.296566]  : system:: execute add_secret_pub nasgorep
geo-replication/master-volume-0007_slave-volume-0001_common_secret.pem.pub :
SUCCESS
[2019-01-04 22:31:31.417565]  : volume geo-replication slave-volume-0001 status
: SUCCESS

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list