[Gluster-users] Geo-Replication push-pem actually does'nt append common_secret_pub.pem to authrorized_keys file

PEPONNET, Cyril N (Cyril) cyril.peponnet at alcatel-lucent.com
Mon Feb 2 19:00:40 UTC 2015


But now I have strange issue:

After creating the geo-rep session and starting it (from nodeB):

[root at nodeB]#  gluster vol geo-replication myvol slaveA::myvol status detail

MASTER NODE     MASTER VOL    MASTER BRICK               SLAVE            STATUS     CHECKPOINT STATUS    CRAWL STATUS    FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES PENDING    FILES SKIPPED
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nodeB           myvol         /export/raid/myvol         slaveA::myvol    Passive    N/A                  N/A             0              0                0                0                  0
nodeA           myvol         /export/raid/myvol         slaveA::myvol    Passive    N/A                  N/A             0              0                0                0                  0
nodeC           myvol         /export/raid/myvol         slaveA::myvol    Active     N/A                  Hybrid Crawl    0              8191             0                0                  0

[root at nodeB]#  gluster vol geo-replication myvol slaveA::myvol status detail

MASTER NODE     MASTER VOL    MASTER BRICK               SLAVE            STATUS     CHECKPOINT STATUS    CRAWL STATUS    FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES PENDING    FILES SKIPPED
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nodeB           myvol         /export/raid/myvol         slaveA::myvol    Passive    N/A                  N/A             0              0                0                0                  0
nodeA           myvol         /export/raid/myvol         slaveA::myvol    Active     N/A                  Hybrid Crawl    0              8191             0                0                  0
nodeC           myvol         /export/raid/myvol         slaveA::myvol    Passive    N/A                  N/A             0              0                0                0                  0

[root at nodeB]#  gluster vol geo-replication myvol slaveA::myvol status detail

MASTER NODE     MASTER VOL    MASTER BRICK               SLAVE            STATUS     CHECKPOINT STATUS    CRAWL STATUS    FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES PENDING    FILES SKIPPED
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nodeB           myvol         /export/raid/myvol         slaveA::myvol    Active     N/A                  Hybrid Crawl    0              8191             0                0                  0
nodeA           myvol         /export/raid/myvol         slaveA::myvol    Passive    N/A                  N/A             0              0                0                0                  0
nodeC           myvol         /export/raid/myvol         slaveA::myvol    Passive    N/A                  N/A             0              0                0                0                  0

So.

    1/Why there is 3 masters nodes ??? nodeB should be the master node only
    2/Why it kept changing turn by turn from active to passive?


Thanks


--
Cyril Peponnet

On Feb 2, 2015, at 10:40 AM, PEPONNET, Cyril N (Cyril) <cyril.peponnet at alcatel-lucent.com<mailto:cyril.peponnet at alcatel-lucent.com>> wrote:

For the record, after adding

operating-version=2

on every nodes (ABC) AND slave node, the commands are working
--
Cyril Peponnet

On Feb 2, 2015, at 9:46 AM, PEPONNET, Cyril N (Cyril) <cyril.peponnet at alcatel-lucent.com<mailto:cyril.peponnet at alcatel-lucent.com>> wrote:

More informations here:


I update the state of the peer in the uid file located in /v/l/g/peers from state 10 to state 3 (as it is on other node) and now the node is in cluster.

gluster system:: execute gsec_create now create a proper file from master node with every node’s key in it.

Now from there I try to create my georeplication between master nodeB and slaveA

gluster vol geo myvol slave::myvol create push-pem force

>From slaveA I got this error message logs:

[2015-02-02 17:19:04.754809] E [glusterd-geo-rep.c:1686:glusterd_op_stage_copy_file] 0-: Op Version not supported.
[2015-02-02 17:19:04.754890] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Copy File' failed on localhost : One or more nodes do not support the required op version.
[2015-02-02 17:19:07.513547] E [glusterd-geo-rep.c:1620:glusterd_op_stage_sys_exec] 0-: Op Version not supported.
[2015-02-02 17:19:07.513632] E [glusterd-geo-rep.c:1658:glusterd_op_stage_sys_exec] 0-: One or more nodes do not support the required op version.
[2015-02-02 17:19:07.513660] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Execute system commands' failed on localhost : One or more nodes do not support the required op version.

On slaveA I have the common pem file transfered in /v/l/g/geo/ with my 3 nodes from source site.

But the /root/.ssh/authorized_keys is not populated with this file.

>From the log I saw that there is a call to a script

/var/lib/glusterd/hooks/1/gsync-create/post/S56glusterd-geo-rep-create-post.sh —volname=myvol is_push_pem=1 pub_file=/var/lib/glusterd/geo-replication/common_secret.pem.pub slave_ip=salveA

In this script the following is done:


```
    scp $pub_file $slave_ip:$pub_file_tmp
    ssh $slave_ip "mv $pub_file_tmp $pub_file"
    ssh $slave_ip "gluster system:: copy file /geo-replication/common_secret.pem.pub > /dev/null"
    ssh $slave_ip “gluster system:: execute add_secret_pub > /dev/null"
```

The first 2 lines passed, the third fail so the fourth is never executed.

Third command on slaveA

#gluster system:: copy file /geo-replication/common_secret.pem.pub
One or more nodes do not support the required op version.

# gluster peer status
Number of Peers: 0

from logs:

==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
[2015-02-02 17:43:29.242524] E [glusterd-geo-rep.c:1686:glusterd_op_stage_copy_file] 0-: Op Version not supported.
[2015-02-02 17:43:29.242610] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Copy File' failed on localhost : One or more nodes do not support the required op version.
One or more nodes do not support the required op version.

I have for now only one node on my remote site.

Any way, as this step is done to copy the file accros all the cluster member I can deal without

The fourth command is not working:

[root at slaveA geo-replication]# gluster system:: execute add_secret_pub
[2015-02-02 17:44:49.123326] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2015-02-02 17:44:49.123381] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2015-02-02 17:44:49.123568] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2015-02-02 17:44:49.123588] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2015-02-02 17:44:49.306482] I [socket.c:2238:socket_event_handler] 0-transport: disconnecting now

==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
[2015-02-02 17:44:49.307921] E [glusterd-geo-rep.c:1620:glusterd_op_stage_sys_exec] 0-: Op Version not supported.
[2015-02-02 17:44:49.308009] E [glusterd-geo-rep.c:1658:glusterd_op_stage_sys_exec] 0-: One or more nodes do not support the required op version.
[2015-02-02 17:44:49.308038] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Execute system commands' failed on localhost : One or more nodes do not support the required op version.
One or more nodes do not support the required op version.

==> /var/log/glusterfs/cli.log <==
[2015-02-02 17:44:49.308493] I [input.c:36:cli_batch] 0-: Exiting with: -1


I have only one node… I don’t understand the meaning of the errror: One or more nodes do not support the required op version.

 --
Cyril Peponnet

On Feb 2, 2015, at 8:49 AM, PEPONNET, Cyril N (Cyril) <cyril.peponnet at alcatel-lucent.com<mailto:cyril.peponnet at alcatel-lucent.com>> wrote:

Every node is connected:

root at nodeA geo-replication]# gluster peer status
Number of Peers: 2

Hostname: nodeB
Uuid: 6a9da7fc-70ec-4302-8152-0e61929a7c8b
State: Peer in Cluster (Connected)

Hostname: nodeC
Uuid: c12353b5-f41a-4911-9329-fee6a8d529de
State: Peer in Cluster (Connected)


[root at nodeB ~]# gluster peer status
Number of Peers: 2

Hostname: nodeC
Uuid: c12353b5-f41a-4911-9329-fee6a8d529de
State: Peer in Cluster (Connected)

Hostname: nodeA
Uuid: 2ac172bb-a2d0-44f1-9e09-6b054dbf8980
State: Peer is connected and Accepted (Connected)

[root at nodeC geo-replication]# gluster peer status
Number of Peers: 2

Hostname: nodeA
Uuid: 2ac172bb-a2d0-44f1-9e09-6b054dbf8980
State: Peer in Cluster (Connected)

Hostname: nodeB
Uuid: 6a9da7fc-70ec-4302-8152-0e61929a7c8b
State: Peer in Cluster (Connected)


The only difference is State: Peer is connected and Accepted (Connected)  from nodeB about nodeA

When I execute gluster system from node A or C, I have the 3 nodes keys in common pem file. But from nodeB, I only have keys for nodeB and node C. This is infortunate as I try to launch the georeplication job from nodeB (master).


--
Cyril Peponnet

On Feb 2, 2015, at 2:07 AM, Aravinda <avishwan at redhat.com<mailto:avishwan at redhat.com>> wrote:

Looks like node C is in diconnected state. Please let us know the output of `gluster peer status` from all the master nodes and slave nodes.

--
regards
Aravinda

On 01/22/2015 12:27 AM, PEPONNET, Cyril N (Cyril) wrote:
So,

On master node of my 3 node setup:

1) gluster system:: execute gsec_create

in /var/lib/glusterd/geo-replication/common_secret.pub I have pem pub key from master node A and node B (not node C).

On node C in don’t have anything in /v/l/g/geo/ except the gsync template config.

So here I have an issue.

The only error I saw on node C is:

   [2015-01-21 18:36:41.179601] E [rpc-clnt.c:208:call_bail]
   0-management: bailing out frame type(Peer mgmt) op(—(2)) xid =
   0x23 sent = 2015-01-21 18:26:33.031937. timeout = 600 for
   xx.xx.xx.xx:24007


On node A, the cli.log looks like:

   [2015-01-21 18:49:49.878905] I [socket.c:3561:socket_init]
   0-glusterfs: SSL support is NOT enabled
   [2015-01-21 18:49:49.878947] I [socket.c:3576:socket_init]
   0-glusterfs: using system polling thread
   [2015-01-21 18:49:49.879085] I [socket.c:3561:socket_init]
   0-glusterfs: SSL support is NOT enabled
   [2015-01-21 18:49:49.879095] I [socket.c:3576:socket_init]
   0-glusterfs: using system polling thread
   [2015-01-21 18:49:49.951835] I
   [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
   [2015-01-21 18:49:49.972143] I [input.c:36:cli_batch] 0-: Exiting
   with: 0


If I run gluster system:: execute gsec_create on node C or node B, the common pem key file contains my 3 nodes pem puk keys. So in some way node A is unable to get the key from node C.

So let’s try to fix this one before going further.

--
Cyril Peponnet

On Jan 20, 2015, at 9:38 PM, Aravinda <avishwan at redhat.com<mailto:avishwan at redhat.com> <mailto:avishwan at redhat.com>> wrote:

On 01/20/2015 11:01 PM, PEPONNET, Cyril N (Cyril) wrote:
Hi,

I’m ready for new testing, I delete the geo-rep session between master and slace, remove the lines in authorized keys file on slave.
I also remove the common secret pem from slave, and from master. There is only the gsyncd_template.conf in /var/lib/gluster now.

Here is our setup:

Site A: gluster 3 nodes
Site B: gluster 1 node (for now, a second will come).

I can issue

gluster systen:: execute gsec_create

what to check?
common_secret.pem.pub is created in /var/lib/glusterd/geo-replication/common_secret.pem.pub, which should contain public keys from all Master nodes(Site A). Should match with contents of /var/lib/glusterd/geo-replication/secret.pem.pub and /var/lib/glusterd/geo-replication/tar_ssh.pem.pub.


then

gluster geo vol geo_test slave::geo_test create push-pem force (force is needed because the slave vol is smaller than the master vol).

What to check ?
Check for any errors in, /var/log/glusterfs/etc-glusterfs-glusterd.vol.log in rpm installation or in /var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log if source installation. In case of any errors related to hook execution, run directly the hook command copied from the log. From your previous mail I understand their is some issue while executing hook script. I will look into the issue in hook script.

I want to use change_detector changelog and not rsync btw.
change_detector is crawling mecanism. Available options are: changelog and xsync. xsync is FS Crawl.
sync mecanisms available are: rsync and tarssh.

Can you guide me to setup this but also debug why it’s not working out of the box ?

If needed I can get in touch with you through IRC.
Sure. IRC nickname is aravindavk.

Thanks for your help.


--
regards
Aravinda
http://aravindavk.in<http://aravindavk.in/>




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150202/14e77b1f/attachment.html>


More information about the Gluster-users mailing list