[Gluster-users] geo replication, invalid slave name and gluster 3.5.1

Aravinda avishwan at redhat.com
Thu Jul 17 11:18:07 UTC 2014


On 07/16/2014 02:20 PM, Stefan Moravcik wrote:
> Hello Vishwanath,
>
> thanks for pointing me to right direction... This was helpful... I 
> thought the password less ssh connection was done from glusterfs using 
> the secret.pem in the initial run.. But wasn't.. I had to create the 
> id_rsa in the /root/.ssh/ directory to be able to ssh to slave without 
> any -i option...
>
> Great, thanks for that... However i have additional question... Again 
> little bit different to the previous ones... This seems like a bug to 
> me.. But you for sure will know better.
>
> After I created the geo-replication volume and i started it.. 
> everything looked Ok and successful. Then i looked in the status 
> command and got this
>
> MASTER NODE                  MASTER VOL    MASTER BRICK 
> SLAVE                           STATUS    CHECKPOINT STATUS CRAWL STATUS
> ---------------------------------------------------------------------------------------------------------------------------------------------
> 1.1.1.1    myvol1    /shared/myvol1    1.2.3.4::myvol1_slave faulty    
> N/A                  N/A
> 1.1.1.2    myvol1    /shared/myvol1    1.2.3.4::myvol1_slave faulty    
> N/A                  N/A
> 1.1.1.3    myvol1    /shared/myvol1    1.2.3.4::myvol1_slave faulty    
> N/A                  N/A
>
> when i checked the config file there is
>
> remote_gsyncd: /nonexistent/gsyncd
> i even tried to create symlinks for this but faulty status has never 
> gone away... Found a bug report on bugzilla 
> https://bugzilla.redhat.com/show_bug.cgi?id=1105283
Update conf file manually as following and stop and start the 
geo-replication.(Conf file location: 
/var/lib/glusterd/geo-replication/<MASTER VOL>_<SLAVE IP_SLAVE 
VOL>/gsyncd.conf)
remote_gsyncd = /usr/libexec/glusterfs/gsyncd

Let us know if this resolves the issue.


--
regards
Aravinda
http://aravindavk.in

>
> [2014-07-16 07:14:34.718718] E 
> [glusterd-geo-rep.c:2685:glusterd_gsync_read_frm_status] 0-: Unable to 
> read gsyncd status file
> [2014-07-16 07:14:34.718756] E 
> [glusterd-geo-rep.c:2999:glusterd_read_status_file] 0-: Unable to read 
> the statusfile for /shared/myvol1 brick for repository(master), 
> 1.2.3.4::myvol1_slave(slave) session
>
> However since the symlink is in place the error message above won't 
> show in the log.. Actually there are no more error logs just faulty 
> status...
>
> Even more interesting. When i changed the configuration from rsync to 
> tar+ssh it synced the files there, but will not replicate any changes 
> or new files created....
>
> MASTER NODE                  MASTER VOL    MASTER BRICK 
> SLAVE                           STATUS    CHECKPOINT STATUS CRAWL 
> STATUS    FILES SYNCD    FILES PENDING    BYTES PENDING DELETES 
> PENDING    FILES SKIPPED
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 1.1.1.1    myvol1    /shared/myvol1    1.2.3.4::myvol1_slave faulty    
> N/A                  N/A             10001 0                
> 0                0 0
> 1.1.1.2    myvol1    /shared/myvol1    1.2.3.4::myvol1_slave faulty    
> N/A                  N/A             0 0                
> 0                0 0
> 1.1.1.3    myvol1    /shared/myvol1    1.2.3.4::myvol1_slave faulty    
> N/A                  N/A             0 0                
> 0                0                  0
>
>
> as you can see, 10001 files replicated... but if i create a new one or 
> edit the existing ones, the faulty status will not replicate anymore.. 
> This is true even if i change back from tar+ssh to rsync or restart 
> glusterd or anything...
>
> Thank you for all your help, much appreciated
>
> Regards,
> Stefan
>
> On 15/07/14 17:15, M S Vishwanath Bhat wrote:
>> On 15/07/14 18:13, Stefan Moravcik wrote:
>>> Hello Vishwanath
>>>
>>> thank you for your quick reply but i have a follow up question if it 
>>> is ok... Maybe a different issue and i should open a new thread, but 
>>> i will try to continue to use this one...
>>>
>>> So I followed the new documentation... let me show you what i have 
>>> done and what is the final error message...
>>>
>>>
>>> I have 3 servers node1, node2 and node3 with IPs 1.1.1.1, 1.1.1.2 
>>> and 1.1.1.3
>>>
>>> I installed glusterfs-server and glusterfs-geo-replication on all 3 
>>> of them... I created replica volume called myvol1 and run the command
>>>
>>> gluster system:: execute gsec_create
>>>
>>> this created 4 files:
>>> secret.pem
>>> secret.pem.pub
>>> tar_ssh.pem
>>> tar_ssh.pem.pub
>>>
>>> The pub file is different on all 3 nodes so I copied all 3 
>>> secret.pem.pub to slave authorized_keys. I tried to ssh directly to 
>>> slave server from all 3 nodes and got through with no problem.
>>>
>>> So I connected to slave server installed glusterfs-server and 
>>> glusterfs-geo-replication there too.
>>>
>>> Started the glusterd and created a volume called myvol1_slave
>>>
>>> Then I peer probed one of the masters with slave. This showed the 
>>> volume in my master and peer appeared in peer status.
>>>
>>> From here i run the command in your documentation
>>>
>>> volume geo-replication myvol1 1.2.3.4::myvol1_slave create push-pem
>>> Passwordless ssh login has not been setup with 1.2.3.4.
>>> geo-replication command failed
>> Couple of things here.
>>
>> I believe it was not clear enough in the docs and I apologise for 
>> that. But this is the prerequisite for dist-geo-rep.
>>
>> * /There should be a password-less ssh setup between at least one 
>> node in master volume to one node in slave volume. The geo-rep create 
>> command should be executed from this node which has password-less ssh 
>> setup to slave./
>>
>> So in your case, you can setup a password less ssh between 1.1.1.1 
>> (one master volume node) to 1.2.3.4 (one slave volume node). You can 
>> use "ssh-keygen" and "ssh-copy-id" to do the same.
>> After the above step is done, execute the "gluster system:: execute 
>> gsec_create". You don't need to copy it to the slave autorized_keys. 
>> geo-rep create push-pem takes care of it for you.
>>
>> Now, you should execute "gluster volume geo-rep myvol1 
>> 1.2.3.4::myvol1_slave cerate push-pem" from 1.1.1.1 (because this 
>> node has passwordless ssh to 1.2.3.4 mentioned in the command)
>>
>> That should create a geo-rep session for you. That can be started 
>> later on.
>>
>> And you don't need to peer probe slave from master or vice versa. 
>> Logically both master and slave volumes are in different clusters (in 
>> two different geographic locations).
>>
>> HTH,
>> Vishwanath
>>
>>>
>>> In the secure log file i could see the connection though.
>>>
>>> 2014-07-15T13:26:56.083445+01:00 1testlab sshd[23905]: Set 
>>> /proc/self/oom_score_adj to 0
>>> 2014-07-15T13:26:56.089423+01:00 1testlab sshd[23905]: Connection 
>>> from 1.1.1.1 port 58351
>>> 2014-07-15T13:26:56.248687+01:00 1testlab sshd[23906]: Connection 
>>> closed by 1.1.1.1
>>>
>>> and in the logs of one of the masters
>>>
>>> [2014-07-15 12:26:56.247667] E 
>>> [glusterd-geo-rep.c:1889:glusterd_verify_slave] 0-: Not a valid slave
>>> [2014-07-15 12:26:56.247752] E 
>>> [glusterd-geo-rep.c:2106:glusterd_op_stage_gsync_create] 0-: 
>>> 1.2.3.4::myvol1_slave is not a valid slave volume. Error: 
>>> Passwordless ssh login has not been setup with 1.2.3.4.
>>> [2014-07-15 12:26:56.247772] E 
>>> [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of 
>>> operation 'Volume Geo-replication Create' failed on localhost : 
>>> Passwordless ssh login has not been setup with 1.2.3.4.
>>>
>>> there is no log in the other masters in the cluster nor on slave..
>>>
>>> I even tried with force option, but same result... I disabled 
>>> firewall and selinux just to make sure those parts of the system do 
>>> not interfere. Searched a google for same problem and found one... 
>>> http://irclog.perlgeek.de/gluster/2014-01-16 but again no answer or 
>>> solution.
>>>
>>> Thank you for your time and help.
>>>
>>> Best regards,
>>> Stefan
>>>
>>> On 15/07/14 12:26, M S Vishwanath Bhat wrote:
>>>> On 15/07/14 15:08, Stefan Moravcik wrote:
>>>>> Hello Guys,
>>>>>
>>>>> I have been trying to set a geo replication in our glusterfs test 
>>>>> environment and got a problem with a message "invalid slave name"
>>>>>
>>>>> So first things first...
>>>>>
>>>>> I have 3 nodes configured in a cluster. Those nodes are configured 
>>>>> as replica. On this cluster I have a volume created with let say 
>>>>> name myvol1. So far everything works and looks good...
>>>>>
>>>>> Next step was to create a geo replication off site.. So i followed 
>>>>> this documentation:
>>>>> http://www.gluster.org/community/documentation/index.php/HowTo:geo-replication 
>>>>>
>>>> These are old docs. I have edited this to mention that it is old 
>>>> geo-rep docs.
>>>>
>>>> Please refer to 
>>>> https://github.com/gluster/glusterfs/blob/master/doc/admin-guide/en-US/markdown/admin_distributed_geo_rep.md 
>>>> or 
>>>> https://medium.com/@msvbhat/distributed-geo-replication-in-glusterfs-ec95f4393c50 
>>>> for latest distributed-geo-rep documentation.
>>>>>
>>>>> I had peered the slave server, created secret.pem was able to ssh 
>>>>> without the password and tried to create the geo replication 
>>>>> volume with the code from the documentation and got the following 
>>>>> error:
>>>>>
>>>>> on master:
>>>>> gluster volume geo-replication myvol1 1.2.3.4:/shared/myvol1_slave 
>>>>> start
>>>>>
>>>>> on master:
>>>>> [2014-07-15 09:15:37.188701] E 
>>>>> [glusterd-geo-rep.c:4083:glusterd_get_slave_info] 0-: Invalid 
>>>>> slave name
>>>>> [2014-07-15 09:15:37.188827] W [dict.c:778:str_to_data] 
>>>>> (-->/usr/lib64/glusterfs/3.5.1/xlator/mgmt/glusterd.so(glusterd_op_stage_gsync_create+0x1e2) 
>>>>> [0x7f979e20f1f2] 
>>>>> (-->/usr/lib64/glusterfs/3.5.1/xlator/mgmt/glusterd.so(glusterd_get_slave_details_confpath+0x116) 
>>>>> [0x7f979e20a306] 
>>>>> (-->/usr/lib64/libglusterfs.so.0(dict_set_str+0x1c) 
>>>>> [0x7f97a322045c]))) 0-dict: value is NULL
>>>>> [2014-07-15 09:15:37.188837] E 
>>>>> [glusterd-geo-rep.c:3995:glusterd_get_slave_details_confpath] 0-: 
>>>>> Unable to store slave volume name.
>>>>> [2014-07-15 09:15:37.188849] E 
>>>>> [glusterd-geo-rep.c:2056:glusterd_op_stage_gsync_create] 0-: 
>>>>> Unable to fetch slave or confpath details.
>>>>> [2014-07-15 09:15:37.188861] E 
>>>>> [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of 
>>>>> operation 'Volume Geo-replication Create' failed on localhost
>>>>>
>>>>> there are no logs on slave what so ever
>>>>> I also tried different documentation with "create push-pem" got 
>>>>> the very same problem as above...
>>>>>
>>>>> I tried to start the volume as node:/path/to/dir and also created 
>>>>> a volume on slave and started as node:/slave_volume_name always a 
>>>>> same result...
>>>>>
>>>>> Tried to search for a solution and found this 
>>>>> http://fpaste.org/114290/04117421/
>>>>>
>>>>> It was different user with a very same problem... The issue was 
>>>>> shown on IRC channel, but never answered..
>>>>>
>>>>> This is a fresh install of 3.5.1, so no upgrade should be needed i 
>>>>> guess... Any help solving this problem would be appreciated..
>>>> From what you have described, it looks like your slave is not a 
>>>> gluster volume. In latest geo-rep, slave has to be a gluster 
>>>> volume. Now glusterfs does not support a simple directory as a slave.
>>>>
>>>> Please follow new documentation and try once more.
>>>>
>>>> HTH
>>>>
>>>> Best Regards,
>>>> Vishwanath
>>>>
>>>>>
>>>>> Thank you and best regards,
>>>>> Stefan
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
> ***************************************************************************************************************************************************************************
> This email and any files transmitted with it are confidential and 
> intended solely for the use of the individual or entity to whom they 
> are addressed.
> If you have received this email in error please reply to the sender 
> indicating that fact and delete the copy you received.
> In addition, if you are not the intended recipient, you should not 
> print, copy, retransmit, disseminate, or otherwise use the information
> contained in this communication. Thank you.
>
> Newsweaver is a Trade Mark of E-Search Ltd. Registered in Ireland No. 
> 254994.
> Registered Office: 2200 Airport Business Park, Kinsale Road, Cork, 
> Ireland. International Telephone Number: +353 21 2427277.
> ***************************************************************************************************************************************************************************
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140717/9a993f51/attachment.html>


More information about the Gluster-users mailing list