[Gluster-users] Can't start geo-replication with version 3.6.3

Fri Jul 3 11:39:39 UTC 2015

On 3 July 2015 at 16:45, John Ewing <johnewing1 at gmail.com> wrote:

> I am only allowing port 22 inbound on the slave server , I thought that
> the traffic would be tunnelled over ssh, is this not the case ?
>

Well, to mount the volume client needs to communicate with the glusterd
(which runs on 24007). So that needs to be open. Also the client talks to
bricks once mounted, so bricks port (49152 in your case) should be open too.

Best Regards,
Vishwanath

>
> Thanks
>
> J.
>
>
> On Fri, Jul 3, 2015 at 12:09 PM, M S Vishwanath Bhat <msvbhat at gmail.com>
> wrote:
>
>>
>>
>> On 3 July 2015 at 15:54, John Ewing <johnewing1 at gmail.com> wrote:
>>
>>> Hi Vishwanath,
>>>
>>> The slave volume is definitely started
>>>
>>> [root at ip-192-168-4-55 ~]# gluster volume start myvol start
>>> volume start: myvol: failed: Volume myvol already started
>>>
>>
>> IIRC geo-rep create first tries to mount the slave volume in master in
>> some temporary location.
>>
>> Can you try mounting the slave in master once manually? If the mount
>> doesn't work, there might some firewall restrictions in the slave volume?
>> Can you please check that as well?
>>
>> Hope it helps...
>>
>> Best Regards,
>> Vishwanath
>>
>>
>>> [root at ip-192-168-4-55 ~]# gluster volume status
>>> Status of volume: myvol
>>> Gluster process                                         Port    Online
>>>  Pid
>>>
>>> ------------------------------------------------------------------------------
>>> Brick 192.168.4.55:/export/xvdb1/brick                  49152   Y
>>> 9972
>>> NFS Server on localhost                                 2049    Y
>>> 12238
>>>
>>> Task Status of Volume myvol
>>>
>>> ------------------------------------------------------------------------------
>>> There are no active volume tasks
>>>
>>>
>>> Anyone have any debugging suggestions ?
>>>
>>> Thanks
>>>
>>> John.
>>>
>>> On Thu, Jul 2, 2015 at 8:46 PM, M S Vishwanath Bhat <msvbhat at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On 2 July 2015 at 21:33, John Ewing <johnewing1 at gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm trying to build a new geo-replicated cluster using Centos 6.6 and
>>>>> Gluster 3.6.3
>>>>>
>>>>> I've got as far creating a replicated volume with two peers on site,
>>>>> and a slave volume in EC2.
>>>>>
>>>>> I've set up passwordless ssh from one of the pair to the slave server,
>>>>> and I've run
>>>>>
>>>>> gluster system:: execute gsec_create
>>>>>
>>>>>
>>>>> When I try and create the geo-replication relationship between the
>>>>> servers I get:
>>>>>
>>>>>
>>>>> gluster volume geo-replication myvol X.X.X.X::myvol create  push-pem
>>>>> force
>>>>>
>>>>>  Unable to fetch slave volume details. Please check the slave cluster
>>>>> and slave volume.
>>>>>  geo-replication command failed
>>>>>
>>>>
>>>> I remember seeing this error when the slave volume is either not
>>>> created or not started or not present in your x.x.x.x host.
>>>>
>>>> Can you check if the slave volume is started?
>>>>
>>>> Best Regards,
>>>> Vishwanath
>>>>
>>>>
>>>>>
>>>>>
>>>>> The geo-replication-slaves log file from the master looks like this
>>>>>
>>>>>
>>>>> [2015-07-02 15:13:37.324823] I [rpc-clnt.c:1761:rpc_clnt_reconfig]
>>>>> 0-myvol-client-0: changing port to 49152 (from 0)
>>>>> [2015-07-02 15:13:37.334874] I
>>>>> [client-handshake.c:1413:select_server_supported_programs]
>>>>> 0-myvol-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
>>>>> [2015-07-02 15:13:37.335419] I
>>>>> [client-handshake.c:1200:client_setvolume_cbk] 0-myvol-client-0: Connected
>>>>> to myvol-client-0, attached to remote volume '/export/sdb1/brick,'.
>>>>> [2015-07-02 15:13:37.335493] I
>>>>> [client-handshake.c:1210:client_setvolume_cbk] 0-myvol-client-0: Server and
>>>>> Client lk-version numbers are not same, reopening the fds
>>>>> [2015-07-02 15:13:37.336050] I [MSGID: 108005]
>>>>> [afr-common.c:3669:afr_notify] 0-myvol-replicate-0: Subvolume
>>>>> 'myvol-client-0' came back up; going online.
>>>>> [2015-07-02 15:13:37.336170] I [rpc-clnt.c:1761:rpc_clnt_reconfig]
>>>>> 0-myvol-client-1: changing port to 49152 (from 0)
>>>>> [2015-07-02 15:13:37.336298] I
>>>>> [client-handshake.c:188:client_set_lk_version_cbk] 0-myvol-client-0: Server
>>>>> lk version = 1
>>>>> [2015-07-02 15:13:37.343247] I
>>>>> [client-handshake.c:1413:select_server_supported_programs]
>>>>> 0-myvol-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
>>>>> [2015-07-02 15:13:37.343964] I
>>>>> [client-handshake.c:1200:client_setvolume_cbk] 0-myvol-client-1: Connected
>>>>> to myvol-client-1, attached to remote volume '/export/sdb1/brick'.
>>>>> [2015-07-02 15:13:37.344043] I
>>>>> [client-handshake.c:1210:client_setvolume_cbk] 0-myvol-client-1: Server and
>>>>> Client lk-version numbers are not same, reopening the fds
>>>>> [2015-07-02 15:13:37.351151] I [fuse-bridge.c:5080:fuse_graph_setup]
>>>>> 0-fuse: switched to graph 0
>>>>> [2015-07-02 15:13:37.351491] I
>>>>> [client-handshake.c:188:client_set_lk_version_cbk] 0-myvol-client-1: Server
>>>>> lk version = 1
>>>>> [2015-07-02 15:13:37.352078] I [fuse-bridge.c:4009:fuse_init]
>>>>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel
>>>>> 7.14
>>>>> [2015-07-02 15:13:37.355056] I
>>>>> [afr-common.c:1477:afr_local_discovery_cbk] 0-myvol-replicate-0: selecting
>>>>> local read_child myvol-client-0
>>>>> [2015-07-02 15:13:37.396403] I [fuse-bridge.c:4921:fuse_thread_proc]
>>>>> 0-fuse: unmounting /tmp/tmp.NPixVv7xk9
>>>>> [2015-07-02 15:13:37.396922] W [glusterfsd.c:1194:cleanup_and_exit]
>>>>> (--> 0-: received signum (15), shutting down
>>>>> [2015-07-02 15:13:37.396970] I [fuse-bridge.c:5599:fini] 0-fuse:
>>>>> Unmounting '/tmp/tmp.NPixVv7xk9'.
>>>>> [2015-07-02 15:13:37.412584] I [MSGID: 100030]
>>>>> [glusterfsd.c:2018:main] 0-glusterfs: Started running glusterfs version
>>>>> 3.6.3 (args: glusterfs --xlator-option=*dht.lookup-unhashed=off
>>>>> --volfile-server X.X.X.X --volfile-id myvol -l
>>>>> /var/log/glusterfs/geo-replication-slaves/slave.log /tmp/tmp.am6rnOYxE7)
>>>>> [2015-07-02 15:14:40.423812] E [socket.c:2276:socket_connect_finish]
>>>>> 0-glusterfs: connection to X.X.X.X:24007 failed (Connection timed out)
>>>>> [2015-07-02 15:14:40.424077] E
>>>>> [glusterfsd-mgmt.c:1811:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to
>>>>> connect with remote-host: X.X.X.X (Transport endpoint is not connected)
>>>>> [2015-07-02 15:14:40.424119] I
>>>>> [glusterfsd-mgmt.c:1817:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all
>>>>> volfile servers
>>>>> [2015-07-02 15:14:40.424557] W [glusterfsd.c:1194:cleanup_and_exit]
>>>>> (--> 0-: received signum (1), shutting down
>>>>> [2015-07-02 15:14:40.424626] I [fuse-bridge.c:5599:fini] 0-fuse:
>>>>> Unmounting '/tmp/tmp.am6rnOYxE7'.
>>>>>
>>>>>
>>>>> I'm confused by the error message about not being able to connect to
>>>>> the slave on port 24007. Should it not be connecting over ssh ?
>>>>>
>>>>> Thanks
>>>>>
>>>>> John.
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150703/6e070cec/attachment.html>