[Gluster-users] unable to get Geo-replication working

Scot Kreienkamp SKreien at la-z-boy.com
Mon Apr 30 15:56:21 UTC 2012


In my case it is already installed.

[root at hptv3130 ~]# rpm -qa|grep -i gluster
glusterfs-fuse-3.2.6-1.x86_64
glusterfs-debuginfo-3.2.6-1.x86_64
glusterfs-geo-replication-3.2.6-1.x86_64
glusterfs-rdma-3.2.6-1.x86_64
glusterfs-core-3.2.6-1.x86_64

Any more ideas?

Scot Kreienkamp
Senior Systems Engineer
skreien at la-z-boy.com

-----Original Message-----
From: Greg Swift [mailto:gregswift at gmail.com]
Sent: Monday, April 30, 2012 10:41 AM
To: Scot Kreienkamp
Cc: Mohit Anchlia; gluster-users at gluster.org
Subject: Re: [Gluster-users] unable to get Geo-replication working

I spent a lot of time troubleshooting this setup.  The resolution for
me was making sure the glusterfs-geo-replication software was
installed on the target system.

http://docs.redhat.com/docs/en-US/Red_Hat_Storage/2/html/User_Guide/chap-User_Guide-Geo_Rep-Preparation-Minimum_Reqs.html
States: Before deploying Geo-replication, you must ensure that both
Master and Slave are Red Hat Storage instances.

I realize that in a strictly literal sense this tells you that you
need the geo-replication software on the slave, however it would make
more sense to clearly state it. A geo-replication target not running
glusterfs just needs glusterfs-{core,geo-replication} not a full RH
Storage instance.

-greg

On Fri, Apr 27, 2012 at 12:26, Scot Kreienkamp <SKreien at la-z-boy.com> wrote:
> I am trying to setup geo-replication between a gluster volume and a
> non-gluster volume, yes.  The command I used to start geo-replication is:
>
>
>
> gluster volume geo-replication RMSNFSMOUNT hptv3130:/nfs start
>
>
>
>
>
> Scot Kreienkamp
>
> Senior Systems Engineer
>
> skreien at la-z-boy.com
>
>
>
> From: Mohit Anchlia [mailto:mohitanchlia at gmail.com]
> Sent: Friday, April 27, 2012 12:46 PM
> To: Scot Kreienkamp
> Cc: gluster-users at gluster.org
> Subject: Re: [Gluster-users] unable to get Geo-replication working
>
>
>
> Are you trying to setup geo-replication between gluster voolume ->
> non-gluster volume? or is it between gluster volume -> gluster volume?
>
>
>
> It looks like there might be some configuration issue here. Please give your
> script of how you configured geo-replication?
>
>
>
>
> On Fri, Apr 27, 2012 at 8:18 AM, Scot Kreienkamp <SKreien at la-z-boy.com>
> wrote:
>
> Sure....
>
>
>
> [root at retv3130 RMSNFSMOUNT]# gluster peer status
>
> Number of Peers: 1
>
>
>
> Hostname: retv3131
>
> Uuid: 450cc731-60be-47be-a42d-d856a03dac01
>
> State: Peer in Cluster (Connected)
>
>
>
>
>
> [root at hptv3130 ~]# gluster peer status
>
> No peers present
>
>
>
>
>
> [root at retv3130 ~]# gluster volume geo-replication RMSNFSMOUNT
> root at hptv3130:/nfs status
>
>
>
> MASTER               SLAVE
> STATUS
>
> --------------------------------------------------------------------------------
>
> RMSNFSMOUNT          root at hptv3130:/nfs
> faulty
>
>
>
>
>
>
>
>
>
> Scot Kreienkamp
>
> Senior Systems Engineer
>
> skreien at la-z-boy.com
>
>
>
> From: Mohit Anchlia [mailto:mohitanchlia at gmail.com]
> Sent: Friday, April 27, 2012 10:58 AM
> To: Scot Kreienkamp
> Subject: Re: [Gluster-users] unable to get Geo-replication working
>
>
>
> Can you look at the status of "gluster geo-replication MASTER SLAVE status"?
> Also, do gluster peer status on both MASTER and SLAVE? Paste the results
> here.
>
> On Fri, Apr 27, 2012 at 6:53 AM, Scot Kreienkamp <SKreien at la-z-boy.com>
> wrote:
>
> Hey everyone,
>
>
>
> I'm trying to get geo-replication working from a two brick replicated volume
> to a single directory on a remote host.  I can ssh as either root or
> georep-user to the destination as either georep-user or root with no
> password using the default ssh commands given by the config command: ssh
> -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
> /etc/glusterd/geo-replication/secret.pem.  All the glusterfs rpms are
> installed on the remote host.  There are no firewalls running on any of the
> hosts and no firewalls in between them.  The remote_gsyncd command is
> correct as I can copy and paste it to the command line and run it on both
> source hosts and destination host.  I'm using the current production version
> of glusterfs 3.2.6, rsync 3.0.9, fuse-2.8.3 rpm's are installed, OpenSSH
> 5.3, and Python 2.6.6 on RHEL6.2.  The remote directory is set to 777, world
> read-write so there are no permission errors.
>
>
>
> I'm using this command to start replication: gluster volume geo-replication
> RMSNFSMOUNT hptv3130:/nfs start
>
>
>
> Whenever I try to initiate geo-replication the status goes to starting for
> about 30 seconds, then goes to faulty.  On the slave I get these messages
> repeating in the geo-replication-slaves log:
>
>
>
> [2012-04-27 09:37:59.485424] I [resource(slave):201:service_loop] FILE:
> slave listening
>
> [2012-04-27 09:38:05.413768] I [repce(slave):60:service_loop] RepceServer:
> terminating on reaching EOF.
>
> [2012-04-27 09:38:15.35907] I [resource(slave):207:service_loop] FILE:
> connection inactive for 120 seconds, stopping
>
> [2012-04-27 09:38:15.36382] I [gsyncd(slave):302:main_i] <top>: exiting.
>
> [2012-04-27 09:38:19.952683] I [gsyncd(slave):290:main_i] <top>: syncing:
> file:///nfs
>
> [2012-04-27 09:38:19.955024] I [resource(slave):201:service_loop] FILE:
> slave listening
>
>
>
>
>
> I get these messages in etc-glusterfs-glusterd.vol.log on the slave:
>
>
>
> [2012-04-27 09:39:23.667930] W [socket.c:1494:__socket_proto_state_machine]
> 0-socket.management: reading from socket failed. Error (Transport endpoint
> is not connected), peer (127.0.0.1:1021)
>
> [2012-04-27 09:39:43.736138] I
> [glusterd-handler.c:3226:glusterd_handle_getwd] 0-glusterd: Received getwd
> req
>
> [2012-04-27 09:39:43.740749] W [socket.c:1494:__socket_proto_state_machine]
> 0-socket.management: reading from socket failed. Error (Transport endpoint
> is not connected), peer (127.0.0.1:1023)
>
>
>
> As I understand it from searching the list that message is benign and can be
> ignored though.
>
>
>
>
>
> Here are tails of all the logs on one of the sources:
>
>
>
> [root at retv3130 RMSNFSMOUNT]# tail
> ssh%3A%2F%2Fgeorep-user%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.gluster.log
>
> +------------------------------------------------------------------------------+
>
> [2012-04-26 16:16:40.804047] E [socket.c:1685:socket_connect_finish]
> 0-RMSNFSMOUNT-client-1: connection to  failed (Connection refused)
>
> [2012-04-26 16:16:40.804852] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
> 0-RMSNFSMOUNT-client-0: changing port to 24009 (from 0)
>
> [2012-04-26 16:16:44.779451] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
> 0-RMSNFSMOUNT-client-1: changing port to 24010 (from 0)
>
> [2012-04-26 16:16:44.855903] I
> [client-handshake.c:1090:select_server_supported_programs]
> 0-RMSNFSMOUNT-client-0: Using Program GlusterFS 3.2.6, Num (1298437),
> Version (310)
>
> [2012-04-26 16:16:44.856893] I [client-handshake.c:913:client_setvolume_cbk]
> 0-RMSNFSMOUNT-client-0: Connected to 10.170.1.222:24009, attached to remote
> volume '/nfs'.
>
> [2012-04-26 16:16:44.856943] I [afr-common.c:3141:afr_notify]
> 0-RMSNFSMOUNT-replicate-0: Subvolume 'RMSNFSMOUNT-client-0' came back up;
> going online.
>
> [2012-04-26 16:16:44.866734] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse:
> switched to graph 0
>
> [2012-04-26 16:16:44.867391] I [fuse-bridge.c:3241:fuse_thread_proc] 0-fuse:
> unmounting /tmp/gsyncd-aux-mount-8zMs0J
>
> [2012-04-26 16:16:44.868538] W [glusterfsd.c:727:cleanup_and_exit]
> (-->/lib64/libc.so.6(clone+0x6d) [0x31494e5ccd] (-->/lib64/libpthread.so.0()
> [0x3149c077f1]
> (-->/opt/glusterfs/3.2.6/sbin/glusterfs(glusterfs_sigwaiter+0x17c)
> [0x40477c]))) 0-: received signum (15), shutting down
>
> [root at retv3130 RMSNFSMOUNT]# tail
> ssh%3A%2F%2Fgeorep-user%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.log
>
> [2012-04-26 16:16:39.263871] I [gsyncd:290:main_i] <top>: syncing:
> gluster://localhost:RMSNFSMOUNT -> ssh://georep-user@hptv3130:/nfs
>
> [2012-04-26 16:16:41.332690] E [syncdutils:133:log_raise_exception] <top>:
> FAIL:
>
> Traceback (most recent call last):
>
>   File
> "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/syncdutils.py",
> line 154, in twrap
>
>     tf(*aa)
>
>   File
> "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/repce.py",
> line 117, in listen
>
>     rid, exc, res = recv(self.inf)
>
>   File
> "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/repce.py",
> line 41, in recv
>
>     return pickle.load(inf)
>
> EOFError
>
> [root at retv3130 RMSNFSMOUNT]# tail
> ssh%3A%2F%2Froot%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.gluster.log
>
> [2012-04-27 09:48:42.892842] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
> 0-RMSNFSMOUNT-client-1: changing port to 24010 (from 0)
>
> [2012-04-27 09:48:43.120749] I
> [client-handshake.c:1090:select_server_supported_programs]
> 0-RMSNFSMOUNT-client-0: Using Program GlusterFS 3.2.6, Num (1298437),
> Version (310)
>
> [2012-04-27 09:48:43.121489] I [client-handshake.c:913:client_setvolume_cbk]
> 0-RMSNFSMOUNT-client-0: Connected to 10.170.1.222:24009, attached to remote
> volume '/nfs'.
>
> [2012-04-27 09:48:43.121515] I [afr-common.c:3141:afr_notify]
> 0-RMSNFSMOUNT-replicate-0: Subvolume 'RMSNFSMOUNT-client-0' came back up;
> going online.
>
> [2012-04-27 09:48:43.132904] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse:
> switched to graph 0
>
> [2012-04-27 09:48:43.133704] I [fuse-bridge.c:2927:fuse_init]
> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel
> 7.13
>
> [2012-04-27 09:48:43.135797] I
> [afr-common.c:1520:afr_set_root_inode_on_first_lookup]
> 0-RMSNFSMOUNT-replicate-0: added root inode
>
> [2012-04-27 09:48:44.533289] W [fuse-bridge.c:2517:fuse_xattr_cbk]
> 0-glusterfs-fuse: 8:
> GETXATTR(trusted.glusterfs.9de3c1c8-a753-45a1-8042-b6a4872c5c3c.xtime) / =>
> -1 (Transport endpoint is not connected)
>
> [2012-04-27 09:48:44.544934] I [fuse-bridge.c:3241:fuse_thread_proc] 0-fuse:
> unmounting /tmp/gsyncd-aux-mount-uXCybC
>
> [2012-04-27 09:48:44.545879] W [glusterfsd.c:727:cleanup_and_exit]
> (-->/lib64/libc.so.6(clone+0x6d) [0x31494e5ccd] (-->/lib64/libpthread.so.0()
> [0x3149c077f1]
> (-->/opt/glusterfs/3.2.6/sbin/glusterfs(glusterfs_sigwaiter+0x17c)
> [0x40477c]))) 0-: received signum (15), shutting down
>
> [root at retv3130 RMSNFSMOUNT]# tail
> ssh%3A%2F%2Froot%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.log
>
>   File
> "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
> line 34, in lgetxattr
>
>     return cls._query_xattr( path, siz, 'lgetxattr', attr)
>
>   File
> "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
> line 26, in _query_xattr
>
>     cls.raise_oserr()
>
>   File
> "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
> line 16, in raise_oserr
>
>     raise OSError(errn, os.strerror(errn))
>
> OSError: [Errno 107] Transport endpoint is not connected
>
> [2012-04-27 09:49:14.846837] I [monitor(monitor):59:monitor] Monitor:
> ------------------------------------------------------------
>
> [2012-04-27 09:49:14.847898] I [monitor(monitor):60:monitor] Monitor:
> starting gsyncd worker
>
> [2012-04-27 09:49:14.930681] I [gsyncd:290:main_i] <top>: syncing:
> gluster://localhost:RMSNFSMOUNT -> ssh://hptv3130:/nfs
>
>
>
>
>
> I'm out of ideas.  I've satisfied all the requirements I can find, and I'm
> not seeing anything in the logs that makes any sense to me as an error that
> I can fix.  Can anyone help?
>
>
>
> Thanks!
>
>
>
> Scot Kreienkamp
>
> skreien at la-z-boy.com
>
>
>
>
>
>
> This message is intended only for the individual or entity to which it is
> addressed. It may contain privileged, confidential information which is
> exempt from disclosure under applicable laws. If you are not the intended
> recipient, please note that you are strictly prohibited from disseminating
> or distributing this information (other than to the intended recipient) or
> copying this information. If you have received this communication in error,
> please notify us immediately by e-mail or by telephone at the above number.
> Thank you.
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>
>
>
>
>
> This message is intended only for the individual or entity to which it is
> addressed. It may contain privileged, confidential information which is
> exempt from disclosure under applicable laws. If you are not the intended
> recipient, please note that you are strictly prohibited from disseminating
> or distributing this information (other than to the intended recipient) or
> copying this information. If you have received this communication in error,
> please notify us immediately by e-mail or by telephone at the above number.
> Thank you.
>
>
>
>
>
>
> This message is intended only for the individual or entity to which it is
> addressed. It may contain privileged, confidential information which is
> exempt from disclosure under applicable laws. If you are not the intended
> recipient, please note that you are strictly prohibited from disseminating
> or distributing this information (other than to the intended recipient) or
> copying this information. If you have received this communication in error,
> please notify us immediately by e-mail or by telephone at the above number.
> Thank you.
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>



This message is intended only for the individual or entity to which it is addressed. It may contain privileged, confidential information which is exempt from disclosure under applicable laws. If you are not the intended recipient, please note that you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information. If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.



More information about the Gluster-users mailing list