[Gluster-users] glusterd service fails to start from AWS AMI

Carlos Capriotti capriotti.carlos at gmail.com
Tue Mar 4 22:29:31 UTC 2014


I don't want to sound simplistic, but seems to be name resolution/network
related.

Again, I DO know you email ends with redhat.com, but just to make sure,
Gluster is running on what distro ? I never dealt with Amazon's platform,
so ignorance here is abundant.

The reason why I am asking is that I am stress-testing my first (on
premisses) install, and I ran into a problem that I am choosing to ignore
for now, but will have to solve in the future: DNS resolution stops working
after a while.

I am using CentOS 6.5, with Gluster 3.4.2. I have a bonded NIC, made out of
two physical ones and a third NIC for management.

I realized that, despite the fact that I have manually configured all
interfaces, disabled user control (may be this), disabled NM access to
them, and even tried to update resolv.conf, after a reboot, name resolution
does not work.

While the NICs were working with NM and/or DHCP, all went file, but after
tailoring my ifcfg-* files, DNS went south.

You said your name resolution does work. Maybe an entry on your hosts file
just to test ?

Another thought would be using 3.4.2, instead of 3.4.0.

Just wanted to share.

KR,

Carlos


On Tue, Mar 4, 2014 at 10:45 PM, Jon Cope <jcope at redhat.com> wrote:

> Hello all.
>
> I have a working replica 2 cluster (4 nodes) up and running happily over
> Amazon EC2.  My end goal is to create AMIs of each machine and then quickly
> reproduce the same, but new, cluster from those AMIs.  Essentially, I'd
> like a cluster "template".
>
> -Assigned original instances' Elastic IPs to new machines to reduce
> resolution issues.
> -Passwordless SSH works on initial boot across all machines
> -Node1: has no evident issue.  Starts with glusterd running.
> -Node1: 'gluster peer status' returns correct public DNS / hostnames for
> peer nodes.  Status: (Disconnected)  --since the service is off on them
>
> Since my goal is to create a cluster template, reinstalling gluster for
> each node, though it'll probably work, isn't an optimal solution.
>
> Thank You
>
> #  Node2: etc-glusterfs-glusterd.vol.log
> #  Begins at 'service glusterd start' command entry
>
> [2014-03-04 21:20:30.532138] I [glusterfsd.c:2024:main]
> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version
> 3.4.0.44rhs (/usr/sbin/glusterd --pid-file=/var/run/glusterd.pid)
> [2014-03-04 21:20:30.539331] I [glusterd.c:1020:init] 0-management: Using
> /var/lib/glusterd as working directory
> [2014-03-04 21:20:30.542578] I [socket.c:3485:socket_init]
> 0-socket.management: SSL support is NOT enabled
> [2014-03-04 21:20:30.542603] I [socket.c:3500:socket_init]
> 0-socket.management: using system polling thread
> [2014-03-04 21:20:30.543203] C [rdma.c:4099:gf_rdma_init]
> 0-rpc-transport/rdma: Failed to get IB devices
> [2014-03-04 21:20:30.543342] E [rdma.c:4990:init] 0-rdma.management:
> Failed to initialize IB Device
> [2014-03-04 21:20:30.543375] E [rpc-transport.c:320:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2014-03-04 21:20:30.543471] W [rpcsvc.c:1387:rpcsvc_transport_create]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2014-03-04 21:20:37.116571] I
> [glusterd-store.c:1388:glusterd_restore_op_version] 0-glusterd: retrieved
> op-version: 2
> [2014-03-04 21:20:37.120082] E
> [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key:
> brick-0
> [2014-03-04 21:20:37.120118] E
> [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key:
> brick-1
> [2014-03-04 21:20:37.120137] E
> [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key:
> brick-2
> [2014-03-04 21:20:37.120154] E
> [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key:
> brick-3
> [2014-03-04 21:20:37.761785] I
> [glusterd-handler.c:2886:glusterd_friend_add] 0-management: connect
> returned 0
> [2014-03-04 21:20:37.765059] I
> [glusterd-handler.c:2886:glusterd_friend_add] 0-management: connect
> returned 0
> [2014-03-04 21:20:37.767677] I
> [glusterd-handler.c:2886:glusterd_friend_add] 0-management: connect
> returned 0
> [2014-03-04 21:20:37.767783] I [rpc-clnt.c:974:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2014-03-04 21:20:37.767850] I [socket.c:3485:socket_init] 0-management:
> SSL support is NOT enabled
> [2014-03-04 21:20:37.767866] I [socket.c:3500:socket_init] 0-management:
> using system polling thread
> [2014-03-04 21:20:37.772356] I [rpc-clnt.c:974:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2014-03-04 21:20:37.772441] I [socket.c:3485:socket_init] 0-management:
> SSL support is NOT enabled
> [2014-03-04 21:20:37.772459] I [socket.c:3500:socket_init] 0-management:
> using system polling thread
> [2014-03-04 21:20:37.776131] I [rpc-clnt.c:974:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2014-03-04 21:20:37.776185] I [socket.c:3485:socket_init] 0-management:
> SSL support is NOT enabled
> [2014-03-04 21:20:37.776201] I [socket.c:3500:socket_init] 0-management:
> using system polling thread
> [2014-03-04 21:20:37.780363] E
> [glusterd-store.c:2548:glusterd_resolve_all_bricks] 0-glusterd: resolve
> brick failed in restore
> [2014-03-04 21:20:37.780395] E [xlator.c:423:xlator_init] 0-management:
> Initialization of volume 'management' failed, review your volfile again
> [2014-03-04 21:20:37.780410] E [graph.c:292:glusterfs_graph_init]
> 0-management: initializing translator failed
> [2014-03-04 21:20:37.780422] E [graph.c:479:glusterfs_graph_activate]
> 0-graph: init failed
> [2014-03-04 21:20:37.780723] W [glusterfsd.c:1097:cleanup_and_exit]
> (-->/usr/sbin/glusterd(main+0x6b1) [0x406a91]
> (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xb7) [0x405247]
> (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x106) [0x405156]))) 0-:
> received signum (0), shutting down
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140304/9d04cb03/attachment.html>


More information about the Gluster-users mailing list