[Gluster-users] Problems with Gluster

Brad Hubbard bhubbard at redhat.com
Sun Aug 3 22:38:07 UTC 2014


On 08/04/2014 02:56 AM, McKenzie, Stan wrote:
> Hi Brad --
>
> Thanks for the response.  I've tried what you recommended on node40 and here is the output:
>
>
> ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
> [2014-08-03 09:25:01.405662] I [glusterfsd.c:1493:main] 0-/opt/glusterfs/3.2.5/sbin/glusterd: Started running /opt/glusterfs/3.2.5/sbin/glusterd version 3.2.5
> [2014-08-03 09:25:01.408622] I [glusterd.c:550:init] 0-management: Using /etc/glusterd as working directory
> [2014-08-03 09:25:01.410117] E [rpc-transport.c:677:rpc_transport_load] 0-rpc-transport: /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
> [2014-08-03 09:25:01.410141] E [rpc-transport.c:681:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
> [2014-08-03 09:25:01.410156] W [rpcsvc.c:1288:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
> [2014-08-03 09:25:01.410272] I [glusterd.c:88:glusterd_uuid_init] 0-glusterd: retrieved UUID: 7690fd99-5ed4-4a45-bb3d-7ab54831b543
> [2014-08-03 09:25:57.414360] E [common-utils.c:125:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or service not known)
> [2014-08-03 09:25:57.414408] E [name.c:253:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host nodei.localdomain

What DNS server do these machines use to resolve the addresses of the 
other nodes? Can you confirm the DNS is resolving the names correctly?

Continued below.

> pending frames:
>
> patchset: git://git.gluster.com/glusterfs.git
> signal received: 11
> time of crash: 2014-08-03 09:25:57
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> fdatasync 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.2.5
> /lib64/libc.so.6[0x30844302d0]
> /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x27)[0x2b131fbcd877]
> /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_peer_rpc_notify+0x1b4)[0x2b131fbb9a04]
> /opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(rpc_clnt_start+0x17)[0x2b131df7e8a7]
> /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_rpc_create+0xff)[0x2b131fbba74f]
> /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_friend_add+0x2d7)[0x2b131fbbaad7]
> /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_store_retrieve_peers+0x3d2)[0x2b131fbff4c2]
> /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_restore+0x78)[0x2b131fc00dd8]
> /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(init+0xd12)[0x2b131fbb6312]
> /opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(xlator_init+0x58)[0x2b131dd19488]
> /opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(glusterfs_graph_init+0x31)[0x2b131dd48501]
> /opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(glusterfs_graph_activate+0x88)[0x2b131dd48688]
> /opt/glusterfs/3.2.5/sbin/glusterd(glusterfs_process_volfp+0x103)[0x404033]
> /opt/glusterfs/3.2.5/sbin/glusterd(glusterfs_volumes_init+0x18b)[0x40424b]
> /opt/glusterfs/3.2.5/sbin/glusterd(main+0x419)[0x405299]
> /lib64/libc.so.6(__libc_start_main+0xf4)[0x308441d9c4]
> /opt/glusterfs/3.2.5/sbin/glusterd[0x403649]

This looks like https://bugzilla.redhat.com/show_bug.cgi?id=787516 which 
is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=786006 and 
one of the potential causes is stated as
"- glusterd attempts to restore all the peers.
- One of the peer's ip/hostname is unreachable."

Once again this may point to a DNS name resolution issue so please 
investigate that thoroughly.

Cheers,
Brad

> ---------
>
> -----Original Message-----
> From: Brad Hubbard [mailto:bhubbard at redhat.com]
> Sent: Saturday, August 02, 2014 4:37 PM
> To: McKenzie, Stan; gluster-users at gluster.org
> Subject: Re: [Gluster-users] Problems with Gluster
>
> On 08/02/2014 01:33 AM, McKenzie, Stan wrote:
>
>>
>> *When I ssh to some nodes I get an error "-bash:
>> /act/Modules/3.2.6/init/bash:  No such file or directory
>>
>> -bash: module:  command not found".   On other nodes when I ssh I get
>> normal login.
>
> Have you verified the file "/act/Modules/3.2.6/init/bash" exists on each peer and is not corrupted/truncated?
>
> You could also try something like this on node40.
>
> # tail -f /var/log/glusterfs/*.log &
>
> Ignore anything output until after you run the following.
>
> # service glusterd start
>
> Paste the output somewhere we can view it if it's too large to post in an email.
>
> Cheers,
> Brad
>
>


-- 

Kindest Regards,

Brad Hubbard
Senior Software Maintenance Engineer
Red Hat Global Support Services
Asia Pacific Region



More information about the Gluster-users mailing list