[Gluster-users] Setup gluster 3.1.2 ok but probe fails

Tue Jan 18 04:22:51 UTC 2011

On 01/17/2011 10:57 PM, Anand Avati wrote:
> Looks like you have a stale process running. Can you force kill all
> gluster daemons, rm -rf /etc/glusterd and start fresh? Please ensure
> name resolution works fine between the hosts.
>
> Avati
>

Primary:

    # ps -ef | grep gluster
    root       807     1  0 01:00 ?        00:00:00
    /usr/local/sbin/glusterd -p /var/run/glusterd.pid

Secondary:

    # ps -ef | grep gluster
    root      1045     1  0 00:52 ?        00:00:00
    /usr/local/sbin/glusterd -p /var/run/glusterd.pid

I don't see any stale processes.  Nothing was marked defunct.

I stopped all daemons and checked with ps that nothing was running. 
I did rm -rf /etc/glusterd/ on both servers.
I can successfully ping between the servers both by hostname and by IP.

I restarted the daemons and retried the probe and still have the same
problem.

Here is the primary log:

    [2011-01-18 04:11:56.852521] I [glusterfsd.c:672:cleanup_and_exit]
    glusterfsd: shutting down
    [2011-01-18 04:13:32.713646] I [glusterd.c:275:init] management:
    Using /etc/glusterd as working directory
    [2011-01-18 04:13:32.714529] E [socket.c:322:__socket_server_bind]
    tcp.management: binding to  failed: Address already in use
    [2011-01-18 04:13:32.714544] E [socket.c:325:__socket_server_bind]
    tcp.management: Port is already in use
    [2011-01-18 04:13:32.714607] I [glusterd.c:96:glusterd_uuid_init]
    glusterd: generated UUID: a39c5d2f-dac2-436b-b715-425becf9075c
    Given volfile:
    +------------------------------------------------------------------------------+
      1: volume management
      2:     type mgmt/glusterd
      3:     option working-directory /etc/glusterd
      4:     option transport-type socket,tcp,rdma
      5:     option transport.socket.keepalive-time 10
      6:     option transport.socket.keepalive-interval 2
      7: end-volume
      8:

    +------------------------------------------------------------------------------+
    [2011-01-18 04:13:55.74921] I
    [glusterd-handler.c:562:glusterd_handle_cli_probe] glusterd:
    Received CLI probe req 10.XXX.58.95 24007
    [2011-01-18 04:13:55.76532] I
    [glusterd-handler.c:397:glusterd_friend_find] glusterd: Unable to
    find hostname: 10.XXX.58.95
    [2011-01-18 04:13:55.76550] I
    [glusterd-handler.c:2615:glusterd_probe_begin] glusterd: Unable to
    find peerinfo for host: 10.XXX.58.95 (24007)
    [2011-01-18 04:13:55.78817] W
    [rpc-transport.c:849:rpc_transport_load] rpc-transport: missing
    'option transport-type'. defaulting to "socket"
    [2011-01-18 04:13:55.79386] I
    [glusterd-handler.c:2597:glusterd_friend_add] glusterd: connect
    returned 0
    [2011-01-18 04:14:16.78380] E [socket.c:1661:socket_connect_finish]
    management: connection to  failed (Connection timed out)

Anything else I can try?