[Gluster-users] Setup gluster 3.1.2 ok but probe fails

Tue Jan 18 04:32:56 UTC 2011

On 01/17/2011 11:22 PM, Gerry Reno wrote:
> On 01/17/2011 10:57 PM, Anand Avati wrote:
>   
>> Looks like you have a stale process running. Can you force kill all
>> gluster daemons, rm -rf /etc/glusterd and start fresh? Please ensure
>> name resolution works fine between the hosts.
>>
>> Avati
>>
>>     
> Primary:
>
>     # ps -ef | grep gluster
>     root       807     1  0 01:00 ?        00:00:00
>     /usr/local/sbin/glusterd -p /var/run/glusterd.pid
>
>
> Secondary:
>
>     # ps -ef | grep gluster
>     root      1045     1  0 00:52 ?        00:00:00
>     /usr/local/sbin/glusterd -p /var/run/glusterd.pid
>
> I don't see any stale processes.  Nothing was marked defunct.
>
> I stopped all daemons and checked with ps that nothing was running. 
> I did rm -rf /etc/glusterd/ on both servers.
> I can successfully ping between the servers both by hostname and by IP.
>
> I restarted the daemons and retried the probe and still have the same
> problem.
>
> Here is the primary log:
>
>     [2011-01-18 04:11:56.852521] I [glusterfsd.c:672:cleanup_and_exit]
>     glusterfsd: shutting down
>     [2011-01-18 04:13:32.713646] I [glusterd.c:275:init] management:
>     Using /etc/glusterd as working directory
>     [2011-01-18 04:13:32.714529] E [socket.c:322:__socket_server_bind]
>     tcp.management: binding to  failed: Address already in use
>     [2011-01-18 04:13:32.714544] E [socket.c:325:__socket_server_bind]
>     tcp.management: Port is already in use
>     [2011-01-18 04:13:32.714607] I [glusterd.c:96:glusterd_uuid_init]
>     glusterd: generated UUID: a39c5d2f-dac2-436b-b715-425becf9075c
>     Given volfile:
>     +------------------------------------------------------------------------------+
>       1: volume management
>       2:     type mgmt/glusterd
>       3:     option working-directory /etc/glusterd
>       4:     option transport-type socket,tcp,rdma
>       5:     option transport.socket.keepalive-time 10
>       6:     option transport.socket.keepalive-interval 2
>       7: end-volume
>       8:
>
>     +------------------------------------------------------------------------------+
>     [2011-01-18 04:13:55.74921] I
>     [glusterd-handler.c:562:glusterd_handle_cli_probe] glusterd:
>     Received CLI probe req 10.XXX.58.95 24007
>     [2011-01-18 04:13:55.76532] I
>     [glusterd-handler.c:397:glusterd_friend_find] glusterd: Unable to
>     find hostname: 10.XXX.58.95
>     [2011-01-18 04:13:55.76550] I
>     [glusterd-handler.c:2615:glusterd_probe_begin] glusterd: Unable to
>     find peerinfo for host: 10.XXX.58.95 (24007)
>     [2011-01-18 04:13:55.78817] W
>     [rpc-transport.c:849:rpc_transport_load] rpc-transport: missing
>     'option transport-type'. defaulting to "socket"
>     [2011-01-18 04:13:55.79386] I
>     [glusterd-handler.c:2597:glusterd_friend_add] glusterd: connect
>     returned 0
>     [2011-01-18 04:14:16.78380] E [socket.c:1661:socket_connect_finish]
>     management: connection to  failed (Connection timed out)
>
>
>
> Anything else I can try?
>
>   

I just rebooted both instances and then checked the log right after
bootup.  The daemons are set to start during boot sequence.
Still some kind of connection problem.

Primary log:

    [2011-01-18 04:27:53.706382] I [glusterfsd.c:672:cleanup_and_exit]
    glusterfsd: shutting down
    [2011-01-18 04:28:09.699032] I [glusterd.c:275:init] management:
    Using /etc/glusterd as working directory
    [2011-01-18 04:28:09.729714] E [socket.c:322:__socket_server_bind]
    tcp.management: binding to  failed: Address already in use
    [2011-01-18 04:28:09.729751] E [socket.c:325:__socket_server_bind]
    tcp.management: Port is already in use
    [2011-01-18 04:28:09.731413] I [glusterd.c:87:glusterd_uuid_init]
    glusterd: retrieved UUID: a39c5d2f-dac2-436b-b715-425becf9075c
    [2011-01-18 04:28:09.734698] E
    [glusterd-store.c:1446:glusterd_store_retrieve_peers] management:
    key: 0x248e0e0, and value: (nil)
    Given volfile:
    +------------------------------------------------------------------------------+
      1: volume management
      2:     type mgmt/glusterd
      3:     option working-directory /etc/glusterd
      4:     option transport-type socket,tcp,rdma
      5:     option transport.socket.keepalive-time 10
      6:     option transport.socket.keepalive-interval 2
      7: end-volume
      8:

    +------------------------------------------------------------------------------+