[Gluster-users] Unable to peer probe after upgrade to 3.3

Fri Aug 3 11:09:19 UTC 2012

Hi Dan,

I had a setup where I manually edited the username to "rahul" and password to "hinduja". Volume reset was done after editing the file. Looks like it does not matter weather what format the creds are (UUID in this case).

Please try this out.

Copied the gluster-users list.

Thanks,
Rahul Hinduja

----- Original Message -----
From: "Dan Bretherton" <d.a.bretherton at reading.ac.uk>
To: "Rahul Hinduja" <rhinduja at redhat.com>
Cc: vs at redhat.com, "Sudhir Dharanendraiah" <sdharane at redhat.com>
Sent: Friday, August 3, 2012 3:50:21 PM
Subject: Re: [Gluster-users] Unable to peer probe after upgrade to 3.3

Hello Rahul,
Thanks for looking into this for me.  You are correct; my volume files 
don't have username and password entries.

I created a test volume on a machine with 3.3 freshly installed and the 
username and password entries look like this.

volume-id=d018547d-ceeb-4a55-b5fa-2237deaf572f
username=5b1c5055-98b5-4b5b-a448-29eb7ac878b3

Please can you tell me how to generate these long strings.  Are they 
randomly generated and can they be the same for every volume?  Can I 
copy and paste the above into my existing volume files?

By the way I notice that your reply wasn't copied to the gluster-users 
list.  Was that deliberate?  I think it would be useful to update 
everyone on the list with the solution to my problem, so please let me 
know if you would be happy for me to CC gluster-users next time.

Regards,
Dan Bretherton

-- 
Mr. D.A. Bretherton
Computer System Manager
Environmental Systems Science Centre (ESSC)
Harry Pitt Building
3 Earley Gate
University of Reading
Reading, RG6 7BE (or RG6 6AL for postal service deliveries)
UK
Tel. +44 118 378 5205, Fax: +44 118 378 6413

On 08/03/2012 08:48 AM, Rahul Hinduja wrote:
> Hi Dan,
>
> Can you please confirm whether your vol info file has the "username and password" entries for the volumes after upgrade from 3.2 to 3.3?
>
> This issue is probably because 3.2′s volume config lacks “username/password” authentication entries which is required for 3.3's.
>
> Vol info file should be located under "/var/lib/glusterd/vols/<vol-name>/info". If the entries(username and password) is not available , then my workaround is
>
> # gluster vol stop
> # service glusterd stop
> Modify vol info files manually by (adding username/password auth entries.)
> # service glusterd start
> # gluster vol reset
> # gluster vol start force
>
> Thanks,
> Rahul Hinduja
>
>
>
> ----- Original Message -----
> From: "Dan Bretherton"<d.a.bretherton at reading.ac.uk>
> To: "Harry Mangalam"<hjmangalam at gmail.com>
> Cc: "gluster-users"<gluster-users at gluster.org>
> Sent: Thursday, August 2, 2012 10:38:54 PM
> Subject: Re: [Gluster-users] Unable to peer probe after upgrade to 3.3
>
> Hello Harry,
> Thanks for that suggestion.  That machine is indeed in a Rocks cluster,
> and I used it as an example because until recently I was adding the
> Rocks cluster nodes as GlusterFS peers so I could NFS mount from localhost.
>
>> Can your
>> machines do DNS lookups and reverse lookups to each other (ie names
>> resolve to the correct IP #s and vice versa)?
> Yes, I just tested forward and reverse lookups on them all using pdsh,
> for all addresses and hostnames.
>
> To avoid muddying the waters with Rocks cluster issues I tried "gluster
> peer probe" again, this time with a spare storage server which has only
> one network interface.  Unfortunately the result was the same.  I
> checked that the firewall and SELinux were disabled on all machines first.
>
> -Dan.
>
> On 08/02/2012 04:49 PM, Harry Mangalam wrote:
>> Based on the error log, I'd guess at a DNS problems.  Can your
>> machines do DNS lookups and reverse lookups to each other (ie names
>> resolve to the correct IP #s and vice versa)?  Based on your
>> hostnames, it looks like you're running on a ROCKS cluster so you
>> might have competing (or incorrect) DNS info (cluster DNS vs
>> institutional DNS vs /etc/hosts info).
>>
>> It shouldn't be the case in a cluster but firewalls can obviously be a problem.
>>
>> hjm
>>
>> On Thu, Aug 2, 2012 at 8:21 AM, Dan Bretherton
>> <d.a.bretherton at reading.ac.uk>   wrote:
>>> Dear All-
>>> My recent upgrade from 3.2.6 to 3.3.0 went well, but now I can't add new
>>> peers to the cluster.  I can create a new peer group of servers all with 3.3
>>> freshly installed, but if any one of them was upgraded from 3.2 the "gluster
>>> peer probe" commands just hang for a while and return nothing. Following
>>> that, "gluster peer status" results in output like the following for the new
>>> peer being added.
>>>
>>> Hostname: compute-0-4.nerc-essc.ac.uk
>>> Uuid: 111612e4-537b-49b4-9e88-2e0e1bae7fdf
>>> State: Establishing Connection (Connected)
>>>
>>> Errors like these are produced in etc-glusterfs-glusterd.vol.log.
>>>
>>> [2012-08-02 13:00:53.553927] I
>>> [glusterd-op-sm.c:2653:glusterd_op_txn_complete] 0-glusterd: Cleared local
>>> lock
>>> [2012-08-02 15:55:19.244849] I
>>> [glusterd-handler.c:679:glusterd_handle_cli_probe] 0-glusterd: Received CLI
>>> probe req compute-0-4.nerc-essc.ac.uk 24007
>>> [2012-08-02 15:55:19.357191] I [glusterd-handler.c:423:glusterd_friend_find]
>>> 0-glusterd: Unable to find hostname: compute-0-4.nerc-essc.ac.uk
>>> [2012-08-02 15:55:19.357261] I
>>> [glusterd-handler.c:2222:glusterd_probe_begin] 0-glusterd: Unable to find
>>> peerinfo for host: compute-0-4.nerc-essc.ac.uk (24007)
>>> [2012-08-02 15:55:19.385050] I [glusterd-handler.c:2204:glusterd_friend_add]
>>> 0-management: connect returned 0
>>> [2012-08-02 15:55:19.387162] E [socket.c:1715:socket_connect_finish]
>>> 0-management: connection to  failed (Connection refused)
>>> [2012-08-02 15:55:19.387239] I
>>> [glusterd-handler.c:2400:glusterd_xfer_cli_probe_resp] 0-glusterd: Responded
>>> to CLI, ret: 0
>>> [2012-08-02 15:55:19.387274] I [mem-pool.c:576:mem_pool_destroy]
>>> 0-management: size=2236 max=0 total=0
>>> [2012-08-02 15:55:19.387294] I [mem-pool.c:576:mem_pool_destroy]
>>> 0-management: size=124 max=0 total=0
>>> [2012-08-02 15:55:33.026866] I
>>> [glusterd-handler.c:813:glusterd_handle_cli_list_friends] 0-glusterd:
>>> Received cli list req
>>> [2012-08-02 15:55:49.766295] I
>>> [glusterd-handler.c:679:glusterd_handle_cli_probe] 0-glusterd: Received CLI
>>> probe req compute-0-4.nerc-essc.ac.uk 24007
>>> [2012-08-02 15:55:49.841049] I [glusterd-handler.c:423:glusterd_friend_find]
>>> 0-glusterd: Unable to find hostname: compute-0-4.nerc-essc.ac.uk
>>> [2012-08-02 15:55:49.841101] I
>>> [glusterd-handler.c:2222:glusterd_probe_begin] 0-glusterd: Unable to find
>>> peerinfo for host: compute-0-4.nerc-essc.ac.uk (24007)
>>> [2012-08-02 15:55:49.857231] I [glusterd-handler.c:2204:glusterd_friend_add]
>>> 0-management: connect returned 0
>>> [2012-08-02 15:55:49.857804] I
>>> [glusterd-handshake.c:397:glusterd_set_clnt_mgmt_program] 0-: Using Program
>>> glusterd mgmt, Num (1238433), Version (2)
>>> [2012-08-02 15:55:49.857840] I
>>> [glusterd-handshake.c:403:glusterd_set_clnt_mgmt_program] 0-: Using Program
>>> Peer mgmt, Num (1238437), Version (2)
>>> [2012-08-02 15:55:49.868300] I
>>> [glusterd-rpc-ops.c:218:glusterd3_1_probe_cbk] 0-glusterd: Received probe
>>> resp from uuid: 111612e4-537b-49b4-9e88-2e0e1bae7fdf, host:
>>> compute-0-4.nerc-essc.ac.uk
>>> [2012-08-02 15:55:49.868344] I [glusterd-handler.c:411:glusterd_friend_find]
>>> 0-glusterd: Unable to find peer by uuid
>>> [2012-08-02 15:55:49.868406] E [glusterd-sm.c:1022:glusterd_friend_sm]
>>> 0-glusterd: handler returned: -1
>>> [2012-08-02 15:55:49.868425] I
>>> [glusterd-rpc-ops.c:286:glusterd3_1_probe_cbk] 0-glusterd: Received resp to
>>> probe req
>>>
>>> In /etc/glusterd/peers a file with the name of the machine being added is
>>> produced, like this example.
>>>
>>> [root at bdan10 peers]# cat compute-0-4.nerc-essc.ac.uk
>>> uuid=00000000-0000-0000-0000-000000000000
>>> state=0
>>> hostname1=compute-0-4.nerc-essc.ac.uk
>>>
>>> However the machine in question does have a valid uuid as shown below.
>>>
>>> [root at compute-0-4 etc]# cat /var/lib/glusterd/glusterd.info
>>> UUID=111612e4-537b-49b4-9e88-2e0e1bae7fdf
>>>
>>> This one had GlusterFS 3.3 freshly installed and was not upgraded from 3.2.
>>> On this machine the command "gluster peer status" outputs the following.
>>>
>>> Number of Peers: 1
>>>
>>> Hostname: 192.171.166.92
>>> Uuid: 00000000-0000-0000-0000-000000000000
>>> State: Establishing Connection (Connected)
>>>
>>> The IP address shown refers to the server where "gluster peer probe" was
>>> executed.
>>>
>>> I tried restarting glusterd on all the servers but it didn't make any
>>> difference, and doing the "peer probe" from a different server in the
>>> cluster had the same result.  Has anyone else experienced this problem and
>>> is there a solution or work-around?  All suggestions would be much
>>> appreciated.
>>>
>>> Regards,
>>> Dan.
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users