[Gluster-users] strange hostname issue on volume create command with famous Peer in Cluster state error message

Tue Feb 6 17:22:17 UTC 2018

I forgot to say i called after 

> root at pri:~# gluster peer probe third.ostechnix.lan

 node 2 terminal

> root at sec:~# gluster peer probe pri.ostechnix.lan

and i checked firewall rules still is valid after reboot.

i used 

apt-get install iptables-persistent
etfilter-persistent save

> On 6 Feb 2018, at 19:37, Ercan Aydoğan <ercan.aydogan at gmail.com> wrote:
> 
> I changed /etc/hosts
> 
> 127.0.0.1        pri.ostechnix.lan     pri
> 51.15.90.60      sec.ostechnix.lan     sec
> 163.172.151.120  third.ostechnix.lan   third
> 
> on every node matching hostname to 127.0.0.1
> 
> then 
> 
> root at pri:~# apt-get purge glusterfs-server
> root at pri:~# rm -rf /var/lib/glusterd/
> root at pri:~# rm -rf /var/log/glusterfs/
> root at pri:~# apt-get install glusterfs-server
> root at pri:~# apt-mark hold glusterfs*
> root at pri:~# reboot now
> root at pri:~# gluster peer probe sec.ostechnix.lan
> peer probe: success.
> root at pri:~# gluster peer probe third.ostechnix.lan
> peer probe: success.
> root at pri:/var/log/glusterfs# gluster volume create myvol1 replica 3 transport tcp pri.ostechnix.lan:/gluster/brick1/mpoint1 sec.ostechnix.lan:/gluster/brick1/mpoint1 third.ostechnix.lan:/gluster/brick1/mpoint1 force
> volume create: myvol1: success: please start the volume to access data 
> 
> 
> and volume created
> root at pri:~# gluster volume list
> myvol1
> 
> Thank you @Serkan and @Atin
> 
> 
> 
> 
> 
>> On 6 Feb 2018, at 19:01, Atin Mukherjee <amukherj at redhat.com <mailto:amukherj at redhat.com>> wrote:
>> 
>> I'm guessing there's something wrong w.r.t address resolution on node 1. From the logs it's quite clear to me that node 1 is unable to resolve the address configured in /etc/hosts where as the other nodes do. Could you paste the gluster peer status output from all the nodes?
>> 
>> Also can you please check if you're able to ping "pri.ostechnix.lan" from node1 only? Does volume create go through if you use the IP instead of the hostname?
>> 
>> 
>> On Tue, Feb 6, 2018 at 7:31 PM, Ercan Aydoğan <ercan.aydogan at gmail.com <mailto:ercan.aydogan at gmail.com>> wrote:
>> Hello,
>> 
>> i installed glusterfs 3.11.3  version 3 nodes ubuntu 16.04 machine. All machines have same /etc/hosts.
>> 
>> node1 hostname
>> pri.ostechnix.lan
>> 
>> node2 hostname
>> sec.ostechnix.lan
>> 
>> node2 hostname
>> third.ostechnix.lan 
>> 
>> 
>> 51.15.77.14     pri.ostechnix.lan     pri
>> 51.15.90.60      sec.ostechnix.lan     sec
>> 163.172.151.120  third.ostechnix.lan   third
>> 
>> volume create command is 
>> 
>> root at pri:/var/log/glusterfs# gluster volume create myvol1 replica 2 transport tcp pri.ostechnix.lan:/gluster/brick1/mpoint1 sec.ostechnix.lan:/gluster/brick1/mpoint1 force
>> Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See:  https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/ <https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/>.
>> Do you still want to continue?
>>  (y/n) y
>> volume create: myvol1: failed: Host pri.ostechnix.lan is not in 'Peer in Cluster' state
>> 
>> node 1 glusterd.log is here
>> 
>> root at pri:/var/log/glusterfs# cat glusterd.log 
>> [2018-02-06 13:28:37.638373] W [glusterfsd.c:1331:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f0232faa6ba] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xe5) [0x55e17938a8c5] -->/usr/sbin/glusterd(cleanup_and_exit+0x54) [0x55e17938a6e4] ) 0-: received signum (15), shutting down
>> [2018-02-06 13:29:41.260479] I [MSGID: 100030] [glusterfsd.c:2476:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.11.3 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
>> [2018-02-06 13:29:41.284367] I [MSGID: 106478] [glusterd.c:1422:init] 0-management: Maximum allowed open file descriptors set to 65536
>> [2018-02-06 13:29:41.284462] I [MSGID: 106479] [glusterd.c:1469:init] 0-management: Using /var/lib/glusterd as working directory
>> [2018-02-06 13:29:41.300804] W [MSGID: 103071] [rdma.c:4591:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
>> [2018-02-06 13:29:41.300969] W [MSGID: 103055] [rdma.c:4898:init] 0-rdma.management: Failed to initialize IB Device
>> [2018-02-06 13:29:41.301098] W [rpc-transport.c:350:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed
>> [2018-02-06 13:29:41.301190] W [rpcsvc.c:1660:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed
>> [2018-02-06 13:29:41.301214] E [MSGID: 106243] [glusterd.c:1693:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
>> [2018-02-06 13:29:44.621889] E [MSGID: 101032] [store.c:433:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info <http://glusterd.info/>. [No such file or directory]
>> [2018-02-06 13:29:44.621967] E [MSGID: 101032] [store.c:433:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info <http://glusterd.info/>. [No such file or directory]
>> [2018-02-06 13:29:44.621971] I [MSGID: 106514] [glusterd-store.c:2215:glusterd_restore_op_version] 0-management: Detected new install. Setting op-version to maximum : 31100
>> [2018-02-06 13:29:44.625749] I [MSGID: 106194] [glusterd-store.c:3772:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list.
>> Final graph:
>> +------------------------------------------------------------------------------+
>>   1: volume management
>>   2:     type mgmt/glusterd
>>   3:     option rpc-auth.auth-glusterfs on
>>   4:     option rpc-auth.auth-unix on
>>   5:     option rpc-auth.auth-null on
>>   6:     option rpc-auth-allow-insecure on
>>   7:     option transport.socket.listen-backlog 128
>>   8:     option event-threads 1
>>   9:     option ping-timeout 0
>>  10:     option transport.socket.read-fail-log off
>>  11:     option transport.socket.keepalive-interval 2
>>  12:     option transport.socket.keepalive-time 10
>>  13:     option transport-type rdma
>>  14:     option working-directory /var/lib/glusterd
>>  15: end-volume
>>  16:  
>> +------------------------------------------------------------------------------+
>> [2018-02-06 13:29:44.628451] I [MSGID: 101190] [event-epoll.c:602:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
>> [2018-02-06 13:46:38.530154] I [MSGID: 106487] [glusterd-handler.c:1484:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
>> [2018-02-06 13:47:05.745357] I [MSGID: 106487] [glusterd-handler.c:1242:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req sec.ostechnix.lan 24007
>> [2018-02-06 13:47:05.746465] I [MSGID: 106129] [glusterd-handler.c:3623:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: sec.ostechnix.lan (24007)
>> [2018-02-06 13:47:05.751131] W [MSGID: 106062] [glusterd-handler.c:3399:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout
>> [2018-02-06 13:47:05.751179] I [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
>> [2018-02-06 13:47:05.751345] W [MSGID: 101002] [options.c:954:xl_opt_validate] 0-management: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
>> [2018-02-06 13:47:05.751902] I [MSGID: 106498] [glusterd-handler.c:3549:glusterd_friend_add] 0-management: connect returned 0
>> [2018-02-06 13:47:05.769054] E [MSGID: 101032] [store.c:433:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info <http://glusterd.info/>. [No such file or directory]
>> [2018-02-06 13:47:05.769160] I [MSGID: 106477] [glusterd.c:190:glusterd_uuid_generate_save] 0-management: generated UUID: 476b754c-24cd-4816-a630-99c1b696a9e6
>> [2018-02-06 13:47:05.806715] I [MSGID: 106511] [glusterd-rpc-ops.c:261:__glusterd_probe_cbk] 0-management: Received probe resp from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada, host: sec.ostechnix.lan
>> [2018-02-06 13:47:05.806764] I [MSGID: 106511] [glusterd-rpc-ops.c:421:__glusterd_probe_cbk] 0-glusterd: Received resp to probe req
>> [2018-02-06 13:47:05.816670] I [MSGID: 106493] [glusterd-rpc-ops.c:485:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada, host: sec.ostechnix.lan, port: 0
>> [2018-02-06 13:47:05.831231] I [MSGID: 106163] [glusterd-handshake.c:1309:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31100
>> [2018-02-06 13:47:05.845025] I [MSGID: 106490] [glusterd-handler.c:2890:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada
>> [2018-02-06 13:47:05.845156] I [MSGID: 106493] [glusterd-handler.c:2953:__glusterd_handle_probe_query] 0-glusterd: Responded to sec.ostechnix.lan, op_ret: 0, op_errno: 0, ret: 0
>> [2018-02-06 13:47:05.852449] I [MSGID: 106490] [glusterd-handler.c:2539:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada
>> [2018-02-06 13:47:05.855861] I [MSGID: 106493] [glusterd-handler.c:3799:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to sec.ostechnix.lan (0), ret: 0, op_ret: 0
>> [2018-02-06 13:47:05.863598] I [MSGID: 106492] [glusterd-handler.c:2717:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada
>> [2018-02-06 13:47:05.863674] I [MSGID: 106502] [glusterd-handler.c:2762:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
>> [2018-02-06 13:47:05.866932] I [MSGID: 106493] [glusterd-rpc-ops.c:700:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada
>> [2018-02-06 13:48:28.152542] I [MSGID: 106487] [glusterd-handler.c:1484:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
>> [2018-02-06 13:48:57.283814] E [MSGID: 106429] [glusterd-utils.c:1156:glusterd_brickinfo_new_from_brick] 0-management: Failed to convert hostname pri.ostechnix.lan to uuid
>> [2018-02-06 13:48:57.283871] E [MSGID: 106301] [glusterd-syncop.c:1322:gd_stage_op_phase] 0-management: Staging of operation 'Volume Create' failed on localhost : Host pri.ostechnix.lan is not in 'Peer in Cluster' state
>> [2018-02-06 13:49:33.997818] E [MSGID: 106429] [glusterd-utils.c:1156:glusterd_brickinfo_new_from_brick] 0-management: Failed to convert hostname pri.ostechnix.lan to uuid
>> [2018-02-06 13:49:33.997826] E [MSGID: 106301] [glusterd-syncop.c:1322:gd_stage_op_phase] 0-management: Staging of operation 'Volume Create' failed on localhost : Host pri.ostechnix.lan is not in 'Peer in Cluster' state
>> 
>> i need advice how can fix this issue.
>> 
>> Thanks.
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180206/07c5c61c/attachment.html>