[Gluster-users] strange hostname issue on volume create command with famous Peer in Cluster state error message

Atin Mukherjee amukherj at redhat.com
Tue Feb 6 16:01:50 UTC 2018


I'm guessing there's something wrong w.r.t address resolution on node 1.
>From the logs it's quite clear to me that node 1 is unable to resolve the
address configured in /etc/hosts where as the other nodes do. Could you
paste the gluster peer status output from all the nodes?

Also can you please check if you're able to ping "pri.ostechnix.lan" from
node1 only? Does volume create go through if you use the IP instead of the
hostname?


On Tue, Feb 6, 2018 at 7:31 PM, Ercan Aydoğan <ercan.aydogan at gmail.com>
wrote:

> Hello,
>
> i installed glusterfs 3.11.3  version 3 nodes ubuntu 16.04 machine. All
> machines have same /etc/hosts.
>
> node1 hostname
> pri.ostechnix.lan
>
> node2 hostname
> sec.ostechnix.lan
>
> node2 hostname
> third.ostechnix.lan
>
>
> 51.15.77.14     pri.ostechnix.lan     pri
> 51.15.90.60      sec.ostechnix.lan     sec
> 163.172.151.120  third.ostechnix.lan   third
>
> volume create command is
>
> root at pri:/var/log/glusterfs# gluster volume create myvol1 replica 2
> transport tcp pri.ostechnix.lan:/gluster/brick1/mpoint1
> sec.ostechnix.lan:/gluster/brick1/mpoint1 force
> Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to
> avoid this. See:  https://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/.
> Do you still want to continue?
>  (y/n) y
> volume create: myvol1: failed: Host pri.ostechnix.lan is not in 'Peer in
> Cluster' state
>
> node 1 glusterd.log is here
>
> root at pri:/var/log/glusterfs# cat glusterd.log
> [2018-02-06 13:28:37.638373] W [glusterfsd.c:1331:cleanup_and_exit]
> (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f0232faa6ba]
> -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xe5) [0x55e17938a8c5]
> -->/usr/sbin/glusterd(cleanup_and_exit+0x54) [0x55e17938a6e4] ) 0-:
> received signum (15), shutting down
> [2018-02-06 13:29:41.260479] I [MSGID: 100030] [glusterfsd.c:2476:main]
> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.11.3
> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
> [2018-02-06 13:29:41.284367] I [MSGID: 106478] [glusterd.c:1422:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2018-02-06 13:29:41.284462] I [MSGID: 106479] [glusterd.c:1469:init]
> 0-management: Using /var/lib/glusterd as working directory
> [2018-02-06 13:29:41.300804] W [MSGID: 103071] [rdma.c:4591:__gf_rdma_ctx_create]
> 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
> [2018-02-06 13:29:41.300969] W [MSGID: 103055] [rdma.c:4898:init]
> 0-rdma.management: Failed to initialize IB Device
> [2018-02-06 13:29:41.301098] W [rpc-transport.c:350:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2018-02-06 13:29:41.301190] W [rpcsvc.c:1660:rpcsvc_create_listener]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2018-02-06 13:29:41.301214] E [MSGID: 106243] [glusterd.c:1693:init]
> 0-management: creation of 1 listeners failed, continuing with succeeded
> transport
> [2018-02-06 13:29:44.621889] E [MSGID: 101032]
> [store.c:433:gf_store_handle_retrieve] 0-: Path corresponding to
> /var/lib/glusterd/glusterd.info. [No such file or directory]
> [2018-02-06 13:29:44.621967] E [MSGID: 101032]
> [store.c:433:gf_store_handle_retrieve] 0-: Path corresponding to
> /var/lib/glusterd/glusterd.info. [No such file or directory]
> [2018-02-06 13:29:44.621971] I [MSGID: 106514] [glusterd-store.c:2215:glusterd_restore_op_version]
> 0-management: Detected new install. Setting op-version to maximum : 31100
> [2018-02-06 13:29:44.625749] I [MSGID: 106194] [glusterd-store.c:3772:
> glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps
> list.
> Final graph:
> +-----------------------------------------------------------
> -------------------+
>   1: volume management
>   2:     type mgmt/glusterd
>   3:     option rpc-auth.auth-glusterfs on
>   4:     option rpc-auth.auth-unix on
>   5:     option rpc-auth.auth-null on
>   6:     option rpc-auth-allow-insecure on
>   7:     option transport.socket.listen-backlog 128
>   8:     option event-threads 1
>   9:     option ping-timeout 0
>  10:     option transport.socket.read-fail-log off
>  11:     option transport.socket.keepalive-interval 2
>  12:     option transport.socket.keepalive-time 10
>  13:     option transport-type rdma
>  14:     option working-directory /var/lib/glusterd
>  15: end-volume
>  16:
> +-----------------------------------------------------------
> -------------------+
> [2018-02-06 13:29:44.628451] I [MSGID: 101190] [event-epoll.c:602:event_dispatch_epoll_worker]
> 0-epoll: Started thread with index 1
> [2018-02-06 13:46:38.530154] I [MSGID: 106487] [glusterd-handler.c:1484:__
> glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
> [2018-02-06 13:47:05.745357] I [MSGID: 106487] [glusterd-handler.c:1242:__glusterd_handle_cli_probe]
> 0-glusterd: Received CLI probe req sec.ostechnix.lan 24007
> [2018-02-06 13:47:05.746465] I [MSGID: 106129] [glusterd-handler.c:3623:glusterd_probe_begin]
> 0-glusterd: Unable to find peerinfo for host: sec.ostechnix.lan (24007)
> [2018-02-06 13:47:05.751131] W [MSGID: 106062] [glusterd-handler.c:3399:
> glusterd_transport_inet_options_build] 0-glusterd: Failed to get
> tcp-user-timeout
> [2018-02-06 13:47:05.751179] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2018-02-06 13:47:05.751345] W [MSGID: 101002] [options.c:954:xl_opt_validate]
> 0-management: option 'address-family' is deprecated, preferred is
> 'transport.address-family', continuing with correction
> [2018-02-06 13:47:05.751902] I [MSGID: 106498] [glusterd-handler.c:3549:glusterd_friend_add]
> 0-management: connect returned 0
> [2018-02-06 13:47:05.769054] E [MSGID: 101032]
> [store.c:433:gf_store_handle_retrieve] 0-: Path corresponding to
> /var/lib/glusterd/glusterd.info. [No such file or directory]
> [2018-02-06 13:47:05.769160] I [MSGID: 106477]
> [glusterd.c:190:glusterd_uuid_generate_save] 0-management: generated
> UUID: 476b754c-24cd-4816-a630-99c1b696a9e6
> [2018-02-06 13:47:05.806715] I [MSGID: 106511] [glusterd-rpc-ops.c:261:__glusterd_probe_cbk]
> 0-management: Received probe resp from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada,
> host: sec.ostechnix.lan
> [2018-02-06 13:47:05.806764] I [MSGID: 106511] [glusterd-rpc-ops.c:421:__glusterd_probe_cbk]
> 0-glusterd: Received resp to probe req
> [2018-02-06 13:47:05.816670] I [MSGID: 106493] [glusterd-rpc-ops.c:485:__glusterd_friend_add_cbk]
> 0-glusterd: Received ACC from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada,
> host: sec.ostechnix.lan, port: 0
> [2018-02-06 13:47:05.831231] I [MSGID: 106163]
> [glusterd-handshake.c:1309:__glusterd_mgmt_hndsk_versions_ack]
> 0-management: using the op-version 31100
> [2018-02-06 13:47:05.845025] I [MSGID: 106490] [glusterd-handler.c:2890:__glusterd_handle_probe_query]
> 0-glusterd: Received probe from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada
> [2018-02-06 13:47:05.845156] I [MSGID: 106493] [glusterd-handler.c:2953:__glusterd_handle_probe_query]
> 0-glusterd: Responded to sec.ostechnix.lan, op_ret: 0, op_errno: 0, ret: 0
> [2018-02-06 13:47:05.852449] I [MSGID: 106490] [glusterd-handler.c:2539:__
> glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from
> uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada
> [2018-02-06 13:47:05.855861] I [MSGID: 106493] [glusterd-handler.c:3799:glusterd_xfer_friend_add_resp]
> 0-glusterd: Responded to sec.ostechnix.lan (0), ret: 0, op_ret: 0
> [2018-02-06 13:47:05.863598] I [MSGID: 106492] [glusterd-handler.c:2717:__glusterd_handle_friend_update]
> 0-glusterd: Received friend update from uuid: 1c041dbb-bad3-4158-97b7-
> fe47cddadada
> [2018-02-06 13:47:05.863674] I [MSGID: 106502] [glusterd-handler.c:2762:__glusterd_handle_friend_update]
> 0-management: Received my uuid as Friend
> [2018-02-06 13:47:05.866932] I [MSGID: 106493] [glusterd-rpc-ops.c:700:__glusterd_friend_update_cbk]
> 0-management: Received ACC from uuid: 1c041dbb-bad3-4158-97b7-fe47cddadada
> [2018-02-06 13:48:28.152542] I [MSGID: 106487] [glusterd-handler.c:1484:__
> glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
> [2018-02-06 13:48:57.283814] E [MSGID: 106429] [glusterd-utils.c:1156:
> glusterd_brickinfo_new_from_brick] 0-management: Failed to convert
> hostname pri.ostechnix.lan to uuid
> [2018-02-06 13:48:57.283871] E [MSGID: 106301] [glusterd-syncop.c:1322:gd_stage_op_phase]
> 0-management: Staging of operation 'Volume Create' failed on localhost :
> Host pri.ostechnix.lan is not in 'Peer in Cluster' state
> [2018-02-06 13:49:33.997818] E [MSGID: 106429] [glusterd-utils.c:1156:
> glusterd_brickinfo_new_from_brick] 0-management: Failed to convert
> hostname pri.ostechnix.lan to uuid
> [2018-02-06 13:49:33.997826] E [MSGID: 106301] [glusterd-syncop.c:1322:gd_stage_op_phase]
> 0-management: Staging of operation 'Volume Create' failed on localhost :
> Host pri.ostechnix.lan is not in 'Peer in Cluster' state
>
> i need advice how can fix this issue.
>
> Thanks.
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180206/716d6111/attachment.html>


More information about the Gluster-users mailing list