[Gluster-users] 答复: Glusterd can't start up

何亦军 heyijun at greatwall.com.cn
Thu Jun 11 07:18:11 UTC 2015


Thanks Atin,  that's good news.

So , I just to wait for new version.

In fact I don't want upgrade to 3.7.1, but I need to repair my fault servers, only the new version 3.7.1 in repos. Why can't store multiple versions in repos? My original version is 3.6.2


-----邮件原件-----
发件人: Atin Mukherjee [mailto:amukherj at redhat.com] 
发送时间: 2015年6月11日 14:57
收件人: 何亦军; gluster-users at gluster.org
主题: Re: [Gluster-users] Glusterd can't start up

This is an issue with 3.7.1, rebalance code path in glusterd is broken.
The fix will be released in 3.7.2.

~Atin

On 06/11/2015 12:21 PM, 何亦军 wrote:
> Hi all,
> 
> My glusterfs pool updated from 3.6.2 to 3.7.1, the node server os is centos 7.1.1503 .
> some server work well , that server met glusterd start up problem. anyone can help me ?
> 
> some message below:
> 
> [root at gwgfs02 bricks]# systemctl status glusterd glusterd.service - 
> GlusterFS, a clustered file-system server
>    Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled)
>    Active: failed (Result: signal) since Thu 2015-06-11 14:37:10 CST; 3s ago
>   Process: 4166 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid 
> (code=exited, status=0/SUCCESS) Main PID: 4167 (code=killed, 
> signal=ABRT)
> 
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: llistxattr 1 
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: setfsid 1 
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: spinlock 1 
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: epoll.h 1 
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: xattr.h 1 
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: 
> st_atim.tv_nsec 1 Jun 11 14:37:10 gwgfs02 
> etc-glusterfs-glusterd.vol[4167]: package-string: glusterfs 3.7.1 Jun 
> 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: --------- Jun 11 
> 14:37:10 gwgfs02 systemd[1]: glusterd.service: main process exited, code=killed, status=6/ABRT Jun 11 14:37:10 gwgfs02 systemd[1]: Unit glusterd.service entered failed state.
> 
> some log in etc-glusterfs-glusterd.vol.log :
> [2015-06-11 06:37:10.187333] W [rdma.c:4493:__gf_rdma_ctx_create] 
> 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such 
> device)
> [2015-06-11 06:37:10.187357] W [rdma.c:4793:init] 0-rdma.management: 
> Failed to initialize IB Device
> [2015-06-11 06:37:10.187367] W 
> [rpc-transport.c:358:rpc_transport_load] 0-rpc-transport: 'rdma' 
> initialization failed
> [2015-06-11 06:37:10.187473] W [rpcsvc.c:1595:rpcsvc_transport_create] 
> 0-rpc-service: cannot create listener, initing the transport failed
> [2015-06-11 06:37:10.187490] E [glusterd.c:1515:init] 0-management: 
> creation of 1 listeners failed, continuing with succeeded transport
> [2015-06-11 06:37:10.188848] I 
> [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: 
> geo-replication module not installed in the system
> [2015-06-11 06:37:10.189361] I 
> [glusterd-store.c:1986:glusterd_restore_op_version] 0-glusterd: 
> retrieved op-version: 30700
> [2015-06-11 06:37:10.189475] I [glusterd.c:154:glusterd_uuid_init] 
> 0-management: retrieved UUID: d79c0a67-155b-43a8-8b51-151cc97aa4da
> [2015-06-11 06:37:10.189557] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glustershd: setting 
> frame-timeout to 600
> [2015-06-11 06:37:10.189769] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-nfs: setting frame-timeout 
> to 600
> [2015-06-11 06:37:10.189931] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-quotad: setting 
> frame-timeout to 600
> [2015-06-11 06:37:10.190093] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-bitd: setting 
> frame-timeout to 600
> [2015-06-11 06:37:10.190287] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-scrub: setting 
> frame-timeout to 600
> [2015-06-11 06:37:10.190515] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-snapd: setting 
> frame-timeout to 600
> [2015-06-11 06:37:10.467359] I 
> [glusterd-handler.c:3387:glusterd_friend_add_from_peerinfo] 
> 0-management: connect returned 0
> [2015-06-11 06:37:10.467437] I 
> [glusterd-handler.c:3387:glusterd_friend_add_from_peerinfo] 
> 0-management: connect returned 0
> [2015-06-11 06:37:10.467493] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting 
> frame-timeout to 600
> [2015-06-11 06:37:10.471021] W [socket.c:923:__socket_keepalive] 
> 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 13, Invalid 
> argument
> [2015-06-11 06:37:10.471039] E [socket.c:3015:socket_connect] 
> 0-management: Failed to set keep-alive: Invalid argument
> [2015-06-11 06:37:10.471159] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting 
> frame-timeout to 600
> [2015-06-11 06:37:10.474425] W [socket.c:923:__socket_keepalive] 
> 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 14, Invalid 
> argument
> [2015-06-11 06:37:10.474442] E [socket.c:3015:socket_connect] 
> 0-management: Failed to set keep-alive: Invalid argument Final graph:
> +------------------------------------------------------------------------------+
>   1: volume management
>   2:     type mgmt/glusterd
>   3:     option rpc-auth.auth-glusterfs on
>   4:     option rpc-auth.auth-unix on
>   5:     option rpc-auth.auth-null on
>   6:     option transport.socket.listen-backlog 128
>   7:     option ping-timeout 30
>   8:     option transport.socket.read-fail-log off
>   9:     option transport.socket.keepalive-interval 2
> 10:     option transport.socket.keepalive-time 10
> 11:     option transport-type rdma
> 12:     option working-directory /var/lib/glusterd
> 13: end-volume
> 14:
> +------------------------------------------------------------------------------+
> [2015-06-11 06:37:10.476457] I 
> [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started 
> thread with index 1
> [2015-06-11 06:37:10.553448] I 
> [glusterd-rpc-ops.c:464:__glusterd_friend_add_cbk] 0-glusterd: 
> Received ACC from uuid: b80f71d0-6944-4236-af96-e272a1f7e739, host: 
> 192.168.0.61, port: 0
> [2015-06-11 06:37:10.572277] I 
> [glusterd-handler.c:2587:__glusterd_handle_friend_update] 0-glusterd: 
> Received friend update from uuid: b80f71d0-6944-4236-af96-e272a1f7e739
> [2015-06-11 06:37:10.572312] I 
> [glusterd-handler.c:2630:__glusterd_handle_friend_update] 
> 0-management: Received my uuid as Friend
> [2015-06-11 06:37:10.572628] I [MSGID: 106132] 
> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already 
> stopped
> [2015-06-11 06:37:10.572673] W [socket.c:3059:socket_connect] 0-nfs: 
> Ignore failed connection attempt on , (No such file or directory)
> [2015-06-11 06:37:10.573149] I [MSGID: 106132] 
> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd 
> already stopped
> [2015-06-11 06:37:10.575894] W [socket.c:3059:socket_connect] 
> 0-glustershd: Ignore failed connection attempt on , (No such file or 
> directory)
> [2015-06-11 06:37:10.578510] I [MSGID: 106132] 
> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad 
> already stopped
> [2015-06-11 06:37:10.581415] W [socket.c:3059:socket_connect] 
> 0-quotad: Ignore failed connection attempt on , (No such file or 
> directory)
> [2015-06-11 06:37:10.581496] I [MSGID: 106132] 
> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd 
> already stopped
> [2015-06-11 06:37:10.581539] I [MSGID: 106132] 
> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub 
> already stopped
> [2015-06-11 06:37:10.584198] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting 
> frame-timeout to 600
> [2015-06-11 06:37:10.588633] I 
> [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 pending frames:
> frame : type(0) op(0)
> patchset: git://git.gluster.com/glusterfs.git
> signal received: 6
> time of crash:
> 2015-06-11 06:37:10
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.7.1
> /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f15d41c0d92]
> /lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f15d41db9ed]
> /lib64/libc.so.6(+0x35650)[0x7f15d2bb2650]
> /lib64/libc.so.6(gsignal+0x37)[0x7f15d2bb25d7]
> /lib64/libc.so.6(abort+0x148)[0x7f15d2bb3cc8]
> /lib64/libc.so.6(+0x75e07)[0x7f15d2bf2e07]
> /lib64/libc.so.6(__fortify_fail+0x37)[0x7f15d2c8aa57]
> /lib64/libc.so.6(+0x10bc10)[0x7f15d2c88c10]
> /lib64/libc.so.6(+0x10b32b)[0x7f15d2c8832b]
> /lib64/libc.so.6(__snprintf_chk+0x78)[0x7f15d2c88248]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_volume_def
> rag_restart+0x191)[0x7f15c9053931]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_restart_re
> balance+0x82)[0x7f15c9059aa2] 
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_spawn_daem
> ons+0x4f)[0x7f15c9059b1f] 
> /lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7f15d41fb482]
> /lib64/libc.so.6(+0x470f0)[0x7f15d2bc40f0]
> ---------
> 
> some log in data-brick1-vol01.log
> [2015-06-11 06:37:10.602714] I 
> [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started 
> thread with index 1
> [2015-06-11 06:37:10.612919] W [socket.c:642:__socket_rwv] 
> 0-glusterfs: readv on 192.168.0.62:24007 failed (Connection reset by 
> peer)
> [2015-06-11 06:37:10.613503] E [rpc-clnt.c:362:saved_frames_unwind] 
> (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f1074730ee6] 
> (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f10744ff36e] 
> (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f10744ff47e] 
> (--> 
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f1074500e0c] 
> (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f10745015c8] ))))) 
> 0-glusterfs: forced unwinding frame type(GlusterFS Handshake) 
> op(GETSPEC(2)) called at 2015-06-11 06:37:10.602886 (xid=0x1)
> [2015-06-11 06:37:10.613550] E 
> [glusterfsd-mgmt.c:1604:mgmt_getspec_cbk] 0-mgmt: failed to fetch 
> volume file (key:vol01.gwgfs02.data-brick1-vol01)
> [2015-06-11 06:37:10.613599] W [glusterfsd.c:1219:cleanup_and_exit] 
> (--> 0-: received signum (0), shutting down
> [2015-06-11 06:37:10.618382] I [socket.c:3358:socket_submit_request] 
> 0-glusterfs: not connected (priv->connected = 0)
> [2015-06-11 06:37:10.618406] W [rpc-clnt.c:1566:rpc_clnt_submit] 
> 0-glusterfs: failed to submit rpc-request (XID: 0x2 Program: Gluster 
> Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 

--
~Atin


More information about the Gluster-users mailing list