[Gluster-users] Glusterd can't start up

何亦军 heyijun at greatwall.com.cn
Thu Jun 11 06:51:35 UTC 2015


Hi all,

My glusterfs pool updated from 3.6.2 to 3.7.1, the node server os is centos 7.1.1503 .
some server work well , that server met glusterd start up problem. anyone can help me ?

some message below:

[root at gwgfs02 bricks]# systemctl status glusterd
glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled)
   Active: failed (Result: signal) since Thu 2015-06-11 14:37:10 CST; 3s ago
  Process: 4166 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid (code=exited, status=0/SUCCESS)
Main PID: 4167 (code=killed, signal=ABRT)

Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: llistxattr 1
Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: setfsid 1
Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: spinlock 1
Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: epoll.h 1
Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: xattr.h 1
Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: st_atim.tv_nsec 1
Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: package-string: glusterfs 3.7.1
Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: ---------
Jun 11 14:37:10 gwgfs02 systemd[1]: glusterd.service: main process exited, code=killed, status=6/ABRT
Jun 11 14:37:10 gwgfs02 systemd[1]: Unit glusterd.service entered failed state.

some log in etc-glusterfs-glusterd.vol.log :
[2015-06-11 06:37:10.187333] W [rdma.c:4493:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such device)
[2015-06-11 06:37:10.187357] W [rdma.c:4793:init] 0-rdma.management: Failed to initialize IB Device
[2015-06-11 06:37:10.187367] W [rpc-transport.c:358:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed
[2015-06-11 06:37:10.187473] W [rpcsvc.c:1595:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
[2015-06-11 06:37:10.187490] E [glusterd.c:1515:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2015-06-11 06:37:10.188848] I [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system
[2015-06-11 06:37:10.189361] I [glusterd-store.c:1986:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30700
[2015-06-11 06:37:10.189475] I [glusterd.c:154:glusterd_uuid_init] 0-management: retrieved UUID: d79c0a67-155b-43a8-8b51-151cc97aa4da
[2015-06-11 06:37:10.189557] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2015-06-11 06:37:10.189769] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2015-06-11 06:37:10.189931] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2015-06-11 06:37:10.190093] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2015-06-11 06:37:10.190287] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2015-06-11 06:37:10.190515] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2015-06-11 06:37:10.467359] I [glusterd-handler.c:3387:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2015-06-11 06:37:10.467437] I [glusterd-handler.c:3387:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2015-06-11 06:37:10.467493] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2015-06-11 06:37:10.471021] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 13, Invalid argument
[2015-06-11 06:37:10.471039] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-11 06:37:10.471159] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2015-06-11 06:37:10.474425] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 14, Invalid argument
[2015-06-11 06:37:10.474442] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option transport.socket.listen-backlog 128
  7:     option ping-timeout 30
  8:     option transport.socket.read-fail-log off
  9:     option transport.socket.keepalive-interval 2
10:     option transport.socket.keepalive-time 10
11:     option transport-type rdma
12:     option working-directory /var/lib/glusterd
13: end-volume
14:
+------------------------------------------------------------------------------+
[2015-06-11 06:37:10.476457] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-06-11 06:37:10.553448] I [glusterd-rpc-ops.c:464:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: b80f71d0-6944-4236-af96-e272a1f7e739, host: 192.168.0.61, port: 0
[2015-06-11 06:37:10.572277] I [glusterd-handler.c:2587:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: b80f71d0-6944-4236-af96-e272a1f7e739
[2015-06-11 06:37:10.572312] I [glusterd-handler.c:2630:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2015-06-11 06:37:10.572628] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2015-06-11 06:37:10.572673] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on , (No such file or directory)
[2015-06-11 06:37:10.573149] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped
[2015-06-11 06:37:10.575894] W [socket.c:3059:socket_connect] 0-glustershd: Ignore failed connection attempt on , (No such file or directory)
[2015-06-11 06:37:10.578510] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped
[2015-06-11 06:37:10.581415] W [socket.c:3059:socket_connect] 0-quotad: Ignore failed connection attempt on , (No such file or directory)
[2015-06-11 06:37:10.581496] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2015-06-11 06:37:10.581539] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2015-06-11 06:37:10.584198] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2015-06-11 06:37:10.588633] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash:
2015-06-11 06:37:10
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.1
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f15d41c0d92]
/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f15d41db9ed]
/lib64/libc.so.6(+0x35650)[0x7f15d2bb2650]
/lib64/libc.so.6(gsignal+0x37)[0x7f15d2bb25d7]
/lib64/libc.so.6(abort+0x148)[0x7f15d2bb3cc8]
/lib64/libc.so.6(+0x75e07)[0x7f15d2bf2e07]
/lib64/libc.so.6(__fortify_fail+0x37)[0x7f15d2c8aa57]
/lib64/libc.so.6(+0x10bc10)[0x7f15d2c88c10]
/lib64/libc.so.6(+0x10b32b)[0x7f15d2c8832b]
/lib64/libc.so.6(__snprintf_chk+0x78)[0x7f15d2c88248]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_volume_defrag_restart+0x191)[0x7f15c9053931]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_restart_rebalance+0x82)[0x7f15c9059aa2]
/usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_spawn_daemons+0x4f)[0x7f15c9059b1f]
/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7f15d41fb482]
/lib64/libc.so.6(+0x470f0)[0x7f15d2bc40f0]
---------

some log in data-brick1-vol01.log
[2015-06-11 06:37:10.602714] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-06-11 06:37:10.612919] W [socket.c:642:__socket_rwv] 0-glusterfs: readv on 192.168.0.62:24007 failed (Connection reset by peer)
[2015-06-11 06:37:10.613503] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f1074730ee6] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f10744ff36e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f10744ff47e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f1074500e0c] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f10745015c8] ))))) 0-glusterfs: forced unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at 2015-06-11 06:37:10.602886 (xid=0x1)
[2015-06-11 06:37:10.613550] E [glusterfsd-mgmt.c:1604:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:vol01.gwgfs02.data-brick1-vol01)
[2015-06-11 06:37:10.613599] W [glusterfsd.c:1219:cleanup_and_exit] (--> 0-: received signum (0), shutting down
[2015-06-11 06:37:10.618382] I [socket.c:3358:socket_submit_request] 0-glusterfs: not connected (priv->connected = 0)
[2015-06-11 06:37:10.618406] W [rpc-clnt.c:1566:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (XID: 0x2 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150611/0d3edc47/attachment.html>


More information about the Gluster-users mailing list