[Gluster-users] glusterfs 3.1.1 troubles on SLES11 SP1

Markus Fröhlich markus.froehlich at xidras.com
Thu Jan 13 19:22:26 UTC 2011


I have two servers with SLES11 SP1 x86_64 and compiled last version of glusterfs 3.1.1.
firewall is disabled on both nodes and they are on the same network.

I put both hostnames in the hosts file, so that each node can resolv the others hostname correctly
192.168.8.104   virt-zabbix-02
192.168.8.105   virt-zabbix-03

this is my config on both nodes: "/etc/glusterfs/glusterd.vol"
volume management
     type mgmt/glusterd
     option working-directory /etc/glusterd
     option transport-type socket,rdma
     option transport.socket.keepalive-time 10
     option transport.socket.keepalive-interval 2
end-volume

virt-zabbix-02# gluster peer status
No peers present

log:
[2011-01-13 19:53:31.576554] I [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: 
Received cli list req

this is okay, but then, when I want to add the other node to the cluster, the "glusterfsd" dies on 
"virt-zabbix-02" where I type the command and a core-dump file is generated:
virt-zabbix-02# gluster peer probe virt-zabbix-03

log virt-zabbix-02:
[2011-01-13 19:54:29.284735] I [glusterd-handler.c:563:glusterd_handle_cli_probe] glusterd: Received 
CLI probe req virt-zabbix-03 24007
[2011-01-13 19:54:29.285110] I [glusterd-handler.c:398:glusterd_friend_find] glusterd: Unable to 
find hostname: virt-zabbix-03
[2011-01-13 19:54:29.285136] I [glusterd-handler.c:2618:glusterd_probe_begin] glusterd: Unable to 
find peerinfo for host: virt-zabbix-03 (24007)
[2011-01-13 19:54:29.287625] W [rpc-transport.c:849:rpc_transport_load] rpc-transport: missing 
'option transport-type'. defaulting to "socket"
[2011-01-13 19:54:29.288496] I [glusterd-handler.c:2600:glusterd_friend_add] glusterd: connect 
returned 0
[2011-01-13 19:54:29.293369] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: 
Friend virt-zabbix-03 found.. state: 0
[2011-01-13 19:54:29.302062] I [glusterd3_1-mops.c:80:glusterd3_1_probe_cbk] glusterd: Received 
probe resp from uuid: 255540da-4b86-46f2-963c-3214e2c5e28a, host: virt-zabbix-03
[2011-01-13 19:54:29.302097] I [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to 
find peer by uuid
[2011-01-13 19:54:29.302111] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: 
Friend virt-zabbix-03 found.. state: 0
pending frames:

patchset: v3.1.1
signal received: 11
time of crash: 2011-01-13 19:54:29
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.1.1
/lib64/libc.so.6(+0x329e0)[0x7f1cbbb589e0]
/usr/lib64/libgfrpc.so.0(rpc_transport_connect+0xc)[0x7f1cbc4c506c]
/usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3d8)[0x7f1cbc4ca878]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_submit_request+0x15e)[0x7f1cba4203be]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_friend_add+0x11b)[0x7f1cba424f3b]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(+0x27b17)[0x7f1cba40db17]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x175)[0x7f1cba40d675]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_probe_cbk+0x495)[0x7f1cba4281f5]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7f1cbc4c9a94]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xc8)[0x7f1cbc4c9cd8]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x2e)[0x7f1cbc4c4f2e]
/usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x7f1cba1def9f]
/usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0x114)[0x7f1cba1df0d4]
/usr/lib64/libglusterfs.so.0(+0x3a384)[0x7f1cbc70b384]
/usr/sbin/glusterd(main+0x23c)[0x4055dc]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x7f1cbbb44bc6]
/usr/sbin/glusterd[0x4032c9]
---------

log virt-zabbix-03:
[2011-01-13 19:54:29.296723] I [glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: 
Received probe from uuid: a9b660c5-456d-4e96-9bdd-d23c917ae941
[2011-01-13 19:54:29.296802] I [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to 
find peer by uuid
[2011-01-13 19:54:29.297224] I [glusterd-handler.c:398:glusterd_friend_find] glusterd: Unable to 
find hostname: 192.168.8.104
[2011-01-13 19:54:29.297278] I [glusterd-handler.c:2401:glusterd_handle_probe_query] glusterd: 
Unable to find peerinfo for host: 192.168.8.104 (24007)
[2011-01-13 19:54:29.300119] W [rpc-transport.c:849:rpc_transport_load] rpc-transport: missing 
'option transport-type'. defaulting to "socket"
[2011-01-13 19:54:29.304856] I [glusterd-handler.c:2600:glusterd_friend_add] glusterd: connect 
returned 0
[2011-01-13 19:54:29.304994] I [glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: 
Responded to virt-zabbix-03, op_ret: 0, op_errno: 0, ret: 0
[2011-01-13 19:54:35.314773] E [socket.c:1656:socket_connect_finish] management: connection to 
192.168.8.104:24007 failed (Connection refused)


so I start the "gluserfsd" on virt-zabbix-02 again - a few secounds later the glusterfsd dies on the 
other node virt-zabbix-03 and there also a core-dump file is generated

log virt-zabbix-02:
[2011-01-13 19:57:08.911495] I [glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: 
Received probe from uuid: 255540da-4b86-46f2-963c-3214e2c5e28a
[2011-01-13 19:57:08.911559] I [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to 
find peer by uuid
[2011-01-13 19:57:08.911643] I [glusterd-utils.c:2140:glusterd_friend_find_by_hostname] glusterd: 
Friend 192.168.8.105 found.. state: 0
[2011-01-13 19:57:08.911715] I [glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: 
Responded to 192.168.8.104, op_ret: 0, op_errno: 0, ret: 0
[2011-01-13 19:57:11.956152] E [socket.c:1656:socket_connect_finish] management: connection to 
192.168.8.105:24007 failed (Connection refused)


log virt-zabbix-03:
[2011-01-13 19:57:08.913897] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: 
Friend 192.168.8.104 found.. state: 0
[2011-01-13 19:57:08.915052] I [glusterd3_1-mops.c:80:glusterd3_1_probe_cbk] glusterd: Received 
probe resp from uuid: a9b660c5-456d-4e96-9bdd-d23c917ae941, host: 192.168.8.104
[2011-01-13 19:57:08.915085] I [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to 
find peer by uuid
[2011-01-13 19:57:08.915100] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: 
Friend 192.168.8.104 found.. state: 0
pending frames:

patchset: v3.1.1
signal received: 11
time of crash: 2011-01-13 19:57:08
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.1.1
/lib64/libc.so.6(+0x329e0)[0x7fe84e6ee9e0]
/usr/lib64/libgfrpc.so.0(rpc_transport_connect+0xc)[0x7fe84f05b06c]
/usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3d8)[0x7fe84f060878]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_submit_request+0x15e)[0x7fe84cfb63be]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_friend_add+0x11b)[0x7fe84cfbaf3b]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(+0x27b17)[0x7fe84cfa3b17]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x175)[0x7fe84cfa3675]
/usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_probe_cbk+0x495)[0x7fe84cfbe1f5]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7fe84f05fa94]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xc8)[0x7fe84f05fcd8]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x2e)[0x7fe84f05af2e]
/usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x7fe84cd74f9f]
/usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0x114)[0x7fe84cd750d4]
/usr/lib64/libglusterfs.so.0(+0x3a384)[0x7fe84f2a1384]
/usr/sbin/glusterd(main+0x23c)[0x4055dc]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x7fe84e6dabc6]
/usr/sbin/glusterd[0x4032c9]
---------


starting the glusterfsd on virt-zabbix-03 again, let die the glusterfsd on virt-zabbix-02 and so on
so I make sure the daemon is stopped on both hosts.
the peer file generated on the nodes are different one is named with the hostname, the other with 
the IP:
virt-zabbix-02:#  cat /etc/glusterd/peers/virt-zabbix-03
uuid=
state=0
hostname1=virt-zabbix-03

virt-zabbix-03:# cat /etc/glusterd/peers/192.168.8.104
uuid=
state=0
hostname1=192.168.8.104


so I see the uuid is empty in both files and I fill it with the uuid from each others 
"/etc/glusterd/glusterd.info" file:
virt-zabbix-02:/ # cat /etc/glusterd/glusterd.info
UUID=a9b660c5-456d-4e96-9bdd-d23c917ae941
virt-zabbix-03:/ # cat etc/glusterd/glusterd.info
UUID=255540da-4b86-46f2-963c-3214e2c5e28a

virt-zabbix-02:/ # cat /etc/glusterd/peers/virt-zabbix-03
uuid=255540da-4b86-46f2-963c-3214e2c5e28a
state=0
hostname1=virt-zabbix-03

virt-zabbix-03:/ # cat /etc/glusterd/peers/192.168.8.104
uuid=a9b660c5-456d-4e96-9bdd-d23c917ae941
state=0
hostname1=192.168.8.104


now I start "glusterfsd" on both nodes again and both daemons keep running and I can type the command:
virt-zabbix-02:/ # gluster peer status
Number of Peers: 1

Hostname: virt-zabbix-03
Uuid: 255540da-4b86-46f2-963c-3214e2c5e28a
State: Establishing Connection (Connected)

I'd like to create my first test volume:
gluster volume create mytest transport tcp virt-zabbix-02:/gfs1 virt-zabbix-03:/gfs1
Creation of volume mytest has been unsuccessful
Host virt-zabbix-03 not connected

log virt-zabbix-02:
[2011-01-13 20:11:10.706931] I [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: 
Received cli list req
[2011-01-13 20:12:20.950199] I [glusterd-handler.c:785:glusterd_handle_create_volume] glusterd: 
Received create volume req
[2011-01-13 20:12:20.950907] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: 
Friend virt-zabbix-03 found.. state: 0
[2011-01-13 20:12:20.950935] I [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend 
found.. state: Establishing Connection
[2011-01-13 20:12:20.950950] E [glusterd-utils.c:2324:glusterd_new_brick_validate] glusterd: Host 
virt-zabbix-03 not connected
[2011-01-13 20:12:20.951005] E [glusterd-handler.c:906:glusterd_handle_create_volume] glusterd: 
Unlock on opinfo failed

no logfiles on virt-zabbix-03

not connected? strange! status info again:
virt-zabbix-02:/ # gluster peer status
Number of Peers: 1

Hostname: virt-zabbix-03
Uuid: 255540da-4b86-46f2-963c-3214e2c5e28a
State: Establishing Connection (Connected)

log virt-zabbix-02:
[2011-01-13 20:13:24.601901] I [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: 
Received cli list req


so I restart the glusterfsd on virt-zabbix-03 and the daemon on virt-zabbix-02 dies again

has some one any idea whats going wrong?

kind regards










More information about the Gluster-users mailing list