[Gluster-users] glusterfs 3.1.1 troubles on SLES11 SP1

Amar Tumballi amar at gluster.com
Fri Jan 14 05:56:45 UTC 2011


Hi Markus,

This is the first time I am coming across this particular backtrace/crash.
Looking into it now. Have filed a bug @
http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2293

Mean time, can you try below options and see if it fixes issues:

* stop all gluster processes (glusterd/glusterfs/glusterfsd)

* mv glusterd config directory

bash# mv /etc/glusterd /etc/glusterd.old (on both machines)

* start glusterd on both machines, do gluster peer probe now

Let me know the output..

Regards,
Amar


2011/1/14 Markus Fröhlich <markus.froehlich at xidras.com>

> I have two servers with SLES11 SP1 x86_64 and compiled last version of
> glusterfs 3.1.1.
> firewall is disabled on both nodes and they are on the same network.
>
> I put both hostnames in the hosts file, so that each node can resolv the
> others hostname correctly
> 192.168.8.104   virt-zabbix-02
> 192.168.8.105   virt-zabbix-03
>
> this is my config on both nodes: "/etc/glusterfs/glusterd.vol"
> volume management
>    type mgmt/glusterd
>    option working-directory /etc/glusterd
>    option transport-type socket,rdma
>    option transport.socket.keepalive-time 10
>    option transport.socket.keepalive-interval 2
> end-volume
>
> virt-zabbix-02# gluster peer status
> No peers present
>
> log:
> [2011-01-13 19:53:31.576554] I
> [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received
> cli list req
>
> this is okay, but then, when I want to add the other node to the cluster,
> the "glusterfsd" dies on "virt-zabbix-02" where I type the command and a
> core-dump file is generated:
> virt-zabbix-02# gluster peer probe virt-zabbix-03
>
> log virt-zabbix-02:
> [2011-01-13 19:54:29.284735] I
> [glusterd-handler.c:563:glusterd_handle_cli_probe] glusterd: Received CLI
> probe req virt-zabbix-03 24007
> [2011-01-13 19:54:29.285110] I
> [glusterd-handler.c:398:glusterd_friend_find] glusterd: Unable to find
> hostname: virt-zabbix-03
> [2011-01-13 19:54:29.285136] I
> [glusterd-handler.c:2618:glusterd_probe_begin] glusterd: Unable to find
> peerinfo for host: virt-zabbix-03 (24007)
> [2011-01-13 19:54:29.287625] W [rpc-transport.c:849:rpc_transport_load]
> rpc-transport: missing 'option transport-type'. defaulting to "socket"
> [2011-01-13 19:54:29.288496] I
> [glusterd-handler.c:2600:glusterd_friend_add] glusterd: connect returned 0
> [2011-01-13 19:54:29.293369] I
> [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend
> virt-zabbix-03 found.. state: 0
> [2011-01-13 19:54:29.302062] I
> [glusterd3_1-mops.c:80:glusterd3_1_probe_cbk] glusterd: Received probe resp
> from uuid: 255540da-4b86-46f2-963c-3214e2c5e28a, host: virt-zabbix-03
> [2011-01-13 19:54:29.302097] I
> [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer
> by uuid
> [2011-01-13 19:54:29.302111] I
> [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend
> virt-zabbix-03 found.. state: 0
> pending frames:
>
> patchset: v3.1.1
> signal received: 11
> time of crash: 2011-01-13 19:54:29
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> fdatasync 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.1.1
> /lib64/libc.so.6(+0x329e0)[0x7f1cbbb589e0]
> /usr/lib64/libgfrpc.so.0(rpc_transport_connect+0xc)[0x7f1cbc4c506c]
> /usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3d8)[0x7f1cbc4ca878]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_submit_request+0x15e)[0x7f1cba4203be]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_friend_add+0x11b)[0x7f1cba424f3b]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(+0x27b17)[0x7f1cba40db17]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x175)[0x7f1cba40d675]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_probe_cbk+0x495)[0x7f1cba4281f5]
> /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7f1cbc4c9a94]
> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xc8)[0x7f1cbc4c9cd8]
> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x2e)[0x7f1cbc4c4f2e]
>
> /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x7f1cba1def9f]
>
> /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0x114)[0x7f1cba1df0d4]
> /usr/lib64/libglusterfs.so.0(+0x3a384)[0x7f1cbc70b384]
> /usr/sbin/glusterd(main+0x23c)[0x4055dc]
> /lib64/libc.so.6(__libc_start_main+0xe6)[0x7f1cbbb44bc6]
> /usr/sbin/glusterd[0x4032c9]
> ---------
>
> log virt-zabbix-03:
> [2011-01-13 19:54:29.296723] I
> [glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: Received
> probe from uuid: a9b660c5-456d-4e96-9bdd-d23c917ae941
> [2011-01-13 19:54:29.296802] I
> [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer
> by uuid
> [2011-01-13 19:54:29.297224] I
> [glusterd-handler.c:398:glusterd_friend_find] glusterd: Unable to find
> hostname: 192.168.8.104
> [2011-01-13 19:54:29.297278] I
> [glusterd-handler.c:2401:glusterd_handle_probe_query] glusterd: Unable to
> find peerinfo for host: 192.168.8.104 (24007)
> [2011-01-13 19:54:29.300119] W [rpc-transport.c:849:rpc_transport_load]
> rpc-transport: missing 'option transport-type'. defaulting to "socket"
> [2011-01-13 19:54:29.304856] I
> [glusterd-handler.c:2600:glusterd_friend_add] glusterd: connect returned 0
> [2011-01-13 19:54:29.304994] I
> [glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: Responded to
> virt-zabbix-03, op_ret: 0, op_errno: 0, ret: 0
> [2011-01-13 19:54:35.314773] E [socket.c:1656:socket_connect_finish]
> management: connection to 192.168.8.104:24007 failed (Connection refused)
>
>
> so I start the "gluserfsd" on virt-zabbix-02 again - a few secounds later
> the glusterfsd dies on the other node virt-zabbix-03 and there also a
> core-dump file is generated
>
> log virt-zabbix-02:
> [2011-01-13 19:57:08.911495] I
> [glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: Received
> probe from uuid: 255540da-4b86-46f2-963c-3214e2c5e28a
> [2011-01-13 19:57:08.911559] I
> [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer
> by uuid
> [2011-01-13 19:57:08.911643] I
> [glusterd-utils.c:2140:glusterd_friend_find_by_hostname] glusterd: Friend
> 192.168.8.105 found.. state: 0
> [2011-01-13 19:57:08.911715] I
> [glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: Responded to
> 192.168.8.104, op_ret: 0, op_errno: 0, ret: 0
> [2011-01-13 19:57:11.956152] E [socket.c:1656:socket_connect_finish]
> management: connection to 192.168.8.105:24007 failed (Connection refused)
>
>
> log virt-zabbix-03:
> [2011-01-13 19:57:08.913897] I
> [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend
> 192.168.8.104 found.. state: 0
> [2011-01-13 19:57:08.915052] I
> [glusterd3_1-mops.c:80:glusterd3_1_probe_cbk] glusterd: Received probe resp
> from uuid: a9b660c5-456d-4e96-9bdd-d23c917ae941, host: 192.168.8.104
> [2011-01-13 19:57:08.915085] I
> [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer
> by uuid
> [2011-01-13 19:57:08.915100] I
> [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend
> 192.168.8.104 found.. state: 0
> pending frames:
>
> patchset: v3.1.1
> signal received: 11
> time of crash: 2011-01-13 19:57:08
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> fdatasync 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.1.1
> /lib64/libc.so.6(+0x329e0)[0x7fe84e6ee9e0]
> /usr/lib64/libgfrpc.so.0(rpc_transport_connect+0xc)[0x7fe84f05b06c]
> /usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3d8)[0x7fe84f060878]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_submit_request+0x15e)[0x7fe84cfb63be]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_friend_add+0x11b)[0x7fe84cfbaf3b]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(+0x27b17)[0x7fe84cfa3b17]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x175)[0x7fe84cfa3675]
>
> /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_probe_cbk+0x495)[0x7fe84cfbe1f5]
> /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7fe84f05fa94]
> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xc8)[0x7fe84f05fcd8]
> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x2e)[0x7fe84f05af2e]
>
> /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x7fe84cd74f9f]
>
> /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0x114)[0x7fe84cd750d4]
> /usr/lib64/libglusterfs.so.0(+0x3a384)[0x7fe84f2a1384]
> /usr/sbin/glusterd(main+0x23c)[0x4055dc]
> /lib64/libc.so.6(__libc_start_main+0xe6)[0x7fe84e6dabc6]
> /usr/sbin/glusterd[0x4032c9]
> ---------
>
>
> starting the glusterfsd on virt-zabbix-03 again, let die the glusterfsd on
> virt-zabbix-02 and so on
> so I make sure the daemon is stopped on both hosts.
> the peer file generated on the nodes are different one is named with the
> hostname, the other with the IP:
> virt-zabbix-02:#  cat /etc/glusterd/peers/virt-zabbix-03
> uuid=
> state=0
> hostname1=virt-zabbix-03
>
> virt-zabbix-03:# cat /etc/glusterd/peers/192.168.8.104
> uuid=
> state=0
> hostname1=192.168.8.104
>
>
> so I see the uuid is empty in both files and I fill it with the uuid from
> each others "/etc/glusterd/glusterd.info" file:
> virt-zabbix-02:/ # cat /etc/glusterd/glusterd.info
> UUID=a9b660c5-456d-4e96-9bdd-d23c917ae941
> virt-zabbix-03:/ # cat etc/glusterd/glusterd.info
> UUID=255540da-4b86-46f2-963c-3214e2c5e28a
>
> virt-zabbix-02:/ # cat /etc/glusterd/peers/virt-zabbix-03
> uuid=255540da-4b86-46f2-963c-3214e2c5e28a
> state=0
> hostname1=virt-zabbix-03
>
> virt-zabbix-03:/ # cat /etc/glusterd/peers/192.168.8.104
> uuid=a9b660c5-456d-4e96-9bdd-d23c917ae941
> state=0
> hostname1=192.168.8.104
>
>
> now I start "glusterfsd" on both nodes again and both daemons keep running
> and I can type the command:
> virt-zabbix-02:/ # gluster peer status
> Number of Peers: 1
>
> Hostname: virt-zabbix-03
> Uuid: 255540da-4b86-46f2-963c-3214e2c5e28a
> State: Establishing Connection (Connected)
>
> I'd like to create my first test volume:
> gluster volume create mytest transport tcp virt-zabbix-02:/gfs1
> virt-zabbix-03:/gfs1
> Creation of volume mytest has been unsuccessful
> Host virt-zabbix-03 not connected
>
> log virt-zabbix-02:
> [2011-01-13 20:11:10.706931] I
> [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received
> cli list req
> [2011-01-13 20:12:20.950199] I
> [glusterd-handler.c:785:glusterd_handle_create_volume] glusterd: Received
> create volume req
> [2011-01-13 20:12:20.950907] I
> [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend
> virt-zabbix-03 found.. state: 0
> [2011-01-13 20:12:20.950935] I
> [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend
> found.. state: Establishing Connection
> [2011-01-13 20:12:20.950950] E
> [glusterd-utils.c:2324:glusterd_new_brick_validate] glusterd: Host
> virt-zabbix-03 not connected
> [2011-01-13 20:12:20.951005] E
> [glusterd-handler.c:906:glusterd_handle_create_volume] glusterd: Unlock on
> opinfo failed
>
> no logfiles on virt-zabbix-03
>
> not connected? strange! status info again:
> virt-zabbix-02:/ # gluster peer status
> Number of Peers: 1
>
> Hostname: virt-zabbix-03
> Uuid: 255540da-4b86-46f2-963c-3214e2c5e28a
> State: Establishing Connection (Connected)
>
> log virt-zabbix-02:
> [2011-01-13 20:13:24.601901] I
> [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received
> cli list req
>
>
> so I restart the glusterfsd on virt-zabbix-03 and the daemon on
> virt-zabbix-02 dies again
>
> has some one any idea whats going wrong?
>
> kind regards
>
>
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>


More information about the Gluster-users mailing list