[Gluster-users] 3.3qa3 - rdma failing to start - how do I completely clear config?

Iain Buchanan iainbuc at gmail.com
Tue Jun 4 07:42:59 UTC 2013


I've been testing GlusterFS (3.3qa3) and I've managed to get into a situation where it won't start up - I've tried removing it and reinstalling (removing package, then manually wiping the /var/lib/glusterd and /var/log/glusterfs folders, reinstalling package).  With no volumes set up I tried to set up a peer (this failed: unknown errno 107).  After restarting glusterfs-server I see a steady stream of events in the log:

First there are a load like this:

[2013-06-04 07:31:22.644256] E [rdma.c:4658:gf_rdma_event_handler] 0-rpc-transport/rdma: rdma.management: pollin re
ceived on tcp socket (peer: 127.0.0.1:61) after handshake is complete
[2013-06-04 07:31:22.644314] W [rdma.c:4521:gf_rdma_handshake_pollerr] (-->/usr/sbin/glusterd(main+0x35a) [0x7f5ec0
3dc47a] (-->/usr/lib/libglusterfs.so.0(+0x3c0b7) [0x7f5ebff760b7] (-->/usr/lib/glusterfs/3.3git/rpc-transport/rdma.
so(+0x5140) [0x7f5ebb685140]))) 0-rpc-transport/rdma: rdma.management: peer (127.0.0.1:61) disconnected, cleaning up

Then it stabilises with these messages:

3dc47a] (-->/usr/lib/libglusterfs.so.0(+0x3c0b7) [0x7f5ebff760b7] (-->/usr/lib/glusterfs/3.3git/rpc-transport/rdma.so(+0x5140) [0x7f5ebb685140]))) 0-rpc-transport/rdma: rdma.management: peer (127.0.0.1:52511) disconnected, cleaning up
[2013-06-04 07:32:35.015360] E [rpcsvc.c:491:rpcsvc_handle_rpc_call] 0-glusterd: Request received from non-privileged port. Failing request
[2013-06-04 07:32:35.015384] W [rdma.c:3216:gf_rdma_pollin_notify] 0-rpc-transport/rdma: transport_notify failed
[2013-06-04 07:32:35.015396] W [rdma.c:3331:gf_rdma_recv_request] 0-rpc-transport/rdma: pollin notification failed
[2013-06-04 07:32:35.015411] W [rdma.c:3411:gf_rdma_process_recv] 0-rpc-transport/rdma: receiving a request from peer (192.168.0.62:54064) failed
[2013-06-04 07:32:35.015441] W [rdma.c:4187:gf_rdma_disconnect] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f5ebf8f8e9a] (-->/usr/lib/glusterfs/3.3git/rpc-transport/rdma.so(+0xc05e) [0x7f5ebb68c05e] (-->/usr/lib/glusterfs/3.3git/rpc-transport/rdma.so(gf_rdma_process_recv+0xef) [0x7f5ebb68ba8f]))) 0-rdma.management: disconnect called (peer:192.168.0.62:54064)
[2013-06-04 07:32:35.015525] W [rdma.c:4521:gf_rdma_handshake_pollerr] (-->/usr/sbin/glusterd(main+0x35a) [0x7f5ec03dc47a] (-->/usr/lib/libglusterfs.so.0(+0x3c0b7) [0x7f5ebff760b7] (-->/usr/lib/glusterfs/3.3git/rpc-transport/rdma.so(+0x5140) [0x7f5ebb685140]))) 0-rpc-transport/rdma: rdma.management: peer (192.168.0.62:54064) disconnected, cleaning up
[2013-06-04 07:32:35.049460] E [rpcsvc.c:491:rpcsvc_handle_rpc_call] 0-glusterd: Request received from non-privileged port. Failing request
[2013-06-04 07:32:35.049479] W [rdma.c:3216:gf_rdma_pollin_notify] 0-rpc-transport/rdma: transport_notify failed
[2013-06-04 07:32:35.049490] W [rdma.c:3331:gf_rdma_recv_request] 0-rpc-transport/rdma: pollin notification failed
[2013-06-04 07:32:35.049500] W [rdma.c:3411:gf_rdma_process_recv] 0-rpc-transport/rdma: receiving a request from peer (192.168.0.62:54065) failed
[2013-06-04 07:32:35.049534] W [rdma.c:4187:gf_rdma_disconnect] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f5ebf8f8e9a] (-->/usr/lib/glusterfs/3.3git/rpc-transport/rdma.so(+0xc05e) [0x7f5ebb68c05e] (-->/usr/lib/glusterfs/3.3git/rpc-transport/rdma.so(gf_rdma_process_recv+0xef) [0x7f5ebb68ba8f]))) 0-rdma.management: disconnect called (peer:192.168.0.62:54065)

(192.168.0.62 is the machine I attempted to peer it with.)

Is there something else that needs to be cleared for a reinstall?  (I've not set up a volume at this point, so I'm guessing old file attributes are not an issue).  The last thing I did before reinstalling was attempting to add a brick to a test volume over rdma with replica set to the number of bricks (hoping it would copy across, but it pretty much locked up the machine - I'll retry with less data next time).

Iain


More information about the Gluster-users mailing list