[Gluster-users] glusterfs 3.1.1 rdma module crashing when mounting volume

Jeremy Stout stout.jeremy at gmail.com
Fri Dec 31 18:44:58 UTC 2010


This looks like the same issue I experienced earlier in the month. I
would suggest rebuilding GlusterFS from source using the patch that
was posted here:
http://gluster.org/pipermail/gluster-users/2010-December/006141.html

The patch resolved the queue creation issues I was experiencing.

Jeremy Stout

On Fri, Dec 31, 2010 at 1:02 PM, Joerg Blank <j.blank at fz-juelich.de> wrote:
> Hi all,
>
> We have a small HPC cluster and I tried to harness the spare disk space
> of our compute nodes to take of some load from the cluster's nfs server.
>
> I started of using glusterfs 3.0.x packaged with Debian Squeeze (all
> nodes use this version)
>
> I tried updating to glusterfs 3.1.x using the prepackaged files [1]
> from gluster.org, but found out I was no longer able to use the
> Infiniband interconnect, because the packages seem to be compiled
> without rdma support.
>
> To get the faster interconnect back I repackaged glusterfs 3.1.1 from
> the source tarball and installed it on all nodes. However rdma crashes
> when mounting a volume on the head node [2], it works fine from the
> compute nodes. The only significant in respect to infiniband is, that
> the head node uses another nic:
>
> Work Nodes:
> 02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
> 5GT/s - IB QDR / 10GigE] (rev b0)
>
> Head Node:
> 06:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx
> HCA] (rev 20)
>
>
> If anyone has an idea how to get this working, please let me know.
>
> Regards,
>
> Jörg Blank
>
>
> [1] http://download.gluster.com/pub/gluster/glusterfs/3.1/LATEST/Debian/
>
> [2] Backtrace from logs:
>
> [2010-12-24 22:45:11.516902] W [io-stats.c:1644:init] test-volume:
> dangling volume. check volfile
> [2010-12-24 22:45:11.516943] W [dict.c:1204:data_to_str] dict: @data=(nil)
> [2010-12-24 22:45:11.516955] W [dict.c:1204:data_to_str] dict: @data=(nil)
> [2010-12-24 22:45:11.527333] E [rdma.c:2066:rdma_create_cq]
> rpc-transport/rdma: test-volume-client-1: creation of send_cq failed
> [2010-12-24 22:45:11.527529] E [rdma.c:3771:rdma_get_device]
> rpc-transport/rdma: test-volume-client-1: could not create CQ
> [2010-12-24 22:45:11.527541] E [rdma.c:3957:rdma_init]
> rpc-transport/rdma: could not create rdma device for mthca0
> [2010-12-24 22:45:11.527611] E [rdma.c:4789:init] test-volume-client-1:
> Failed to initialize IB Device
> [2010-12-24 22:45:11.527623] E [rpc-transport.c:971:rpc_transport_load]
> rpc-transport: 'rdma' initialization failed
> pending frames:
>
> patchset: v3.1.1
> signal received: 11
> time of crash: 2010-12-24 22:45:11
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> fdatasync 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.1.1
> /lib/libc.so.6(+0x321e0)[0x7f8e2c3ea1e0]
> /lib/libc.so.6(+0x7a126)[0x7f8e2c432126]
> /usr/lib/glusterfs/3.1.1/rpc-transport/rdma.so(init+0x37c)[0x7f8e28956e7c]
> /usr/lib/libgfrpc.so.0(rpc_transport_load+0x365)[0x7f8e2cd5a035]
> /usr/lib/libgfrpc.so.0(rpc_clnt_new+0xf9)[0x7f8e2cd5de59]
> /usr/lib/glusterfs/3.1.1/xlator/protocol/client.so(client_init_rpc+0xa9)[0x7f8e29c32b09]
> /usr/lib/glusterfs/3.1.1/xlator/protocol/client.so(init+0xf1)[0x7f8e29c32cb1]
> /usr/lib/libglusterfs.so.0(xlator_init+0x58)[0x7f8e2cf7c978]
> /usr/lib/libglusterfs.so.0(glusterfs_graph_init+0x35)[0x7f8e2cfa5b05]
> /usr/lib/libglusterfs.so.0(glusterfs_graph_activate+0x38)[0x7f8e2cfa5c48]
> /usr/sbin/glusterfs(glusterfs_process_volfp+0xba)[0x40447a]
> /usr/sbin/glusterfs(mgmt_getspec_cbk+0xc7)[0x405cc7]
> /usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f8e2cd5cb75]
> /usr/lib/libgfrpc.so.0(rpc_clnt_notify+0xc9)[0x7f8e2cd5cdc9]
> /usr/lib/libgfrpc.so.0(rpc_transport_notify+0x2d)[0x7f8e2cd57d7d]
> /usr/lib/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x34)[0x7f8e2a870c94]
> /usr/lib/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0xb3)[0x7f8e2a870d63]
> /usr/lib/libglusterfs.so.0(+0x3a272)[0x7f8e2cf9d272]
> /usr/sbin/glusterfs(main+0x247)[0x4054c7]
> /lib/libc.so.6(__libc_start_main+0xfd)[0x7f8e2c3d6c4d]
> /usr/sbin/glusterfs[0x403179]
>
>
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>



More information about the Gluster-users mailing list