[Gluster-users] RDMA Failure with 3.1.3
scandic
scandic at rootblog.org
Wed Apr 6 20:31:16 UTC 2011
I just installed GlusterFS 3.1.3 on Arch Linux x86_64 with Kernel 2.6.28.2 and libibverbs 1.1.4. It's a 2 Node replicated setup over rdma:
gluster> peer status
Number of Peers: 1
Hostname: easyStor-252
Uuid: 6766b083-8a12-41a8-984e-dd3ca2f1c536
State: Peer in Cluster (Connected)
gluster>
gluster> volume info all
Volume Name: Storage-1
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: rdma
Bricks:
Brick1: easyStor-251:/var/export
Brick2: easyStor-252:/var/export
Options Reconfigured:
auth.allow: 10.1.*
gluster>
When I try to mount using "mount -t glusterfs easyStor-251:/Storage-1 /mnt" I get no output. When I try to chdir to /mnt or call df the process hangs.
Here's my logfile content:
[2011-04-06 21:20:50.678395] W [io-stats.c:1644:init] 0-Storage-1: dangling volume. check volfile
[2011-04-06 21:20:50.678516] W [dict.c:1205:data_to_str] 0-dict: @data=(nil)
[2011-04-06 21:20:50.678545] W [dict.c:1205:data_to_str] 0-dict: @data=(nil)
[2011-04-06 21:20:50.682292] C [rdma.c:3953:rdma_init] 0-rpc-transport/rdma: No IB devices found
[2011-04-06 21:20:50.682371] E [rdma.c:4826:init] 0-Storage-1-client-1: Failed to initialize IB Device
[2011-04-06 21:20:50.682397] E [rpc-transport.c:849:rpc_transport_load] 0-rpc-tr
ansport: 'rdma' initialization failed
pending frames:
patchset: v3.1.3
signal received: 6
So it seems it can't find the IB device. According to lsmod output the modules are loaded:
Module Size Used by
xprtrdma 38186 0
svcrdma 31379 0
sunrpc 190513 2 xprtrdma,svcrdma
rdma_ucm 12356 0
rdma_cm 29317 3 xprtrdma,svcrdma,rdma_ucm
iw_cm 7033 1 rdma_cm
ib_addr 4965 1 rdma_cm
ib_ucm 11398 0
ib_uverbs 27035 2 rdma_ucm,ib_ucm
ib_umad 10506 0
ib_ipoib 71040 0
ib_cm 30040 3 rdma_cm,ib_ucm,ib_ipoib
ib_sa 18440 4 rdma_ucm,rdma_cm,ib_ipoib,ib_cm
ipv6 277133 27 bridge,ib_addr,ib_ipoib
ib_mad 35558 4 ib_umad,ib_cm,ib_sa,ib_mthca
ib_core 44827 13 xprtrdma,svcrdma,rdma_ucm,rdma_cm,iw_cm,ib_ucm,ib_uverbs,ib_umad,ib_ipoib,ib_cm,ib_sa,ib_mthca,ib_mad
fuse 64671 3
And also dmesg tells me the HCA has been found:
ib_mthca: Mellanox InfiniBand HCA driver v1.0 (April 4, 2008)
ib_mthca: Initializing 0000:0a:00.0
ib_mthca 0000:0a:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
ib_mthca 0000:0a:00.0: setting latency timer to 64
ib_mthca 0000:0a:00.0: HCA FW version 1.0.800 is old (1.2.000 is current).
ib_mthca 0000:0a:00.0: If you have problems, try updating your HCA FW.
ib_mthca 0000:0a:00.0: irq 68 for MSI/MSI-X
ib_mthca 0000:0a:00.0: irq 69 for MSI/MSI-X
ib_mthca 0000:0a:00.0: irq 70 for MSI/MSI-X
There's no firewall active. Would be great if someone can lead me to the right direction.
More information about the Gluster-users
mailing list