[Gluster-users] Fedora 11 - 2.6.31 Kernel - Fuse 2.8.0 - Infiniband

Mickey Mazarick mic at digitaltadpole.com
Tue Sep 22 19:23:41 UTC 2009


I had some difficulty getting OFED 1.3 working on kernel 2.6.27 about 6 
months back.  It took some patching but I did find that you needed to 
have the srq enabled for it to work. The ibv_srq_pingpong test app was a 
good test for weather it would work with gluster of not.

I also had to upgrade the firmware on the mellanox cards I have to 
enable srq (send recieve que)

-Mic

Nathan Stratton wrote:
>
> Hate to post again, but anyone have any ideas on this?
>
> -Nathan
>
> On Fri, 18 Sep 2009, Nathan Stratton wrote:
>
>>
>> Has anyone been able to get Infiniband working with 2.6.31 kernel and 
>> fuse 2.8.0? My config works fine on my Centos 2.6.18 box, so I know 
>> that is ok.
>>
>> Infiniband looks good:
>>
>> [root at xen1 src]# lsmod |grep ib
>> ib_ucm                 13752  0
>> ib_uverbs              32256  2 rdma_ucm,ib_ucm
>> ib_ipoib               68880  0
>> ib_mthca              123700  0
>>
>> [root at xen1 src]# ibv_devices
>>    device                 node GUID
>>    ------              ----------------
>>    mthca0              0005ad00000327e8
>>
>> Gluster looks like it starts OK, but I can't touch the mount and 
>> after a while it times out. Debug logs:
>>
>>
>> [2009-09-18 19:36:17] D [glusterfsd.c:354:_get_specfp] glusterfs: 
>> loading volume file /usr/local/etc/glusterfs/glusterfs.vol
>> ================================================================================ 
>>
>> Version      : glusterfs 2.0.6 built on Sep 18 2009 09:54:43
>> TLA Revision : v2.0.6
>> Starting Time: 2009-09-18 19:36:17
>> Command line : glusterfs -L DEBUG -l /var/log/glusterfs.log 
>> --disable-direct-io-mode /share
>> PID          : 8303
>> System name  : Linux
>> Nodename     : xen1.hou.blinkmind.com
>> Kernel Release : 2.6.31
>> Hardware Identifier: x86_64
>>
>> Given volfile:
>> +------------------------------------------------------------------------------+ 
>>
>>  1: volume brick0
>>  2:  type protocol/client
>>  3:  option transport-type ib-verbs/client
>>  4:  option remote-host 172.16.0.200
>>  5:  option remote-port 6997
>>  6:  option transport.address-family inet/inet6
>>  7:  option remote-subvolume brick
>>  8: end-volume
>>  9:
>> 10: volume mirror0
>> 11:  type protocol/client
>> 12:  option transport-type ib-verbs/client
>> 13:  option remote-host 172.16.0.201
>> 14:  option remote-port 6997
>> 15:  option transport.address-family inet/inet6
>> 16:  option remote-subvolume brick
>> 17: end-volume
>> 18:
>> 19: volume brick1
>> 20:  type protocol/client
>> 21:  option transport-type ib-verbs/client
>> 22:  option remote-host 172.16.0.202
>> 23:  option remote-port 6997
>> 24:  option transport.address-family inet/inet6
>> 25:  option remote-subvolume brick
>> 26: end-volume
>> 27:
>> 28: volume mirror1
>> 29:  type protocol/client
>> 30:  option transport-type ib-verbs/client
>> 31:  option remote-host 172.16.0.203
>> 32:  option remote-port 6997
>> 33:  option transport.address-family inet/inet6
>> 34:  option remote-subvolume brick
>> 35: end-volume
>> 36:
>> 37: volume brick2
>> 38:  type protocol/client
>> 39:  option transport-type ib-verbs/client
>> 40:  option remote-host 172.16.0.204
>> 41:  option remote-port 6997
>> 42:  option transport.address-family inet/inet6
>> 43:  option remote-subvolume brick
>> 44: end-volume
>> 45:
>> 46: volume mirror2
>> 47:  type protocol/client
>> 48:  option transport-type ib-verbs/client
>> 49:  option remote-host 172.16.0.205
>> 50:  option remote-port 6997
>> 51:  option transport.address-family inet/inet6
>> 52:  option remote-subvolume brick
>> 53: end-volume
>> 54:
>> 55: volume block0
>> 56:  type cluster/replicate
>> 57:  subvolumes brick0 mirror0
>> 58: end-volume
>> 59:
>> 60: volume block1
>> 61:  type cluster/replicate
>> 62:  subvolumes brick1 mirror1
>> 63: end-volume
>> 64:
>> 65: volume block2
>> 66:  type cluster/replicate
>> 67:  subvolumes brick2 mirror2
>> 68: end-volume
>> 69:
>> 70: volume unify
>> 71:  type cluster/distribute
>> 72:  subvolumes block0 block1 block2
>> 73: end-volume
>> 74:
>>
>> +------------------------------------------------------------------------------+ 
>>
>> [2009-09-18 19:36:17] D [glusterfsd.c:1205:main] glusterfs: running 
>> in pid 8303
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick0: 
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick0: 
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> brick0: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> brick0: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror0: 
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror0: 
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> mirror0: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> mirror0: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick1: 
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick1: 
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> brick1: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> brick1: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror1: 
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror1: 
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> mirror1: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> mirror1: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick2: 
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick2: 
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> brick2: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> brick2: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror2: 
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror2: 
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> mirror2: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: 
>> attempt to load file 
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] 
>> mirror2: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got 
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] N [glusterfsd.c:1224:main] glusterfs: 
>> Successfully started
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick0: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick0: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror0: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror0: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick1: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick1: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror1: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror1: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick2: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick2: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror2: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror2: got 
>> GF_EVENT_CHILD_UP
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick0: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> brick0: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick0: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> brick0: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror0: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> mirror0: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror0: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> mirror0: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick1: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> brick1: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick1: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> brick1: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror1: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> mirror1: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror1: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> mirror1: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick2: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> brick2: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick2: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> brick2: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror2: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> mirror2: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror2: 
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. 
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] 
>> mirror2: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] D [dht-common.c:820:dht_lookup] unify: no 
>> subvolume in layout for path=/, checking on all the subvols to see if 
>> it is a directory
>> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify: 
>> lookup of / on block0 returned error (Transport endpoint is not 
>> connected)
>> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify: 
>> lookup of / on block1 returned error (Transport endpoint is not 
>> connected)
>> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify: 
>> lookup of / on block2 returned error (Transport endpoint is not 
>> connected)
>> [2009-09-18 20:06:18] D [fuse-bridge.c:2385:fuse_root_lookup_cbk] 
>> fuse: first lookup on root failed.
>> [2009-09-18 20:06:18] W [fuse-bridge.c:1841:fuse_statfs_cbk] 
>> glusterfs-fuse: 2: ERR => -1 (Transport endpoint is not connected)
>>
>>
>>
>>> <>
>> Nathan Stratton                                CTO, BlinkMind, Inc.
>> nathan at robotics.net                         nathan at blinkmind.com
>> http://www.robotics.net                        http://www.blinkmind.com
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




More information about the Gluster-users mailing list