[Gluster-users] Fedora 11 - 2.6.31 Kernel - Fuse 2.8.0 - Infiniband
Mickey Mazarick
mic at digitaltadpole.com
Tue Sep 22 19:23:41 UTC 2009
I had some difficulty getting OFED 1.3 working on kernel 2.6.27 about 6
months back. It took some patching but I did find that you needed to
have the srq enabled for it to work. The ibv_srq_pingpong test app was a
good test for weather it would work with gluster of not.
I also had to upgrade the firmware on the mellanox cards I have to
enable srq (send recieve que)
-Mic
Nathan Stratton wrote:
>
> Hate to post again, but anyone have any ideas on this?
>
> -Nathan
>
> On Fri, 18 Sep 2009, Nathan Stratton wrote:
>
>>
>> Has anyone been able to get Infiniband working with 2.6.31 kernel and
>> fuse 2.8.0? My config works fine on my Centos 2.6.18 box, so I know
>> that is ok.
>>
>> Infiniband looks good:
>>
>> [root at xen1 src]# lsmod |grep ib
>> ib_ucm 13752 0
>> ib_uverbs 32256 2 rdma_ucm,ib_ucm
>> ib_ipoib 68880 0
>> ib_mthca 123700 0
>>
>> [root at xen1 src]# ibv_devices
>> device node GUID
>> ------ ----------------
>> mthca0 0005ad00000327e8
>>
>> Gluster looks like it starts OK, but I can't touch the mount and
>> after a while it times out. Debug logs:
>>
>>
>> [2009-09-18 19:36:17] D [glusterfsd.c:354:_get_specfp] glusterfs:
>> loading volume file /usr/local/etc/glusterfs/glusterfs.vol
>> ================================================================================
>>
>> Version : glusterfs 2.0.6 built on Sep 18 2009 09:54:43
>> TLA Revision : v2.0.6
>> Starting Time: 2009-09-18 19:36:17
>> Command line : glusterfs -L DEBUG -l /var/log/glusterfs.log
>> --disable-direct-io-mode /share
>> PID : 8303
>> System name : Linux
>> Nodename : xen1.hou.blinkmind.com
>> Kernel Release : 2.6.31
>> Hardware Identifier: x86_64
>>
>> Given volfile:
>> +------------------------------------------------------------------------------+
>>
>> 1: volume brick0
>> 2: type protocol/client
>> 3: option transport-type ib-verbs/client
>> 4: option remote-host 172.16.0.200
>> 5: option remote-port 6997
>> 6: option transport.address-family inet/inet6
>> 7: option remote-subvolume brick
>> 8: end-volume
>> 9:
>> 10: volume mirror0
>> 11: type protocol/client
>> 12: option transport-type ib-verbs/client
>> 13: option remote-host 172.16.0.201
>> 14: option remote-port 6997
>> 15: option transport.address-family inet/inet6
>> 16: option remote-subvolume brick
>> 17: end-volume
>> 18:
>> 19: volume brick1
>> 20: type protocol/client
>> 21: option transport-type ib-verbs/client
>> 22: option remote-host 172.16.0.202
>> 23: option remote-port 6997
>> 24: option transport.address-family inet/inet6
>> 25: option remote-subvolume brick
>> 26: end-volume
>> 27:
>> 28: volume mirror1
>> 29: type protocol/client
>> 30: option transport-type ib-verbs/client
>> 31: option remote-host 172.16.0.203
>> 32: option remote-port 6997
>> 33: option transport.address-family inet/inet6
>> 34: option remote-subvolume brick
>> 35: end-volume
>> 36:
>> 37: volume brick2
>> 38: type protocol/client
>> 39: option transport-type ib-verbs/client
>> 40: option remote-host 172.16.0.204
>> 41: option remote-port 6997
>> 42: option transport.address-family inet/inet6
>> 43: option remote-subvolume brick
>> 44: end-volume
>> 45:
>> 46: volume mirror2
>> 47: type protocol/client
>> 48: option transport-type ib-verbs/client
>> 49: option remote-host 172.16.0.205
>> 50: option remote-port 6997
>> 51: option transport.address-family inet/inet6
>> 52: option remote-subvolume brick
>> 53: end-volume
>> 54:
>> 55: volume block0
>> 56: type cluster/replicate
>> 57: subvolumes brick0 mirror0
>> 58: end-volume
>> 59:
>> 60: volume block1
>> 61: type cluster/replicate
>> 62: subvolumes brick1 mirror1
>> 63: end-volume
>> 64:
>> 65: volume block2
>> 66: type cluster/replicate
>> 67: subvolumes brick2 mirror2
>> 68: end-volume
>> 69:
>> 70: volume unify
>> 71: type cluster/distribute
>> 72: subvolumes block0 block1 block2
>> 73: end-volume
>> 74:
>>
>> +------------------------------------------------------------------------------+
>>
>> [2009-09-18 19:36:17] D [glusterfsd.c:1205:main] glusterfs: running
>> in pid 8303
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick0:
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick0:
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> brick0: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> brick0: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror0:
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror0:
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> mirror0: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> mirror0: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick1:
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick1:
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> brick1: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> brick1: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror1:
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror1:
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> mirror1: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> mirror1: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick2:
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick2:
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> brick2: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> brick2: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror2:
>> defaulting frame-timeout to 30mins
>> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror2:
>> defaulting ping-timeout to 10
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> mirror2: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport:
>> attempt to load file
>> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so
>> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate]
>> mirror2: no range check required for 'option remote-port 6997'
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got
>> GF_EVENT_PARENT_UP, attempting connect on transport
>> [2009-09-18 19:36:17] N [glusterfsd.c:1224:main] glusterfs:
>> Successfully started
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick0: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick0: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror0: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror0: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick1: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick1: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror1: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror1: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick2: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick2: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror2: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror2: got
>> GF_EVENT_CHILD_UP
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick0:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> brick0: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick0:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> brick0: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror0:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> mirror0: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror0:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> mirror0: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick1:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> brick1: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick1:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> brick1: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror1:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> mirror1: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror1:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> mirror1: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick2:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> brick2: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick2:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> brick2: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror2:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> mirror2: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror2:
>> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17.
>> frame-timeout = 1800
>> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk]
>> mirror2: setvolume failed (Transport endpoint is not connected)
>> [2009-09-18 20:06:18] D [dht-common.c:820:dht_lookup] unify: no
>> subvolume in layout for path=/, checking on all the subvols to see if
>> it is a directory
>> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify:
>> lookup of / on block0 returned error (Transport endpoint is not
>> connected)
>> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify:
>> lookup of / on block1 returned error (Transport endpoint is not
>> connected)
>> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify:
>> lookup of / on block2 returned error (Transport endpoint is not
>> connected)
>> [2009-09-18 20:06:18] D [fuse-bridge.c:2385:fuse_root_lookup_cbk]
>> fuse: first lookup on root failed.
>> [2009-09-18 20:06:18] W [fuse-bridge.c:1841:fuse_statfs_cbk]
>> glusterfs-fuse: 2: ERR => -1 (Transport endpoint is not connected)
>>
>>
>>
>>> <>
>> Nathan Stratton CTO, BlinkMind, Inc.
>> nathan at robotics.net nathan at blinkmind.com
>> http://www.robotics.net http://www.blinkmind.com
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list