[Gluster-users] Performance gluster 3.2.5 + QLogic Infiniband

Bryan Whitehead driver at megahappy.net
Sun May 6 02:40:49 UTC 2012


> I am copying the "gluster volume info all" output below.
>
> Sorry about the confusion with IPoIB and RDMA.
>
> I am using RDMA on the main-volume (mount -t glusterfs
> cfdstorage01-ib:main-volume.rdma /cfd/data) and IPoIB on backup volume
> (mount -t glusterfs cfdstorage01-ib:backup-volume /cfd/data)  as gluster
> will not let me do two RDMA mounts at once.
>
> The confusion probably arose that I was initially setting up a cluster with
> the Gigabit Ethernet associated host names as peers. When I used rdma in
> this setup I could not get much more than 100 MB/s. Only when I replaced the
> GbE host names with the IPoIB ones, performance increased to about 400...500
> MB/s (single thread dd with 1 MB block size). With the RDMA I am getting up
> to about 900 MB/s.

Interesting. So are you using IPoIB hostname (hostname hitting an IP
on the IPoIB card) for the peer probe <hostname>? but then specifying
RDMA?

When I did that, my performance was pretty lackluster and unchanged
from using regular hostnames on the gig card. I basically turned rdma
off completely to get better performace (using IP's from the IPoIB
network cards)

> Hope that answers your questions ?

How does the 900MB/sec compare with your raw io directly (local
filesystem vs gluster)?

> Michael.
>
>
>
> Volume Name: backup-volume
>
> Type: Distributed-Replicate
>
> Status: Started
>
> Number of Bricks: 4 x 2 = 8
>
> Transport-type: tcp,rdma
>
> Bricks:
>
> Brick1: cfdstorage01-ib:/export/backup/0
>
> Brick2: cfdstorage02-ib:/export/backup/0
>
> Brick3: cfdstorage03-ib:/export/backup/0
>
> Brick4: cfdstorage04-ib:/export/backup/0
>
> Brick5: cfdstorage01-ib:/export/backup/1
>
> Brick6: cfdstorage02-ib:/export/backup/1
>
> Brick7: cfdstorage03-ib:/export/backup/1
>
> Brick8: cfdstorage04-ib:/export/backup/1
>
> Options Reconfigured:
>
> performance.io-thread-count: 16
>
> auth.allow: 10.*,192.*
>
>
>
> Volume Name: main-volume
>
> Type: Distribute
>
> Status: Started
>
> Number of Bricks: 8
>
> Transport-type: tcp,rdma
>
> Bricks:
>
> Brick1: cfdstorage01-ib:/export/main/0
>
> Brick2: cfdstorage02-ib:/export/main/0
>
> Brick3: cfdstorage03-ib:/export/main/0
>
> Brick4: cfdstorage04-ib:/export/main/0
>
> Brick5: cfdstorage01-ib:/export/main/1
>
> Brick6: cfdstorage02-ib:/export/main/1
>
> Brick7: cfdstorage03-ib:/export/main/1
>
> Brick8: cfdstorage04-ib:/export/main/1
>
> Options Reconfigured:
>
> performance.io-thread-count: 32
>
> auth.allow: 10.*,192.*
>
>
>
>
> On 04/25/2012 07:10 AM, Bryan Whitehead wrote:
>>
>> I'm confused, you said "everything works ok (IPoIB)" but later you
>> state you are using RDMA? Can you post details of your setup? Maybe
>> the output from gluster volume info<volumename>?
>>
>> On Sat, Apr 21, 2012 at 1:40 AM, Michael Mayer<michael at mayer.cx>  wrote:
>>>
>>> Hi all,
>>>
>>> thanks for your suggestions,
>>>
>>> i think I have "solved" the performance issue now. I had a few too many
>>> kernel patches included. I am back to the stock RHEL 5.8 kernel with
>>> stock
>>> QLogic OFED and everything works ok (IPoIB). My original intent was to
>>> explore cachefs on RHEL5 by building a 2.6.32 kernel but while cachefs
>>> worked like a treat performance for gluster was as bad as reported
>>> previously - so will go without cachefs for now and reintroduce cachefs
>>> in
>>> an OS upgrade later on.
>>>
>>> I even have a nicely working rdma setup now and - using that -
>>> performance
>>> is 900 MB/s + and that consistently so.
>>>
>>> Since I have two volumes exported by the same bricks it seems I only can
>>> get
>>> one of them to use RDMA, the other will then refuse to mount and only
>>> mount
>>> if not using rdma on that one - but that is not a real problem for now as
>>> the second one is only used for backup purposes.
>>>
>>> Michael,
>>>
>>> On 04/12/2012 01:13 AM, Fabricio Cannini wrote:
>>>
>>> Hi there
>>>
>>> The only time i setup a gluster "distributed scratch" like Michael is
>>> doing,
>>> ( 3.0.5 Debian packages ) i too choose IPoIB simply because i could not
>>> get
>>> rdma working at all.
>>> Time was short and IPoIB "Just worked" well enough for our demand at the
>>> time, so i didn't looked into this issue. Plus, pinging and ssh'ing into
>>> a
>>> node through the IB interface comes handy when diagnosing and fixing
>>> networking issues.
>>>
>>> Em quarta-feira, 11 de abril de 2012, Sabuj Pattanayek<sabujp at gmail.com>
>>> escreveu:
>>>>
>>>> I wonder if it's possible to have both rdma and ipoib served by a
>>>> single glusterfsd so I can test this? I guess so, since it's just a
>>>> tcp mount?
>>>>
>>>> On Wed, Apr 11, 2012 at 1:43 PM, Harry Mangalam<harry.mangalam at uci.edu>
>>>> wrote:
>>>>>
>>>>> On Tuesday 10 April 2012 15:47:08 Bryan Whitehead wrote:
>>>>>
>>>>>> with my infiniband setup I found my performance was much better by
>>>>>> setting up a TCP network over infiniband and then using pure tcp as
>>>>>> the transport with my gluster volume. For the life of me I couldn't
>>>>>> get rdma to beat tcp.
>>>>>
>>>>> Thanks for that data point, Brian.
>>>>>
>>>>> Very interesting. Is this a common experience? The RDMA experience has
>>>>> not
>>>>> been a very smooth one for me and doing everything with IPoIB would
>>>>> save
>>>>> a
>>>>> lot of headaches, especially if it's also higher performance.
>>>>>
>>>>> hjm
>>>>>
>>>>> --
>>>>>
>>>>> Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
>>>>>
>>>>> [ZOT 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
>>>>>
>>>>> 415 South Circle View Dr, Irvine, CA, 92697 [shipping]
>>>>>
>>>>> MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>



More information about the Gluster-users mailing list