[Gluster-users] Performance gluster 3.2.5 + QLogic Infiniband

Michael Mayer michael at mayer.cx
Thu Apr 26 17:08:45 UTC 2012


Hi Bryan,

I am copying the "gluster volume info all" output below.

Sorry about the confusion with IPoIB and RDMA.

I am using RDMA on the main-volume (mount -t glusterfs 
cfdstorage01-ib:main-volume.rdma /cfd/data) and IPoIB on backup volume 
(mount -t glusterfs cfdstorage01-ib:backup-volume /cfd/data)  as gluster 
will not let me do two RDMA mounts at once.

The confusion probably arose that I was initially setting up a cluster 
with the Gigabit Ethernet associated host names as peers. When I used 
rdma in this setup I could not get much more than 100 MB/s. Only when I 
replaced the GbE host names with the IPoIB ones, performance increased 
to about 400...500 MB/s (single thread dd with 1 MB block size). With 
the RDMA I am getting up to about 900 MB/s.

Hope that answers your questions ?

Michael.



Volume Name: backup-volume

Type: Distributed-Replicate

Status: Started

Number of Bricks: 4 x 2 = 8

Transport-type: tcp,rdma

Bricks:

Brick1: cfdstorage01-ib:/export/backup/0

Brick2: cfdstorage02-ib:/export/backup/0

Brick3: cfdstorage03-ib:/export/backup/0

Brick4: cfdstorage04-ib:/export/backup/0

Brick5: cfdstorage01-ib:/export/backup/1

Brick6: cfdstorage02-ib:/export/backup/1

Brick7: cfdstorage03-ib:/export/backup/1

Brick8: cfdstorage04-ib:/export/backup/1

Options Reconfigured:

performance.io-thread-count: 16

auth.allow: 10.*,192.*



Volume Name: main-volume

Type: Distribute

Status: Started

Number of Bricks: 8

Transport-type: tcp,rdma

Bricks:

Brick1: cfdstorage01-ib:/export/main/0

Brick2: cfdstorage02-ib:/export/main/0

Brick3: cfdstorage03-ib:/export/main/0

Brick4: cfdstorage04-ib:/export/main/0

Brick5: cfdstorage01-ib:/export/main/1

Brick6: cfdstorage02-ib:/export/main/1

Brick7: cfdstorage03-ib:/export/main/1

Brick8: cfdstorage04-ib:/export/main/1

Options Reconfigured:

performance.io-thread-count: 32

auth.allow: 10.*,192.*



On 04/25/2012 07:10 AM, Bryan Whitehead wrote:
> I'm confused, you said "everything works ok (IPoIB)" but later you
> state you are using RDMA? Can you post details of your setup? Maybe
> the output from gluster volume info<volumename>?
>
> On Sat, Apr 21, 2012 at 1:40 AM, Michael Mayer<michael at mayer.cx>  wrote:
>> Hi all,
>>
>> thanks for your suggestions,
>>
>> i think I have "solved" the performance issue now. I had a few too many
>> kernel patches included. I am back to the stock RHEL 5.8 kernel with stock
>> QLogic OFED and everything works ok (IPoIB). My original intent was to
>> explore cachefs on RHEL5 by building a 2.6.32 kernel but while cachefs
>> worked like a treat performance for gluster was as bad as reported
>> previously - so will go without cachefs for now and reintroduce cachefs in
>> an OS upgrade later on.
>>
>> I even have a nicely working rdma setup now and - using that - performance
>> is 900 MB/s + and that consistently so.
>>
>> Since I have two volumes exported by the same bricks it seems I only can get
>> one of them to use RDMA, the other will then refuse to mount and only mount
>> if not using rdma on that one - but that is not a real problem for now as
>> the second one is only used for backup purposes.
>>
>> Michael,
>>
>> On 04/12/2012 01:13 AM, Fabricio Cannini wrote:
>>
>> Hi there
>>
>> The only time i setup a gluster "distributed scratch" like Michael is doing,
>> ( 3.0.5 Debian packages ) i too choose IPoIB simply because i could not get
>> rdma working at all.
>> Time was short and IPoIB "Just worked" well enough for our demand at the
>> time, so i didn't looked into this issue. Plus, pinging and ssh'ing into a
>> node through the IB interface comes handy when diagnosing and fixing
>> networking issues.
>>
>> Em quarta-feira, 11 de abril de 2012, Sabuj Pattanayek<sabujp at gmail.com>
>> escreveu:
>>> I wonder if it's possible to have both rdma and ipoib served by a
>>> single glusterfsd so I can test this? I guess so, since it's just a
>>> tcp mount?
>>>
>>> On Wed, Apr 11, 2012 at 1:43 PM, Harry Mangalam<harry.mangalam at uci.edu>
>>> wrote:
>>>> On Tuesday 10 April 2012 15:47:08 Bryan Whitehead wrote:
>>>>
>>>>> with my infiniband setup I found my performance was much better by
>>>>> setting up a TCP network over infiniband and then using pure tcp as
>>>>> the transport with my gluster volume. For the life of me I couldn't
>>>>> get rdma to beat tcp.
>>>> Thanks for that data point, Brian.
>>>>
>>>> Very interesting. Is this a common experience? The RDMA experience has
>>>> not
>>>> been a very smooth one for me and doing everything with IPoIB would save
>>>> a
>>>> lot of headaches, especially if it's also higher performance.
>>>>
>>>> hjm
>>>>
>>>> --
>>>>
>>>> Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
>>>>
>>>> [ZOT 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
>>>>
>>>> 415 South Circle View Dr, Irvine, CA, 92697 [shipping]
>>>>
>>>> MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
>>>>
>>>> --
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>




More information about the Gluster-users mailing list