[Gluster-users] rdma or tcp?

Anand Babu Periasamy ab at gluster.com
Mon Apr 4 21:59:19 UTC 2011

My recommendation will be TCP.  For most application needs TCP is just
fine. On 1GigE, TCP/IP running host CPU is hardly a bottleneck.
Network latencies, disk latencies and memory copy operations are the
ones usually causing bottle necks. With scale out storage, you will
have multiple 1GigE ports to parallelize. Scaleout 10GigE is going to
get even better. 10GigE cost has come down. I am always in favor of
easy-of-use and commodity components. Ethernet is getting closer to
Infiniband potential with its Data Center Bridging (lossless ethernet)
specification. TCP/IP stack is going to get thinner in the future and
you will get close to RDMA like efficiency with socket emulation over
RDMA. RoCE specification brings RDMA to Ethernet over DCB.

Infiniband is great for today if you can take RDMA all the way to your
client nodes. If your applications mostly talk RDMA, it gets even
better. This is what happens in Supercomputing / HPC with scientific
computing applications. They mostly use MPI over RDMA. All nodes are
connected via IB. Infiniband is cheap, scalable and proven. How ever,
its TCP/IP (TCP/IPoIB) performance sucks. Mellanox and QLogic are the
only two prominent vendors behind IB. I don't see IB gaining wide
adoption outside of the HPC community in future.
Mellanox is moving towards unified Ethernet+IB NICs.

So.. stick to Ethernet unless you are running MPI based scientific
apps. 1GigE scale out is good enough for moderate needs. 10GigE is
compelling at least on the server side. Client nodes can still talk
1GigE and uplink to storage servers via 10GigE.

Look at Dell 24 port 10GigE switch costs between 8 to 12k USD
depending on copper vs fiber. Arista, Cisco Nexus are the high-end
scalable switches.

Hope this helps.

On Mon, Apr 4, 2011 at 1:07 PM, isdtor <isdtor at gmail.com> wrote:
> Is there a document with some guidelines for setting up bricks with
> tcp or rdma transport?
> I'm looking at a new deployment where the storage cluster hosts
> connect via 10GigE, but clients are on 1GigE. Over time, there will be
> 10GigE clients, but the majority will remain on 1GigE. In this setup,
> should the storage bricks use tcp or rdma?
> If tcp is the better choice, and at some point in the future all
> clients are 10GigE, should the bricks be rebuilt with rdma?
> Are there any downsides to using rdma even in a purely 1GigE network?
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Anand Babu Periasamy
Blog [http://www.unlocksmith.org]

Imagination is more important than knowledge --Albert Einstein

More information about the Gluster-users mailing list