[Gluster-devel] Replicate/AFR Using Broadcast/Multicast?

Beat Rubischon beat at 0x1b.ch
Wed Oct 13 12:22:41 UTC 2010


Hi Gordan!

Quoting <gordan at bobich.net> (13.10.10 10:06):

> What sort of a cluster are you running with that many nodes? RHCS?
> Heartbeat? Something else entirely? In what arrangement?

High performance clusters. The main target Gluster was made for :-)

>> Even the most expensive GigE switch chassis could be killed by 125+ MBytes
>> of traffic which is almost nothing :-)
> Sounds like a typical example of cost not being a good measure of
> quality and performance. :)

It's simply a technical limit. Think about what broadcast is and how it
passes a switch.

>> In Infiniband...
> Sure, but historically in the networking space, non-ethernet
> technologies have always been niche, cost ineffective in terms of
> price/performance and only had a temporary performance advantage.

Right.  You'll be surprised but the price per port is much lower in the
Infiniband world compared to the 10GigE world. When using GlusterFS inside a
datacenter Infiniband could be a good choice.

> Right now more storage nodes means slower storage, and that should
> really be addressed.

Wrong. Assuming you have a "distribute" concept. 10 clients talks to 5
servers. Storing a file means the client writes the file to one of the
servers. Reading the same. So the bandwidth of each server is accumulated.
With GigE this means you'll have about 600MBytes/s network bandwidth.
Additional servers will add additional bandwidth - as long as you scale not
only servers but also clients. One small exception: The lookup of a file
must be directed to all servers. One of the reasons why GlusterFS is
"better" for a smaller amount of large files as for a large amount of
smaller files.

Right when you use a "replicate" concept. Your client has to write to both
members of the replica. Additional replicas will consume additional
bandwith. But hey - who needs more then two replicas? BTW: The servers will
never talk to each other. It's always the client who transfers the data.

The perfect solution is probably a "distribute" over a "replicate". Mirror
the files over two bricks. Use your mirrors to bild a large filesystem with
replicate. Your performance will scale with the amount of bricks but you'll
keep the stability of a fully redundant setup.

Beat

-- 
     \|/                           Beat Rubischon <beat at 0x1b.ch>
   ( 0^0 )                             http://www.0x1b.ch/~beat/
oOO--(_)--OOo---------------------------------------------------
Meine Erlebnisse, Gedanken und Traeume: http://www.0x1b.ch/blog/






More information about the Gluster-devel mailing list