[Gluster-users] gluster client performance
Pavan T C
tcp at gluster.com
Tue Jul 26 09:03:28 UTC 2011
On Tuesday 26 July 2011 03:42 AM, John Lalande wrote:
> I'm new to Gluster, but am trying to get it set up on a new compute
> cluster we're building. We picked Gluster for one of our cluster file
> systems (we're also using Lustre for fast scratch space), but the
> Gluster performance has been so bad that I think maybe we have a
> configuration problem -- perhaps we're missing a tuning parameter that
> would help, but I can't find anything in the Gluster documentation --
> all the tuning info I've found seems geared toward Gluster 2.x.
> For some background, our compute cluster has 64 compute nodes. The
> gluster storage pool has 10 Dell PowerEdge R515 servers, each with 12 x
> 2 TB disks. We have another 16 Dell PowerEdge R515s used as Lustre
> storage servers. The compute and storage nodes are all connected via QDR
> Infiniband. Both Gluster and Lustre are set to use RDMA over Infiniband.
> We are using OFED version 1.5.2-20101219, Gluster 3.2.2 and CentOS 5.5
> on both the compute and storage nodes.
I would need some more information about your setup to estimate the
performance you should get with your gluster setup.
1. Can you provide the details of how disks are connected to the storage
boxes ? Is it via FC ? What raid configuration is it using (if at all any) ?
2. What is the disk bandwidth you are getting on the local filesystem on
a given storage node ? I mean, pick any of the 10 storage servers
dedicated for Gluster Storage and perform a dd as below:
Write bandwidth measurement:
dd if=/dev/zero of=/export_directory/10g_file bs=128K count=80000
Read bandwidth measurement:
dd if=/export_directory/10g_file of=/dev/null bs=128K count=80000
[The above command is doing a direct IO of 10GB via your backend FS -
3. What is the IB bandwidth that you are getting between the compute
node and the glusterfs storage node? You can run the tool "rdma_bw" to
get the details:
On the server, run:
# rdma_bw -b
[ -b measures bi-directional bandwidth]
On the compute node, run,
# rdma_bw -b <server>
[If you have not already installed it, rdma_bw is available via -
Lets start with this, and I will ask for more if necessary.
> Oddly, it seems like there's some sort of bottleneck on the client side
> -- for example, we're only seeing about 50 MB/s write throughput from a
> single compute node when writing a 10GB file. But, if we run multiple
> simultaneous writes from multiple compute nodes to the same Gluster
> volume, we get 50 MB/s from each compute node. However, running multiple
> writes from the same compute node does not increase throughput. The
> compute nodes have 48 cores and 128 GB RAM, so I don't think the issue
> is with the compute node hardware.
> With Lustre, on the same hardware, with the same version of OFED, we're
> seeing write throughput on that same 10 GB file as follows: 476 MB/s
> single stream write from a single compute node and aggregate performance
> of more like 2.4 GB/s if we run simultaneous writes. That leads me to
> believe that we don't have a problem with RDMA, otherwise Lustre, which
> is also using RDMA, should be similarly affected.
> We have tried both xfs and ext4 for the backend file system on the
> Gluster storage nodes (we're currently using ext4). We went with
> distributed (not distributed striped) for the Gluster volume -- the
> thought was that if there was a catastrophic failure of one of the
> storage nodes, we'd only lose the data on that node; presumably with
> distributed striped you'd lose any data striped across that volume,
> unless I have misinterpreted the documentation.
> So ... what's expected/normal throughput for Gluster over QDR IB to a
> relatively large storage pool (10 servers / 120 disks)? Does anyone have
> suggested tuning tips for improving performance?
> Gluster-users mailing list
> Gluster-users at gluster.org
More information about the Gluster-users