[Gluster-users] Gluster 3.1.1 issues over RDMA and HPC environment
Fabricio Cannini
fcannini at gmail.com
Mon Feb 7 19:26:32 UTC 2011
Em Domingo 06 Fevereiro 2011, às 16:35:45, Claudio Baeza Retamal escreveu:
Hi.
> Dear friends,
>
> I have several problems of stability, reliability in a small-middle
> sized cluster, my configuration is the following:
>
> 66 compute nodes (IBM idataplex, X5550, 24 GB RAM)
> 1 access node (front end)
> 1 master node (queue manager and monotoring)
> 2 server for I/O with GlusterFS configured in distributed mode (4 TB in
> total)
>
> All computer have a Mellanox ConnectX QDR (40 Gbps) dual port
> 1 Switch Qlogic 12800-180, 7 leaf of 24 ports each one and two double
> Spines QSFP plug
>
> Centos 5.5 and Xcat as cluster manager
> Ofed 1.5.1
> Gluster 3.1.1 over inbiniband
I have a smaller, but relatively similar setup, and am facing the same issues
of Claudio.
- 1 frontend node ( 2 intel xeon 5420 , 16gb ram DDR2 ECC , 4TB of raw disk
space ) with 2 "Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 5GT/s -
IB DDR"
- 1 storage node ( 2 intel xeon 5420 , 24gb ram DDR@ ECC, 8TB of raw disk
space ) with 2 "Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 5GT/s -
IB DDR"
- 22 compute nodes ( 2 intel xeon 5420 , 16gb ram DDR2 ECC , 750GB of raw
disk space ) with 1 "InfiniBand: Mellanox Technologies MT25204 [InfiniHost III
Lx HCA]"
Each compute node has a /glstfs partition, with 615GB , serving a gluster
volume of ~3.1TB in /scratch for all nodes and the frontend, using 3.0.5 stock
debian squeeze 6.0 packages.
> When the cluster is full loaded for applications which use heavily MPI
> in combination with other application which uses a lot of I/O to file
> system, GlusterFS do not work anymore.
> Also, when gendb uses interproscan bioinformatic applications with 128 o
> more jobs, GlusterFS death or disconnects clients randomly, so, some
> applicatios become shutdown due they do not see the file system.
>
> This do not happen with Gluster over tcp (ethernet 1 Gbps) and neither
> happen with Lustre 1.8.5 over infiniband, under same conditions Lustre
> work fine.
>
> My question is, exist any documenation where there are information more
> especific for GlusterFS tuning?
>
> Only I found basic information for configuring Gluster, but I do no have
> information more deep (i.e. for experts), I think must exist some
> option for manipulate this siuation on GlusterFS, moreover, other people
> should have the same problems, since we replicate
> the configuration in other site with the same results.
> Perhaps, the question is about the gluster scalability, how many
> clients is recommended for each gluster server when I use RDMA and
> infiniband fabric at 40 Gbps?
>
> I would appreciate any help, I want to use Gluster, but stability and
> reliability is very important for us. Perhaps
I have "solved" it , by taking out of the executing queue the first node that
was listed in the client file '/etc/glusterfs/glusterfs.vol'.
And this what i *think* is the reason it worked:
I can't find it now, but i saw in the 3.0 docs that " ... the first hostname
found in the client config file acts as a lock server for the whole volume...".
In other words, the first hostname found in the client config coordinates the
locking/unlocking of files in the whole volume. This way, the node does not
accepts any job, and can dedicate its processing power solely as a 'lock
server'.
it may well be the case that gluster is not yet as optimized for infiniband as
it is for ethernet, too. I just can't say.
I am also unable to find how i can specify something like this in the gluster
config: "node n is a lock server for nodes a,b,c,d". Does anybody if is it
possible?
Hope it helps you somehow, and to improve gluster performance over IB/RDMA.
More information about the Gluster-users
mailing list