[Gluster-users] adding to gluster a node with 24TB of disk and 16GB RAM

Tue Jan 4 06:09:31 UTC 2011

Hello again,

Ideally you could run a benchmark of your application and use 
blktrace+seekwatcher <http://oss.oracle.com/%7Emason/seekwatcher/> to 
capture and view some really accurate IO stats and then tune 
accordingly. Other than that it's complete guess work, you have about 
50G of potential FS cache there which is 0.2% of your physical data 
capacity. It all depends then on your cache hit rates (to network 
performance) and ability to handle the cache misses. You could just run 
'iostat' on your backend nodes during a benchmark as well, that's good 
enough.

 From your network stats it appears as though access is actually quite 
low (64in/43out) and so most might be ending up in your FS cache and 
even if it isn't there is no way, even if it's 100% small block random 
operations, that it'll saturate your drives.

By bonding <http://en.wikipedia.org/wiki/Channel_bonding> I mean 
aggregation, trunking or teaming, depending on what networking school 
you went to. My rough and totally inaccurate back-of-a-napkin numbers 
are designed to indicate 1Gbit probably won't be enough, and you might 
need to consider two or more gbit interfaces. Based on my testing with 
six servers I can kill the Gbit interface pretty easily (but that's not 
with your app of course).

Long story short the answer to your original question, "Any guide line 
we should follow for calculating the memory requirements" is no. It's 
all about your specific application requirements (and the money you're 
willing to spend).

The only advice I'd give then is;

    * Be sure to monitor your IO and know exactly what the numbers mean
      and what causes them.
    * Have a capacity plan with an eye on what you need to address any
      of the possible eventualities;

       1. Network throughput/latency - More/faster ports.
       2. Disk sequential read/write - More spindles or flash.
       3. Disk random read/write - More spindles or flash.
       4. File System cache misses - RAM increases on storage nodes.
       5. Single storage node overload - More nodes or striping that file.

On 12/30/2010 08:51 PM, admin iqtc wrote:
> Hi,
>
> Sorry Mark, but i don't understand what you exactly need. Could you give me
> an example of information you're asking?
>
> Regarding bonding, don't worry, all the current 5 machines are bonded(1gbit
> each interface) to the switch, and the new machine would be installed the
> same way.
>
> That switch load is from the HPC clusters to the gluster. The info is from
> the trunking interface in the switch. Our network topology is as follows:
> each gluster server(and the new one) are connected with bonding to a L2
> switch, then from that switch 4x1gbit cables goes to a L3 switch. Both
> switches are configured for those 4 cables to be trunked. The traffic load i
> told you is from the L3 switch.
>
> We may expand that trunking some day, but for now we aren't having any
> trouble..
>
> Thanks
>
> 2010/12/28 Mark "Naoki" Rogers<mrogers at valuecommerce.co.jp>
>
>> Hi,
>>
>> Your five machines should get you raw speeds of at least 300MB/s sequential
>> and 300-500 random IOP/s, your file-system cache alters things depending on
>> access patterns. Without knowing about those patterns I can't guess as to
>> the most beneficial disk/memory ratios for you. If possible run some
>> synthetic benchmarks for base-lining and then try and benchmark your
>> application, even if it's only a limited benchmark that's ok you can still
>> extrapolate from there.
>>
>> The first thing you might hit though could be the 1Gbit interfaces so keep
>> an eye on those and perhaps have a plan to bond them, and get ready to think
>> about 10G on the larger one if needed.
>>
>> Right now it seems the switch load is light, is that per port to the
>> storage bricks?
>>
>>
>>
>> On 12/28/2010 05:38 PM, admin iqtc wrote:
>>
>>> Hi,
>>>
>>> sorry for not giving more information on the first mail.
>>>
>>> The setup would be straight distributed. The disks are SATA2 7200RPM. ATM
>>> the 5 machines we're currently running have 5 disks of 1TB(4TB with RAID5)
>>> each. The new machine would have 12 disks of 2TB with RAID5 as well, so 23TB
>>> approx.
>>>
>>> We're using gluster for storage of an HPC cluster. That means: Data gets
>>> copied from gluster and to gluster all the times. For example looking at the
>>> traffic on the switch, the average is 64Mbit/s IN(that is, writing) and
>>> 43Mbit/s OUT(that is, reading). That is among the 5 machines.
>>>
>>> Is this enough?
>>>
>>> Thanks!
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>