RDP rdp.com at gmail.com
Fri Jan 20 13:08:09 UTC 2012

We are now in a process of creating a Glusterfs based NAS on the Amazon EC2
cloud using EBS volume. I did like to hear opinions/advices/experiences of
how others have done this.

Our scenario is simple. We need to store large number of static web files
like images (size less than 2 MB per file), etc and we dont need any
redundancy for this. But we need very high throughput on a high concurrency

I know from past experience that to maximize the I/O on Amazon EBS one
needs to stripe 4 EBS disk per instances with (RAID0) and to use XFS (hmm
..havent benchmarked with ext4 or btrfs yet).

Now I am at a point to find a ideal setup.

The first question that comes up is, whether having more servers in a
glusterfs with less disks per server is better than having few servers with
more disks. Note again this question is context based as we are not looking
at any redundancy, so only distribute and stripped are needed.

Now assume for the the tests we take  2 cc1.4xlarge servers each with 4 EBS
disks, then which would be better

   1. Create RAID0 for the 4 disks on each server and then use the 2 raid0
   devices in distribute
   2. Create Glusterfs stripe volume with the 4 disks on each server and
   then the 2 servers in distribute mode.
   3. Create 4 stripe volumes using one disk from one server and  one disk
   from the other, then use distribute on the 4 stripe volume.

Any inputs will be highly appreciated.

