[Gluster-users] GlusterFS performance questions for Amazon EC2 deployment

Craig Box craig.box at gmail.com
Wed Jun 30 14:22:33 UTC 2010

OK, so this brings me to Plan B.  (Feel free to suggest a plan C if you can.)

I want to have six nodes, three in each availability zone, replicate a
Mercurial repository.  Here's some art:

[gluster c/s] [gluster c/s] | [gluster c/s] [gluster c/s]
           [gluster s]      |      [gluster s]
              [OCFS 2]      |      [OCFS 2]
              [ DRBD ] ----------- [ DRBD ]

DRBD doing the cross-AZ replication, and a three node GlusterFS
cluster inside each AZ.  That way, any one machine going down should
still mean all the rest of the nodes can access the files.

Sound believable?


On Tue, Jun 29, 2010 at 5:16 PM, Count Zero <countz at gmail.com> wrote:
> My short (and probably disappointing) answer is that with all my attempts, and weeks trying to research and improve the performance, and asking here on the mailing lists, that I have both failed to make it work over WAN, and that authoritative answers were that "Wan is in the works".
> So for now, until WAN is officially supported, Keep it working within the same zone, and use some other replication method to synchronize the two zones.
> On Jun 29, 2010, at 7:12 PM, Craig Box wrote:
>> Hi all,
>> Spent the day reading the docs, blog posts, this mailing list, and
>> lurking on IRC, but still have a few questions to ask.
>> My goal is to implement a cross-availability-zone file system in
>> Amazon EC2, and ensure that even if one server goes down, or is
>> rebooted, all clients can continue, reading from/writing to a
>> secondary server.
>> The primary purpose is to share some data files for running a web site
>> for an open source project - a Mercurial repository and some shared
>> data, such as wiki images - but the main code/images/CSS etc for the
>> site will be stored on each instance and managed by version control.
>> As we have 150GB ephemeral storage (aka instance store, as opposed to
>> EBS) free on each instance, I thought it might be good if we were to
>> use that as the POSIX backend for Gluster, and have a complete copy of
>> the Mercurial repository on each system, with each client using its
>> local brick as the read subvolume for speed.  That way, you don't need
>> to go to the network for reads, which ought to be far more common than
>> writes.
>> We want to have the files available to seven servers, four in one AZ
>> and three in another.
>> I think it best if we maximise client performance, rather than
>> replication speed; if one of our nodes is a few seconds behind, it's
>> not the end of the world, but if it consistently takes a few seconds
>> on every file write, that would be irritating.
>> Some questions which I hope someone can answer:
>> 1. Somewhat obviously, when we turn on replication and introduce a
>> second server, write speed to the volume drops drastically  If we use
>> client-side replication, we can have redundancy in servers.  Does this
>> mean that GlusterFS client blocks, waiting for the client to write to
>> every server?  If we changed to server-side replication, would this
>> background the replication overhead?
>> 2. If we were to use server-side replication, should we use the
>> write-behind translator in the server stack?
>> 3. I was originally using 3.0.2 packaged with Ubuntu 10.04, and have
>> tried upgrading to 3.0.5rc7 (as suggested on this list) for better
>> performance with the quick-read translator, and other fixes.  However,
>> this actually seemed to make write performance *worse*!  Should this
>> be expected?
>> (Our write test is totally scientific *cough*: we cp -a a directory of
>> files onto the mounted volume.)
>> 4. Should I expect a different performance pattern using the instance
>> storage, rather than an EBS volume?  I found this post helpful -
>> http://www.sirgroane.net/2010/03/tuning-glusterfs-for-apache-on-ec2/ -
>> but it talks more about reading files than writing them, and it writes
>> off some translators as not useful because of the way EBS works.
>> 5. Is cluster/replicate even the right answer?  Could we do something
>> with cluster/distribute - is this, in effect, a RAID 10?  It doesn't
>> seem that replicate could possibly scale up to the number of nodes you
>> hear about other people using GlusterFS with.
>> 6. Could we do something crafty where you read directly from the POSIX
>> volume but you do all your writes through GlusterFS?  I see it's
>> unsupported, but I guess that is just because you might get old data
>> by reading the disk, rather than the client.
>> Any advice that anyone can provide is welcome, and my thanks in advance!
>> Regards
>> Craig
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

More information about the Gluster-users mailing list