[Gluster-users] question re. current state of art/practice

Miles Fidelman mfidelman at meetinghouse.net
Thu Feb 16 20:23:20 UTC 2012


Thomas,

That's exactly the kind of info I was looking for.  Thanks!

Think I'll go experiment a bit with the Gluster 3.3beta, and see what 
kind of results I can get.

Miles

Thomas Jackson wrote:
> Just as a small warning, Sheepdog is well away from being production ready -
> the big problem that I can see is that it won't de-allocate blocks once an
> image is deleted! It definitely has promise for the future though.
>
> We set up a 4 node cluster using Gluster with KVM late last year - which has
> been running along quite nicely for us.
>
> To quote myself from a forum post I made a few weeks ago (note, prices are
> in Australian dollars):
>
> In short, we built a VM cluster without a traditional SAN based on free
> Linux-based software for a comparatively small amount of money. The company
> has had some reasonably explosive growth over the past 18 months, so the
> pressure has been on to deliver more power to the various business units
> without breaking the bank.
>
> We've been running VMware reasonably successfully, so the obvious option was
> to just do that again. We called our friendly Dell rep for a quote using the
> traditional SAN, some servers, dual switches etc - which came back at around
> $40,000 - that wasn't going to fly, so we needed to get creative.
>
> The requirements for this particular cluster were fairly modest, immediate
> need for a mix of ~20 moderate-use Windows and Linux VMs, ability to scale
> 3-4x in the short term, cheaper is better and full redundancy is a must.
>
> Gluster lets you create a virtual "SAN" across multiple nodes, either
> replicating everything to every node (RAID1), distributing each file to a
> separate node (JBOD / RAID0 I guess you could call it at a stretch) or a
> combination of the two (called distribute/replicate in Gluster terms). After
> running it on a few old boxes as a test, we decided to take the plunge and
> build up the new cluster using it.
>
> The tools of choice are KVM for the actual virtualisation and GlusterFS to
> run the storage all built on top of Debian.
>
> The hardware
> We're very much a Dell house, so this all lives on Dell R710 servers, each
> with
> * 6 core Xeon
> * 24GB RAM
> * 6x 300GB 15k SAS drives
> * Intel X520-DA2 dual port 10 GigE NIC (SFP+ twinax cables)
> * Usual DRAC module, ProSupport etc.
>
> All connected together using a pair of Dell 8024F 10 Gig eth switches
> (active / backup config). All up, the price was in the region of $20,000.
> Much better.
>
> Gluster - storage
> Gluster is still a bit rough around the edges, but it does do what we needed
> reasonably well. The main problem is that if a node has to resync (self-heal
> in Gluster terms), it locks the WHOLE file for reading across all nodes
> until the sync is finished. If you have a lot of big VM images, this can
> mean that the storage for them "disappears" as far as KVM is concerned while
> the sync happens, leading the VM to hard crash. Even with 10 Gig Eth and
> fast disks, moving several-hundred-GB images takes a while. Currently, if a
> node goes offline, we leave it dead until such time as we can shut down all
> of the VMs and bring it back up gracefully. Only happened once so far
> (hardware problem), but it is something that certainly is worrying.
>
> This is apparently being fixed in the next major release (3.3), due out mid
> this year. A point to note is that Gluster was recently acquired by Red Hat,
> so that is very positive.
>
> We did have a few stability problems with earlier versions, but 3.2.5 runs
> smoothly from what we've thrown at it so far.
>
> KVM - hypervisor
> Very powerful, performs better than VMware in our testing, and totally free.
> We use libvirt with some custom apps to manage everything, but 99% of it is
> set-and-forget. For those who want a nice GUI in Windows, sadly there is
> nothing that is 100% there yet. For those of us who prefer to use command
> lines, virsh from libvirt works very well.
>
> The only problem is that the concept of live-snapshots doesn't currently
> exist - the VM has to be paused or shut down to take a snapshot. We've
> tested wedging LVM under the storage bricks, which lets us snapshot the base
> storage and grab our backups from there. It seems to work OK on the test
> boxes, but it isn't in production yet. Until we can get a long enough window
> to set that up, we're doing ad-hoc manual snapshots of the VMs, usually
> before/after a major change when we have an outage window already.
> Day-to-day data is backed up the traditional way.
>
> What else did we look at?
> There are a number of other shared storage / cluster storage apps out there
> (Ceph, Lustre, DRBD etc), all of which have their own little problems.
>
> There are also a lot of different hypervisors out there (Xen being the main
> other player), but KVM fit our needs perfectly.
>
> We looked at doing 4x bonded gigabit ethernet for the storage - but Dell
> offered us a deal that wasn't much difference to step up to 10 Gig Eth, and
> it meant that we could avoid the fun of dealing with bonded links.
> Realistically, bonded gigabit would have done the job for us, but the deal
> we were offered made it silly to consider doing that. If memory is right,
> the 8024F units were brand new at the time, so I think they were trying to
> get some into the wild.
>
> Conclusion
> There are certainly a few rough edges (primarily to do with storage), but
> nothing show-stopping for what we needed at the moment. In the end, the
> business is happy, so we're happy.
>
> I'd definitely say this approach at least worth a look for anyone building a
> small VM cluster. Gluster is still a bit too immature for me to be 100%
> happy with it, but I think it's going to be perfect in 6 months time. At the
> moment, it is fine to use in production with a bit of care and knowledge of
> the limitations.
>
>
> -----Original Message-----
> From: gluster-users-bounces at gluster.org
> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Miles Fidelman
> Sent: Friday, 17 February 2012 5:47 AM
> Cc: gluster-users at gluster.org
> Subject: Re: [Gluster-users] question re. current state of art/practice
>
> Brian Candler wrote:
>> On Wed, Feb 15, 2012 at 08:22:18PM -0500, Miles Fidelman wrote:
>>> We've been running a 2-node, high-availability cluster - basically
>>> xen w/ pacemaker and DRBD for replicating disks.  We recently
>>> purchased 2 additional,  servers, and I'm thinking about combining
>>> all 4 machines into a 4-node cluster - which takes us out of DRBD
>>> space and requires some other kind of filesystem replication.
>>>
>>> Gluster, Ceph, Sheepdog, and XtreemFS seem to keep coming up as
>>> things that might work, but... Sheepdog is too tied to KVM
>> ... although if you're considering changing DRBD->Gluster, then
>> changing
>> Xen->KVM is perhaps worth considering too?
> Considering it, but... Sheepdog doesn't seem to have the support that
> Gluster does, and my older servers don't have the processor extensions
> necessary to run KVM.  Sigh....
>
>>> i.  Is it now reasonable to consider running Gluster and Xen on the
>>> same boxes, without hitting too much of a performance penalty?
>> I have been testing Gluster on 24-disk nodes :
>>     - 2 HBAs per node (one 16-port and one 8-port)
>>     - single CPU chip (one node is dual-core i3, one is quad-core Xeon)
>>     - 8GB RAM
>>     - 10G ethernet
>> and however I hit it, the CPU is mostly idle. I think the issue for
>> you is more likely to be one of latency rather than throughput or CPU
>> utilisation, and if you have multiple VMs accessing the disk
>> concurrently then latency becomes less important.
>>
>> However, I should add that I'm not running VMs on top of this, just
>> doing filesystem tests (and mostly reads at this stage).
>>
>> For what gluster 3.3 will bring to the table, see this:
>> http://community.gluster.org/q/can-i-use-glusterfs-as-an-alternative-n
>> etwork-storage-backing-for-vm-hosting/
> Thanks!  That gives me some hard info.  I'm starting to think waiting for
> 3.3 is a very good idea.  Might start playing with the beta.
>
> Miles
>
> --
> In theory, there is no difference between theory and practice.
> In practice, there is.   .... Yogi Berra
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


-- 
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra





More information about the Gluster-users mailing list