[Gluster-users] GlusterFS replication over WAN
cbd at adfitech.com
Wed Aug 13 21:48:40 UTC 2008
-----BEGIN PGP SIGNED MESSAGE-----
Keith Freedman wrote:
> At 01:29 PM 8/13/2008, Collin Douglas wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> How does GlusterFS' AFR translator work over a WAN connection? I know
>> the simple answer is that it depends on the connection what's being
>> replicated but I'm interested in understanding how it works in a low
>> bandwidth situation.
> it works fine from my personal experience. you can probably move data
> at close to wire speed. I think you'll have different experiences
> depending on whether or not you're using server or client AFR.
> My personal preference is to use server based AFR, only because this
> really simplifies client configuration and when you have multiple
> clients acting on the same data sets, when you set up new clients, if
> they're not AFRing correctly you can get some out of sync data.
>> We are building a new storage model for our imaging system in which
>> GlusterFS is a candidate.
>> Current design ideas consist of a tier 1 storage where our current
>> "pipeline" of active files would be stored and operated on. Once the
>> files are in a relative static state, they would be moved to tier 2
>> storage. It's this tier 2 storage that would need WAN replication. In
>> this way we would limit the amount of data somewhat that has to be
>> replicated as the pipeline files are in an almost constant state of
>> The idea is to have an array on site (let's call it building 1), one
>> next door in another building (building 2) and one a few hundred miles
>> away. The local arrays will be connected via Infiniband or fiber and
>> the remote one via a 10MB link.
>> It seems logical that I would want to replicate from building 1 to
>> building 2 and then have another AFR configuration at building 2 to
>> replicate to the remote site. Does this fit best practices for
>> GlusterFS or should I use more of a hub and spoke method?
> I'm not sure I'm conceptualizing your setup quite right. . if this is
> what you mean:
> for live active files that are being worked on, AFR building 1 to
> building 2. Lets call this /active
> so the /active filesystem would be an AFR fs using servers in building
> 1 and 2. your 10MB infiniband should be more than sufficient.
> For the completed files to have them replicated, you move them to
> /backup which is a volume afr'ed from building 2 to offsite building
> C. This is probably over lan speed? hopefully faster than T1, but
> you're moving smaller amounts of data it should be fine.
I think you conceptualized it quite well. /active would only be
mirrored from building 1 to building 2 over 20GB/sec Infiniband while
/backup would be replicated from building 1 to building 2 and remote
with remote being connected via a 10MB/sec fiber link.
> Remember AFR is a real time replication, so what you can't have is this:
> /active being AFR building 1 to building 2, while also having the
> building 2 server set up to afr /active to building C.
> I belive this would cause the files once they get updated on building
> 2 to get AFR'ed to building C. I *think* this would make the afr to
> building 1 as slow as the building C afr, but I'm not positive.
That's what I am unsure about as well. I was hoping that a
configuration like that (building 1 AFRs to building 2 which then AFRs
to remote) would take place asynchronously so that replications from
building 1 to building 2 would be quick and then the data that arrives
in building 2 would be able to take it's time getting to the remote site
without causing problems with building 1's filesystem.
>> I am interested to hear what people think.
>> Collin Douglas
>> Adfitech, Inc
>> -----BEGIN PGP SIGNATURE-----
>> Version: PGP Universal 2.6.3
>> Charset: ISO-8859-1
>> -----END PGP SIGNATURE-----
>> Gluster-users mailing list
>> Gluster-users at gluster.org
-----BEGIN PGP SIGNATURE-----
Version: PGP Universal 2.6.3
-----END PGP SIGNATURE-----
More information about the Gluster-users