[Gluster-users] GlusterFS replication over WAN
Amar S. Tumballi
amar at zresearch.com
Wed Aug 13 22:43:04 UTC 2008
Thank Keith for reply,
Your idea is perfect in my opinion too. (ie, have afr on server side, and
separate mountpoints for /active and /backup).
Collin,
Yes, glusterfs sends write calls in parallel to all nodes, and also its
asynchronous. The decision of when the application gets notified about write
complete will be dependent on when afr says write is complete. We did make
some changes to return as soon as first write request is completed. But the
problem is, when a large file was being written, it caused glusterfs to
consume all memory and die, because the write buffers were waiting at the
slower link to get completed, but application kept sending data as faster
link succeeded faster.
Hence we switched back to behavior where the write is always as fast as
your slowest link. Where as reads can always be done from local volume,
hence will be faster.
Let me get back with more info after discussing with team about this
situation, and how to handle it in a better way.
Regards,
Amar
2008/8/13 Collin Douglas <cbd at adfitech.com>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
>
>
> Keith Freedman wrote:
> > At 01:29 PM 8/13/2008, Collin Douglas wrote:
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> How does GlusterFS' AFR translator work over a WAN connection? I know
> >> the simple answer is that it depends on the connection what's being
> >> replicated but I'm interested in understanding how it works in a low
> >> bandwidth situation.
> >
> > it works fine from my personal experience. you can probably move data
> > at close to wire speed. I think you'll have different experiences
> > depending on whether or not you're using server or client AFR.
> > My personal preference is to use server based AFR, only because this
> > really simplifies client configuration and when you have multiple
> > clients acting on the same data sets, when you set up new clients, if
> > they're not AFRing correctly you can get some out of sync data.
> >
> >> We are building a new storage model for our imaging system in which
> >> GlusterFS is a candidate.
> >>
> >> Current design ideas consist of a tier 1 storage where our current
> >> "pipeline" of active files would be stored and operated on. Once the
> >> files are in a relative static state, they would be moved to tier 2
> >> storage. It's this tier 2 storage that would need WAN replication. In
> >> this way we would limit the amount of data somewhat that has to be
> >> replicated as the pipeline files are in an almost constant state of
> >> flux.
> >>
> >> The idea is to have an array on site (let's call it building 1), one
> >> next door in another building (building 2) and one a few hundred miles
> >> away. The local arrays will be connected via Infiniband or fiber and
> >> the remote one via a 10MB link.
> >>
> >> It seems logical that I would want to replicate from building 1 to
> >> building 2 and then have another AFR configuration at building 2 to
> >> replicate to the remote site. Does this fit best practices for
> >> GlusterFS or should I use more of a hub and spoke method?
> >
> > I'm not sure I'm conceptualizing your setup quite right. . if this is
> > what you mean:
> > for live active files that are being worked on, AFR building 1 to
> > building 2. Lets call this /active
> > so the /active filesystem would be an AFR fs using servers in building
> > 1 and 2. your 10MB infiniband should be more than sufficient.
> > For the completed files to have them replicated, you move them to
> > /backup which is a volume afr'ed from building 2 to offsite building
> > C. This is probably over lan speed? hopefully faster than T1, but
> > you're moving smaller amounts of data it should be fine.
> >
> I think you conceptualized it quite well. /active would only be
> mirrored from building 1 to building 2 over 20GB/sec Infiniband while
> /backup would be replicated from building 1 to building 2 and remote
> with remote being connected via a 10MB/sec fiber link.
> > Remember AFR is a real time replication, so what you can't have is this:
> > /active being AFR building 1 to building 2, while also having the
> > building 2 server set up to afr /active to building C.
> >
> > I belive this would cause the files once they get updated on building
> > 2 to get AFR'ed to building C. I *think* this would make the afr to
> > building 1 as slow as the building C afr, but I'm not positive.
> >
> That's what I am unsure about as well. I was hoping that a
> configuration like that (building 1 AFRs to building 2 which then AFRs
> to remote) would take place asynchronously so that replications from
> building 1 to building 2 would be quick and then the data that arrives
> in building 2 would be able to take it's time getting to the remote site
> without causing problems with building 1's filesystem.
>
>
> >
> >> I am interested to hear what people think.
> >>
> >> Collin Douglas
> >> Adfitech, Inc
> >>
> >>
> >>
> >> -----BEGIN PGP SIGNATURE-----
> >> Version: PGP Universal 2.6.3
> >> Charset: ISO-8859-1
> >>
> >> wsBVAwUBSKNEIfvgUY49IQeAAQjeAggAolKbmc5xw9f0BOqp4Uo0NH7VuhK9n8ol
> >> D9wquJ85fecI08BfoTuCLOjH7oviayZBCqNC+CzQm21QZP1hTGBisGrJUJ87rscc
> >> MLici37YmtQC+ItAWdzqUq33bGgNp+T+HiJbYmX3AE0PY19vC0YUOK9QYM0hMosc
> >> cWyeVCfPpM0SM4/ND83zyO6bjv9QkD+JpGoQMOwPMNnY055kdFlWRo7tDaUjhdhN
> >> GQkTtoEmNWQL1uzZ4sibUudKD8YXBFaY/GGbRLorrkNJHZepYH3VM/saCMR7pjo+
> >> cVh2Mek5F+IdwyH8Zg6Vks73RAQtUaJQHMonx7g7Y2dY/A2l/p3beg==
> >> =y84r
> >> -----END PGP SIGNATURE-----
> >>
> >> _______________________________________________
> >> Gluster-users mailing list
> >> Gluster-users at gluster.org
> >> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
> >
> >
>
> -----BEGIN PGP SIGNATURE-----
> Version: PGP Universal 2.6.3
> Charset: ISO-8859-1
>
> wsBVAwUBSKNWtfvgUY49IQeAAQjk2gf9FTVvPSFe8ohlgNSqF+eP4JHiwywtNEE+
> d2j2eMguY1UCE4YpupgcvsPq6G75A8Qoig2nGxtu0AMIam6MP2DmNBfLQV7pl1or
> ErlxdfdQlX6Z26GBrOUPpiiLvgfZYjX6QRaBe2DqFb4vKSqfn3c1ztLcNLx3dilS
> jajPK3oYXid1rY1ZKngbTdiSdCceQ505D1sv6SB2AqDyAVSBvu2vQPgpMt4mMRAJ
> AsVgLVTT0yudLTyHOXEkeAt3QHsWLFXqBytgqI8sn7CSfr85jo41JpeCFUyiVP07
> DWV6r8r9F8tHLkUILBbCz2SwYClWeKhMjA2KA8emTcWwsPx7ZeOLug==
> =eCoE
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
>
--
Amar Tumballi
Gluster/GlusterFS Hacker
[bulde on #gluster/irc.gnu.org]
http://www.zresearch.com - Commoditizing Super Storage!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20080813/1eec46ca/attachment.html>
More information about the Gluster-users
mailing list