[Gluster-users] GlusterFS replication over WAN

Collin Douglas cbd at adfitech.com
Thu Aug 14 18:43:41 UTC 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Amar S. Tumballi wrote:
> Thank Keith for reply,
>  Your idea is perfect in my opinion too. (ie, have afr on server side, 
> and separate mountpoints for /active and /backup).
>
> Collin,
>   Yes, glusterfs sends write calls in parallel to all nodes, and also 
> its asynchronous. The decision of when the application gets notified 
> about write complete will be dependent on when afr says write is 
> complete. We did make some changes to return as soon as first write 
> request is completed. But the problem is, when a large file was being 
> written, it caused glusterfs to consume all memory and die, because 
> the write buffers were waiting at the slower link to get completed, 
> but application kept sending data as faster link succeeded faster.
>
>  Hence we switched back to behavior where the write is always as fast 
> as your slowest link. Where as reads can always be done from local 
> volume, hence will be faster.
>
Ok.  So if I use AFR to replicate from server1 to server2 and server1 to 
server3, the system will be as slow as the weak link.  That makes sense.

What about an AFR from server1 to server2 and then another AFR 
configuration on server2 to server3 where server3 is the slow link.  
Would this free server1 from the burden of waiting for the slow link or 
would it just cause more problems?


>  Let me get back with more info after discussing with team about this 
> situation, and how to handle it in a better way.
>
> Regards,
> Amar
>
> 2008/8/13 Collin Douglas <cbd at adfitech.com <mailto:cbd at adfitech.com>>
>
>     -----BEGIN PGP SIGNED MESSAGE-----
>     Hash: SHA256
>
>
>
>     Keith Freedman wrote:
>     > At 01:29 PM 8/13/2008, Collin Douglas wrote:
>     >> -----BEGIN PGP SIGNED MESSAGE-----
>     >> How does GlusterFS' AFR translator work over a WAN connection?
>      I know
>     >> the simple answer is that it depends on the connection what's being
>     >> replicated but I'm interested in understanding how it works in
>     a low
>     >> bandwidth situation.
>     >
>     > it works fine from my personal experience.  you can probably
>     move data
>     > at close to wire speed.  I think you'll have different experiences
>     > depending on whether or not you're using server or client AFR.
>     > My personal preference is to use server based AFR, only because this
>     > really simplifies client configuration and when you have multiple
>     > clients acting on the same data sets, when you set up new
>     clients, if
>     > they're not AFRing correctly you can get some out of sync data.
>     >
>     >> We are building a new storage model for our imaging system in which
>     >> GlusterFS is a candidate.
>     >>
>     >> Current design ideas consist of a tier 1 storage where our current
>     >> "pipeline" of active files would be stored and operated on.
>      Once the
>     >> files are in a relative static state, they would be moved to tier 2
>     >> storage.  It's this tier 2 storage that would need WAN
>     replication.  In
>     >> this way we would limit the amount of data somewhat that has to be
>     >> replicated as the pipeline files are in an almost constant state of
>     >> flux.
>     >>
>     >> The idea is to have an array on site (let's call it building
>     1), one
>     >> next door in another building (building 2) and one a few
>     hundred miles
>     >> away.  The local arrays will be connected via Infiniband or
>     fiber and
>     >> the remote one via a 10MB link.
>     >>
>     >> It seems logical that I would want to replicate from building 1 to
>     >> building 2 and then have another AFR configuration at building 2 to
>     >> replicate to the remote site.  Does this fit best practices for
>     >> GlusterFS or should I use more of a hub and spoke method?
>     >
>     > I'm not sure I'm conceptualizing your setup quite right. . if
>     this is
>     > what you mean:
>     > for live active files that are being worked on, AFR building 1 to
>     > building 2. Lets call this /active
>     > so the /active filesystem would be an AFR fs using servers in
>     building
>     > 1 and 2.  your 10MB infiniband should be more than sufficient.
>     > For the completed files to have them replicated, you move them to
>     > /backup which is a volume afr'ed from building 2 to offsite building
>     > C.  This is probably over lan speed?  hopefully faster than T1, but
>     > you're moving smaller amounts of data it should be fine.
>     >
>     I think you conceptualized it quite well.  /active would only be
>     mirrored from building 1 to building 2 over 20GB/sec Infiniband while
>     /backup would be replicated from building 1 to building 2 and remote
>     with remote being connected via a 10MB/sec fiber link.
>     > Remember AFR is a real time replication, so what you can't have
>     is this:
>     > /active being AFR building 1 to building 2, while also having the
>     > building 2 server set up to afr /active to building C.
>     >
>     > I belive this would cause the files once they get updated on
>     building
>     > 2 to get AFR'ed to building C.   I *think* this would make the
>     afr to
>     > building 1 as slow as the building C afr, but I'm not positive.
>     >
>     That's what I am unsure about as well.  I was hoping that a
>     configuration like that (building 1 AFRs to building 2 which then AFRs
>     to remote) would take place asynchronously so that replications from
>     building 1 to building 2 would be quick and then the data that arrives
>     in building 2 would be able to take it's time getting to the
>     remote site
>     without causing problems with building 1's filesystem.
>
>
>     >
>     >> I am interested to hear what people think.
>     >>
>     >> Collin Douglas
>     >> Adfitech, Inc
>     >>
>     >>
>     >>
>     >> -----BEGIN PGP SIGNATURE-----
>     >> Version: PGP Universal 2.6.3
>     >> Charset: ISO-8859-1
>     >>
>     >> wsBVAwUBSKNEIfvgUY49IQeAAQjeAggAolKbmc5xw9f0BOqp4Uo0NH7VuhK9n8ol
>     >> D9wquJ85fecI08BfoTuCLOjH7oviayZBCqNC+CzQm21QZP1hTGBisGrJUJ87rscc
>     >> MLici37YmtQC+ItAWdzqUq33bGgNp+T+HiJbYmX3AE0PY19vC0YUOK9QYM0hMosc
>     >> cWyeVCfPpM0SM4/ND83zyO6bjv9QkD+JpGoQMOwPMNnY055kdFlWRo7tDaUjhdhN
>     >> GQkTtoEmNWQL1uzZ4sibUudKD8YXBFaY/GGbRLorrkNJHZepYH3VM/saCMR7pjo+
>     >> cVh2Mek5F+IdwyH8Zg6Vks73RAQtUaJQHMonx7g7Y2dY/A2l/p3beg==
>     >> =y84r
>     >> -----END PGP SIGNATURE-----
>     >>
>     >> _______________________________________________
>     >> Gluster-users mailing list
>     >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>     >> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>     >
>     >
>
>     -----BEGIN PGP SIGNATURE-----
>     Version: PGP Universal 2.6.3
>     Charset: ISO-8859-1
>
>     wsBVAwUBSKNWtfvgUY49IQeAAQjk2gf9FTVvPSFe8ohlgNSqF+eP4JHiwywtNEE+
>     d2j2eMguY1UCE4YpupgcvsPq6G75A8Qoig2nGxtu0AMIam6MP2DmNBfLQV7pl1or
>     ErlxdfdQlX6Z26GBrOUPpiiLvgfZYjX6QRaBe2DqFb4vKSqfn3c1ztLcNLx3dilS
>     jajPK3oYXid1rY1ZKngbTdiSdCceQ505D1sv6SB2AqDyAVSBvu2vQPgpMt4mMRAJ
>     AsVgLVTT0yudLTyHOXEkeAt3QHsWLFXqBytgqI8sn7CSfr85jo41JpeCFUyiVP07
>     DWV6r8r9F8tHLkUILBbCz2SwYClWeKhMjA2KA8emTcWwsPx7ZeOLug==
>     =eCoE
>     -----END PGP SIGNATURE-----
>
>     _______________________________________________
>     Gluster-users mailing list
>     Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>     http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
>
>
>
> -- 
> Amar Tumballi
> Gluster/GlusterFS Hacker
> [bulde on #gluster/irc.gnu.org <http://irc.gnu.org>]
> http://www.zresearch.com - Commoditizing Super Storage!

-----BEGIN PGP SIGNATURE-----
Version: PGP Universal 2.6.3
Charset: UTF-8

wsBVAwUBSKR82vvgUY49IQeAAQin9wf+PCOF7aqeL0FJOjOOwsjx9PSbi91HUBTg
HdOoAI+5ZmxP10DF77vSN9gJ4NBj6GBg/qkOwwR79MvWtGhOzSh9nKCsQr8quD1Z
1rF44ygk6DKcehc+Cax4vayPqzbS1+/OaqGdnTpmDjcsXNWTbpfAbi9T+594O+6Y
1jqmp/emjLKYeGGuoYE0uD3lkfDjxrf3ptuovRzfxN6BkVzY82OReyOCqw1F9XIo
lcC2QRfjYqSQUVtIFf9MaJONXz7MOh8YYQxLrEImGv6OPd8UrUai9QeUPKHytBQW
hGVM+hTQQ0NFUlSlR7c6FYdrkDhka9mKPlxLIGyA/9EHWeWxRLMTzA==
=CcDv
-----END PGP SIGNATURE-----




More information about the Gluster-users mailing list