[Gluster-users] Replicate Over VPN

Thu Mar 13 23:50:06 UTC 2014

Yeah... I found the Joe Julian "do's and don'ts" blog post that pretty much
says I shouldn't have started down this road too late.  But I have started
down the road, so I'd like to make the best of it. (
http://joejulian.name/blog/glusterfs-replication-dos-and-donts/)

So now I'm wondering what I can do to speed things up as much as possible.
 First of all, I'll describe why I'm in this situation.

I need to maintain the same read/write access to all files in two
geographically-distant offices.  Essentially, the way Autodesk sets up
project files makes moving them between offices problematic, so Glusterfs
with a fast connection would solve the problem.  But the VPN connection is
slow for the next year (under 10 mbits, hopefully 100 mbits eventually
which still isn't ideally fast).  The nature of the file use is such that
it would be surprising if a file was accessed from both offices at the same
time.  99% of the time I could disconnect the bricks from each other and
reconnect at the end of the day and the heal function would do fine after
hours, no split-brain problems.  Except for that 1%, I could even rsync...

In testing (Samba, Glusterfs 3.4.2, Ubuntu 12.04LTS), I'm seeing two
issues:  1) Browsing folders is slow; and 2) file writes are being done at
approximately the speed of the VPN connection.

Before I can attempt to improve the situation (if that's even possible),
I'd like to know if my understandings are correct!

1) My reading suggests that every brick's file list is read when a folder
is browsed by the client.  Meaning the latency of the link is the
bottleneck.  Does this actually happen?  Is there a way to prevent it?  If
bricks are supposedly exact replicas of each other, why get the file
listings from all the other bricks in the volume instead of trusting the
local one?  If there were actually discrepancies found, wouldn't that
suggest a bigger problem with the replication?

2) I've seen it suggested that the write function isn't considered complete
until it's complete on all bricks in the volume. My write speeds would seem
to confirm this.  Is this correct and is there any way to cache the data
and allow it to trickle over the link in the background?  I'm thinking
about the write-behind-window size setting, etc.  It would be nice if
something like DRBD Protocol A could be implemented, where writes are
considered complete when the fast local one is done.  I realize the
potential for data loss if something goes wrong, but in my case the heal
would take care of almost every scenario I can envision.

Geo-replication would seem to be the ideal solution, except for the fact
that it apparently only works in one direction (although it was evidently
hoped it would be upgraded in 3.4.0 to go in both directions I understand).

So are there any configuration tricks (write-behind, compression etc) that
might help me out?  Is there a way to fool geo-replication into working in
both directions, recognizing my application isn't seeing serious read/write
activity and some reasonable amount of risk is acceptable?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140313/d0e732ab/attachment.html>