[Gluster-users] gluster ha/replication/disaster recover(dr translator) wish list

Mon Jan 26 20:51:05 UTC 2009

At 10:36 AM 1/26/2009, Prabhu Ramachandran wrote:
> > Here, if the network connection fails and is back up in short periods of
> > time you'll alway be experiencing delays as gluster is often waiting for
> > timeouts, then the server is visible again, it auto-heals, then it's not
> > visible and it has to timeout.
> > It'll likely work just fine, but this will seem pretty slow (but no
> > moreso than an NFS mount behind a faulty connection I suppose).
>
>Are you saying I can't use the local files on machine A (even if I am
>not touching any files on B) when the network is down even though all I
>am doing is reading and perhaps writing locally on A?  That could be a
>bit of a problem in my case since I usually keep any local builds on the
>partitions I'd like to share.  There could be other problems when one
>machine is down for maintenance for example (or the disk crashes).

I'm not saying that.  you can use the files on the local 
machine.  and it will, it'll just have to wait for network timeouts 
to decode that machine B  is down before it continues.

I think I'm a bit confused about what you're trying to do.
if you have A and B grouped as an HA brick, then writing to a local 
file gets replicated to the remote machine and vice-versa.  so for 
all intents and purposes these should be thought of as the same filesystem.

in this case, if machine B is down for maintenance, you can update 
files on machine A just fine.  then when machine B is back online 
files will start to auto-heal.

The problem would be if the network between A and B is down.  you 
update files on A (say run a build), and simultaneously update files 
on B (run a build there too), then the network comes back up, the 
same files might have changed on both A & B at the same time, and 
this would cause a split-brain.

but if you always only update on A, then it doesn't matter.  when the 
network is down, B will have local access to the version of files 
that were correct as of the last time they were in sync, but it will 
still serve them.

> > Things can be further complicated if you have some clients that can see
> > SERVER 1 and other clients that only see SERVER 2.  If this happens,
> > then you will increase the likelihood of a split brain situation and
> > things will go wrong when it tries to auto-heal (most likely requiring
> > manual intervention to get back to a stable state).
>
>Ahh, but in my case I don't have that problem there are only two
>machines (currently at any rate).
>
> > so the replication features of gluster/HA will most likely solve your
> > problem.
> > if you have specific concerns, post a volume config to the group so
> > people can advise you on a specific configuration.
>
>Well, I posted my configuration a while ago here:
>
>http://gluster.org/pipermail/gluster-users/20081217/000899.html
>
>The attachment is scrubbed which makes it a pain to get from there.  I
>enclose the relevant parts below.  Many thanks.

I dont see any problems with your config.
other than, if your network connection is very sporadic, then you'll 
be caught often by waiting for timeouts which will make things seem slower.

What you seem to want is for gluster to serve local files instantly, 
but it can't because in HA mode, it needs to know that either the 
replica is down, or that is has the most current version.  if your 
network is spotty, then it will constanly be waiting to decide that 
the network is down before continuing on, that's likely the concern 
that was raised earlier.

but if it's more likely that the network will be down for a while 
then up for a while, then it's not a big deal and you should be just fine.