[Gluster-users] Regarding Replicated Volume

Ted Miller tmiller at hcjb.org
Thu Mar 20 21:28:03 UTC 2014


On 3/19/2014 3:06 PM, Cary Tsai wrote:
> Hi There:
> New to the GlusterFS and could not find the answer from the document
> so hope I can get the answer form the mailing list.
>
> Let's say we have two web servers:
> One in Seattle, WA and another one is in Chapel Hill, NC.
> So I create a 'replicated' volume which one brick in WA and another brick 
> in NC.
> I assume the web server in both WA and NC can mount the 'replicated' volume.
> There are 2 HTTP/Get calls from CA and NY.
> We assume CA's HTTP/Get is sent to web server in WA and
> NY's HTTP/Get is sent to web server in NC.
>
> My question is does the web server in WA definitely gets the data
> from the brick in WA? If not, is any way to configure so the
> web server in WA definitely gets data from the brick in WA?
The answer to your basic question is either "yes" or "they're working on 
it".  I know a while back it was on the "to do" list, but I am not sure if 
the patch is done, and if so, has it made it into production code.  But, the 
last I heard, yes, we were heading that direction.  Contrary to what someone 
else said, no, your scenario is not the only one where this is desirable.  In 
most any active situation using mostly-read files, reading from your local 
replicated disk is much faster, and also reduces network activity.

The big make-or-break questions as to whether this will work for you are:
* How much do you write? (more=more problem)
* Are these files sort-of/almost WORM files (Write Once, Read Many). WORM is 
better, RAM is worse
* Do both servers write?  (Only one is better)
* Do you modify files?  (More modification = more headaches)
* Do you replace/update files?  (Yes = more grief)

The critical issue is timing.  Gluster has various operations where it has to 
communicate with all nodes, and the process cannot move forward until all 
nodes answer.  Gluster is designed for all nodes to be connected by 1GB or 
faster networking, so your cross-continental link is outside the use-case the 
developers are using.  This always applies to writes, i.e. when a write 
occurs, it has to finish on both servers, probably with several commands 
issued, and each time it cannot go on to the next step until the distant 
server finishes.  There are certain read operations where gluster checks to 
make sure that things match between all servers. I hear reference to the 
"stat" call as being one that can be slow, but I can't say I fully understand 
what it does.  I think I understand that an 'ls' command does not include the 
'stat' call, but the 'ls -l' does include the 'stat' call, so a 'ls -l' 
command on a directory with hundreds or thousands of files can take MUCH 
longer than an 'ls' call to that same directory.

IF your web site is doing read-only access to your file system, and it is not 
triggering any calls that make gluster do a difference check between your two 
servers, it might work.

If
1. You do not require absolute real-time synchronization between the servers
AND
2. You can do all the writes on one of the two servers
then
you should probably look at Geo-replication.  Geo-replication is a one-way 
process, where all the changes happen on one end, and they are reflected on 
the other end.  It is designed to handle slower network links, and allows you 
to keep the two sites in close-to real-time synchronization.  How close to 
real time will depend on your server write load, and you would have to 
describe what you are doing and let some of the folks here give you their 
experience in similar situations.  At least you are within the intended 
use-case, so the developers will be receptive to any problems you have, and 
they may get fixed.

Another caution (based on painfully learned experience).  If you decide to 
try a regular (not Geo-Replicated) system, I advise that you store your data 
on a third machine somewhere, ESPECIALLY if both machines are updating files 
at the same time.  Otherwise, it seems that it is only a matter of time 
before you will be struggling with a split-brain situation.  When you face 
your first split-brain, you will wish you had never run into one.

Ted Miller
Elkhart, IN, USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140320/41eb07df/attachment.html>


More information about the Gluster-users mailing list