[Gluster-devel] ping timeout
Gordan Bobic
gordan at bobich.net
Tue Mar 23 20:05:00 UTC 2010
On 03/23/2010 07:23 PM, Ed W wrote:
> On 18/03/2010 16:59, Christopher Hawkins wrote:
>> I see what you mean. Hopefully that behavior is fixed in 3.0. Though
>> in my case, I would still like fast disconnect because the data mirror
>> is active / passive. There should be no problems for glusterfs to
>> figure out which side has the new data because only one server will be
>> receiving writes at any given time.
>
> I'm not an active Glusterfs user yet, but what worries me about gluster
> is this very casual attitude to split brain... Other cluster solutions
> take outages extremely seriously to the point they fence off the downed
> server until it's guaranteed back into a synchronised state...
>
> The issue is that once the servers diverge you are just asking for some
> circumstance which will cause the older file to be served (causing data
> loss). Simple scenario is that one server goes down, files update on the
> second server, then first server comes back up and second server goes
> down, result is out of date files being served...
>
> Once a machine has gone down then it should be fenced off and not be
> allowed to serve files again until it's fully synced - otherwise you are
> just asking for a set of circumstances (however, unlikely) to cause the
> out of date data to be served...
>
> A superb solution would be for the replication tracker to actually log
> and mark dirty anything it can't fully replicate. When the replication
> partner comes back up these could then be treated as a priority sync
> list to get the servers back up to date?
Unfortunately, you CANNOT achieve any guarantees about correctness and
no data loss after split-brain. The only way to do that is to prevent
split-brain by fencing.
You need to look at your requirements. If you need bulletproof
guarantees of correctness and no data loss, you will have to use a
proper clustered file system (e.g. GFS (1,2), OCFS (1,2), or Veritas,
maybe even PeerFS). If your use case is such that you can get away with
cutting the odd corner for slightly reduced complexity, then glfs is
good enough - as bizzare as it may be to consider that potential for
data loss can be deemed "good enough" for a file system.
Gordan
More information about the Gluster-devel
mailing list