[Gluster-devel] solutions for split brain situation
Gordan Bobic
gordan at bobich.net
Thu Sep 17 00:33:52 UTC 2009
Mark Mielke wrote:
> In case it is of any use to other, here is the list I had worked out
> before when doing my analysis:
[...]
Since the issue of alternatives has been raised - you missed two file
systems in your summary:
1) SeznamFS. No POSIX locking and concurrent writes are wrought with
race conditions (by very design), but it is quite useful for some
use-cases, and it is stable. Fuse based. The file system is replicated
using the same paradigm as MySQL's replication (serialized write stream
logs).
2) PeerFS. POSIX, commercial, relatively expensive, replicated, block
level based, native kernel driver. The thing that killed it for me is
that there is no way to resize the file system without doing a full
dump/restore of the data - which is prohibitive with multi-TB data
stores replicated over a WAN.
As for Coda - you say there is no further development being done on it,
but that is because it is completed and stable. I _almost_ ended up
using it, but there were a few things that finally pushed me over toward
a hybrid SeznamFS/GlusterFS solution.
1) No POSIX locking, user/group based permissions.
2) Cannot sensibly be used as a home directory because it has to be
mounted by the user after logging in due to it's externally bolted on
security system (in it's defence, it is a _global_ file system by
design, so POSIX doesn't really fit with it's security paradigm).
3) The metadata is limited to something like 1MB/directory. This
includes all file names the directory contains, so is unsuitable for
Maildirs or large source code directories.
4) The files are kept as files with the same content, but not with the
same names (file names in the Coda's backing store are just numbers), as
SeznamFS and GlusterFS conveniently do. This makes data recovery more
difficult in case things go wrong compared to SeznamFS and GlusterFS.
Gordan
More information about the Gluster-devel
mailing list