[Gluster-devel] solutions for split brain situation

Tue Sep 15 23:45:43 UTC 2009

2009/9/16 Anand Avati <avati at gluster.com>

> No, not really. In fact every other comment about glusterfs(d) reads like
> > "this is a standard application regarding the fs, therefore it cannot be
> > responsible for problem A or bug B". Now, if it is to be judged as one of
> many
> > applications on the one hand, then it should be able to cope with
> situations
> > that every standard application can cope with either - other applications
> > using the same fs.
>
> glusterfs is a standard application regarding the fs, therefore it
> cannot be responsible for problems showing up in the kernel. glusterfs
> is not expected to work properly if you modify the backend export
> directory directly bypassing the mountpoint. This is the baseline
> premise for using glusterfs.
>
> > _The_ advantage of the whole glusterfs concept is exactly that it is _no_
> fs
> > with a own and special disk layout. It (should) run(s) on top of an
> existing
> > fs that can be used just like a fs may be used - including backup (with
> rsync
> > or whatever), restore and file operations of any kind.
>
> glusterfs uses a disk based filesystem as its backend. This in no way
> implies that it can share the backend with other applications and work
> without problems. glusterfs needs _exclusive_ access to this export
> directory. That is how it is designed to work. If you backup one
> backend, you can restor it only as that very backend. What you are
> trying is to do is use one backend as another subvolumes backend. If
> you expect copying over the backend by skipping the xattrs, while
> modifying those very files from the mountpoint to just work, then the
> expectation is improperly set. Please copy in all your content only
> from the mountpoint.
>
> > If subvolumes are indeed closed storages then they would be in no way
> > different than nbd, enbd, whatever-nbd. For various reasons we don't want
> > these solutions.
>
> GlusterFS is surely not a solution where you can freely modify the
> backend directly. For proper operation of the filesystem, the only
> supported mode of usage is through the mountpoint. Whatever
> modification you do with the backend is done at your own risk.
>
> Avati
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>

Stephan,
As Avati has tried to make clear, GlusterFS with the cluster/replicate
translator relies very heavily on the backend filesystem's support for
extended attributes, and these extended attributes are what GlusterFS uses
to know if a file is more up to date on one brick over another when
performing a self heal.

I feel that you have not read the link Understanding AFR
translator<http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator>.
This link should explain exactly how GlusterFS performs self-heal, and
should help you understand its use of extended attributes. There used to be
a way to initialise a brick for use with GlusterFS by manually setting the
appropriate extended attributes. I do not know if this is still supported.

You are welcome to read the source code for more in depth understanding of
the particular extended attributes used by GlusterFS for the replicate
translator.

Don't try bypassing the mountpoint to perform file operations _period_ . You
can always have a replicate mountpoint configured on the server (i.e. a
client for replicate), as well as the server side. NFS should run on top of
this replicate mountpoint. This (poor) graphic may help. Note that
everything is running on the same machine:

|      NFS       |
------------------
|GlusterFS Client|
------------------
|GlusterFS Server|
------------------
| POSIX Storage  |
------------------

Regards,
Michael Cassaniti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090916/cd08c8ab/attachment-0003.html>