[Gluster-devel] solutions for split brain situation

Wed Sep 16 15:04:18 UTC 2009

> Lets make a trivial setup, lots of data for webservers and some ftp servers
> for feeding in and deleting old. The first thing in sight: compared to the
> reads there are very few writes, mostly sequential logfiles. And another
> thing: most of the data does not get read nor written the whole day long.
> This is a pretty common example I would say. Since really very few changes are
> going on compared to the total amount of stored data you may call the
> situation pseudo-static.
> What would you expect in that setup? Lets say the bad boys (ftp servers) are
> local feeds and not going over glusterfs for some unknown reason.
> What do they really do to the data? They delete (the data is gone afterwards,
> so there is no problem at all), they write new files. It should be very simple
> for glusterfs to detect a local fed new file, because it has no xattribs at
> all (assuming every glusterfs-fed file has some (*)). So basically all you
> have to do is try to write-lock the file on the backend store, create its
> xattribs default, unlock and do a stat for self-healing the other subvolumes -
> lets call such a thing "import".
> Does that really sound unsolvable? (For simplicity we assume such local feeds
> only on the first subvolume, and the cluster being replicate)

1) There is a race condition in what you describe. Since you mentioned 30 years in development, I assume you know what that means. Consider this:
You are "locally feeding" file "x" on server1. During this, the same file gets created via the mountpoint on server2. What would you expect to happen, in a fs that aims for full posix compliance on atomic operations?

2) The example you give doesn't, in any way, provide justification for not copying the file in via the mountpoint in the first place.

> (*) IF not every glusterfs file has xattribs then "import" is even simpler and
> can be done by just stat'ing. This case sounds pretty automagically happening
> on first touching of the new file over glusterfs mountpoint.

Not quite - you are forgetting the directory metadata, which is necessary to keep track of created/deleted files.

> Another story: the backup 
> I am pretty astonished that you all talk about backuping the xattribs. But
> according to your own clean philosophy there should be no problem for backups
> without xattribs as long as they are read in from the glusterfs mountpoint.

Yes, so far nothing astonishing there - you need either the snapshot of backing store incl. xattrs, or the mount point sourced data.

> Since other applications do not honor the xattribs either that can only mean
> that a backup must be a complete snapshot without them.

No more than in any other setting, if you are reading from the mount point. From the backend incl. xattrs, it should be a snapshot to ensure consistent metadata state.

Backup with xattribs in this sense can only be useful at all if read local
from the backend store to be able to recover that backend later on - including
the information hidden in the xattribs. But since you would not want to deal
with local data at all this should be no backup method at all.

You are extrapolating, and incorrectly. This sceanrio (backup of a snapshot including xattrs) would work fine. It is equivalent to a server recoining the cluster after an outage.

> Even from my bad boy position I would not backup xattribs via local feed.  The
> reason for me lies in restore. If I local-restore a file without xattribs
> I give glusterfs a realistic change to notice that this is a local fed file and
> should probably be handled like discussed above ("import").

You missed the point somewhere. If you are backing up the snapshot of the backing store you SHOULD backup/restore the xattrs. The important thing is for data and metadata to be in a consistent state. As long as that is the case, the files will self-heal correctly when the restored server rejoins the cluster.

> But if I
> local-restore a file with xattribs it is likely that these contain a currently
> invalid state.

Sure, thus you have to either snapshot before the backup, or better, unmount the server process on the server you are using for backing up.

> My guess is that this will harm glusterfs more than not having
> xattribs for the file at all because there is possibly no good way to find out
> the invalid state.

Sounds like the problem is that you are expecting (hoping for?) correct results when following incorrect procedures. If you stick with the approaches I outlined above for the use-cases you mentioned, it will do what you want.

Gordan