[Gluster-devel] Client side AFR race conditions?

Wed May 7 16:40:06 UTC 2008

gordan at bobich.net wrote:
> On Wed, 7 May 2008, Anand Avati wrote:
> 
>>>
>>> The only way I see to ensure data integrity is to have some arbiter vet
>>>> all writes.  You can try to make that arbiter redundant, but good luck
>>>> making it actually distributed.
>>>>
>>>
>>> I've seen the distributed arbiter done in proprietary software, so it
>>> must be possible.  The design is pretty clear to me, but I have no idea
>>> where to start integrating the idea into glusterfs, though gluster's the
>>> closest thing to what I need that I've seen in open source.
>>
>> Can you give some details/links? We would be interested to learn about 
>> it.
> 
> I suspect what was referred to was a system where the locks are notified 
> to every host, not an actually load sharing system. DLM (RHCS/GFS) does 
> it by multicasting, presumably with acknowledgements being returned from 
> each connected node. I've not looked at the DLM protocol in great 
> detail, so I don't know what the details are.

Actually, I was thinking of WANdisco's Multi-site CVS/SVN/MySQL 
mirroring software.  It's not generalized to the point of being a disk 
load sharing system, exactly, but I think the concept and the problems 
are the same.  They use a quorum locking model and basically journal the 
transaction with whichever server they are wrapping for later replay on 
the other servers.

There used to be a white-paper on WANdisco's protocol online (I haven't 
looked recently).  I didn't know much about DLM (and, after reading what 
documentation I could find online just now, I don't feel like I know 
much more), but it sounds like DLM uses a similar quorum model for locking.

As for the versioning (and perhaps this is relevant to the discussion 
taking place in another thread), I don't see how this can be done 
without meta-data journaling, so why not make things even simpler and 
share a unique version number between all entities changed in a 
transaction?  So, for any server to acquire an implicit write lock, the 
quorum must agree to increment a global transaction ID (which could also 
be attached in the FS as a directory and/or file's version number). 
Then, as long as any given system knew that its journal/replay was 
up-to-date with the latest transaction ID according to the quorum, then 
it could trust a file's content without consulting a file-specific 
revision number.

If a server was not completely up-to-date, then it would at least have 
to synchronize the meta-data journal and consult it to find if a 
requested file had any pending writes and decide whether it needed to 
synchronize the file before serving it.

Regards,

Derek
-- 
Derek R. Price
Solutions Architect
Ximbiot, LLC <http://ximbiot.com>
Get CVS and Subversion Support from Ximbiot!

v: +1 248.835.1260
f: +1 248.246.1176