[Gluster-devel] directory lock durning file self-healing

Tue Aug 4 13:42:18 UTC 2009

On Tue, 4 Aug 2009 09:23:31 -0400 (EDT), Brent A Nelson
<brent at phys.ufl.edu>
wrote:
> On Tue, 4 Aug 2009, Vikas Gorur wrote:
> 
>>
>> Virtualization environments are one of our major focus areas right now.
>> We have contributed to the oVirt project to add GlusterFS support
>>
(http://git.et.redhat.com/?p=ovirt-server.git;a=shortlog;h=refs/heads/next).
>> We hope you'll use GlusterFS too!
> 
> Isn't there also an issue where, if a file is open on a replicated 
> GlusterFS, the process accessing the file will get an error if the node 
> it had open goes down (even though there's a replica available)? Or is 
> there already a way around that or in the works?

I think one of the servers going away in AFR is transparent (well, apart
from the time-out during which operations will block).

> If this is an issue, then a virtual machine open from an image on a 
> replicated GlusterFS would think it lost its filesystem and freak out,
and 
> you'd have to kill and restart the virtual machine.

A bigger problem would likely be that if one server has a VM running, it'll
have the image open - which means that if another server comes online it
won't be able to self-heal because healing doesn't work on files that are
open. This can lead to a situation where the data isn't redundant when you
expect it to be.

Worse, have a look at this:
http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170

If you optimize your setup with read-subvolume so you always read locally,
you won't even be able to open() a file that is open remotely but out of
sync localy. Or you might get an out of date local version of the contents
(which is arguably worse than just an outright failure).

Gordan