[Gluster-devel] Improving real world performance by moving files closer to their target workloads

gordan at bobich.net gordan at bobich.net
Fri May 16 09:52:38 UTC 2008


On Fri, 16 May 2008, Luke McGregor wrote:

> I think im getting a little closer to understanding this. So the
> metadata (file atributes AFR settings ect) is stored as a header on
> the actual file and the central name space cache stores information to
> do with file lookup, node its stored on ect. Does this sound correct?

The look-up of file location is done by the hash. The namespace only 
serves to present a unified view of all the individual merged stores.

> If this is the case is it possible to update this namespace info as
> the file is accessed or will that be dificult as they are currently
> concidered static. i can see this as a potential issue where the local
> and central caches may have consistancy issues.

There are no central caches. The nodes are all equal peers. You would have 
to keep them all in sync. At that rate, you might as well do a broadcast 
(or multicast) to establish who has a file when it's not available 
locally. I'm also not sure that this would be a big problem - the 
broadcast and the corresponding responses would only need to be done when:
1) A file is being open and it isn't available locally (see if another 
node has it).
2) A file is being deleted due to local store filling up (see if the file 
is sufficiently redundant in the network to allow us to delete it from the 
local store).

> Would this problem be
> lessened any by distributing the namespace cache? (im assuming a DHT
> type solution) Would this just mean that consistancy problems would
> occur in the instance of a node failure?

I'm not sure that there is a namespace "cache" per se. I think the file 
open call is just routed according to the hash.

Gordan





More information about the Gluster-devel mailing list