[Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume

Xavier Hernandez xhernandez at datalab.es
Tue Jul 22 13:41:56 UTC 2014


On Tuesday 22 July 2014 07:04:54 Jeff Darcy wrote:
> > One possible solution is to convert directories into files managed by
> > storage/posix (some changes will also be required in dht and afr
> > probably).  We will have full control about the format of this file,
> > so we'll be able to use the directory offset that we want to avoid
> > interferences with upper xlators in readdir(p) calls. This will also
> > allow to optimize directory accesses and even minimize or solve the
> > problem of renames.
> 
> Unfortunately, most of the problems with renames involve multiple
> directories and/or multiple bricks, so changing how we store directory
> information within a brick won't solve those particular problems.

I know there are many problems that this solution doesn't address. I should 
have specified that I was talking about the effect of a rename causing a file 
to reside in the wrong brick (needing a rebalance to move it to the right 
place). With this implementation it would be easier to avoid this problem.

> > Additionally, this will give the same reliability to directories that
> > files have (replicated or dispersed).
> 
> If it's within storage/posix then it's well below either replication or
> dispersal.  I think there's the kernel of a good idea here, but it's
> going to require changes to multiple components (and how they relate to
> one another).

Well, even if this is implemented in storage/posix, many other components, 
including dht, afr and ec, will need to be aware of that to make it work, as 
you said.

It's a big change, but the benefits are also very interesting. I haven't 
analyzed all details yet, but I think that many of the complexities in 
directory management inside dht and afr could be eliminated or simplified 
significantly. It could also improve directory browsing speed.

Xavi



More information about the Gluster-devel mailing list