[Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume

Xavier Hernandez xhernandez at datalab.es
Tue Jul 22 07:51:58 UTC 2014


On Monday 21 July 2014 13:14:46 Jeff Darcy wrote:
> Perhaps it's time to revisit the idea of making assumptions about d_off
> values and twiddling them back and forth, vs. maintaining a precise
> mapping between our values and local-FS values.
> 
> http://review.gluster.org/#/c/4675/
> 
> That patch is old and probably incomplete, but at the time it worked
> just as well as the one that led us into the current situation.

I think directory handling has a lot of issues, not only the problem of big 
offsets. The most important will be scalability when the number of bricks will 
be greater.

Maybe we should try to find a better solution to address all these problems at 
once.

One possible solution is to convert directories into files managed by 
storage/posix (some changes will also be required in dht and afr probably). We 
will have full control about the format of this file, so we'll be able to use 
the directory offset that we want to avoid interferences with upper xlators in 
readdir(p) calls. This will also allow to optimize directory accesses and even 
minimize or solve the problem of renames.

Additionally, this will give the same reliability to directories that files 
have (replicated or dispersed).

Obviously this is an important architectural change on the brick level, but I 
think its benefits are worth it.

Xavi


More information about the Gluster-devel mailing list