[Gluster-devel] Feature review: Improved rebalance performance

Tue Jul 1 09:55:51 UTC 2014

----- Original Message -----
> From: "Xavier Hernandez" <xhernandez at datalab.es>
> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> Cc: "Shyamsundar Ranganathan" <srangana at redhat.com>, gluster-devel at gluster.org
> Sent: Tuesday, July 1, 2014 3:10:29 PM
> Subject: Re: [Gluster-devel] Feature review: Improved rebalance performance
> 
> On Tuesday 01 July 2014 02:37:34 Raghavendra Gowdappa wrote:
> > > Another thing to consider for future versions is to modify the current
> > > DHT
> > > to a consistent hashing and even the hash value (using gfid instead of a
> > > hash of the name would solve the rename problem). The consistent hashing
> > > would drastically reduce the number of files that need to be moved and
> > > already solves some of the current problems. This change needs a lot of
> > > thinking though.
> > 
> > The problem with using gfid for hashing instead of name is that we run into
> > a chicken and egg problem. Before lookup, we cannot know the gfid of the
> > file and to lookup the file, we need gfid to find out the node in which
> > file resides. Of course, this problem would go away if we lookup (may be
> > just during fresh lookups) on all the nodes, but that slows down the fresh
> > lookups and may not be acceptable.
> 
> I think it's not so problematic, and the benefits would be considerable.
> 
> The gfid of the root directory is always known. This means that we could
> always do a lookup on root by gfid.
> 
> I haven't tested it but as I understand it, when you want to do a getxattr on
> a file inside a subdirectory, for example, the kernel will issue lookups on
> all intermediate directories to check,

Yes, but how does dht handle these lookups? Are you suggesting that we wind the lookup call to all subvolumes (since we don't know which subvolume the file is present for lack of gfid)?

> at least, the access rights before
> finally reading the xattr of the file. This means that we can get and cache
> gfid's of all intermediate directories in the process.
> 
> Even if there's some operation that does not issue a previous lookup, we
> could
> do that lookup if it's not cached. Of course if there were many more
> operations not issuing a previous lookup, this solution won't be good, but I
> think this is not the case.
> 
> I'll try to do some tests to see if this is correct.
> 
> Xavi
>