[Gluster-devel] Improving real world performance by moving files closer to their target workloads

Wed May 21 14:48:10 UTC 2008

On Tue, 20 May 2008, Martin Fick wrote:

>>> My point was that, as I understood your algorithm,
>>> a client would not know which nodes contained a
>>> certain file until all nodes had been
>>> contacted.  So, while the actual bandwidth, even
>>> to consult thousands of nodes, might be small
>>> relative to file transfer bandwidth, the client
>>> can't assume it has a complete answer until it
>>> gets all the replies, meaning requests to downed
>>> nodes have timed out.
>>
>> I agree that waiting for all nodes could be an issue
>> in case of downed nodes, and I concur that quorum
>> would be a good work-around.
>
>
> I think that it might help to stay focused in these
> types of discussions.  Luke was concerned with
> increasing performance, not reliability.  Yes, the
> idea of an AFR like unify translator was brought up,
> and it would be neat to be able to have HA and
> performance.  However, it might be helpful for now to
> deal with these as two separate issues.

IMO, separation of issues should only happen when the entire design is 
worked out. Otherwise you end up with two half-solutions which either 
don't integrate properly or don't make a whole when they're put together.

> Having said that, downed nodes will currently (in the
> non migration scenario) affect unify, they will
> effectively shutdown the whole cluster.  The current
> way to "fix" this is to use AFR underneath unify.  So,
> if a unify translator were modified to be able to
> migrate files for performance, we are no worse off
> than we currently are with the unify translator if one
> of those nodes goes down.

I don't think that a migrating unify translator alone would be the 
solution. It would need to be able to handle multiple instances of the 
same file, too, which would mean that this single translator would need to 
include functionality of both unify and AFR, but in a looser way 
(variable AFR per file, if you will).

Quorum locking might be a separate issue, but I don't think the translator 
itself would be separable.

> So why try and solve this issue here?  What you really
> are talking about is solving AFRs issues, not issues
> with the migration solution.  I agree that AFR could
> use some enhancements to deal with split brain, but
> that seems out of the scope of a migration type
> solution aimed at improving performance.  I suggest
> that Luke should pursue increasing performance with
> migration and making that work well without adding
> additional constraints to his problem.

Are you saying that files should be migrated without duplication? I guess 
that _could_ be done with just a unify translator mod, but it'd require 
some heuristics to check what nodes request which files most frequently, 
and then migrate based on that. This may be insufficient, however, if 
there are multiple nodes that could greatly benefit from having the file 
locally, and which point you have no choice but to consider a more complex 
translator that has to include AFR functionality on a per-file basis.

> The simplest migration solution is to not tolerate
> downed nodes!  If you do not tolerate them, you do not
> have locking/split brain types of issues to resolve.

This would be the "no redundancy" solution. As I said above, that could be 
done with just a unify translator mod (unify can't handle downed nodes 
without data loss, either). But that assumes only one node ever 
intensively accesses one file, which I don't think is likely to be a 
typical case.

> Simply migrate a file where it is needed and never
> leave a copy behind where it can get out of sync.  If
> you want HA, install AFR under each subvolume.  If you
> want to solve split brain issues with AFR (I hope we
> can,) start another thread. :)  Once AFR split brain
> issues are resolved in glusterfs, merging AFR and a
> Luke's potential merging unify translator should be a
> much easier and well defined task!

I don't think the problems are as separable as you are implying, at least 
not with making compromises in both parts that cannot be made up by 
stacking the two.

Gordan