[Gluster-devel] Improving real world performance by moving files closer to their target workloads

Mon Jun 2 18:59:06 UTC 2008

--- Derek Price <derek at ximbiot.com> wrote:
> Martin Fick wrote:
> > Except that as I pointed out in my other email, I
> > do not think that this would actually make reading
> > any better than today's caching translator, and if

> > so, then simply improving the caching translator
> > should suffice.  And, it would make writes much 
> > more complicated and in fact probably slower than 
> > what unify does today.  So where is the gain?
> 
> Hrm.  Rereading your comments above, only #2 below
> is directly relevant to cache efficiency, if that is

> all you are interested in, but this design would
have > some other advantages, listed below.  Why do
you 
> think this model would slow down writes?

How would it not slow down writes?  Coordinating
a write to many servers is slower than one single
server, thus the whole discussions here and on 
the AFR threads about how to coordinate things 
with the least impact.  However this impact is 
reduced, it surely will be more than writing to a
single server, thus there is no performance 
improvement for writes, only a potential 
slowdown.

I do not disagree that AFR like functionality for
redundancy is a good thing, but as I keep trying 
to clarify, this really is not what Luke was 
asking about or suggesting.  His objective did 
not seem to be to add redundancy, but rather to 
improve performance.  I was suggesting to be 
clear about the objectives and not to mix the two
(or, if you do mix them, be clear that there is
some gain to be had over working on the problems
separately).  

Naturally if one is focusing on redundancy (AFR),
performance would also be an objective, but the 
reverse is not so natural.  So if you want to 
improve the feature set/performance of the AFR
translator, great!  Personally, I am currently 
more concerned with eliminating split brain from 
AFR than increasing its flexibility.  I think 
that ensuring a working simple design is more 
important than trying to come up with more 
advanced features.  

But, I can't tell you what to focus on, my 
rantings have primarily been to try and make 
this whole discussion a little more focused 
and to ensure that the objectives and 
advantages / disadvantages of any suggested 
solutions be clear since it seems like many 
concepts are being mixed that have potentially
conflicting impacts.

> 2.  All known copies could be kept up-to-date via
> some sort of differential algorithm (at least for 
> appends, this would be easy). Using the current read

> cache, I think that if a large file gets written 
> then the entire file must be recopied over the
> network to any caching readers?

I don't know how it currently works, but if 
the read cache could use improving it seems 
like that would be a valuable well focused 
clear objective and I would encourage it. :)
I also think that this would be easier to do
than trying to improve read performance by
adding AFR like features to the unify 
translator.  Not only might it be easier,
it would benefit many more scenarios, a nice
design effect of the current modularity of 
glusterfs!

Cheers,

-Martin