[Gluster-users] Brick Preference

Jeff Darcy jdarcy at redhat.com
Thu Jun 24 17:23:54 UTC 2010


On 06/24/2010 12:49 PM, Andy Pace wrote:
> When is a good time to defrag? Is it something I should cron
> (hourly? Daily?) to make sure files are evenly distributed amongst
> each brick?

It's totally up to you. The way I just outlined is a bit disruptive.
We're doing it to recover from what looks like inconsistent state caused
by following earlier suggestions. In fact, I'd even go so far as to say
that the xattr removal from the client side in scale-n-defrag.sh is
highly likely to be the culprit here and should be avoided. There should
definitely be a better way to do this, and I've even implemented my own
DHT-like translator largely to avoid this specific problem of skewed
distribution based on an outdated per-directory DHT map. With that
translator the manual map removal/reconstruction would never be
necessary, but it's not production-ready yet so maybe I should shut up
about it. ;)

If your server/brick setup hasn't changed, then the defrag process won't
do anything but waste bandwidth; the files will end up exactly where
they were before. OTOH, the distribution should be sufficiently random
that one brick will not be overutilized while others remain empty. (The
only exception should be with very few very large files, and if that's
your use case then you might want to consider using the stripe
translator to distribute at the block level.) It's only if your
DHT-level setup has changed that defrag will have any effect, and then
you need to do the manual map removal/reconstruction part. In any case,
I'd suggest doing it on a rolling basis. Pick a subdirectory, do all of
the operations (server-side and client-side) to rebalance that, then
move on the next one. You could also parallelize it by doing N
subdirectories at once, but if you avoid filesystem-wide operations that
will minimize the disruption to users.




More information about the Gluster-users mailing list