[Gluster-users] DHT, pre-existing data unevenly distributed
Dan Bretherton
dab at mail.nerc-essc.ac.uk
Fri Apr 16 14:00:07 UTC 2010
I have been using DHT to join together two large filesystems (2.5TB)
containing pre-exising data. I solved the problem of ls not seeing
all the files by doing "rsync --dry-run" from the individual brick
directories to the glusterfs mounted volume. I am using
glusterfs-3.0.2 and "option lookup-unhashed yes" for DHT on the
client. All seemed to be well until the volume started to get nearly
full, and despite also using "option min-free-disk 10%" one of the
bricks became 100% full preventing any further writes to the whole
volume. I managed to get going again by manually transferring some
data from one server to the other, making the two more evenly
balanced, but I would like to find a more permanent solution. I would
also like to know if this sort of thing is supposed to happen with DHT
and pre-existing data, in the situation where data is not evenly
distributed across the bricks. I have included my client volume file
at the bottom of this message.
I tried using the unify translator instead, even though it is
supposedly now obsolete, but glusterfs crashed (segfault) when I tried
to mount the volume. I thought perhaps unify was no longer supported
in 3.0.2 so didn't pursue that option any further. However, if unify
turns out to be better than DHT for pre-existing data situations I
will have to find out what went wrong. Should I be using the unify
translator instead of DHT for pre-existing data that is unevenly
distributed across bricks? If I can continue with DHT, can I stop
using "option lookup-unhashed yes" at some point?
Regards,
Dan Bretherton
####
## Client vol file
volume romulus
type protocol/client
option transport-type tcp
option remote-host romulus
option remote-port 6996
option remote-subvolume brick1
end-volume
volume perseus
type protocol/client
option transport-type tcp
option remote-host perseus
option remote-port 6996
option remote-subvolume brick1
end-volume
volume distribute
type cluster/distribute
option min-free-disk 10%
option lookup-unhashed yes
subvolumes romulus perseus
end-volume
volume io-threads
type performance/io-threads
#option thread-count 8 # default is 16
subvolumes distribute
end-volume
volume io-cache
type performance/io-cache
option cache-size 1GB
subvolumes io-threads
end-volume
volume main
type performance/stat-prefetch
subvolumes io-cache
end-volume
--
Mr. D.A. Bretherton
Reading e-Science Centre
Environmental Systems Science Centre
Harry Pitt Building
3 Earley Gate
University of Reading
Reading, RG6 6AL
UK
Tel. +44 118 378 7722
Fax: +44 118 378 6413
More information about the Gluster-users
mailing list