[Gluster-users] Parallel cp?

Brian Candler B.Candler at pobox.com
Sat Feb 4 23:11:54 UTC 2012

I reckon that to quickly copy one glusterfs volume to another, I will need a
multi-threaded 'cp'.  That is, something which will take the list of files
from readdir() and copy batches of N of them in parallel.  This is so I can
keep all the component spindles busy.

Question 1: does such a thing existing already in the open source world?

Question 2: for a DHT volume, does readdir() return the files in a
round-robin fashion, i.e. one from brick 1, one from brick 2, one from brick
3 etc? Or does it return all the results from one brick, followed by all the
results from the second brick, and so on? Or something indeterminate?

Alternatively: is it possible to determine for each file which brick it
resides on?

(I don't think it's in an extended attribute; I tried 'getfattr -d' on a
file, both on the GlusterFS mount and on the underlying brick, and couldn't
see anything)



P.S. I did look in the source, and I couldn't figure out how dht_do_readdir
works.  But it does have a slightly disconcerting comment:

/* TODO: do proper readdir */

