[Gluster-users] Script and tips for parallelizing rsync

Dan Mons dmons at cuttingedge.com.au
Thu Jul 10 01:27:07 UTC 2014

We do something similar for our nightly backups (100TB between two
Gluster setups).

Each of our 6 Gluster nodes gets a set of top level folders
(representing each department in the org), and within each we thread
based on folders in the top level of each major section.  That nets us
around 200+ rsync threads, which makes the nightly sync happen a lot

I played around with parallel rsync, but could never make it work the
way I wanted.  Just doing a simple "ls -d * | while read DIR ; do
rsync /$DIR/ remote:/$DIR/ & done" works out far better.


Dan Mons
Unbreaker of broken things
Cutting Edge

On 9 July 2014 21:42, Alan Orth <alan.orth at gmail.com> wrote:
> Hi,
> I recently had a RAID failure on one of my Gluster replicas; luckily my
> replica was ok, and I could re-sync all the data to the bad node's
> bricks.  I used rsync to pre-seed the brick data, rather than having
> Gluster's self-heal daemon try to figure it out.
> It turns out I had way more files than I realized, which exposed some
> problems with "traditional" rsync invocation.  I found some clever ways
> to optimize the transfer and speed up the process, and wrote up my
> experiences on my blog:
> http://mjanja.co.ke/2014/07/parallelizing-rsync/
> Hope this helps someone!
> --
> Alan Orth
> alan.orth at gmail.com
> http://alaninkenya.org
> http://mjanja.co.ke
> "I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone." -Bjarne Stroustrup, inventor of C++
> GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

More information about the Gluster-users mailing list