[Gluster-users] Disastrous performance with rsync to mounted Gluster volume.
David Robinson
david.robinson at corvidtec.com
Mon Apr 27 21:21:08 UTC 2015
I am also having a terrible time with rsync and gluster. The vast
majority of my time is spent figuring out what to sync... This sync
takes 17-hours even though very little data is being transferred.
sent 120,523 bytes received 74,485,191,265 bytes 1,210,720.02
bytes/sec
total size is 27,589,660,889,910 speedup is 370.40
------ Original Message ------
From: "Ben Turner" <bturner at redhat.com>
To: "Ernie Dunbar" <maillist at lightspeed.ca>
Cc: "Gluster Users" <gluster-users at gluster.org>
Sent: 4/27/2015 4:52:35 PM
Subject: Re: [Gluster-users] Disastrous performance with rsync to
mounted Gluster volume.
>----- Original Message -----
>> From: "Ernie Dunbar" <maillist at lightspeed.ca>
>> To: "Gluster Users" <gluster-users at gluster.org>
>> Sent: Monday, April 27, 2015 4:24:56 PM
>> Subject: Re: [Gluster-users] Disastrous performance with rsync to
>>mounted Gluster volume.
>>
>> On 2015-04-24 11:43, Joe Julian wrote:
>>
>> >> This should get you where you need to be. Before you start to
>>migrate
>> >> the data maybe do a couple DDs and send me the output so we can
>>get an
>> >> idea of how your cluster performs:
>> >>
>> >> time `dd if=/dev/zero of=<gluster-mount>/myfile bs=1024k
>>count=1000;
>> >> sync`
>> >> echo 3 > /proc/sys/vm/drop_caches
>> >> dd if=<gluster mount> of=/dev/null bs=1024k count=1000
>> >>
>> >> If you are using gigabit and glusterfs mounts with replica 2 you
>> >> should get ~55 MB / sec writes and ~110 MB / sec reads. With NFS
>>you
>> >> will take a bit of a hit since NFS doesnt know where files live
>>like
>> >> glusterfs does.
>>
>> After copying our data and doing a couple of very slow rsyncs, I did
>> your speed test and came back with these results:
>>
>> 1048576 bytes (1.0 MB) copied, 0.0307951 s, 34.1 MB/s
>> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile
>> count=1024 bs=1024; sync
>> 1024+0 records in
>> 1024+0 records out
>> 1048576 bytes (1.0 MB) copied, 0.0298592 s, 35.1 MB/s
>> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile
>> count=1024 bs=1024; sync
>> 1024+0 records in
>> 1024+0 records out
>> 1048576 bytes (1.0 MB) copied, 0.0501495 s, 20.9 MB/s
>> root at backup:/home/webmailbak# echo 3 > /proc/sys/vm/drop_caches
>> root at backup:/home/webmailbak# # dd if=/mnt/testfile of=/dev/null
>> bs=1024k count=1000
>> 1+0 records in
>> 1+0 records out
>> 1048576 bytes (1.0 MB) copied, 0.0124498 s, 84.2 MB/s
>>
>>
>> Keep in mind that this is an NFS share over the network.
>>
>> I've also noticed that if I increase the count of those writes, the
>> transfer speed increases as well:
>>
>> 2097152 bytes (2.1 MB) copied, 0.036291 s, 57.8 MB/s
>> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile
>> count=2048 bs=1024; sync
>> 2048+0 records in
>> 2048+0 records out
>> 2097152 bytes (2.1 MB) copied, 0.0362724 s, 57.8 MB/s
>> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile
>> count=2048 bs=1024; sync
>> 2048+0 records in
>> 2048+0 records out
>> 2097152 bytes (2.1 MB) copied, 0.0360319 s, 58.2 MB/s
>> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile
>> count=10240 bs=1024; sync
>> 10240+0 records in
>> 10240+0 records out
>> 10485760 bytes (10 MB) copied, 0.127219 s, 82.4 MB/s
>> root at backup:/home/webmailbak# dd if=/dev/zero of=/mnt/testfile
>> count=10240 bs=1024; sync
>> 10240+0 records in
>> 10240+0 records out
>> 10485760 bytes (10 MB) copied, 0.128671 s, 81.5 MB/s
>
>This is correct, there is overhead that happens with small files and
>the smaller the file the less throughput you get. That said, since
>files are smaller you should get more files / second but less MB /
>second. I have found that when you go under 16k changing files size
>doesn't matter, you will get the same number of 16k files / second as
>you do 1 k files.
>
>>
>>
>> However, the biggest stumbling block for rsync seems to be changes to
>> directories. I'm unsure about what exactly it's doing (probably
>>changing
>> last access times?) but these minor writes seem to take a very long
>>time
>> when normally they would not. Actual file copies (as in the very
>>files
>> that are actually new within those same directories) appear to take
>> quite a lot less time than the directory updates.
>
>Dragons be here! Access time is not kept in sync across the
>replicas(IIRC, someone correct me if I am wrong!) and each time a dir
>is read from a different brick I bet the access time is different.
>
>>
>> For example:
>>
>> # time rsync -av --inplace --whole-file --ignore-existing
>>--delete-after
>> gromm/* /mnt/gromm/
>> building file list ... done
>> Maildir/ ## This part takes a long time.
>> Maildir/.INBOX.Trash/
>> Maildir/.INBOX.Trash/cur/
>>
>>Maildir/.INBOX.Trash/cur/1429836077.H817602P21531.pop.lightspeed.ca:2,S
>> Maildir/.INBOX.Trash/tmp/ ## The previous three lines took
>>nearly
>> no time at all.
>> Maildir/cur/ ## This takes a long time.
>> Maildir/cur/1430160436.H952679P13870.pop.lightspeed.ca:2,S
>> Maildir/new/
>> Maildir/tmp/ ## The previous lines again take no
>>time
>> at all.
>> deleting Maildir/cur/1429836077.H817602P21531.pop.lightspeed.ca:2,S
>> ## This delete did take a while.
>> sent 1327634 bytes received 75 bytes 59009.29 bytes/sec
>> total size is 624491648 speedup is 470.35
>>
>> real 0m26.110s
>> user 0m0.140s
>> sys 0m1.596s
>>
>>
>> So, rsync reports that it wrote 1327634 bytes at 59 kBytes/sec, and
>>the
>> whole operation took 26 seconds. To write 2 files that were around
>>20-30
>> kBytes each and delete 1.
>>
>> The last rsync took around 56 minutes, when normally such an rsync
>>would
>> have taken 5-10 minutes, writing over the network via ssh.
>
>It may have something to do with the access times not being in sync
>across replicated pairs. Maybe some has experience with this / could
>this be tripping up rsync?
>
>-b
>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://www.gluster.org/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list