[Gluster-users] Slow performance from simple tar -x && rm -r benchmark

Tue Mar 20 06:47:45 UTC 2012

I'm going to start off and say that I misstated, I must have been
doing my *many-file* tests *inside* VM's running on top of glusterfs.
I post a loopback test later this week.

Anyhow, here is my run:

[root at lab0-v3 ~]# gluster volume info

Volume Name: images
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 169.254.253.11:/gluster/images
Brick2: 169.254.253.10:/gluster/images
Options Reconfigured:
performance.io-thread-count: 64
[root at lab0-v3 ~]# mount | grep images
/dev/mapper/storage-images on /gluster/images type xfs (rw)
lab0-v3.us.vorstack.net:images on /var/lib/libvirt/images type
fuse.glusterfs (rw,allow_other,default_permissions,max_read=131072)
[root at lab0-v3 ~]# dd if=/dev/zero of=/var/lib/libvirt/images/junk.del
bs=1M count=1000 conv=fsync oflag=sync
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 4.41501 s, 238 MB/s
[root at lab0-v3 ~]# ls -al linux-3.3.tar.bz2
-rw-r--r-- 1 root root 78963640 Mar 19 01:16 linux-3.3.tar.bz2
[root at lab0-v3 ~]# bunzip2 linux-3.3.tar.bz2
[root at lab0-v3 ~]# ls -al linux-3.3.tar
-rw-r--r-- 1 root root 466083840 Mar 19 01:16 linux-3.3.tar
[root at lab0-v3 ~]# cd /tmp/
[root at lab0-v3 tmp]# time bash -c 'tar xf /root/linux-3.3.tar ; sync ;
rm -rf linux-3.3'

real	0m3.777s
user	0m0.127s
sys	0m1.415s
root at lab0-v3 images]# time bash -c 'tar xf /root/linux-3.3.tar ; sync
; rm -rf linux-3.3'

real	2m22.159s
user	0m0.484s
sys	0m3.898s

On Mon, Mar 19, 2012 at 3:06 PM, John Lauro <john.lauro at covenanteyes.com> wrote:
> Bryan can you reproduce this test and see what times you get?  Just extracting a tar file of the linux kernel...  I had the same problem when I looked into gluster a few months ago.
>
> I would be interested if you can get acceptable performance for this type of operation.
>
> From what I could tell it is normal for such slow performance when creating lots and lots of small files.  That is why gluster isn't recommended for homedirectories, etc...  Reading lots of small files I think is ok, and updating large files is ok, but creating lots of small files is terrible.
>
> To show that large files are fine, you can try the following expirment.  Create a large file (ie: 20gb file with dd) and make a loopback device out it and then mount the loopback device, you can then get decent performance out of that loopback running on gluster, but obviously only one machine would be able to mount that loopback device at a time...  That test shows it might work ok for hosting virtual machines.
>
>
> ----- Original Message -----
> From: "Bryan Whitehead" <driver at megahappy.net>
> To: "Chris Webb" <chris at arachsys.com>
> Cc: gluster-users at gluster.org
> Sent: Monday, March 19, 2012 5:00:02 PM
> Subject: Re: [Gluster-users] Slow performance from simple tar -x && rm -r       benchmark
>
> I have a number of labs I test my glusterfs installs on. From
> Infinband 40G w/switch and also some cheap $800 boxes on a gig
> network.
>
> None of them exhibit the poor performance i'm seeing in your post - so
> I'm just throwing out the differences I'm seeing with your config vs
> mine.
>
> Another option you might want to try is increasing the max number of threads:
>
> gluster volume set <name> performance.io-thread-count 64
>
>
> On Mon, Mar 19, 2012 at 2:34 AM, Chris Webb <chris at arachsys.com> wrote:
>> Bryan Whitehead <driver at megahappy.net> writes:
>>
>>> I didn't see any sync's after the tar/rm commands...
>>
>> By default, ext4 flushes both metadata and data every five seconds, so a
>> post-benchmark sync tends to make little difference on a reasonable large test,
>> but for completeness:
>>
>>  # time bash -c 'tar xfz ~/linux-3.3-rc7.tgz; sync; rm -rf linux-3.3-rc7; sync'
>>  real    0m23.826s
>>  user    0m20.681s
>>  sys     0m2.392s
>>
>> vs
>>
>>  # time bash -c 'tar xfz ~/linux-3.3-rc7.tgz; sync; rm -rf linux-3.3-rc7; sync'
>>
>>  real    4m24.067s
>>  user    0m24.692s
>>  sys     0m7.588s
>>
>> showing very similar timings and the same effect.
>>
>>> try using xfs instead of ext4.
>>
>> I'll build the xfs tooling, add kernel support, and give this a go, but I'm
>> surprised you think changing the underlying filesystem would eliminate the big
>> gap between native and gluster performance. I could imagine it improving both
>> somewhat, but if anything, I'd expect a higher performance filesystem to
>> amplify the differences. Do you think that glusterfs does something that's
>> particularly expensive on ext4, much more expensive than the operations proxied
>> through it?
>>
>> Best wishes,
>>
>> Chris.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users