[Gluster-users] gluster local vs local = gluster x4 slower
Jeremy Enos
jenos at ncsa.uiuc.edu
Mon Mar 29 18:21:41 UTC 2010
Got a chance to run your suggested test:
##############GLUSTER SINGLE DISK##############
[root at ac33 gjenos]# dd bs=4096 count=32768 if=/dev/zero of=./filename.test
32768+0 records in
32768+0 records out
134217728 bytes (134 MB) copied, 8.60486 s, 15.6 MB/s
[root at ac33 gjenos]#
[root at ac33 gjenos]# cd /export/jenos/
##############DIRECT SINGLE DISK##############
[root at ac33 jenos]# dd bs=4096 count=32768 if=/dev/zero of=./filename.test
32768+0 records in
32768+0 records out
134217728 bytes (134 MB) copied, 0.21915 s, 612 MB/s
[root at ac33 jenos]#
If doing anything that can see a cache benefit, the performance of
Gluster can't compare. Is it even using cache?
This is the client vol file I used for that test:
[root at ac33 jenos]# cat /etc/glusterfs/ghome.vol
#-----------IB remotes------------------
volume ghome
type protocol/client
option transport-type tcp/client
option remote-host ac33
option remote-subvolume ibstripe
end-volume
#------------Performance Options-------------------
volume readahead
type performance/read-ahead
option page-count 4 # 2 is default option
option force-atime-update off # default is off
subvolumes ghome
end-volume
volume writebehind
type performance/write-behind
option cache-size 1MB
subvolumes readahead
end-volume
volume cache
type performance/io-cache
option cache-size 2GB
subvolumes writebehind
end-volume
Any suggestions appreciated. thx-
Jeremy
On 3/26/2010 6:09 PM, Bryan Whitehead wrote:
> One more thought, looks like (from your emails) you are always running
> the gluster test first. Maybe the tar file is being read from disk
> when you do the gluster test, then being read from cache when you run
> for the disk.
>
> What if you just pull a chunk of 0's off /dev/zero?
>
> dd bs=4096 count=32768 if=/dev/zero of=./filename.test
>
> or stick the tar in a ramdisk?
>
> (or run the benchmark 10 times for each, drop the best and the worse,
> and average the remaining 8)
>
> Would also be curious if you add another node if the time would be
> halved, then add another 2... then it would be halved again? I guess
> that depends on if striping or just replicating is being used.
> (unfortunately I don't have access to more than 1 test box right now).
>
> On Wed, Mar 24, 2010 at 11:06 PM, Jeremy Enos<jenos at ncsa.uiuc.edu> wrote:
>
>> For completeness:
>>
>> ##############GLUSTER SINGLE DISK NO PERFORMANCE OPTIONS##############
>> [root at ac33 gjenos]# time (tar xzf
>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&& sync )
>>
>> real 0m41.052s
>> user 0m7.705s
>> sys 0m3.122s
>> ##############DIRECT SINGLE DISK##############
>> [root at ac33 gjenos]# cd /export/jenos
>> [root at ac33 jenos]# time (tar xzf
>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&& sync )
>>
>> real 0m22.093s
>> user 0m6.932s
>> sys 0m2.459s
>> [root at ac33 jenos]#
>>
>> The performance options don't appear to be the problem. So the question
>> stands- how do I get the disk cache advantage through the Gluster mounted
>> filesystem? It seems to be key in the large performance difference.
>>
>> Jeremy
>>
>> On 3/24/2010 4:47 PM, Jeremy Enos wrote:
>>
>>> Good suggestion- I hadn't tried that yet. It brings them much closer.
>>>
>>> ##############GLUSTER SINGLE DISK##############
>>> [root at ac33 gjenos]# time (tar xzf
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&& sync )
>>>
>>> real 0m32.089s
>>> user 0m6.516s
>>> sys 0m3.177s
>>> ##############DIRECT SINGLE DISK##############
>>> [root at ac33 gjenos]# cd /export/jenos/
>>> [root at ac33 jenos]# time (tar xzf
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&& sync )
>>>
>>> real 0m25.089s
>>> user 0m6.850s
>>> sys 0m2.058s
>>> ##############DIRECT SINGLE DISK CACHED##############
>>> [root at ac33 jenos]# time (tar xzf
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz )
>>>
>>> real 0m8.955s
>>> user 0m6.785s
>>> sys 0m1.848s
>>>
>>>
>>> Oddly, I'm seeing better performance on the gluster system than previous
>>> tests too (used to be ~39 s). The direct disk time is obviously benefiting
>>> from cache. There is still a difference, but it appears most of the
>>> difference disappears w/ the cache advantage removed. That said- the
>>> relative performance issue then still exists with Gluster. What can be done
>>> to make it benefit from cache the same way direct disk does?
>>> thx-
>>>
>>> Jeremy
>>>
>>> P.S.
>>> I'll be posting results w/ performance options completely removed from
>>> gluster as soon as I get a chance.
>>>
>>> Jeremy
>>>
>>> On 3/24/2010 4:23 PM, Bryan Whitehead wrote:
>>>
>>>> I'd like to see results with this:
>>>>
>>>> time ( tar xzf /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&
>>>> sync )
>>>>
>>>> I've found local filesystems seem to use cache very heavily. The
>>>> untarred file could mostly be sitting in ram with local fs vs going
>>>> though fuse (which might do many more sync'ed flushes to disk?).
>>>>
>>>> On Wed, Mar 24, 2010 at 2:25 AM, Jeremy Enos<jenos at ncsa.uiuc.edu> wrote:
>>>>
>>>>> I also neglected to mention that the underlying filesystem is ext3.
>>>>>
>>>>> On 3/24/2010 3:44 AM, Jeremy Enos wrote:
>>>>>
>>>>>> I haven't tried all performance options disabled yet- I can try that
>>>>>> tomorrow when the resource frees up. I was actually asking first
>>>>>> before
>>>>>> blindly trying different configuration matrices in case there's a clear
>>>>>> direction I should take with it. I'll let you know.
>>>>>>
>>>>>> Jeremy
>>>>>>
>>>>>> On 3/24/2010 2:54 AM, Stephan von Krawczynski wrote:
>>>>>>
>>>>>>> Hi Jeremy,
>>>>>>>
>>>>>>> have you tried to reproduce with all performance options disabled?
>>>>>>> They
>>>>>>> are
>>>>>>> possibly no good idea on a local system.
>>>>>>> What local fs do you use?
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Stephan
>>>>>>>
>>>>>>>
>>>>>>> On Tue, 23 Mar 2010 19:11:28 -0500
>>>>>>> Jeremy Enos<jenos at ncsa.uiuc.edu> wrote:
>>>>>>>
>>>>>>>
>>>>>>>> Stephan is correct- I primarily did this test to show a demonstrable
>>>>>>>> overhead example that I'm trying to eliminate. It's pronounced
>>>>>>>> enough
>>>>>>>> that it can be seen on a single disk / single node configuration,
>>>>>>>> which
>>>>>>>> is good in a way (so anyone can easily repro).
>>>>>>>>
>>>>>>>> My distributed/clustered solution would be ideal if it were fast
>>>>>>>> enough
>>>>>>>> for small block i/o as well as large block- I was hoping that single
>>>>>>>> node systems would achieve that, hence the single node test. Because
>>>>>>>> the single node test performed poorly, I eventually reduced down to
>>>>>>>> single disk to see if it could still be seen, and it clearly can be.
>>>>>>>> Perhaps it's something in my configuration? I've pasted my config
>>>>>>>> files
>>>>>>>> below.
>>>>>>>> thx-
>>>>>>>>
>>>>>>>> Jeremy
>>>>>>>>
>>>>>>>> ######################glusterfsd.vol######################
>>>>>>>> volume posix
>>>>>>>> type storage/posix
>>>>>>>> option directory /export
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> volume locks
>>>>>>>> type features/locks
>>>>>>>> subvolumes posix
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> volume disk
>>>>>>>> type performance/io-threads
>>>>>>>> option thread-count 4
>>>>>>>> subvolumes locks
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> volume server-ib
>>>>>>>> type protocol/server
>>>>>>>> option transport-type ib-verbs/server
>>>>>>>> option auth.addr.disk.allow *
>>>>>>>> subvolumes disk
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> volume server-tcp
>>>>>>>> type protocol/server
>>>>>>>> option transport-type tcp/server
>>>>>>>> option auth.addr.disk.allow *
>>>>>>>> subvolumes disk
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> ######################ghome.vol######################
>>>>>>>>
>>>>>>>> #-----------IB remotes------------------
>>>>>>>> volume ghome
>>>>>>>> type protocol/client
>>>>>>>> option transport-type ib-verbs/client
>>>>>>>> # option transport-type tcp/client
>>>>>>>> option remote-host acfs
>>>>>>>> option remote-subvolume raid
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> #------------Performance Options-------------------
>>>>>>>>
>>>>>>>> volume readahead
>>>>>>>> type performance/read-ahead
>>>>>>>> option page-count 4 # 2 is default option
>>>>>>>> option force-atime-update off # default is off
>>>>>>>> subvolumes ghome
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> volume writebehind
>>>>>>>> type performance/write-behind
>>>>>>>> option cache-size 1MB
>>>>>>>> subvolumes readahead
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> volume cache
>>>>>>>> type performance/io-cache
>>>>>>>> option cache-size 1GB
>>>>>>>> subvolumes writebehind
>>>>>>>> end-volume
>>>>>>>>
>>>>>>>> ######################END######################
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/23/2010 6:02 AM, Stephan von Krawczynski wrote:
>>>>>>>>
>>>>>>>>> On Tue, 23 Mar 2010 02:59:35 -0600 (CST)
>>>>>>>>> "Tejas N. Bhise"<tejas at gluster.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Out of curiosity, if you want to do stuff only on one machine,
>>>>>>>>>> why do you want to use a distributed, multi node, clustered,
>>>>>>>>>> file system ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Because what he does is a very good way to show the overhead
>>>>>>>>> produced
>>>>>>>>> only by
>>>>>>>>> glusterfs and nothing else (i.e. no network involved).
>>>>>>>>> A pretty relevant test scenario I would say.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Regards,
>>>>>>>>> Stephan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Am I missing something here ?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Tejas.
>>>>>>>>>>
>>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: "Jeremy Enos"<jenos at ncsa.uiuc.edu>
>>>>>>>>>> To: gluster-users at gluster.org
>>>>>>>>>> Sent: Tuesday, March 23, 2010 2:07:06 PM GMT +05:30 Chennai,
>>>>>>>>>> Kolkata,
>>>>>>>>>> Mumbai, New Delhi
>>>>>>>>>> Subject: [Gluster-users] gluster local vs local = gluster x4 slower
>>>>>>>>>>
>>>>>>>>>> This test is pretty easy to replicate anywhere- only takes 1 disk,
>>>>>>>>>> one
>>>>>>>>>> machine, one tarball. Untarring to local disk directly vs thru
>>>>>>>>>> gluster
>>>>>>>>>> is about 4.5x faster. At first I thought this may be due to a slow
>>>>>>>>>> host
>>>>>>>>>> (Opteron 2.4ghz). But it's not- same configuration, on a much
>>>>>>>>>> faster
>>>>>>>>>> machine (dual 3.33ghz Xeon) yields the performance below.
>>>>>>>>>>
>>>>>>>>>> ####THIS TEST WAS TO A LOCAL DISK THRU GLUSTER####
>>>>>>>>>> [root at ac33 jenos]# time tar xzf
>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>>>>>>>>
>>>>>>>>>> real 0m41.290s
>>>>>>>>>> user 0m14.246s
>>>>>>>>>> sys 0m2.957s
>>>>>>>>>>
>>>>>>>>>> ####THIS TEST WAS TO A LOCAL DISK (BYPASS GLUSTER)####
>>>>>>>>>> [root at ac33 jenos]# cd /export/jenos/
>>>>>>>>>> [root at ac33 jenos]# time tar xzf
>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>>>>>>>>
>>>>>>>>>> real 0m8.983s
>>>>>>>>>> user 0m6.857s
>>>>>>>>>> sys 0m1.844s
>>>>>>>>>>
>>>>>>>>>> ####THESE ARE TEST FILE DETAILS####
>>>>>>>>>> [root at ac33 jenos]# tar tzvf
>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz |wc -l
>>>>>>>>>> 109
>>>>>>>>>> [root at ac33 jenos]# ls -l
>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>>>>>>>> -rw-r--r-- 1 jenos ac 804385203 2010-02-07 06:32
>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>>>>>>>> [root at ac33 jenos]#
>>>>>>>>>>
>>>>>>>>>> These are the relevant performance options I'm using in my .vol
>>>>>>>>>> file:
>>>>>>>>>>
>>>>>>>>>> #------------Performance Options-------------------
>>>>>>>>>>
>>>>>>>>>> volume readahead
>>>>>>>>>> type performance/read-ahead
>>>>>>>>>> option page-count 4 # 2 is default option
>>>>>>>>>> option force-atime-update off # default is off
>>>>>>>>>> subvolumes ghome
>>>>>>>>>> end-volume
>>>>>>>>>>
>>>>>>>>>> volume writebehind
>>>>>>>>>> type performance/write-behind
>>>>>>>>>> option cache-size 1MB
>>>>>>>>>> subvolumes readahead
>>>>>>>>>> end-volume
>>>>>>>>>>
>>>>>>>>>> volume cache
>>>>>>>>>> type performance/io-cache
>>>>>>>>>> option cache-size 1GB
>>>>>>>>>> subvolumes writebehind
>>>>>>>>>> end-volume
>>>>>>>>>>
>>>>>>>>>> What can I do to improve gluster's performance?
>>>>>>>>>>
>>>>>>>>>> Jeremy
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>>>
>>
>
More information about the Gluster-users
mailing list