[Gluster-devel] Performance question.

Chris Johnson johnson at nmr.mgh.harvard.edu
Wed Nov 21 18:26:24 UTC 2007


On Wed, 21 Nov 2007, Anand Avati wrote:

      I'll try and find out.

      Also, is it the case that glusterfs will always be noticably
slower than NFS?

> Chris,
>  what cache-size did you configure in io-cache? Is it possible to share
> throughput benchmarks using dd (both read and write) also what is the
> io-zone performance at 128kB reclen?
>
> avati
>
> 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
>>
>> On Wed, 21 Nov 2007, Chris Johnson wrote:
>>
>> Ok, caching and write-behind moved to the client side.  There is some
>> improvement.
>>
>>
>> random  random    bkwd  record  stride
>>                KB  reclen   write rewrite    read    reread    read
>> write    read rewrite    read   fwrite frewrite   fread  freread
>>            131072      32     312     312      361      363    1453
>> 322     677     320     753      312      312     369      363
>>
>> but as you can see it's marginal.  Is this typical, i.e. being an
>> order of magnitude slower than NFS?
>>
>>> On Wed, 21 Nov 2007, Anand Avati wrote:
>>>
>>>     See I asked if there was a philosophy about how to build a stack.
>>> Never got a response until now.
>>>
>>>     Caching won't help in the real appication I don't believe.
>>> Mostly it's read, crunch, write.  If I'm wrong here please let me
>>> know.  Although I don't believe it will hurt.  I'll give moving
>>> write-behind and io-cache to the client and see what happens.  Does it
>>> matter how they're stacked, i.e. the which comes first?
>>>
>>>> You should also be loading io-cache on the client side with a decent
>>>> cache-size (like 256MB? depends on how much RAM you have to spare).
>> this
>>>> will help re-read improve a lot.
>>>>
>>>> avati
>>>>
>>>> 2007/11/21, Anand Avati <avati at zresearch.com>:
>>>>>
>>>>> Chris,
>>>>>  you shoud really be loading write-behind on the client side, that is
>> wht
>>>>> improves write performance the most. do let us know the results with
>>>>> writebehind on the client side.
>>>>>
>>>>> avati
>>>>>
>>>>> 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
>>>>>>
>>>>>>       Hi, again,
>>>>>>
>>>>>>       I asked about stack building philosophy.  Apparently there
>> isn't
>>>>>> one.  So I tried a few things.  The configs are down the end here.
>>>>>>
>>>>>>       Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster
>>>>>> enhanced, glusterfs-1.3.5-2.  Both have gigabit ethernet, server runs
>>>>>> a SATABeast.  Currently I ge the following from from iozone.
>>>>>>
>>>>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff
>>>>>>
>>>>>>
>>>>>> random  random    bkwd  record  stride
>>>>>>                KB  reclen   write rewrite    read    reread    read
>>>>>> write    read rewrite    read   fwrite frewrite   fread  freread
>>>>>>            131072      32     589     587      345      343     818
>>>>>> 621     757     624     845      592      591     346      366
>>>>>>
>>>>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware
>>>>>> RAID card gives this
>>>>>>
>>>>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff
>>>>>>
>>>>>>
>>>>>> random  random    bkwd  record  stride
>>>>>>                KB  reclen   write rewrite    read    reread    read
>>>>>> write    read rewrite    read   fwrite frewrite   fread  freread
>>>>>>            131072      32      27      26      292
>>>>>> 11      11      24     542       9     539       30       28     295
>>>>>> 11
>>>>>>
>>>>>> And you can see that the NFS system is faster.  Is this because of
>> the
>>>>>> hardware 3ware RAID or is NFS really that much faster here?  Is there
>>>>>> a better way to stack this that would improve things?  And I tried
>> with
>>>>>> and without striping.  No noticable difference in gluster
>> performance.
>>>>>>
>>>>>>       Help appreciated.
>>>>>>
>>>>>> ============  server config
>>>>>>
>>>>>> volume brick1
>>>>>>    type storage/posix
>>>>>>    option directory /home/sdm1
>>>>>> end-volume
>>>>>>
>>>>>> volume brick2
>>>>>>    type storage/posix
>>>>>>    option directory /home/sdl1
>>>>>> end-volume
>>>>>>
>>>>>> volume brick3
>>>>>>    type storage/posix
>>>>>>    option directory /home/sdk1
>>>>>> end-volume
>>>>>>
>>>>>> volume brick4
>>>>>>    type storage/posix
>>>>>>    option directory /home/sdk1
>>>>>> end-volume
>>>>>>
>>>>>> volume ns-brick
>>>>>>    type storage/posix
>>>>>>    option directory /home/sdk1
>>>>>> end-volume
>>>>>>
>>>>>> volume stripe1
>>>>>>   type cluster/stripe
>>>>>>   subvolumes brick1 brick2
>>>>>> # option block-size *:10KB,
>>>>>> end-volume
>>>>>>
>>>>>> volume stripe2
>>>>>>   type cluster/stripe
>>>>>>   subvolumes brick3 brick4
>>>>>> # option block-size *:10KB,
>>>>>> end-volume
>>>>>>
>>>>>> volume unify0
>>>>>>   type cluster/unify
>>>>>>   subvolumes stripe1 stripe2
>>>>>>   option namespace ns-brick
>>>>>>   option scheduler rr
>>>>>> # option rr.limits.min-disk-free 5
>>>>>> end-volume
>>>>>>
>>>>>> volume iot
>>>>>>   type performance/io-threads
>>>>>>   subvolumes unify0
>>>>>>   option thread-count 8
>>>>>> end-volume
>>>>>>
>>>>>> volume writebehind
>>>>>>    type performance/write-behind
>>>>>>    option aggregate-size 131072 # in bytes
>>>>>>    subvolumes iot
>>>>>> end-volume
>>>>>>
>>>>>> volume readahead
>>>>>>    type performance/read-ahead
>>>>>> #  option page-size 65536 ### in bytes
>>>>>>    option page-size 128kb ### in bytes
>>>>>> #  option page-count 16 ### memory cache size is page-count x
>>>>>> page-size per file
>>>>>>    option page-count 2 ### memory cache size is page-count x
>> page-size
>>>>>> per file
>>>>>>    subvolumes writebehind
>>>>>> end-volume
>>>>>>
>>>>>> volume server
>>>>>>    type protocol/server
>>>>>>    subvolumes readahead
>>>>>>    option transport-type tcp/server     # For TCP/IP transport
>>>>>> #  option client-volume-filename /etc/glusterfs/glusterfs-client.vol
>>>>>>    option auth.ip.readahead.allow *
>>>>>> end-volume
>>>>>>
>>>>>>
>>>>>> ============  client config
>>>>>>
>>>>>> volume client
>>>>>>    type protocol/client
>>>>>>    option transport-type tcp/client
>>>>>>    option remote-host xxx.xxx.xxx.xxx
>>>>>>    option remote-subvolume readahead
>>>>>> end-volume
>>>>>>
>>>>>>
>>>>>>
>> -------------------------------------------------------------------------------
>>>>>>
>>>>>> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
>>>>>> Systems Administrator       |Web:
>>>>>> http://www.nmr.mgh.harvard.edu/~johnson
>>>>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson>
>>>>>> NMR Center                  |Voice:    617.726.0949
>>>>>> Mass. General Hospital      |FAX:      617.726.7422
>>>>>> 149 (2301) 13th Street      |A compromise is a solution nobody is
>> happy
>>>>>> with.
>>>>>> Charlestown, MA., 02129 USA |     Observation, Unknown
>>>>>>
>>>>>>
>>>>>>
>> -------------------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-devel mailing list
>>>>>> Gluster-devel at nongnu.org
>>>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> It always takes longer than you expect, even when you take into
>> account
>>>>> Hofstadter's Law.
>>>>>
>>>>> -- Hofstadter's Law
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> It always takes longer than you expect, even when you take into account
>>>> Hofstadter's Law.
>>>>
>>>> -- Hofstadter's Law
>>>>
>>>
>>>
>> -------------------------------------------------------------------------------
>>> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
>>> Systems Administrator       |Web:
>>> http://www.nmr.mgh.harvard.edu/~johnson
>>> NMR Center                  |Voice:    617.726.0949
>>> Mass. General Hospital      |FAX:      617.726.7422
>>> 149 (2301) 13th Street      |For all sad words of tongue or pen, the
>> saddest
>>> Charlestown, MA., 02129 USA |are these: "It might have been".  John G.
>>> Whittier
>>>
>> -------------------------------------------------------------------------------
>>>
>>>
>>>
>>
>>
>> -------------------------------------------------------------------------------
>> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
>> Systems Administrator       |Web:
>> http://www.nmr.mgh.harvard.edu/~johnson
>> NMR Center                  |Voice:    617.726.0949
>> Mass. General Hospital      |FAX:      617.726.7422
>> 149 (2301) 13th Street      |Fifty percent of all doctors graduated in the
>> Charlestown, MA., 02129 USA |lower half of the class.  Observation
>>
>> -------------------------------------------------------------------------------
>>
>
>
>
> -- 
> It always takes longer than you expect, even when you take into account
> Hofstadter's Law.
>
> -- Hofstadter's Law
>

------------------------------------------------------------------------------- 
Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
Systems Administrator       |Web:      http://www.nmr.mgh.harvard.edu/~johnson
NMR Center                  |Voice:    617.726.0949
Mass. General Hospital      |FAX:      617.726.7422
149 (2301) 13th Street      |"A good engineer never reinvents the wheel when
Charlestown, MA., 02129 USA |an existing one with modifications will do." Me 
-------------------------------------------------------------------------------





More information about the Gluster-devel mailing list