[Gluster-devel] Performance question.

Wed Nov 21 18:22:10 UTC 2007

Chris,
  what cache-size did you configure in io-cache? Is it possible to share
throughput benchmarks using dd (both read and write) also what is the
io-zone performance at 128kB reclen?

avati

2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
>
> On Wed, 21 Nov 2007, Chris Johnson wrote:
>
> Ok, caching and write-behind moved to the client side.  There is some
> improvement.
>
>
> random  random    bkwd  record  stride
>                KB  reclen   write rewrite    read    reread    read
> write    read rewrite    read   fwrite frewrite   fread  freread
>            131072      32     312     312      361      363    1453
> 322     677     320     753      312      312     369      363
>
> but as you can see it's marginal.  Is this typical, i.e. being an
> order of magnitude slower than NFS?
>
> > On Wed, 21 Nov 2007, Anand Avati wrote:
> >
> >     See I asked if there was a philosophy about how to build a stack.
> > Never got a response until now.
> >
> >     Caching won't help in the real appication I don't believe.
> > Mostly it's read, crunch, write.  If I'm wrong here please let me
> > know.  Although I don't believe it will hurt.  I'll give moving
> > write-behind and io-cache to the client and see what happens.  Does it
> > matter how they're stacked, i.e. the which comes first?
> >
> >> You should also be loading io-cache on the client side with a decent
> >> cache-size (like 256MB? depends on how much RAM you have to spare).
> this
> >> will help re-read improve a lot.
> >>
> >> avati
> >>
> >> 2007/11/21, Anand Avati <avati at zresearch.com>:
> >>>
> >>> Chris,
> >>>  you shoud really be loading write-behind on the client side, that is
> wht
> >>> improves write performance the most. do let us know the results with
> >>> writebehind on the client side.
> >>>
> >>> avati
> >>>
> >>> 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
> >>>>
> >>>>       Hi, again,
> >>>>
> >>>>       I asked about stack building philosophy.  Apparently there
> isn't
> >>>> one.  So I tried a few things.  The configs are down the end here.
> >>>>
> >>>>       Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster
> >>>> enhanced, glusterfs-1.3.5-2.  Both have gigabit ethernet, server runs
> >>>> a SATABeast.  Currently I ge the following from from iozone.
> >>>>
> >>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff
> >>>>
> >>>>
> >>>> random  random    bkwd  record  stride
> >>>>                KB  reclen   write rewrite    read    reread    read
> >>>> write    read rewrite    read   fwrite frewrite   fread  freread
> >>>>            131072      32     589     587      345      343     818
> >>>> 621     757     624     845      592      591     346      366
> >>>>
> >>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware
> >>>> RAID card gives this
> >>>>
> >>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff
> >>>>
> >>>>
> >>>> random  random    bkwd  record  stride
> >>>>                KB  reclen   write rewrite    read    reread    read
> >>>> write    read rewrite    read   fwrite frewrite   fread  freread
> >>>>            131072      32      27      26      292
> >>>> 11      11      24     542       9     539       30       28     295
> >>>> 11
> >>>>
> >>>> And you can see that the NFS system is faster.  Is this because of
> the
> >>>> hardware 3ware RAID or is NFS really that much faster here?  Is there
> >>>> a better way to stack this that would improve things?  And I tried
> with
> >>>> and without striping.  No noticable difference in gluster
> performance.
> >>>>
> >>>>       Help appreciated.
> >>>>
> >>>> ============  server config
> >>>>
> >>>> volume brick1
> >>>>    type storage/posix
> >>>>    option directory /home/sdm1
> >>>> end-volume
> >>>>
> >>>> volume brick2
> >>>>    type storage/posix
> >>>>    option directory /home/sdl1
> >>>> end-volume
> >>>>
> >>>> volume brick3
> >>>>    type storage/posix
> >>>>    option directory /home/sdk1
> >>>> end-volume
> >>>>
> >>>> volume brick4
> >>>>    type storage/posix
> >>>>    option directory /home/sdk1
> >>>> end-volume
> >>>>
> >>>> volume ns-brick
> >>>>    type storage/posix
> >>>>    option directory /home/sdk1
> >>>> end-volume
> >>>>
> >>>> volume stripe1
> >>>>   type cluster/stripe
> >>>>   subvolumes brick1 brick2
> >>>> # option block-size *:10KB,
> >>>> end-volume
> >>>>
> >>>> volume stripe2
> >>>>   type cluster/stripe
> >>>>   subvolumes brick3 brick4
> >>>> # option block-size *:10KB,
> >>>> end-volume
> >>>>
> >>>> volume unify0
> >>>>   type cluster/unify
> >>>>   subvolumes stripe1 stripe2
> >>>>   option namespace ns-brick
> >>>>   option scheduler rr
> >>>> # option rr.limits.min-disk-free 5
> >>>> end-volume
> >>>>
> >>>> volume iot
> >>>>   type performance/io-threads
> >>>>   subvolumes unify0
> >>>>   option thread-count 8
> >>>> end-volume
> >>>>
> >>>> volume writebehind
> >>>>    type performance/write-behind
> >>>>    option aggregate-size 131072 # in bytes
> >>>>    subvolumes iot
> >>>> end-volume
> >>>>
> >>>> volume readahead
> >>>>    type performance/read-ahead
> >>>> #  option page-size 65536 ### in bytes
> >>>>    option page-size 128kb ### in bytes
> >>>> #  option page-count 16 ### memory cache size is page-count x
> >>>> page-size per file
> >>>>    option page-count 2 ### memory cache size is page-count x
> page-size
> >>>> per file
> >>>>    subvolumes writebehind
> >>>> end-volume
> >>>>
> >>>> volume server
> >>>>    type protocol/server
> >>>>    subvolumes readahead
> >>>>    option transport-type tcp/server     # For TCP/IP transport
> >>>> #  option client-volume-filename /etc/glusterfs/glusterfs-client.vol
> >>>>    option auth.ip.readahead.allow *
> >>>> end-volume
> >>>>
> >>>>
> >>>> ============  client config
> >>>>
> >>>> volume client
> >>>>    type protocol/client
> >>>>    option transport-type tcp/client
> >>>>    option remote-host xxx.xxx.xxx.xxx
> >>>>    option remote-subvolume readahead
> >>>> end-volume
> >>>>
> >>>>
> >>>>
> -------------------------------------------------------------------------------
> >>>>
> >>>> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
> >>>> Systems Administrator       |Web:
> >>>> http://www.nmr.mgh.harvard.edu/~johnson
> >>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson>
> >>>> NMR Center                  |Voice:    617.726.0949
> >>>> Mass. General Hospital      |FAX:      617.726.7422
> >>>> 149 (2301) 13th Street      |A compromise is a solution nobody is
> happy
> >>>> with.
> >>>> Charlestown, MA., 02129 USA |     Observation, Unknown
> >>>>
> >>>>
> >>>>
> -------------------------------------------------------------------------------
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Gluster-devel mailing list
> >>>> Gluster-devel at nongnu.org
> >>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> It always takes longer than you expect, even when you take into
> account
> >>> Hofstadter's Law.
> >>>
> >>> -- Hofstadter's Law
> >>
> >>
> >>
> >>
> >> --
> >> It always takes longer than you expect, even when you take into account
> >> Hofstadter's Law.
> >>
> >> -- Hofstadter's Law
> >>
> >
> >
> -------------------------------------------------------------------------------
> > Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
> > Systems Administrator       |Web:
> > http://www.nmr.mgh.harvard.edu/~johnson
> > NMR Center                  |Voice:    617.726.0949
> > Mass. General Hospital      |FAX:      617.726.7422
> > 149 (2301) 13th Street      |For all sad words of tongue or pen, the
> saddest
> > Charlestown, MA., 02129 USA |are these: "It might have been".  John G.
> > Whittier
> >
> -------------------------------------------------------------------------------
> >
> >
> >
>
>
> -------------------------------------------------------------------------------
> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
> Systems Administrator       |Web:
> http://www.nmr.mgh.harvard.edu/~johnson
> NMR Center                  |Voice:    617.726.0949
> Mass. General Hospital      |FAX:      617.726.7422
> 149 (2301) 13th Street      |Fifty percent of all doctors graduated in the
> Charlestown, MA., 02129 USA |lower half of the class.  Observation
>
> -------------------------------------------------------------------------------
>

-- 
It always takes longer than you expect, even when you take into account
Hofstadter's Law.

-- Hofstadter's Law