[Gluster-devel] Performance question.

Wed Nov 21 18:38:59 UTC 2007

>       I'll try and find out.
>
>       Also, is it the case that glusterfs will always be noticably
> slower than NFS?

For metadata operations NFS can be faster. But for file I/O our tests show
that most of the times GlusterFS is faster, especially for block sizes >
64K.

You should try with io-cache configured with enough cache-size to fit your
dataset in RAM with some buffer (say 256MB in your case).

thanks,
avati

> Chris,
> >  what cache-size did you configure in io-cache? Is it possible to share
> > throughput benchmarks using dd (both read and write) also what is the
> > io-zone performance at 128kB reclen?
> >
> > avati
> >
> > 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
> >>
> >> On Wed, 21 Nov 2007, Chris Johnson wrote:
> >>
> >> Ok, caching and write-behind moved to the client side.  There is some
> >> improvement.
> >>
> >>
> >> random  random    bkwd  record  stride
> >>                KB  reclen   write rewrite    read    reread    read
> >> write    read rewrite    read   fwrite frewrite   fread  freread
> >>            131072      32     312     312      361      363    1453
> >> 322     677     320     753      312      312     369      363
> >>
> >> but as you can see it's marginal.  Is this typical, i.e. being an
> >> order of magnitude slower than NFS?
> >>
> >>> On Wed, 21 Nov 2007, Anand Avati wrote:
> >>>
> >>>     See I asked if there was a philosophy about how to build a stack.
> >>> Never got a response until now.
> >>>
> >>>     Caching won't help in the real appication I don't believe.
> >>> Mostly it's read, crunch, write.  If I'm wrong here please let me
> >>> know.  Although I don't believe it will hurt.  I'll give moving
> >>> write-behind and io-cache to the client and see what happens.  Does it
> >>> matter how they're stacked, i.e. the which comes first?
> >>>
> >>>> You should also be loading io-cache on the client side with a decent
> >>>> cache-size (like 256MB? depends on how much RAM you have to spare).
> >> this
> >>>> will help re-read improve a lot.
> >>>>
> >>>> avati
> >>>>
> >>>> 2007/11/21, Anand Avati <avati at zresearch.com>:
> >>>>>
> >>>>> Chris,
> >>>>>  you shoud really be loading write-behind on the client side, that
> is
> >> wht
> >>>>> improves write performance the most. do let us know the results with
> >>>>> writebehind on the client side.
> >>>>>
> >>>>> avati
> >>>>>
> >>>>> 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
> >>>>>>
> >>>>>>       Hi, again,
> >>>>>>
> >>>>>>       I asked about stack building philosophy.  Apparently there
> >> isn't
> >>>>>> one.  So I tried a few things.  The configs are down the end here.
> >>>>>>
> >>>>>>       Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster
> >>>>>> enhanced, glusterfs-1.3.5-2.  Both have gigabit ethernet, server
> runs
> >>>>>> a SATABeast.  Currently I ge the following from from iozone.
> >>>>>>
> >>>>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff
> >>>>>>
> >>>>>>
> >>>>>> random  random    bkwd  record  stride
> >>>>>>                KB  reclen   write rewrite    read    reread    read
> >>>>>> write    read rewrite    read   fwrite frewrite   fread  freread
> >>>>>>            131072      32     589     587      345      343     818
> >>>>>> 621     757     624     845      592      591     346      366
> >>>>>>
> >>>>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware
> >>>>>> RAID card gives this
> >>>>>>
> >>>>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff
> >>>>>>
> >>>>>>
> >>>>>> random  random    bkwd  record  stride
> >>>>>>                KB  reclen   write rewrite    read    reread    read
> >>>>>> write    read rewrite    read   fwrite frewrite   fread  freread
> >>>>>>            131072      32      27      26      292
> >>>>>> 11      11      24     542       9     539       30       28
> 295
> >>>>>> 11
> >>>>>>
> >>>>>> And you can see that the NFS system is faster.  Is this because of
> >> the
> >>>>>> hardware 3ware RAID or is NFS really that much faster here?  Is
> there
> >>>>>> a better way to stack this that would improve things?  And I tried
> >> with
> >>>>>> and without striping.  No noticable difference in gluster
> >> performance.
> >>>>>>
> >>>>>>       Help appreciated.
> >>>>>>
> >>>>>> ============  server config
> >>>>>>
> >>>>>> volume brick1
> >>>>>>    type storage/posix
> >>>>>>    option directory /home/sdm1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume brick2
> >>>>>>    type storage/posix
> >>>>>>    option directory /home/sdl1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume brick3
> >>>>>>    type storage/posix
> >>>>>>    option directory /home/sdk1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume brick4
> >>>>>>    type storage/posix
> >>>>>>    option directory /home/sdk1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume ns-brick
> >>>>>>    type storage/posix
> >>>>>>    option directory /home/sdk1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume stripe1
> >>>>>>   type cluster/stripe
> >>>>>>   subvolumes brick1 brick2
> >>>>>> # option block-size *:10KB,
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume stripe2
> >>>>>>   type cluster/stripe
> >>>>>>   subvolumes brick3 brick4
> >>>>>> # option block-size *:10KB,
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume unify0
> >>>>>>   type cluster/unify
> >>>>>>   subvolumes stripe1 stripe2
> >>>>>>   option namespace ns-brick
> >>>>>>   option scheduler rr
> >>>>>> # option rr.limits.min-disk-free 5
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume iot
> >>>>>>   type performance/io-threads
> >>>>>>   subvolumes unify0
> >>>>>>   option thread-count 8
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume writebehind
> >>>>>>    type performance/write-behind
> >>>>>>    option aggregate-size 131072 # in bytes
> >>>>>>    subvolumes iot
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume readahead
> >>>>>>    type performance/read-ahead
> >>>>>> #  option page-size 65536 ### in bytes
> >>>>>>    option page-size 128kb ### in bytes
> >>>>>> #  option page-count 16 ### memory cache size is page-count x
> >>>>>> page-size per file
> >>>>>>    option page-count 2 ### memory cache size is page-count x
> >> page-size
> >>>>>> per file
> >>>>>>    subvolumes writebehind
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume server
> >>>>>>    type protocol/server
> >>>>>>    subvolumes readahead
> >>>>>>    option transport-type tcp/server     # For TCP/IP transport
> >>>>>> #  option client-volume-filename /etc/glusterfs/glusterfs-
> client.vol
> >>>>>>    option auth.ip.readahead.allow *
> >>>>>> end-volume
> >>>>>>
> >>>>>>
> >>>>>> ============  client config
> >>>>>>
> >>>>>> volume client
> >>>>>>    type protocol/client
> >>>>>>    option transport-type tcp/client
> >>>>>>    option remote-host xxx.xxx.xxx.xxx
> >>>>>>    option remote-subvolume readahead
> >>>>>> end-volume
> >>>>>>
> >>>>>>
> >>>>>>
> >>
> -------------------------------------------------------------------------------
> >>>>>>
> >>>>>> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
> >>>>>> Systems Administrator       |Web:
> >>>>>> http://www.nmr.mgh.harvard.edu/~johnson
> >>>>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson>
> >>>>>> NMR Center                  |Voice:    617.726.0949
> >>>>>> Mass. General Hospital      |FAX:      617.726.7422
> >>>>>> 149 (2301) 13th Street      |A compromise is a solution nobody is
> >> happy
> >>>>>> with.
> >>>>>> Charlestown, MA., 02129 USA |     Observation, Unknown
> >>>>>>
> >>>>>>
> >>>>>>
> >>
> -------------------------------------------------------------------------------
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Gluster-devel mailing list
> >>>>>> Gluster-devel at nongnu.org
> >>>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> It always takes longer than you expect, even when you take into
> >> account
> >>>>> Hofstadter's Law.
> >>>>>
> >>>>> -- Hofstadter's Law
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> It always takes longer than you expect, even when you take into
> account
> >>>> Hofstadter's Law.
> >>>>
> >>>> -- Hofstadter's Law
> >>>>
> >>>
> >>>
> >>
> -------------------------------------------------------------------------------
> >>> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
> >>> Systems Administrator       |Web:
> >>> http://www.nmr.mgh.harvard.edu/~johnson
> >>> NMR Center                  |Voice:    617.726.0949
> >>> Mass. General Hospital      |FAX:      617.726.7422
> >>> 149 (2301) 13th Street      |For all sad words of tongue or pen, the
> >> saddest
> >>> Charlestown, MA., 02129 USA |are these: "It might have been".  John G.
> >>> Whittier
> >>>
> >>
> -------------------------------------------------------------------------------
> >>>
> >>>
> >>>
> >>
> >>
> >>
> -------------------------------------------------------------------------------
> >> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
> >> Systems Administrator       |Web:
> >> http://www.nmr.mgh.harvard.edu/~johnson
> >> NMR Center                  |Voice:    617.726.0949
> >> Mass. General Hospital      |FAX:      617.726.7422
> >> 149 (2301) 13th Street      |Fifty percent of all doctors graduated in
> the
> >> Charlestown, MA., 02129 USA |lower half of the class.  Observation
> >>
> >>
> -------------------------------------------------------------------------------
> >>
> >
> >
> >
> > --
> > It always takes longer than you expect, even when you take into account
> > Hofstadter's Law.
> >
> > -- Hofstadter's Law
> >
>
>
> -------------------------------------------------------------------------------
> Chris Johnson               |Internet: johnson at nmr.mgh.harvard.edu
> Systems Administrator       |Web:
> http://www.nmr.mgh.harvard.edu/~johnson
> NMR Center                  |Voice:    617.726.0949
> Mass. General Hospital      |FAX:      617.726.7422
> 149 (2301) 13th Street      |"A good engineer never reinvents the wheel
> when
> Charlestown, MA., 02129 USA |an existing one with modifications will do."
> Me
>
> -------------------------------------------------------------------------------
>

-- 
It always takes longer than you expect, even when you take into account
Hofstadter's Law.

-- Hofstadter's Law