[Gluster-devel] Performance question.
Anand Avati
avati at zresearch.com
Wed Nov 21 18:38:59 UTC 2007
> I'll try and find out.
>
> Also, is it the case that glusterfs will always be noticably
> slower than NFS?
For metadata operations NFS can be faster. But for file I/O our tests show
that most of the times GlusterFS is faster, especially for block sizes >
64K.
You should try with io-cache configured with enough cache-size to fit your
dataset in RAM with some buffer (say 256MB in your case).
thanks,
avati
> Chris,
> > what cache-size did you configure in io-cache? Is it possible to share
> > throughput benchmarks using dd (both read and write) also what is the
> > io-zone performance at 128kB reclen?
> >
> > avati
> >
> > 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
> >>
> >> On Wed, 21 Nov 2007, Chris Johnson wrote:
> >>
> >> Ok, caching and write-behind moved to the client side. There is some
> >> improvement.
> >>
> >>
> >> random random bkwd record stride
> >> KB reclen write rewrite read reread read
> >> write read rewrite read fwrite frewrite fread freread
> >> 131072 32 312 312 361 363 1453
> >> 322 677 320 753 312 312 369 363
> >>
> >> but as you can see it's marginal. Is this typical, i.e. being an
> >> order of magnitude slower than NFS?
> >>
> >>> On Wed, 21 Nov 2007, Anand Avati wrote:
> >>>
> >>> See I asked if there was a philosophy about how to build a stack.
> >>> Never got a response until now.
> >>>
> >>> Caching won't help in the real appication I don't believe.
> >>> Mostly it's read, crunch, write. If I'm wrong here please let me
> >>> know. Although I don't believe it will hurt. I'll give moving
> >>> write-behind and io-cache to the client and see what happens. Does it
> >>> matter how they're stacked, i.e. the which comes first?
> >>>
> >>>> You should also be loading io-cache on the client side with a decent
> >>>> cache-size (like 256MB? depends on how much RAM you have to spare).
> >> this
> >>>> will help re-read improve a lot.
> >>>>
> >>>> avati
> >>>>
> >>>> 2007/11/21, Anand Avati <avati at zresearch.com>:
> >>>>>
> >>>>> Chris,
> >>>>> you shoud really be loading write-behind on the client side, that
> is
> >> wht
> >>>>> improves write performance the most. do let us know the results with
> >>>>> writebehind on the client side.
> >>>>>
> >>>>> avati
> >>>>>
> >>>>> 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
> >>>>>>
> >>>>>> Hi, again,
> >>>>>>
> >>>>>> I asked about stack building philosophy. Apparently there
> >> isn't
> >>>>>> one. So I tried a few things. The configs are down the end here.
> >>>>>>
> >>>>>> Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster
> >>>>>> enhanced, glusterfs-1.3.5-2. Both have gigabit ethernet, server
> runs
> >>>>>> a SATABeast. Currently I ge the following from from iozone.
> >>>>>>
> >>>>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff
> >>>>>>
> >>>>>>
> >>>>>> random random bkwd record stride
> >>>>>> KB reclen write rewrite read reread read
> >>>>>> write read rewrite read fwrite frewrite fread freread
> >>>>>> 131072 32 589 587 345 343 818
> >>>>>> 621 757 624 845 592 591 346 366
> >>>>>>
> >>>>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware
> >>>>>> RAID card gives this
> >>>>>>
> >>>>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff
> >>>>>>
> >>>>>>
> >>>>>> random random bkwd record stride
> >>>>>> KB reclen write rewrite read reread read
> >>>>>> write read rewrite read fwrite frewrite fread freread
> >>>>>> 131072 32 27 26 292
> >>>>>> 11 11 24 542 9 539 30 28
> 295
> >>>>>> 11
> >>>>>>
> >>>>>> And you can see that the NFS system is faster. Is this because of
> >> the
> >>>>>> hardware 3ware RAID or is NFS really that much faster here? Is
> there
> >>>>>> a better way to stack this that would improve things? And I tried
> >> with
> >>>>>> and without striping. No noticable difference in gluster
> >> performance.
> >>>>>>
> >>>>>> Help appreciated.
> >>>>>>
> >>>>>> ============ server config
> >>>>>>
> >>>>>> volume brick1
> >>>>>> type storage/posix
> >>>>>> option directory /home/sdm1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume brick2
> >>>>>> type storage/posix
> >>>>>> option directory /home/sdl1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume brick3
> >>>>>> type storage/posix
> >>>>>> option directory /home/sdk1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume brick4
> >>>>>> type storage/posix
> >>>>>> option directory /home/sdk1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume ns-brick
> >>>>>> type storage/posix
> >>>>>> option directory /home/sdk1
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume stripe1
> >>>>>> type cluster/stripe
> >>>>>> subvolumes brick1 brick2
> >>>>>> # option block-size *:10KB,
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume stripe2
> >>>>>> type cluster/stripe
> >>>>>> subvolumes brick3 brick4
> >>>>>> # option block-size *:10KB,
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume unify0
> >>>>>> type cluster/unify
> >>>>>> subvolumes stripe1 stripe2
> >>>>>> option namespace ns-brick
> >>>>>> option scheduler rr
> >>>>>> # option rr.limits.min-disk-free 5
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume iot
> >>>>>> type performance/io-threads
> >>>>>> subvolumes unify0
> >>>>>> option thread-count 8
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume writebehind
> >>>>>> type performance/write-behind
> >>>>>> option aggregate-size 131072 # in bytes
> >>>>>> subvolumes iot
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume readahead
> >>>>>> type performance/read-ahead
> >>>>>> # option page-size 65536 ### in bytes
> >>>>>> option page-size 128kb ### in bytes
> >>>>>> # option page-count 16 ### memory cache size is page-count x
> >>>>>> page-size per file
> >>>>>> option page-count 2 ### memory cache size is page-count x
> >> page-size
> >>>>>> per file
> >>>>>> subvolumes writebehind
> >>>>>> end-volume
> >>>>>>
> >>>>>> volume server
> >>>>>> type protocol/server
> >>>>>> subvolumes readahead
> >>>>>> option transport-type tcp/server # For TCP/IP transport
> >>>>>> # option client-volume-filename /etc/glusterfs/glusterfs-
> client.vol
> >>>>>> option auth.ip.readahead.allow *
> >>>>>> end-volume
> >>>>>>
> >>>>>>
> >>>>>> ============ client config
> >>>>>>
> >>>>>> volume client
> >>>>>> type protocol/client
> >>>>>> option transport-type tcp/client
> >>>>>> option remote-host xxx.xxx.xxx.xxx
> >>>>>> option remote-subvolume readahead
> >>>>>> end-volume
> >>>>>>
> >>>>>>
> >>>>>>
> >>
> -------------------------------------------------------------------------------
> >>>>>>
> >>>>>> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
> >>>>>> Systems Administrator |Web:
> >>>>>> http://www.nmr.mgh.harvard.edu/~johnson
> >>>>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson>
> >>>>>> NMR Center |Voice: 617.726.0949
> >>>>>> Mass. General Hospital |FAX: 617.726.7422
> >>>>>> 149 (2301) 13th Street |A compromise is a solution nobody is
> >> happy
> >>>>>> with.
> >>>>>> Charlestown, MA., 02129 USA | Observation, Unknown
> >>>>>>
> >>>>>>
> >>>>>>
> >>
> -------------------------------------------------------------------------------
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Gluster-devel mailing list
> >>>>>> Gluster-devel at nongnu.org
> >>>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> It always takes longer than you expect, even when you take into
> >> account
> >>>>> Hofstadter's Law.
> >>>>>
> >>>>> -- Hofstadter's Law
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> It always takes longer than you expect, even when you take into
> account
> >>>> Hofstadter's Law.
> >>>>
> >>>> -- Hofstadter's Law
> >>>>
> >>>
> >>>
> >>
> -------------------------------------------------------------------------------
> >>> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
> >>> Systems Administrator |Web:
> >>> http://www.nmr.mgh.harvard.edu/~johnson
> >>> NMR Center |Voice: 617.726.0949
> >>> Mass. General Hospital |FAX: 617.726.7422
> >>> 149 (2301) 13th Street |For all sad words of tongue or pen, the
> >> saddest
> >>> Charlestown, MA., 02129 USA |are these: "It might have been". John G.
> >>> Whittier
> >>>
> >>
> -------------------------------------------------------------------------------
> >>>
> >>>
> >>>
> >>
> >>
> >>
> -------------------------------------------------------------------------------
> >> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
> >> Systems Administrator |Web:
> >> http://www.nmr.mgh.harvard.edu/~johnson
> >> NMR Center |Voice: 617.726.0949
> >> Mass. General Hospital |FAX: 617.726.7422
> >> 149 (2301) 13th Street |Fifty percent of all doctors graduated in
> the
> >> Charlestown, MA., 02129 USA |lower half of the class. Observation
> >>
> >>
> -------------------------------------------------------------------------------
> >>
> >
> >
> >
> > --
> > It always takes longer than you expect, even when you take into account
> > Hofstadter's Law.
> >
> > -- Hofstadter's Law
> >
>
>
> -------------------------------------------------------------------------------
> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
> Systems Administrator |Web:
> http://www.nmr.mgh.harvard.edu/~johnson
> NMR Center |Voice: 617.726.0949
> Mass. General Hospital |FAX: 617.726.7422
> 149 (2301) 13th Street |"A good engineer never reinvents the wheel
> when
> Charlestown, MA., 02129 USA |an existing one with modifications will do."
> Me
>
> -------------------------------------------------------------------------------
>
--
It always takes longer than you expect, even when you take into account
Hofstadter's Law.
-- Hofstadter's Law
More information about the Gluster-devel
mailing list