[Gluster-devel] Performance question.
Anand Avati
avati at zresearch.com
Wed Nov 21 18:22:10 UTC 2007
Chris,
what cache-size did you configure in io-cache? Is it possible to share
throughput benchmarks using dd (both read and write) also what is the
io-zone performance at 128kB reclen?
avati
2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
>
> On Wed, 21 Nov 2007, Chris Johnson wrote:
>
> Ok, caching and write-behind moved to the client side. There is some
> improvement.
>
>
> random random bkwd record stride
> KB reclen write rewrite read reread read
> write read rewrite read fwrite frewrite fread freread
> 131072 32 312 312 361 363 1453
> 322 677 320 753 312 312 369 363
>
> but as you can see it's marginal. Is this typical, i.e. being an
> order of magnitude slower than NFS?
>
> > On Wed, 21 Nov 2007, Anand Avati wrote:
> >
> > See I asked if there was a philosophy about how to build a stack.
> > Never got a response until now.
> >
> > Caching won't help in the real appication I don't believe.
> > Mostly it's read, crunch, write. If I'm wrong here please let me
> > know. Although I don't believe it will hurt. I'll give moving
> > write-behind and io-cache to the client and see what happens. Does it
> > matter how they're stacked, i.e. the which comes first?
> >
> >> You should also be loading io-cache on the client side with a decent
> >> cache-size (like 256MB? depends on how much RAM you have to spare).
> this
> >> will help re-read improve a lot.
> >>
> >> avati
> >>
> >> 2007/11/21, Anand Avati <avati at zresearch.com>:
> >>>
> >>> Chris,
> >>> you shoud really be loading write-behind on the client side, that is
> wht
> >>> improves write performance the most. do let us know the results with
> >>> writebehind on the client side.
> >>>
> >>> avati
> >>>
> >>> 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
> >>>>
> >>>> Hi, again,
> >>>>
> >>>> I asked about stack building philosophy. Apparently there
> isn't
> >>>> one. So I tried a few things. The configs are down the end here.
> >>>>
> >>>> Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster
> >>>> enhanced, glusterfs-1.3.5-2. Both have gigabit ethernet, server runs
> >>>> a SATABeast. Currently I ge the following from from iozone.
> >>>>
> >>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff
> >>>>
> >>>>
> >>>> random random bkwd record stride
> >>>> KB reclen write rewrite read reread read
> >>>> write read rewrite read fwrite frewrite fread freread
> >>>> 131072 32 589 587 345 343 818
> >>>> 621 757 624 845 592 591 346 366
> >>>>
> >>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware
> >>>> RAID card gives this
> >>>>
> >>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff
> >>>>
> >>>>
> >>>> random random bkwd record stride
> >>>> KB reclen write rewrite read reread read
> >>>> write read rewrite read fwrite frewrite fread freread
> >>>> 131072 32 27 26 292
> >>>> 11 11 24 542 9 539 30 28 295
> >>>> 11
> >>>>
> >>>> And you can see that the NFS system is faster. Is this because of
> the
> >>>> hardware 3ware RAID or is NFS really that much faster here? Is there
> >>>> a better way to stack this that would improve things? And I tried
> with
> >>>> and without striping. No noticable difference in gluster
> performance.
> >>>>
> >>>> Help appreciated.
> >>>>
> >>>> ============ server config
> >>>>
> >>>> volume brick1
> >>>> type storage/posix
> >>>> option directory /home/sdm1
> >>>> end-volume
> >>>>
> >>>> volume brick2
> >>>> type storage/posix
> >>>> option directory /home/sdl1
> >>>> end-volume
> >>>>
> >>>> volume brick3
> >>>> type storage/posix
> >>>> option directory /home/sdk1
> >>>> end-volume
> >>>>
> >>>> volume brick4
> >>>> type storage/posix
> >>>> option directory /home/sdk1
> >>>> end-volume
> >>>>
> >>>> volume ns-brick
> >>>> type storage/posix
> >>>> option directory /home/sdk1
> >>>> end-volume
> >>>>
> >>>> volume stripe1
> >>>> type cluster/stripe
> >>>> subvolumes brick1 brick2
> >>>> # option block-size *:10KB,
> >>>> end-volume
> >>>>
> >>>> volume stripe2
> >>>> type cluster/stripe
> >>>> subvolumes brick3 brick4
> >>>> # option block-size *:10KB,
> >>>> end-volume
> >>>>
> >>>> volume unify0
> >>>> type cluster/unify
> >>>> subvolumes stripe1 stripe2
> >>>> option namespace ns-brick
> >>>> option scheduler rr
> >>>> # option rr.limits.min-disk-free 5
> >>>> end-volume
> >>>>
> >>>> volume iot
> >>>> type performance/io-threads
> >>>> subvolumes unify0
> >>>> option thread-count 8
> >>>> end-volume
> >>>>
> >>>> volume writebehind
> >>>> type performance/write-behind
> >>>> option aggregate-size 131072 # in bytes
> >>>> subvolumes iot
> >>>> end-volume
> >>>>
> >>>> volume readahead
> >>>> type performance/read-ahead
> >>>> # option page-size 65536 ### in bytes
> >>>> option page-size 128kb ### in bytes
> >>>> # option page-count 16 ### memory cache size is page-count x
> >>>> page-size per file
> >>>> option page-count 2 ### memory cache size is page-count x
> page-size
> >>>> per file
> >>>> subvolumes writebehind
> >>>> end-volume
> >>>>
> >>>> volume server
> >>>> type protocol/server
> >>>> subvolumes readahead
> >>>> option transport-type tcp/server # For TCP/IP transport
> >>>> # option client-volume-filename /etc/glusterfs/glusterfs-client.vol
> >>>> option auth.ip.readahead.allow *
> >>>> end-volume
> >>>>
> >>>>
> >>>> ============ client config
> >>>>
> >>>> volume client
> >>>> type protocol/client
> >>>> option transport-type tcp/client
> >>>> option remote-host xxx.xxx.xxx.xxx
> >>>> option remote-subvolume readahead
> >>>> end-volume
> >>>>
> >>>>
> >>>>
> -------------------------------------------------------------------------------
> >>>>
> >>>> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
> >>>> Systems Administrator |Web:
> >>>> http://www.nmr.mgh.harvard.edu/~johnson
> >>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson>
> >>>> NMR Center |Voice: 617.726.0949
> >>>> Mass. General Hospital |FAX: 617.726.7422
> >>>> 149 (2301) 13th Street |A compromise is a solution nobody is
> happy
> >>>> with.
> >>>> Charlestown, MA., 02129 USA | Observation, Unknown
> >>>>
> >>>>
> >>>>
> -------------------------------------------------------------------------------
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Gluster-devel mailing list
> >>>> Gluster-devel at nongnu.org
> >>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> It always takes longer than you expect, even when you take into
> account
> >>> Hofstadter's Law.
> >>>
> >>> -- Hofstadter's Law
> >>
> >>
> >>
> >>
> >> --
> >> It always takes longer than you expect, even when you take into account
> >> Hofstadter's Law.
> >>
> >> -- Hofstadter's Law
> >>
> >
> >
> -------------------------------------------------------------------------------
> > Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
> > Systems Administrator |Web:
> > http://www.nmr.mgh.harvard.edu/~johnson
> > NMR Center |Voice: 617.726.0949
> > Mass. General Hospital |FAX: 617.726.7422
> > 149 (2301) 13th Street |For all sad words of tongue or pen, the
> saddest
> > Charlestown, MA., 02129 USA |are these: "It might have been". John G.
> > Whittier
> >
> -------------------------------------------------------------------------------
> >
> >
> >
>
>
> -------------------------------------------------------------------------------
> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
> Systems Administrator |Web:
> http://www.nmr.mgh.harvard.edu/~johnson
> NMR Center |Voice: 617.726.0949
> Mass. General Hospital |FAX: 617.726.7422
> 149 (2301) 13th Street |Fifty percent of all doctors graduated in the
> Charlestown, MA., 02129 USA |lower half of the class. Observation
>
> -------------------------------------------------------------------------------
>
--
It always takes longer than you expect, even when you take into account
Hofstadter's Law.
-- Hofstadter's Law
More information about the Gluster-devel
mailing list