[Gluster-devel] Performance question.
Chris Johnson
johnson at nmr.mgh.harvard.edu
Wed Nov 21 17:07:44 UTC 2007
On Wed, 21 Nov 2007, Chris Johnson wrote:
Ok, caching and write-behind moved to the client side. There is some
improvement.
random random bkwd record stride
KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
131072 32 312 312 361 363 1453 322 677 320 753 312 312 369 363
but as you can see it's marginal. Is this typical, i.e. being an
order of magnitude slower than NFS?
> On Wed, 21 Nov 2007, Anand Avati wrote:
>
> See I asked if there was a philosophy about how to build a stack.
> Never got a response until now.
>
> Caching won't help in the real appication I don't believe.
> Mostly it's read, crunch, write. If I'm wrong here please let me
> know. Although I don't believe it will hurt. I'll give moving
> write-behind and io-cache to the client and see what happens. Does it
> matter how they're stacked, i.e. the which comes first?
>
>> You should also be loading io-cache on the client side with a decent
>> cache-size (like 256MB? depends on how much RAM you have to spare). this
>> will help re-read improve a lot.
>>
>> avati
>>
>> 2007/11/21, Anand Avati <avati at zresearch.com>:
>>>
>>> Chris,
>>> you shoud really be loading write-behind on the client side, that is wht
>>> improves write performance the most. do let us know the results with
>>> writebehind on the client side.
>>>
>>> avati
>>>
>>> 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
>>>>
>>>> Hi, again,
>>>>
>>>> I asked about stack building philosophy. Apparently there isn't
>>>> one. So I tried a few things. The configs are down the end here.
>>>>
>>>> Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster
>>>> enhanced, glusterfs-1.3.5-2. Both have gigabit ethernet, server runs
>>>> a SATABeast. Currently I ge the following from from iozone.
>>>>
>>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff
>>>>
>>>>
>>>> random random bkwd record stride
>>>> KB reclen write rewrite read reread read
>>>> write read rewrite read fwrite frewrite fread freread
>>>> 131072 32 589 587 345 343 818
>>>> 621 757 624 845 592 591 346 366
>>>>
>>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware
>>>> RAID card gives this
>>>>
>>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff
>>>>
>>>>
>>>> random random bkwd record stride
>>>> KB reclen write rewrite read reread read
>>>> write read rewrite read fwrite frewrite fread freread
>>>> 131072 32 27 26 292
>>>> 11 11 24 542 9 539 30 28 295
>>>> 11
>>>>
>>>> And you can see that the NFS system is faster. Is this because of the
>>>> hardware 3ware RAID or is NFS really that much faster here? Is there
>>>> a better way to stack this that would improve things? And I tried with
>>>> and without striping. No noticable difference in gluster performance.
>>>>
>>>> Help appreciated.
>>>>
>>>> ============ server config
>>>>
>>>> volume brick1
>>>> type storage/posix
>>>> option directory /home/sdm1
>>>> end-volume
>>>>
>>>> volume brick2
>>>> type storage/posix
>>>> option directory /home/sdl1
>>>> end-volume
>>>>
>>>> volume brick3
>>>> type storage/posix
>>>> option directory /home/sdk1
>>>> end-volume
>>>>
>>>> volume brick4
>>>> type storage/posix
>>>> option directory /home/sdk1
>>>> end-volume
>>>>
>>>> volume ns-brick
>>>> type storage/posix
>>>> option directory /home/sdk1
>>>> end-volume
>>>>
>>>> volume stripe1
>>>> type cluster/stripe
>>>> subvolumes brick1 brick2
>>>> # option block-size *:10KB,
>>>> end-volume
>>>>
>>>> volume stripe2
>>>> type cluster/stripe
>>>> subvolumes brick3 brick4
>>>> # option block-size *:10KB,
>>>> end-volume
>>>>
>>>> volume unify0
>>>> type cluster/unify
>>>> subvolumes stripe1 stripe2
>>>> option namespace ns-brick
>>>> option scheduler rr
>>>> # option rr.limits.min-disk-free 5
>>>> end-volume
>>>>
>>>> volume iot
>>>> type performance/io-threads
>>>> subvolumes unify0
>>>> option thread-count 8
>>>> end-volume
>>>>
>>>> volume writebehind
>>>> type performance/write-behind
>>>> option aggregate-size 131072 # in bytes
>>>> subvolumes iot
>>>> end-volume
>>>>
>>>> volume readahead
>>>> type performance/read-ahead
>>>> # option page-size 65536 ### in bytes
>>>> option page-size 128kb ### in bytes
>>>> # option page-count 16 ### memory cache size is page-count x
>>>> page-size per file
>>>> option page-count 2 ### memory cache size is page-count x page-size
>>>> per file
>>>> subvolumes writebehind
>>>> end-volume
>>>>
>>>> volume server
>>>> type protocol/server
>>>> subvolumes readahead
>>>> option transport-type tcp/server # For TCP/IP transport
>>>> # option client-volume-filename /etc/glusterfs/glusterfs-client.vol
>>>> option auth.ip.readahead.allow *
>>>> end-volume
>>>>
>>>>
>>>> ============ client config
>>>>
>>>> volume client
>>>> type protocol/client
>>>> option transport-type tcp/client
>>>> option remote-host xxx.xxx.xxx.xxx
>>>> option remote-subvolume readahead
>>>> end-volume
>>>>
>>>>
>>>> -------------------------------------------------------------------------------
>>>>
>>>> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
>>>> Systems Administrator |Web:
>>>> http://www.nmr.mgh.harvard.edu/~johnson
>>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson>
>>>> NMR Center |Voice: 617.726.0949
>>>> Mass. General Hospital |FAX: 617.726.7422
>>>> 149 (2301) 13th Street |A compromise is a solution nobody is happy
>>>> with.
>>>> Charlestown, MA., 02129 USA | Observation, Unknown
>>>>
>>>>
>>>> -------------------------------------------------------------------------------
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at nongnu.org
>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>
>>>
>>>
>>>
>>> --
>>> It always takes longer than you expect, even when you take into account
>>> Hofstadter's Law.
>>>
>>> -- Hofstadter's Law
>>
>>
>>
>>
>> --
>> It always takes longer than you expect, even when you take into account
>> Hofstadter's Law.
>>
>> -- Hofstadter's Law
>>
>
> -------------------------------------------------------------------------------
> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
> Systems Administrator |Web:
> http://www.nmr.mgh.harvard.edu/~johnson
> NMR Center |Voice: 617.726.0949
> Mass. General Hospital |FAX: 617.726.7422
> 149 (2301) 13th Street |For all sad words of tongue or pen, the saddest
> Charlestown, MA., 02129 USA |are these: "It might have been". John G.
> Whittier
> -------------------------------------------------------------------------------
>
>
>
-------------------------------------------------------------------------------
Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson
NMR Center |Voice: 617.726.0949
Mass. General Hospital |FAX: 617.726.7422
149 (2301) 13th Street |Fifty percent of all doctors graduated in the
Charlestown, MA., 02129 USA |lower half of the class. Observation
-------------------------------------------------------------------------------
More information about the Gluster-devel
mailing list