[Gluster-devel] Performance question.
Chris Johnson
johnson at nmr.mgh.harvard.edu
Wed Nov 21 16:29:26 UTC 2007
On Wed, 21 Nov 2007, Anand Avati wrote:
See I asked if there was a philosophy about how to build a stack.
Never got a response until now.
Caching won't help in the real appication I don't believe.
Mostly it's read, crunch, write. If I'm wrong here please let me
know. Although I don't believe it will hurt. I'll give moving
write-behind and io-cache to the client and see what happens. Does it
matter how they're stacked, i.e. the which comes first?
> You should also be loading io-cache on the client side with a decent
> cache-size (like 256MB? depends on how much RAM you have to spare). this
> will help re-read improve a lot.
>
> avati
>
> 2007/11/21, Anand Avati <avati at zresearch.com>:
>>
>> Chris,
>> you shoud really be loading write-behind on the client side, that is wht
>> improves write performance the most. do let us know the results with
>> writebehind on the client side.
>>
>> avati
>>
>> 2007/11/21, Chris Johnson <johnson at nmr.mgh.harvard.edu>:
>>>
>>> Hi, again,
>>>
>>> I asked about stack building philosophy. Apparently there isn't
>>> one. So I tried a few things. The configs are down the end here.
>>>
>>> Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster
>>> enhanced, glusterfs-1.3.5-2. Both have gigabit ethernet, server runs
>>> a SATABeast. Currently I ge the following from from iozone.
>>>
>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff
>>>
>>>
>>> random random bkwd record stride
>>> KB reclen write rewrite read reread read
>>> write read rewrite read fwrite frewrite fread freread
>>> 131072 32 589 587 345 343 818
>>> 621 757 624 845 592 591 346 366
>>>
>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware
>>> RAID card gives this
>>>
>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff
>>>
>>>
>>> random random bkwd record stride
>>> KB reclen write rewrite read reread read
>>> write read rewrite read fwrite frewrite fread freread
>>> 131072 32 27 26 292
>>> 11 11 24 542 9 539 30 28 295
>>> 11
>>>
>>> And you can see that the NFS system is faster. Is this because of the
>>> hardware 3ware RAID or is NFS really that much faster here? Is there
>>> a better way to stack this that would improve things? And I tried with
>>> and without striping. No noticable difference in gluster performance.
>>>
>>> Help appreciated.
>>>
>>> ============ server config
>>>
>>> volume brick1
>>> type storage/posix
>>> option directory /home/sdm1
>>> end-volume
>>>
>>> volume brick2
>>> type storage/posix
>>> option directory /home/sdl1
>>> end-volume
>>>
>>> volume brick3
>>> type storage/posix
>>> option directory /home/sdk1
>>> end-volume
>>>
>>> volume brick4
>>> type storage/posix
>>> option directory /home/sdk1
>>> end-volume
>>>
>>> volume ns-brick
>>> type storage/posix
>>> option directory /home/sdk1
>>> end-volume
>>>
>>> volume stripe1
>>> type cluster/stripe
>>> subvolumes brick1 brick2
>>> # option block-size *:10KB,
>>> end-volume
>>>
>>> volume stripe2
>>> type cluster/stripe
>>> subvolumes brick3 brick4
>>> # option block-size *:10KB,
>>> end-volume
>>>
>>> volume unify0
>>> type cluster/unify
>>> subvolumes stripe1 stripe2
>>> option namespace ns-brick
>>> option scheduler rr
>>> # option rr.limits.min-disk-free 5
>>> end-volume
>>>
>>> volume iot
>>> type performance/io-threads
>>> subvolumes unify0
>>> option thread-count 8
>>> end-volume
>>>
>>> volume writebehind
>>> type performance/write-behind
>>> option aggregate-size 131072 # in bytes
>>> subvolumes iot
>>> end-volume
>>>
>>> volume readahead
>>> type performance/read-ahead
>>> # option page-size 65536 ### in bytes
>>> option page-size 128kb ### in bytes
>>> # option page-count 16 ### memory cache size is page-count x
>>> page-size per file
>>> option page-count 2 ### memory cache size is page-count x page-size
>>> per file
>>> subvolumes writebehind
>>> end-volume
>>>
>>> volume server
>>> type protocol/server
>>> subvolumes readahead
>>> option transport-type tcp/server # For TCP/IP transport
>>> # option client-volume-filename /etc/glusterfs/glusterfs-client.vol
>>> option auth.ip.readahead.allow *
>>> end-volume
>>>
>>>
>>> ============ client config
>>>
>>> volume client
>>> type protocol/client
>>> option transport-type tcp/client
>>> option remote-host xxx.xxx.xxx.xxx
>>> option remote-subvolume readahead
>>> end-volume
>>>
>>> -------------------------------------------------------------------------------
>>>
>>> Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
>>> Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson
>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson>
>>> NMR Center |Voice: 617.726.0949
>>> Mass. General Hospital |FAX: 617.726.7422
>>> 149 (2301) 13th Street |A compromise is a solution nobody is happy
>>> with.
>>> Charlestown, MA., 02129 USA | Observation, Unknown
>>>
>>> -------------------------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at nongnu.org
>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>
>>
>>
>>
>> --
>> It always takes longer than you expect, even when you take into account
>> Hofstadter's Law.
>>
>> -- Hofstadter's Law
>
>
>
>
> --
> It always takes longer than you expect, even when you take into account
> Hofstadter's Law.
>
> -- Hofstadter's Law
>
-------------------------------------------------------------------------------
Chris Johnson |Internet: johnson at nmr.mgh.harvard.edu
Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson
NMR Center |Voice: 617.726.0949
Mass. General Hospital |FAX: 617.726.7422
149 (2301) 13th Street |For all sad words of tongue or pen, the saddest
Charlestown, MA., 02129 USA |are these: "It might have been". John G. Whittier
-------------------------------------------------------------------------------
More information about the Gluster-devel
mailing list