[Gluster-devel] write-behind tuning

Fri May 16 18:02:10 UTC 2008

Thanks Amar,

If I were to also add read-ahead to this setup, how would you recommend I
configure page-size and page-count? Is page-size simply the filesystem's
(XFS in this case) block-size? What would you recommend for these 8core/8GB
machines?

Also the wiki says there is no real benefit for GigE, is this true or is it
still worth configure?

Lastly, are there any issues using this with booster?

Thanks,
Jordan

On Thu, May 15, 2008 at 3:41 PM, Amar S. Tumballi <amar at zresearch.com>
wrote:

>
>
> On Wed, May 14, 2008 at 12:15 AM, Jordan Mendler <jmendler at ucla.edu>
> wrote:
>
>> Hi all,
>>
>> I am in the process of implementing write-behind and had some questions.
>>
>> 1) I've been told aggregate-size should be between 0-4MB. What is the
>> down-side to making it large? In our case (a backup server) I would think
>> the bigger the better since we are doing lots of consecutive/parallel
>> rsyncs
>> of a combination of tons of small files and some very large files. The
>> only
>> down-side I could see is that small transfers are not distributed as
>> evenly
>> since large writes will be done to only 1 brick instead of half of the
>> write
>> to each brick. Perhaps someone can clarify.
>>
>
> Currently we have a upper limit in our protocol translator to transfer only
> 4MB of data at the max in one request/reply packet. Hence if you use
> write-behind on client side (as in most of the cases), it will fail to send
> the bigger packet.
>
>
>
>>
>> 2) What does flush-behind do? What is the advantage of having it on and
>> what
>> is the advantage of it off.
>>
> this option is given for increasing the performance of handling lot of
> small files where, the close()/flush() can be pushed to back-ground, hence
> client can process the next request. Its ''off'' by default.
>
>
>>
>> 3) write-behind on the client aggregates small writes into larger ones. Is
>> there any purpose to doing it on the server side? If so, how is this
>> helpful?
>>
> Yes, generally, it helps if the writes are coming in a very small chunk. It
> helps to reduce the diskhead seek() time.
>
>
>>
>> 4) should write-behind be done on a brick-by-brick basis on the client, or
>> is it fine to do it after the unify? (seems like it would be fine since
>> this
>> would consolidate small writes before sending it to the scheduler).
>>
>
> Yes, the behavior will be same where ever you put it. But its advised to do
> it after unify as it reduces the complexity of the spec file.
>
>
>> Hardware wise we currently have 2 16x1TB Hardware RAID6 servers (each is
>> 8core, 8gb of RAM). Each acts as both a server and a unify client.
>> Underlying filesystem is currently XFS on Linux, ~13TB each. Interconnect
>> is
>> GigE and eventually we will have more external clients, though for now we
>> are just using the servers as clients. My current client config is below.
>> Any other suggestions are also appreciated.
>>
>
> Spec file looks good.
>
>
> Regards,
> --
> Amar Tumballi
> Gluster/GlusterFS Hacker
> [bulde on #gluster/irc.gnu.org]
> http://www.zresearch.com - Commoditizing Super Storage!