[Gluster-users] Effect of performance tuning options

Fri Mar 6 11:58:57 UTC 2015

Hello all,

I am currently trying to tune the performance of a Gluster volume that I
have just created, and I am wondering what is the exact effect of some
of the tuning options.

Overview of the volume, with the options that I have modified:

======================================================================
glusterfs 3.6.2 built on Jan 22 2015 12:59:57

Volume Name: live
Type: Distribute
Volume ID: 81c3d212-e43b-4460-8b5d-b743992a01eb
Status: Started
Number of Bricks: 8
Transport-type: tcp
Bricks:
Brick1: stor104:/zfs/brick0/brick
Brick2: stor104:/zfs/brick1/brick
Brick3: stor104:/zfs/brick2/brick
Brick4: stor104:/zfs/brick3/brick
Brick5: stor106:/zfs/brick0/brick
Brick6: stor106:/zfs/brick1/brick
Brick7: stor106:/zfs/brick2/brick
Brick8: stor106:/zfs/brick3/brick
Options Reconfigured:
performance.flush-behind: on
performance.client-io-threads: on
performance.cache-refresh-timeout: 10
nfs.disable: on
nfs.addr-namelookup: off
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
cluster.min-free-disk: 1%
cluster.data-self-heal-algorithm: full
performance.io-thread-count: 64
performance.write-behind-window-size: 4MB
performance.cache-size: 1GB
======================================================================

2 servers, 4 bricks per server. Bandwidth is 2x10Gb trunked link on the
client side, and one 10Gb link per server.

Now, the questions I still haven't found an answer for are:

1) for the thread count on the server side, is it per brick, per server
or for the whole volume? While doing some tests I saw an increase in
threading on the servers but it seems to be dynamic, and I didn't get
the max number of threads created.

2) when I activate client-io-threads, I see that the same thread count
is used for the clients. The only way to modify it for the clients only
is to edit the volume files by hand, correct?

3) as for the client cache, if I remember correctly FUSE filesystems are
not cached by the kernel VFS layer, in which case it all hangs on the
performance.cache-size option. Given that the cache is refreshed on a
regular basis, has any test been done already to see what are the
network and CPU load impacts of large caches?

4) as far as I understand, the Samba backend hooks directly into the
FUSE module. Therefore it should benefit from all optimizations done for
the TCP FUSE client, correct?

5) is there any know issue with activating both client-io-threads and
flush-behind?

6) is there any other obvious (or not) tuning knob that I have missed?

And finally, the question I shouldn't ask: is there any way that I can
dump the current values for all possible parameters? Google points me to
various threads in the past on that topic, yet nothing seems to have
changed on that front...

Thank you in advance for your answers.
Regards,
JF

-- 

 Jean-François Le Fillâtre
 -------------------------------
 HPC Systems Administrator
 LCSB - University of Luxembourg
 -------------------------------
 PGP KeyID 0x134657C6