[Gluster-devel] performance improvements

Tue Oct 23 17:02:42 UTC 2007

Vincent Régnard wrote:
> Hi all,
>
> We are presently trying to tune our non-gluster configuration to 
> improve glusterfs performance. My config is gluster 
> 1.3.7/fuse2.7.0-glfs5, linux 2.6.16.55. We have 3 clients and 3 
> servers on a 100Mb network with 5ms round trip between clients and 
> servers. The 3 clients replicate with afr on client side over the 3 
> servers.
>
> We have a read/write throughput benchmark (dbench) between 2 and 5 MB/s.

I imagine your clients and servers are the same systems?  Otherwise, 
5MB/s shouldn't be possible on a 100 Mbit network.  If one of the three 
AFR locations to write to is local, that means you have 100Mbit to write 
the 2 other copies, or about 11.11 MB/sec total at line saturation (what 
I usually see at least).  Since it's two copies, that's about 5.5MB/s 
max.  If all three AFR subvolumes are remote, that's 11.11 MB/s split 3 
ways.

> The afr synchronisation using "find -mtime -1 -type f -exec head -c1 
> trick" takes approximately 30 minutes for a 20GB filesystem with 
> 300.000 files. Which seems too long to be acceptable for us. I'd like 
> to tune some parameters to increase performance.

20 minutes when the other AFR's don't have any data and it all needs to 
be synced, or 20 minutes when they are all already in sync?  This time 
is going to be highly dependent on how many files you have, not just the 
size (as the command will take about 1 seconds or less probably on 20 
1GB files that are already in sync on all servers).

> I can imagine that reducing the roundtrip between servers might help ? 
> But I cannot actually do anything for that. The only thing I might be 
> able to do is to configure some QOS. Have you any suggestion about how 
> we should do that ? Would giving priority to tcp/6996 between clients 
> and servers really help ?

Separate network connections to each AFR subvoume.  VLAN your switches 
and implement separate logical networks for each connection to the AFR 
subvolumes using secondary (or even tertiary) nics in each client.  You 
can effectively double or triple your throughput while increasing 
redundancy.

> At the (linux) kernel level, could acting on PREMPTION MODEL and 
> CONFIG_HZ produce improvement ?
>
> Our present config is as follow:
>
> # CONFIG_PREEMPT_NONE is not set
> # CONFIG_PREEMPT_VOLUNTARY is not set
> CONFIG_PREEMPT=y
> CONFIG_PREEMPT_BKL=y
>
> # CONFIG_HZ_100 is not set
> CONFIG_HZ_250=y
> # CONFIG_HZ_1000 is not set
> CONFIG_HZ=250
>
> Is it better to prefer SMP to non-SMP kernel builds ? (We presently 
> have SMP eneabled for our dual-cores). What impact on glusterfs 
> performances if we deactivate SMP ?
>
> We use linuxthread (glibc2.3) and have no NPTL support, can this 
> influence the performances as well ?
>
> We naturally already have gluster improvements in the configuration 
> (io-{thread,cache}, readahead and writebehind).
>
> Thanks in advance for your comments or suggestions.
>
> Vincent.

I think your problem is more architecture limitations than kernel 
scheduling.  There's a cost for redundancy, and it's performance.  It's 
just much easier to scale the performance with glusterfs with more hardware.

-- 

-Kevan Benson
-A-1 Networks