[Gluster-users] performance

Tue Aug 4 04:00:06 UTC 2020

На 4 август 2020 г. 6:01:17 GMT+03:00, Computerisms Corporation <bob at computerisms.ca> написа:
>Hi Gurus,
>
>I have been trying to wrap my head around performance improvements on
>my 
>gluster setup, and I don't seem to be making any progress.  I mean 
>forward progress.  making it worse takes practically no effort at all.
>
>My gluster is distributed-replicated across 6 bricks and 2 servers,
>with 
>an arbiter on each server.  I designed it like this so I have an 
>expansion path to more servers in the future (like the staggered
>arbiter 
>diagram in the red hat documentation).  gluster v info output is below.
>
>I have compiled gluster 7.6 from sources on both servers.

There  is a 7.7 version which is fixing somw stuff. Why do you have to compile it from source ?

>Servers are 6core/3.4Ghz with 32 GB RAM, no swap, and SSD and gigabit 
>network connections.  They are running debian, and are being used as 
>redundant web servers.  There is some 3Million files on the Gluster 
>Storage averaging 130KB/file.  

This type of workload is called 'metadata-intensive'.
There are some recommendations for this type of workload:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/small_file_performance_enhancements

Keep an eye on the section that mentions dirty-ratio = 5 &dirty-background-ration = 2. 

>Currently only one of the two servers is
>
>serving web services.  There are well over 100 sites, and apache 
>server-status claims around 5 hits per second, depending on time of
>day, 
>so a fair bit of logging going on.  The gluster is only holding website
>
>data and config files that will be common between the two servers, no 
>databases or anything like that on the Gluster.
>
>When the serving server is under load load average is consistently 
>12-20.  glusterfs is always at the top with 150%-250% cpu, and each of
>3 
>bricks at roughly 50-70%, so consistently pegging 4 of the 6 cores. 
>apache processes will easily eat up all the rest of the cpus after
>that. 
>  And web page response time is underwhelming at best.
>
>Interestingly, mostly because it is not something I have ever 
>experienced before, software interrupts sit between 1 and 5 on each 
>core, but the last core is usually sitting around 20.  Have never 
>encountered a high load average where the si number was ever 
>significant.  I have googled the crap out of that (as well as gluster 
>performance in general), there are nearly limitless posts about what it
>
>is, but have yet to see one thing to explain what to do about it. 

There is an explanation  about that in the link I provided above:

Configuring a higher event threads value than the available processing units could again cause context switches on these threads. As a result reducing the number deduced from the previous step to a number that is less that the available processing units is recommended.

>Sadly 
>I can't really shut down the gluster process to confirm if that is the 
>cause, but it's a pretty good bet, I think.
>
>When the system is not under load, glusterfs will be running at around 
>100% with each of the 3 bricks around 35%, so using 2 cores when doing 
>not much of anything.
>
>nload shows the network cards rarely climb above 300 Mbps unless I am 
>doing a direct file transfer between the servers, in which case it gets
>
>right up to the 1Gbps limit.  RAM is never above 15GB unless I am 
>causing it to happen.  atop show a disk busy percentage, it is often 
>above 50% and sometimes will hit 100%, and is no where near as 
>consistently showing excessive usage like the cpu cores are.  The cpu 
>definitely seems to be the bottleneck.
>When I found out about the groups directory, I figured one of those
>must 
>be useful to me, but as best as I can tell they are not.  But I am 
>really hoping that someone has configured a system like mine and has a 
>good group file they might share for this situation, or a peak at their
>
>volume info output?
>
>or maybe this is really just about as good as I should expect?  Maybe 
>the fix is that I need more/faster cores?  I hope not, as that isn't 
>really an option.
>
>Anyway, here is my volume info as promised.
>
>root at mooglian:/Computerisms/sites/computerisms.ca/log# gluster v info
>
>Volume Name: webisms
>Type: Distributed-Replicate
>Volume ID: 261901e7-60b4-4760-897d-0163beed356e
>Status: Started
>Snapshot Count: 0
>Number of Bricks: 2 x (2 + 1) = 6
>Transport-type: tcp
>Bricks:
>Brick1: mooglian:/var/GlusterBrick/replset-0/webisms-replset-0
>Brick2: moogle:/var/GlusterBrick/replset-0/webisms-replset-0
>Brick3: moogle:/var/GlusterBrick/replset-0-arb/webisms-replset-0-arb 
>(arbiter)
>Brick4: moogle:/var/GlusterBrick/replset-1/webisms-replset-1
>Brick5: mooglian:/var/GlusterBrick/replset-1/webisms-replset-1
>Brick6: mooglian:/var/GlusterBrick/replset-1-arb/webisms-replset-1-arb 
>(arbiter)
>Options Reconfigured:
>auth.allow: xxxx
>performance.client-io-threads: off
>nfs.disable: on
>storage.fips-mode-rchecksum: on
>transport.address-family: inet
>performance.stat-prefetch: on
>network.inode-lru-limit: 200000
>performance.write-behind-window-size: 4MB
>performance.readdir-ahead: on
>performance.io-thread-count: 64
>performance.cache-size: 8GB
>server.event-threads: 8
>client.event-threads: 8
>performance.nl-cache-timeout: 600

As 'storage.fips-mode-rchecksum' is using sha256, you can try to disable it - which should use the less cpu intensive md5. Yet, I have never played with that option ...

Check the RH page about the tunings and try different values  for the event threads.

Best Regards,
Strahil Nikolov