[Gluster-users] performance

Tue Aug 4 19:48:51 UTC 2020

Hi Artem,

would also like this recipe.  If you have any comments on my answer to 
Strahil, would love to hear them...

On 2020-08-03 9:42 p.m., Artem Russakovskii wrote:
> I tried putting all web files (specifically WordPress php and static 
> files as well as various cache files) on gluster before, and the results 
> were miserable on a busy site - our usual ~8-10 load quickly turned into 
> 100+ and killed everything.
> 
> I had to go back to running just the user uploads (which are static 
> files in the Wordpress uploads/ dir) on gluster and using rsync (via 
> lsyncd) for the frequently executed php / cache.
> 
> I'd love to figure this out as well and tune gluster for heavy reads and 
> moderate writes, but I haven't cracked that recipe yet.
> 
> On Mon, Aug 3, 2020, 8:08 PM Computerisms Corporation 
> <bob at computerisms.ca <mailto:bob at computerisms.ca>> wrote:
> 
>     Hi Gurus,
> 
>     I have been trying to wrap my head around performance improvements
>     on my
>     gluster setup, and I don't seem to be making any progress.  I mean
>     forward progress.  making it worse takes practically no effort at all.
> 
>     My gluster is distributed-replicated across 6 bricks and 2 servers,
>     with
>     an arbiter on each server.  I designed it like this so I have an
>     expansion path to more servers in the future (like the staggered
>     arbiter
>     diagram in the red hat documentation).  gluster v info output is below.
>     I have compiled gluster 7.6 from sources on both servers.
> 
>     Servers are 6core/3.4Ghz with 32 GB RAM, no swap, and SSD and gigabit
>     network connections.  They are running debian, and are being used as
>     redundant web servers.  There is some 3Million files on the Gluster
>     Storage averaging 130KB/file.  Currently only one of the two servers is
>     serving web services.  There are well over 100 sites, and apache
>     server-status claims around 5 hits per second, depending on time of
>     day,
>     so a fair bit of logging going on.  The gluster is only holding website
>     data and config files that will be common between the two servers, no
>     databases or anything like that on the Gluster.
> 
>     When the serving server is under load load average is consistently
>     12-20.  glusterfs is always at the top with 150%-250% cpu, and each
>     of 3
>     bricks at roughly 50-70%, so consistently pegging 4 of the 6 cores.
>     apache processes will easily eat up all the rest of the cpus after
>     that.
>        And web page response time is underwhelming at best.
> 
>     Interestingly, mostly because it is not something I have ever
>     experienced before, software interrupts sit between 1 and 5 on each
>     core, but the last core is usually sitting around 20.  Have never
>     encountered a high load average where the si number was ever
>     significant.  I have googled the crap out of that (as well as gluster
>     performance in general), there are nearly limitless posts about what it
>     is, but have yet to see one thing to explain what to do about it. 
>     Sadly
>     I can't really shut down the gluster process to confirm if that is the
>     cause, but it's a pretty good bet, I think.
> 
>     When the system is not under load, glusterfs will be running at around
>     100% with each of the 3 bricks around 35%, so using 2 cores when doing
>     not much of anything.
> 
>     nload shows the network cards rarely climb above 300 Mbps unless I am
>     doing a direct file transfer between the servers, in which case it gets
>     right up to the 1Gbps limit.  RAM is never above 15GB unless I am
>     causing it to happen.  atop show a disk busy percentage, it is often
>     above 50% and sometimes will hit 100%, and is no where near as
>     consistently showing excessive usage like the cpu cores are.  The cpu
>     definitely seems to be the bottleneck.
> 
>     When I found out about the groups directory, I figured one of those
>     must
>     be useful to me, but as best as I can tell they are not.  But I am
>     really hoping that someone has configured a system like mine and has a
>     good group file they might share for this situation, or a peak at their
>     volume info output?
> 
>     or maybe this is really just about as good as I should expect?  Maybe
>     the fix is that I need more/faster cores?  I hope not, as that isn't
>     really an option.
> 
>     Anyway, here is my volume info as promised.
> 
>     root at mooglian:/Computerisms/sites/computerisms.ca/log#
>     <http://computerisms.ca/log#> gluster v info
> 
>     Volume Name: webisms
>     Type: Distributed-Replicate
>     Volume ID: 261901e7-60b4-4760-897d-0163beed356e
>     Status: Started
>     Snapshot Count: 0
>     Number of Bricks: 2 x (2 + 1) = 6
>     Transport-type: tcp
>     Bricks:
>     Brick1: mooglian:/var/GlusterBrick/replset-0/webisms-replset-0
>     Brick2: moogle:/var/GlusterBrick/replset-0/webisms-replset-0
>     Brick3: moogle:/var/GlusterBrick/replset-0-arb/webisms-replset-0-arb
>     (arbiter)
>     Brick4: moogle:/var/GlusterBrick/replset-1/webisms-replset-1
>     Brick5: mooglian:/var/GlusterBrick/replset-1/webisms-replset-1
>     Brick6: mooglian:/var/GlusterBrick/replset-1-arb/webisms-replset-1-arb
>     (arbiter)
>     Options Reconfigured:
>     auth.allow: xxxx
>     performance.client-io-threads: off
>     nfs.disable: on
>     storage.fips-mode-rchecksum: on
>     transport.address-family: inet
>     performance.stat-prefetch: on
>     network.inode-lru-limit: 200000
>     performance.write-behind-window-size: 4MB
>     performance.readdir-ahead: on
>     performance.io-thread-count: 64
>     performance.cache-size: 8GB
>     server.event-threads: 8
>     client.event-threads: 8
>     performance.nl-cache-timeout: 600
> 
> 
>     -- 
>     Bob Miller
>     Cell: 867-334-7117
>     Office: 867-633-3760
>     Office: 867-322-0362
>     www.computerisms.ca <http://www.computerisms.ca>
>     ________
> 
> 
> 
>     Community Meeting Calendar:
> 
>     Schedule -
>     Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>     Bridge: https://bluejeans.com/441850968
> 
>     Gluster-users mailing list
>     Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>     https://lists.gluster.org/mailman/listinfo/gluster-users
>