[Gluster-users] performance

Tue Aug 4 04:42:45 UTC 2020

I tried putting all web files (specifically WordPress php and static files
as well as various cache files) on gluster before, and the results were
miserable on a busy site - our usual ~8-10 load quickly turned into 100+
and killed everything.

I had to go back to running just the user uploads (which are static files
in the Wordpress uploads/ dir) on gluster and using rsync (via lsyncd) for
the frequently executed php / cache.

I'd love to figure this out as well and tune gluster for heavy reads and
moderate writes, but I haven't cracked that recipe yet.

On Mon, Aug 3, 2020, 8:08 PM Computerisms Corporation <bob at computerisms.ca>
wrote:

> Hi Gurus,
>
> I have been trying to wrap my head around performance improvements on my
> gluster setup, and I don't seem to be making any progress.  I mean
> forward progress.  making it worse takes practically no effort at all.
>
> My gluster is distributed-replicated across 6 bricks and 2 servers, with
> an arbiter on each server.  I designed it like this so I have an
> expansion path to more servers in the future (like the staggered arbiter
> diagram in the red hat documentation).  gluster v info output is below.
> I have compiled gluster 7.6 from sources on both servers.
>
> Servers are 6core/3.4Ghz with 32 GB RAM, no swap, and SSD and gigabit
> network connections.  They are running debian, and are being used as
> redundant web servers.  There is some 3Million files on the Gluster
> Storage averaging 130KB/file.  Currently only one of the two servers is
> serving web services.  There are well over 100 sites, and apache
> server-status claims around 5 hits per second, depending on time of day,
> so a fair bit of logging going on.  The gluster is only holding website
> data and config files that will be common between the two servers, no
> databases or anything like that on the Gluster.
>
> When the serving server is under load load average is consistently
> 12-20.  glusterfs is always at the top with 150%-250% cpu, and each of 3
> bricks at roughly 50-70%, so consistently pegging 4 of the 6 cores.
> apache processes will easily eat up all the rest of the cpus after that.
>   And web page response time is underwhelming at best.
>
> Interestingly, mostly because it is not something I have ever
> experienced before, software interrupts sit between 1 and 5 on each
> core, but the last core is usually sitting around 20.  Have never
> encountered a high load average where the si number was ever
> significant.  I have googled the crap out of that (as well as gluster
> performance in general), there are nearly limitless posts about what it
> is, but have yet to see one thing to explain what to do about it.  Sadly
> I can't really shut down the gluster process to confirm if that is the
> cause, but it's a pretty good bet, I think.
>
> When the system is not under load, glusterfs will be running at around
> 100% with each of the 3 bricks around 35%, so using 2 cores when doing
> not much of anything.
>
> nload shows the network cards rarely climb above 300 Mbps unless I am
> doing a direct file transfer between the servers, in which case it gets
> right up to the 1Gbps limit.  RAM is never above 15GB unless I am
> causing it to happen.  atop show a disk busy percentage, it is often
> above 50% and sometimes will hit 100%, and is no where near as
> consistently showing excessive usage like the cpu cores are.  The cpu
> definitely seems to be the bottleneck.
>
> When I found out about the groups directory, I figured one of those must
> be useful to me, but as best as I can tell they are not.  But I am
> really hoping that someone has configured a system like mine and has a
> good group file they might share for this situation, or a peak at their
> volume info output?
>
> or maybe this is really just about as good as I should expect?  Maybe
> the fix is that I need more/faster cores?  I hope not, as that isn't
> really an option.
>
> Anyway, here is my volume info as promised.
>
> root at mooglian:/Computerisms/sites/computerisms.ca/log# gluster v info
>
> Volume Name: webisms
> Type: Distributed-Replicate
> Volume ID: 261901e7-60b4-4760-897d-0163beed356e
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x (2 + 1) = 6
> Transport-type: tcp
> Bricks:
> Brick1: mooglian:/var/GlusterBrick/replset-0/webisms-replset-0
> Brick2: moogle:/var/GlusterBrick/replset-0/webisms-replset-0
> Brick3: moogle:/var/GlusterBrick/replset-0-arb/webisms-replset-0-arb
> (arbiter)
> Brick4: moogle:/var/GlusterBrick/replset-1/webisms-replset-1
> Brick5: mooglian:/var/GlusterBrick/replset-1/webisms-replset-1
> Brick6: mooglian:/var/GlusterBrick/replset-1-arb/webisms-replset-1-arb
> (arbiter)
> Options Reconfigured:
> auth.allow: xxxx
> performance.client-io-threads: off
> nfs.disable: on
> storage.fips-mode-rchecksum: on
> transport.address-family: inet
> performance.stat-prefetch: on
> network.inode-lru-limit: 200000
> performance.write-behind-window-size: 4MB
> performance.readdir-ahead: on
> performance.io-thread-count: 64
> performance.cache-size: 8GB
> server.event-threads: 8
> client.event-threads: 8
> performance.nl-cache-timeout: 600
>
>
> --
> Bob Miller
> Cell: 867-334-7117
> Office: 867-633-3760
> Office: 867-322-0362
> www.computerisms.ca
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200803/81d3aa66/attachment.html>