[Gluster-users] performance
Computerisms Corporation
bob at computerisms.ca
Tue Aug 4 19:48:51 UTC 2020
Hi Artem,
would also like this recipe. If you have any comments on my answer to
Strahil, would love to hear them...
On 2020-08-03 9:42 p.m., Artem Russakovskii wrote:
> I tried putting all web files (specifically WordPress php and static
> files as well as various cache files) on gluster before, and the results
> were miserable on a busy site - our usual ~8-10 load quickly turned into
> 100+ and killed everything.
>
> I had to go back to running just the user uploads (which are static
> files in the Wordpress uploads/ dir) on gluster and using rsync (via
> lsyncd) for the frequently executed php / cache.
>
> I'd love to figure this out as well and tune gluster for heavy reads and
> moderate writes, but I haven't cracked that recipe yet.
>
> On Mon, Aug 3, 2020, 8:08 PM Computerisms Corporation
> <bob at computerisms.ca <mailto:bob at computerisms.ca>> wrote:
>
> Hi Gurus,
>
> I have been trying to wrap my head around performance improvements
> on my
> gluster setup, and I don't seem to be making any progress. I mean
> forward progress. making it worse takes practically no effort at all.
>
> My gluster is distributed-replicated across 6 bricks and 2 servers,
> with
> an arbiter on each server. I designed it like this so I have an
> expansion path to more servers in the future (like the staggered
> arbiter
> diagram in the red hat documentation). gluster v info output is below.
> I have compiled gluster 7.6 from sources on both servers.
>
> Servers are 6core/3.4Ghz with 32 GB RAM, no swap, and SSD and gigabit
> network connections. They are running debian, and are being used as
> redundant web servers. There is some 3Million files on the Gluster
> Storage averaging 130KB/file. Currently only one of the two servers is
> serving web services. There are well over 100 sites, and apache
> server-status claims around 5 hits per second, depending on time of
> day,
> so a fair bit of logging going on. The gluster is only holding website
> data and config files that will be common between the two servers, no
> databases or anything like that on the Gluster.
>
> When the serving server is under load load average is consistently
> 12-20. glusterfs is always at the top with 150%-250% cpu, and each
> of 3
> bricks at roughly 50-70%, so consistently pegging 4 of the 6 cores.
> apache processes will easily eat up all the rest of the cpus after
> that.
> And web page response time is underwhelming at best.
>
> Interestingly, mostly because it is not something I have ever
> experienced before, software interrupts sit between 1 and 5 on each
> core, but the last core is usually sitting around 20. Have never
> encountered a high load average where the si number was ever
> significant. I have googled the crap out of that (as well as gluster
> performance in general), there are nearly limitless posts about what it
> is, but have yet to see one thing to explain what to do about it.
> Sadly
> I can't really shut down the gluster process to confirm if that is the
> cause, but it's a pretty good bet, I think.
>
> When the system is not under load, glusterfs will be running at around
> 100% with each of the 3 bricks around 35%, so using 2 cores when doing
> not much of anything.
>
> nload shows the network cards rarely climb above 300 Mbps unless I am
> doing a direct file transfer between the servers, in which case it gets
> right up to the 1Gbps limit. RAM is never above 15GB unless I am
> causing it to happen. atop show a disk busy percentage, it is often
> above 50% and sometimes will hit 100%, and is no where near as
> consistently showing excessive usage like the cpu cores are. The cpu
> definitely seems to be the bottleneck.
>
> When I found out about the groups directory, I figured one of those
> must
> be useful to me, but as best as I can tell they are not. But I am
> really hoping that someone has configured a system like mine and has a
> good group file they might share for this situation, or a peak at their
> volume info output?
>
> or maybe this is really just about as good as I should expect? Maybe
> the fix is that I need more/faster cores? I hope not, as that isn't
> really an option.
>
> Anyway, here is my volume info as promised.
>
> root at mooglian:/Computerisms/sites/computerisms.ca/log#
> <http://computerisms.ca/log#> gluster v info
>
> Volume Name: webisms
> Type: Distributed-Replicate
> Volume ID: 261901e7-60b4-4760-897d-0163beed356e
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x (2 + 1) = 6
> Transport-type: tcp
> Bricks:
> Brick1: mooglian:/var/GlusterBrick/replset-0/webisms-replset-0
> Brick2: moogle:/var/GlusterBrick/replset-0/webisms-replset-0
> Brick3: moogle:/var/GlusterBrick/replset-0-arb/webisms-replset-0-arb
> (arbiter)
> Brick4: moogle:/var/GlusterBrick/replset-1/webisms-replset-1
> Brick5: mooglian:/var/GlusterBrick/replset-1/webisms-replset-1
> Brick6: mooglian:/var/GlusterBrick/replset-1-arb/webisms-replset-1-arb
> (arbiter)
> Options Reconfigured:
> auth.allow: xxxx
> performance.client-io-threads: off
> nfs.disable: on
> storage.fips-mode-rchecksum: on
> transport.address-family: inet
> performance.stat-prefetch: on
> network.inode-lru-limit: 200000
> performance.write-behind-window-size: 4MB
> performance.readdir-ahead: on
> performance.io-thread-count: 64
> performance.cache-size: 8GB
> server.event-threads: 8
> client.event-threads: 8
> performance.nl-cache-timeout: 600
>
>
> --
> Bob Miller
> Cell: 867-334-7117
> Office: 867-633-3760
> Office: 867-322-0362
> www.computerisms.ca <http://www.computerisms.ca>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
More information about the Gluster-users
mailing list