<div dir="auto">I tried putting all web files (specifically WordPress php and static files as well as various cache files) on gluster before, and the results were miserable on a busy site - our usual ~8-10 load quickly turned into 100+ and killed everything.<div dir="auto"><br></div><div dir="auto">I had to go back to running just the user uploads (which are static files in the Wordpress uploads/ dir) on gluster and using rsync (via lsyncd) for the frequently executed php / cache.</div><div dir="auto"><br></div><div dir="auto">I&#39;d love to figure this out as well and tune gluster for heavy reads and moderate writes, but I haven&#39;t cracked that recipe yet. </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Aug 3, 2020, 8:08 PM Computerisms Corporation &lt;<a href="mailto:bob@computerisms.ca">bob@computerisms.ca</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Gurus,<br>

<br>

I have been trying to wrap my head around performance improvements on my <br>

gluster setup, and I don&#39;t seem to be making any progress.  I mean <br>

forward progress.  making it worse takes practically no effort at all.<br>

<br>

My gluster is distributed-replicated across 6 bricks and 2 servers, with <br>

an arbiter on each server.  I designed it like this so I have an <br>

expansion path to more servers in the future (like the staggered arbiter <br>

diagram in the red hat documentation).  gluster v info output is below. <br>

I have compiled gluster 7.6 from sources on both servers.<br>

<br>

Servers are 6core/3.4Ghz with 32 GB RAM, no swap, and SSD and gigabit <br>

network connections.  They are running debian, and are being used as <br>

redundant web servers.  There is some 3Million files on the Gluster <br>

Storage averaging 130KB/file.  Currently only one of the two servers is <br>

serving web services.  There are well over 100 sites, and apache <br>

server-status claims around 5 hits per second, depending on time of day, <br>

so a fair bit of logging going on.  The gluster is only holding website <br>

data and config files that will be common between the two servers, no <br>

databases or anything like that on the Gluster.<br>

<br>

When the serving server is under load load average is consistently <br>

12-20.  glusterfs is always at the top with 150%-250% cpu, and each of 3 <br>

bricks at roughly 50-70%, so consistently pegging 4 of the 6 cores. <br>

apache processes will easily eat up all the rest of the cpus after that. <br>

  And web page response time is underwhelming at best.<br>

<br>

Interestingly, mostly because it is not something I have ever <br>

experienced before, software interrupts sit between 1 and 5 on each <br>

core, but the last core is usually sitting around 20.  Have never <br>

encountered a high load average where the si number was ever <br>

significant.  I have googled the crap out of that (as well as gluster <br>

performance in general), there are nearly limitless posts about what it <br>

is, but have yet to see one thing to explain what to do about it.  Sadly <br>

I can&#39;t really shut down the gluster process to confirm if that is the <br>

cause, but it&#39;s a pretty good bet, I think.<br>

<br>

When the system is not under load, glusterfs will be running at around <br>

100% with each of the 3 bricks around 35%, so using 2 cores when doing <br>

not much of anything.<br>

<br>

nload shows the network cards rarely climb above 300 Mbps unless I am <br>

doing a direct file transfer between the servers, in which case it gets <br>

right up to the 1Gbps limit.  RAM is never above 15GB unless I am <br>

causing it to happen.  atop show a disk busy percentage, it is often <br>

above 50% and sometimes will hit 100%, and is no where near as <br>

consistently showing excessive usage like the cpu cores are.  The cpu <br>

definitely seems to be the bottleneck.<br>

<br>

When I found out about the groups directory, I figured one of those must <br>

be useful to me, but as best as I can tell they are not.  But I am <br>

really hoping that someone has configured a system like mine and has a <br>

good group file they might share for this situation, or a peak at their <br>

volume info output?<br>

<br>

or maybe this is really just about as good as I should expect?  Maybe <br>

the fix is that I need more/faster cores?  I hope not, as that isn&#39;t <br>

really an option.<br>

<br>

Anyway, here is my volume info as promised.<br>

<br>

root@mooglian:/Computerisms/sites/<a href="http://computerisms.ca/log#" rel="noreferrer noreferrer" target="_blank">computerisms.ca/log#</a> gluster v info<br>

<br>

Volume Name: webisms<br>

Type: Distributed-Replicate<br>

Volume ID: 261901e7-60b4-4760-897d-0163beed356e<br>

Status: Started<br>

Snapshot Count: 0<br>

Number of Bricks: 2 x (2 + 1) = 6<br>

Transport-type: tcp<br>

Bricks:<br>

Brick1: mooglian:/var/GlusterBrick/replset-0/webisms-replset-0<br>

Brick2: moogle:/var/GlusterBrick/replset-0/webisms-replset-0<br>

Brick3: moogle:/var/GlusterBrick/replset-0-arb/webisms-replset-0-arb <br>

(arbiter)<br>

Brick4: moogle:/var/GlusterBrick/replset-1/webisms-replset-1<br>

Brick5: mooglian:/var/GlusterBrick/replset-1/webisms-replset-1<br>

Brick6: mooglian:/var/GlusterBrick/replset-1-arb/webisms-replset-1-arb <br>

(arbiter)<br>

Options Reconfigured:<br>

auth.allow: xxxx<br>

performance.client-io-threads: off<br>

nfs.disable: on<br>

storage.fips-mode-rchecksum: on<br>

transport.address-family: inet<br>

performance.stat-prefetch: on<br>

network.inode-lru-limit: 200000<br>

performance.write-behind-window-size: 4MB<br>

performance.readdir-ahead: on<br>

performance.io-thread-count: 64<br>

performance.cache-size: 8GB<br>

server.event-threads: 8<br>

client.event-threads: 8<br>

performance.nl-cache-timeout: 600<br>

<br>

<br>

-- <br>

Bob Miller<br>

Cell: 867-334-7117<br>

Office: 867-633-3760<br>

Office: 867-322-0362<br>

<a href="http://www.computerisms.ca" rel="noreferrer noreferrer" target="_blank">www.computerisms.ca</a><br>

________<br>

<br>

<br>

<br>

Community Meeting Calendar:<br>

<br>

Schedule -<br>

Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br>

Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>

<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank" rel="noreferrer">Gluster-users@gluster.org</a><br>

<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br>

</blockquote></div>