[Gluster-users] Performance in VM guests when hosting VM images on Gluster

Fri Mar 1 18:01:39 UTC 2013

On 03/01/2013 11:48 AM, Torbjørn Thorsen wrote:
> On Thu, Feb 28, 2013 at 4:54 PM, Brian Foster <bfoster at redhat.com> wrote:
...
> 
> Thanks for taking the time to investigate and explain all this,
> I'm very grateful for that.
> 

No problem.

> I had two clients set up to do a 2GB sync write, then sleep 3600 seconds.
> The writes were to loop devices backed by a file on Gluster.
> The writes were staggered to avoid overlap, and the I created
> the loop devices with a few hours between them.
> 
> Client 1:
> 00:50:49 - 2.1 GB copied, 53.2285 s, 39.4 MB/s
> 01:51:43 - 2.1 GB copied, 53.943 s, 38.9 MB/s
> 02:52:37 - 2.1 GB copied, 619.662 s, 3.4 MB/s
> 04:02:56 - 2.1 GB copied, 675.043 s, 3.1 MB/s
> 
> Client 2:
> 00:12:27 - 2.1 GB copied, 51.0465 s, 41.1 MB/s
> 01:13:18 - 2.1 GB copied, 59.8881 s, 35.0 MB/s
> 02:14:18 - 2.1 GB copied, 817.714 s, 2.6 MB/s
> 03:27:55 - 2.1 GB copied, 609.257 s, 3.4 MB/s
> 
> I have more data points, but the change is there.
> Somewhere between 01:51 and 02:14, the transfer rate
> seems to go from ~40MB/s to ~3MB/s.
> 
> This is consistent with the findings from a couple of days ago,
> where things had gotten slow over night.
> 
> Removing and re-adding the loop device gets us back to ~40MB/s,
> without me noticing any error messages or latency due to flushing.
> I didn't measure if there was any flushing going on, but "losetup -d"
> returned pretty much immediately.
> All writes are done with sync, so I don't quite understand how cache
> flushing comes in.
> 

Flushing doesn't seem to be a factor, I was just noting previously that
the only slowdown I noticed in my brief tests were associated with flushing.

Note again though that loop seems to flush on close(). I suspect a
reason for this is so 'losetup -d' can return immediately, but that's
just a guess. IOW, if you hadn't used oflag=sync, the close() issued by
dd before it actually exits would result in flushing the buffers
associated with the loop device to the backing store. You are using
oflag=sync, so that doesn't really matter.

> 
> On one client, I had a slow loop device from yesterday and a fresh,
> new loop device going.
> The transfer rate for the old was about ~3.5MB/s, while the new had ~40MB/s.
> 
> I set up a script that cleared Gluster stats, did a 60-second dd
> write, ran "sync",
> and then outputted Gluster profiling stats. Gluster was idle except
> for this script.
> Looking at those results, we can clearly see that there is a
> difference in the behavior.
> 
> $ ./parse-gluster-info.py slow-and-fast/auto-slow
> Brick: storage01.gluster.trollweb.net:/srv/gluster/brick1
> Block size           reads      writes
> 4096b+               0          49152
> 131072b+             7          0
> 
> Brick: storage02.gluster.trollweb.net:/srv/gluster/brick1
> Block size           reads      writes
> 4096b+               0          49152
> 
> $ ./parse-gluster-info.py slow-and-fast/auto-fast
> Brick: storage01.gluster.trollweb.net:/srv/gluster/brick0
> Block size           reads      writes
> 4096b+               0          106
> 8192b+               0          68
> 16384b+              0          305
> 32768b+              0          2261
> 65536b+              0          21285
> 131072b+             0          466
> 
> Brick: storage02.gluster.trollweb.net:/srv/gluster/brick0
> Block size           reads      writes
> 4096b+               0          106
> 8192b+               0          68
> 16384b+              0          305
> 32768b+              0          2261
> 65536b+              0          21285
> 131072b+             22         466
> 
> We see these are on different bricks.
> I made new loop devices until I one was backed by brick1,
> the transfer rate was as expected:
> $ ./parse-gluster-info.py slow-and-fast/auto-fast-loop3
> Brick: storage01.gluster.trollweb.net:/srv/gluster/brick1
> Block size           reads      writes
> 4096b+               0          165
> 8192b+               0          99
> 16384b+              0          70
> 32768b+              0          805
> 65536b+              0          19086
> 131072b+             0          851
> 
> Brick: storage02.gluster.trollweb.net:/srv/gluster/brick1
> Block size           reads      writes
> 4096b+               0          165
> 8192b+               0          99
> 16384b+              0          70
> 32768b+              0          805
> 65536b+              0          19086
> 131072b+             22         851
> 
> The script and gluster profile output can be
> downloaded at http://torbjorn-dev.trollweb.net/gluster/.
> 
> To me it seems that a fresh loop device does mostly 64kb writes,
> and at some point during a 24 hour window, changes to doing 4kb writes ?
> 

Yeah, interesting data. One thing I was curious about is whether
write-behind or some other caching translator was behind this one way or
another (including the possibility that the higher throughput value is
actually due to a bug, rather than the other way around). If I
understand the io-stats translator correctly however, these request size
metrics should match the size of the requests coming into gluster and
thus suggest something else is going on.

Regardless, I think it's best to narrow the problem down and rule out as
much as possible. Could you try some of the commands in my previous
email to disable performance translators and see if it affects
throughput? For example, does disabling any particular translator
degrade throughput consistently (even on new loop devices)? If so, does
re-enabling a particular translator enhance throughput on an already
mapped and "degraded" loop (without unmapping/remapping the loop)?

Also, what gluster and kernel versions are you on?

Brian

> 
> --
> Vennlig hilsen
> Torbjørn Thorsen
> Utvikler / driftstekniker
> 
> Trollweb Solutions AS
> - Professional Magento Partner
> www.trollweb.no
> 
> Telefon dagtid: +47 51215300
> Telefon kveld/helg: For kunder med Serviceavtale
> 
> Besøksadresse: Luramyrveien 40, 4313 Sandnes
> Postadresse: Maurholen 57, 4316 Sandnes
> 
> Husk at alle våre standard-vilkår alltid er gjeldende
>