[Bugs] [Bug 1467614] Gluster read/write performance improvements on NVMe backend

Thu Nov 9 15:03:39 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1467614

Krutika Dhananjay <kdhananj at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |needinfo?(mpillai at redhat.co
                   |                            |m)

--- Comment #46 from Krutika Dhananjay <kdhananj at redhat.com> ---
(In reply to Manoj Pillai from comment #44)
> (In reply to Manoj Pillai from comment #40)
> > Back to the client-side analysis. Single-client, single brick runs. Client
> > system separate from server, 10GbE interconnect. io-thread-count=4.
> > 
> > Trying an approach where I'm doing runs with increasing number of concurrent
> > jobs -- 24, 48 and 96 -- and attaching mutrace to the glusterfs client in
> > each case. Comparing mutrace output for the runs 
> 
> Looking at this more, I'm not convinced that lock contention is the root
> cause of the IOPs limit we are hitting...
> 
> fio output for run with 24 concurrent jobs:
>    read: IOPS=23.7k, BW=92.4Mi (96.9M)(6144MiB/66483msec)
>     clat (usec): min=209, max=2951, avg=1010.89, stdev=62.19
>      lat (usec): min=209, max=2952, avg=1011.16, stdev=62.18
> 
> fio output for run with 48 concurrent jobs:
>    read: IOPS=23.5k, BW=91.8Mi (96.3M)(6144MiB/66932msec)
>     clat (usec): min=237, max=4431, avg=2038.93, stdev=120.72
>      lat (usec): min=237, max=4431, avg=2039.21, stdev=120.71
> 
> IOPs is about the same but avg latency is double in the 48 jobs case, up by
> about 1ms. If lock contention is what is limiting IOPS, we should have seen
> a huge spike in contention times reported by mutrace for the run with 48
> jobs. mutrace outputs are attached in comment #42 and comment #43.
> Contention times reported vary from run to run, but I'm not seeing the kind
> of increases I'd expect if lock contention was the root cause.
> 
> I think we need to look for other possible bottlenecks as well.

So I compared the mutrace output attachments from comment #42 and #43 and I see
that the contention time of the lock in iobuf_pool_new() increased by 10
seconds upon increasing iodepth to 48 (with iodepth=24 it was 28283ms or 28s
and with iodepth=48 it is 38450ms or 38s). This is for my own understanding -
based on your calculations in comment #44, is a 10s spike not big enough for
this to be taken seriously?

-Krutika

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=cuPv80D0Qx&a=cc_unsubscribe