[Bugs] [Bug 1266476] New: RFE : Feature: Periodic FOP statistics dumps for v3.6.x/v3.7.x

bugzilla at redhat.com bugzilla at redhat.com
Fri Sep 25 11:23:58 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1266476

            Bug ID: 1266476
           Summary: RFE : Feature: Periodic FOP statistics dumps for
                    v3.6.x/v3.7.x
           Product: GlusterFS
           Version: mainline
         Component: core
          Keywords: FutureFeature, Triaged
          Severity: medium
          Priority: medium
          Assignee: bugs at gluster.org
          Reporter: asengupt at redhat.com
                CC: asengupt at redhat.com, bengland at redhat.com,
                    bugs at gluster.org, gluster-bugs at redhat.com,
                    rwareing at fb.com, sshreyas at fb.com
        Depends On: 1261700



+++ This bug was initially created as a clone of Bug #1261700 +++

Description of problem:
Patch to add periodic JSON dumps of FOP latency & hit rate statistics from the
io-stats translator.  Dumps are controlled by the
diagnostics.stats-dump-interval <dump interval sec> option and stored in
/var/lib/glusterd/stats under their respective FUSE, gNFSd or brick instance.

This is immensely useful to reliably ferret out diagnostics & performance
metrics from GlusterFS for injection into a robust analytics backend for future
analysis or alarming.  Heavily in-use here at Facebook.

Patches clean onto the release-3.6 or release-3.7 branches as of this bug
creation.

Version-Release number of selected component (if applicable):
v3.6.x or v3.7.x, should be trivial to port to master.

How reproducible:
100%

Steps to Reproduce:
N/A

Actual results:


Expected results:


Additional info:

--- Additional comment from Ben England on 2015-09-22 16:55:47 EDT ---

Richard, 

this is an extremely good idea.  I have had to parse gluster volume profile
output and it is extremely hard to do.  JSON would make it much easier. Also,
io-stats translator can run client-side so you get client-side latency, not
server-side.   Would be great if /usr/sbin/gluster could initiate the profiling
so we didn't have to edit a volfile.

Can you provide an attachment with JSON output from the patch so that lazy
folks like me can see what it looks like?

thx

-Ben England, Perf. Engr., Red Hat

--- Additional comment from  on 2015-09-22 17:25 EDT ---



--- Additional comment from  on 2015-09-22 17:34:08 EDT ---

Added example output.  Also, this is automatically engaged when either of these
options is enabled:

diagnostics.latency-measurement
diagnostics.count-fop-hits

and 

diagnostics.ios-dump-interval 

...is set to something non-zero.

We run with these enabled 24x7 on all clusters at all times, and as you have
noted it's extremely powerful to be able to look at performance from all layers
of the stack (FUSE client, gNFSd and bricks).  And with lockless counters (also
in this patch) we haven't observed any perf hit.

--- Additional comment from Avra Sengupta on 2015-09-25 07:23:19 EDT ---

Cloning this bug to master.


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1261700
[Bug 1261700] RFE : Feature: Periodic FOP statistics dumps for
v3.6.x/v3.7.x
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list