[Gluster-devel] Feature: Tunable FOP sampling for v3.6.x/v3.7.x
rwareing at fb.com
Thu Sep 10 19:36:18 UTC 2015
Following up on the FOP statistics dump feature, here's our FOP sampling patch as well. This feature allows you to sample a 1:N ratio of FOPs, such that they can be later analyzed to track down mis-behaving clients, calculate P99/P95 FOP service times, audit traffic and probably other things I'm forgetting to mention.
The patch can be had @ https://bugzilla.redhat.com/show_bug.cgi?id=1262092 (it does require the FOP stats dump patch to work, so patch that first!)
Here's the details from the patch commit description:
debug/io-stats: FOP sampling feature
- Using sampling feature you can record details about every Nth FOP.
The fields in each sample are: FOP type, hostname, uid, gid, FOP priority,
port and time taken (latency) to fufill the request.
- Implemented using a ring buffer which is not (m/c) allocated in the IO path,
this should make the sampling process pretty cheap.
- DNS resolution done @ dump time not @ sample time for performance w/
- Metrics can be used for both diagnostics, traffic/IO profiling as well
as P95/P99 calculations
- To control this feature there are two new volume options:
diagnostics.fop-sample-interval - The sampling interval, e.g. 1 means
sample every FOP, 100 means sample every 100th FOP
diagnostics.fop-sample-buf-size - The size (in bytes) of the ring
buffer used to store the samples. In the even more samples
are collected in the stats dump interval than can be held in this buffer,
the oldest samples shall be discarded. Samples are stored in the log
directory under /var/log/glusterfs/samples.
- Uses DNS cache written by sshreyas at fb.com (Thank-you!), the DNS cache
TTL is controlled by the diagnostics.stats-dnscache-ttl-sec option
and defaults to 24hrs.
Thanks go to David Hasson for reviewing the code at our end, and Shreyas Siravara for his (high performance) DNS cache implementation. The direction we (and by "we" I really mean Shreyas) plan on taking this work is load shaping/throttling based on host prefixes, uids, gids etc since this patch exposes this information in a concise and manner which is out of the IO path.
More information about the Gluster-devel