<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 09/01/2017 10:57 AM, Amar Tumballi
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAHxyDdNawp_rTc-vcze+VYKHSvz3Y4zXrqB6q2oL0V7_fKtFBw@mail.gmail.com">
<div dir="ltr">Disclaimer: This email is long, and did take
significant time to write. Do take time and read, review and
give feedback, so we can have some metrics related tasks done by
Gluster 4.0
<div><br>
</div>
<div>---</div>
<div><b>* History:</b></div>
<div><br>
</div>
<div>To understand what is happening inside GlusterFS process,
over the years, we have opened many bugs and also coded few
things with regard to statedump, and did put some effort into
io-stats translator to improve the gluster's monitoring
capabilities.</div>
<div><br>
</div>
<div>But surely there is more required! And some glimpse of it
is captured in [1], [2], [3] & [4]. Also, I did send an
email to this group [5] about possibilities of capturing this
information.</div>
<div><br>
</div>
<div><b>* Current problem:</b></div>
<div><br>
</div>
<div>When we talk about metrics or monitoring, we have to
consider giving out these data to a tool which can preserve
the readings in a periodic time, without a time graph, no
metrics will make sense! So, the first challenge itself is how
to get them out? Should getting the metrics out from each
process need 'glusterd' interacting? or should we use signals?
Which leads us to <b>'challenge #1'.</b></div>
<div><br>
</div>
<div>Next is, should we depend on io-stats to do the reporting?
If yes, how to get information from between any two layers?
Should we provide io-stats in between all the nodes of
translator graph? or should we utilize STACK_WIND/UNWIND
framework to get the details? This is our <b>'challenge #2'</b></div>
<div><br>
</div>
<div>Once the above decision will be taken, then the question
is, "what about 'metrics' from other translators? Who gives it
out (ie, dumps it?)? Why do we need something similar to
statedump, and can't we read info from statedump itself?". But
when we say 'metrics', we should have a key and a number
associated with it, statedump has lot more, and no format. If
its different from statedump, then what is our answer for
translator code to give out metrics? This is our <b>'challenge
#3</b>'</div>
<div><br>
</div>
<div>If we get a solution to above challenges, then I guess we
are in a decent shape for further development. Lets go through
them one by one, in detail.</div>
<div><br>
</div>
<div><b>* Problems and proposed solutions:</b></div>
<div><br>
</div>
<div><b>a) how to dump metrics data ?</b></div>
<div><br>
</div>
<div>Currently, I propose signal handler way, as it will give
control for us to choose what are the processes we need to
capture information on, and will be much faster than
communicating through another tool. Also considering we need
to have these metrics taken every 10sec or so, there will be a
need for efficient way to get this out.</div>
<div><br>
</div>
<div>But even there, we have challenges, because we have already
chosen both USR1 and USR2 signal handlers, one for statedump,
another for toggling latency monitoring respectively. It makes
sense to continue to have statedump use USR1, but toggling
options should be technically (for correctness too) be handled
by glusterd volume set options, and there should be a way to
handle it in a better way by our 'reconfigure()' framework in
graph-switch. Proposal sent in github issue #303 [6]. </div>
<div><br>
</div>
<div>If we are good with above proposal, then we can make use of
USR2 for metrics dump. Next issue will be about the format of
the file itself, which we will discuss at the end of the
email.</div>
</div>
</blockquote>
<br>
How about using UDP to push data from Gluster processes?<br>
<br>
- Signal handling only required while reloading vol file when
metrics enabled/disabled<br>
- Push it to pre-configured UDP address(socket file), if listener
exists it will capture metrics else UDP message is lost<br>
- Will not affect the io performance since it is asynchronous and
error is not checked by the sender.<br>
- If metrics is lost, will not impact any data/io. We don't need
crash consistency or high accuracy while collecting metrics.<br>
- Receiver can receive data fast without blocking incoming data and
can produce outputs in different formats asynchronously.<br>
<br>
Usage:<br>
- Enable metrics using volopt<br>
- Start udp server(Metrics receiver), Example
./gluster-metrics-receiver (socket file should be predefined say
`/var/run/gluster/metrics.socket`)<br>
<br>
Limitations:<br>
Similar to `strace` command, running two receiver same time is not
possible. It is possible to run multiple receivers by having
different socket address for each process/pid. For example,
`metrics.<pid>.socket` and receiver will listen using
`./gluster-metrics-receiver -p <pid>`<br>
<br>
<blockquote type="cite"
cite="mid:CAHxyDdNawp_rTc-vcze+VYKHSvz3Y4zXrqB6q2oL0V7_fKtFBw@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>NOTE: Above approach is already implemented in
'experimental' branch, excluding handling of [6].</div>
<div><br>
</div>
<div><b>b) where to measure the latency and fops counts?</b></div>
<div><br>
</div>
<div>One of the possible way is to load io-stats in between all
the nodes, but it has its own limitations. Mainly, how to
configure options in each of this translator, will having too
many translators slow down operation ? (ie, create one extra
'frame' for every fop, and in a graph of 20 xlator, it will be
20 extra frame creates for a single fop).</div>
<div><br>
</div>
<div>I propose we handle this in 'STACK_WIND/UNWIND' macros
itself, and provide a placeholder to store all this data in
translator structure itself. This will be more cleaner, and no
changes are required in code base, other than in 'stack.h (and
some in xlator.h)'.</div>
<div><br>
</div>
<div>Also, we can provide 'option monitoring enable' (or
disable) option as a default option for every translator, and
can handle it at xlator_init() time itself. (This is not a
blocker for 4.0, but good to have). Idea proposed @ github
#304 [7]. </div>
<div><br>
</div>
<div>NOTE: this approach is working pretty good already at
'experimental' branch, excluding [7]. Depending on feedback,
we can improve it further.</div>
<div><br>
</div>
<div><b>c) framework for xlators to provide private metrics</b></div>
<div><br>
</div>
<div>One possible solution is to use statedump functions. But to
cause least disruption to an existing code, I propose 2 new
methods. 'dump_metrics()', and 'reset_metrics()' to xlator
methods, which can be dl_open()'d to xlator structure.</div>
<div><br>
</div>
<div>'dump_metrics()' dumps the private metrics in the expected
format, and will be called from the global dump-metrics
framework, and 'reset_metrics()' would be called from a CLI
command when someone wants to restart metrics from 0 to check
/ validate few things in a running cluster. Helps
debug-ability.</div>
<div><br>
</div>
<div>Further feedback welcome.</div>
<div><br>
</div>
<div>NOTE: a sample code is already implemented in
'experimental' branch, and protocol/server xlator uses this
framework to dump metrics from rpc layer, and client
connections.</div>
<div><br>
</div>
<div><b>d) format of the 'metrics' file.</b></div>
<div><br>
</div>
<div>If you want any plot-able data on a graph, you need key
(should be string), and value (should be a number), collected
over time. So, this file should output data for the monitoring
systems and not exactly for the debug-ability. We have
'statedump' for debug-ability.</div>
<div><br>
</div>
<div>So, I propose a plain text file, where data would be dumped
like below.</div>
<div><br>
</div>
<div>```</div>
<div># anything starting from # would be treated as comment.</div>
<div><key><space><value></div>
<div># anything after the value would be ignored.</div>
<div>```</div>
<div>Any better solutions are welcome. Ideally, we should keep
this friendly for external projects to consume, like tendrl
[8] or graphite, prometheus etc. Also note that, once we agree
to the format, it would be very hard to change it as external
projects would use it.</div>
<div><br>
</div>
<div>I would like to hear the feedback from people who are
experienced with monitoring systems here.</div>
<div><br>
</div>
<div>NOTE: the above format works fine with 'glustermetrics'
project [9] and is working decently on 'experimental' branch. </div>
<div><br>
</div>
<div>------</div>
<div><br>
</div>
<div><b>* Discussions:</b></div>
<div><br>
</div>
<div>Let me know how you all want to take the discussion
forward? </div>
<div><br>
</div>
<div>Should we get to github, and discuss on each issue? or
should I rebase and send the current patches from experimental
to 'master' branch and discuss in our review system? Or
should we continue on the email here!</div>
<div><br>
</div>
<div>Regards,</div>
<div>Amar</div>
<div><br>
</div>
<div>References:</div>
<div><br>
</div>
<div>[1] - <a
href="https://github.com/gluster/glusterfs/issues/137"
moz-do-not-send="true">https://github.com/gluster/glusterfs/issues/137</a></div>
<div>[2] - <a
href="https://github.com/gluster/glusterfs/issues/141"
moz-do-not-send="true">https://github.com/gluster/glusterfs/issues/141</a></div>
<div>[3] - <a
href="https://github.com/gluster/glusterfs/issues/275"
moz-do-not-send="true">https://github.com/gluster/glusterfs/issues/275</a></div>
<div>[4] - <a
href="https://github.com/gluster/glusterfs/issues/168"
moz-do-not-send="true">https://github.com/gluster/glusterfs/issues/168</a></div>
<div>[5] - <a
href="http://lists.gluster.org/pipermail/maintainers/2017-August/002954.html"
moz-do-not-send="true">http://lists.gluster.org/pipermail/maintainers/2017-August/002954.html</a>
(last email of the thread).</div>
<div>[6] - <a
href="https://github.com/gluster/glusterfs/issues/303"
moz-do-not-send="true">https://github.com/gluster/glusterfs/issues/303</a></div>
<div>[7] - <a
href="https://github.com/gluster/glusterfs/issues/304"
moz-do-not-send="true">https://github.com/gluster/glusterfs/issues/304</a><br
clear="all">
<div>[8] - <a href="https://github.com/Tendrl"
moz-do-not-send="true">https://github.com/Tendrl</a></div>
<div>[9] - <a href="https://github.com/amarts/glustermetrics"
moz-do-not-send="true">https://github.com/amarts/glustermetrics</a></div>
<div><br>
</div>
-- <br>
<div class="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>Amar Tumballi (amarts)<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
maintainers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:maintainers@gluster.org">maintainers@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/maintainers">http://lists.gluster.org/mailman/listinfo/maintainers</a>
</pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
regards
Aravinda VK
<a class="moz-txt-link-freetext" href="http://aravindavk.in">http://aravindavk.in</a>
</pre>
</body>
</html>