[Gluster-devel] Re-thinking gluster regression logging

Mon Jul 2 11:58:19 UTC 2018

Hello folks,

Deepshikha is working on getting the distributed-regression testing into
production. This is a good time to discuss how we log our regression. We
tend to go with the approach of "get as many logs as possible" and then we
try to make sense of it when it something fails.

In a setup where we distribute the tests to 10 machines, that means
fetching runs from 10 machines and trying to make sense of it. Granted, the
number of files will most likely remain the same since a successful test is
only run once, but a failed test is re-attempted two more times on
different machines. So we will now have duplicates.

I have a couple of suggestions and I'd like to see what people think.
1. We stop doing tar of tars to do the logs and just tar the
/var/log/glusterfs folder at the end of the run. That will probably achieve
better compression.
2. We could stream the logs to a service like ELK that we host. This means
that no more tarballs. It also lets us test any logging improvements we
plan to make for Gluster in one place.
2. I've been looking at Citellus[1] to write parsers that help us identify
critical problems. This could be a way for us to build a repo of parsers
that can identify common gluster issues.

Perhaps our solution would be a mix of all 2 and 3. Ideally, I'd like us to
avoid archiving tarballs to debug regression issues in the future.

[1]: https://github.com/citellusorg/citellus

-- 
nigelb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180702/3f6ae521/attachment-0001.html>