[Gluster-devel] Regression tests and improvement ideas
Raghavendra Talur
rtalur at redhat.com
Wed Jun 17 10:56:13 UTC 2015
Hi,
MSV Bhat and I had presented in Gluster Design Summit some ideas about
improving our testing infrastructure.
Here is the link to the slides: http://redhat.slides.com/rtalur/distaf#
Here are the same suggestions,
1. *A .t file for a bug*
When a community user discovers a bug in Gluster, they contact us over
irc or email and eventually end up filling a bug in bugzilla.
Many times it so happens that we find a bug which we don't know the
fix for OR not a bug in our module and also end up filling a bug in
bugzilla.
If we could rather write a .t test to reproduce the bug and add it to
say /tests/bug/yet-to-be-fixed/ folder in gluster repo it would be
more helpful. As part of bug-triage we could try doing the same for bugs
filed by community users.
*What do we get?*
a. very easy for a new developer to pick up that bug and fix it.
If .t passes then the bug is fixed.
b. The regression on daily patch sets would skip this folder; but on a
nightly basis we could run a test on this folder to see if any of these
tests got fixed while we were fixing some other tests. Yay!
2. *New gerrit/review work flow*
Our gerrit setup currently has a 2 hour average for regression run.
Due to long queue of commits the round about time is around 4-6 hours.
Kaushal has proposed on how to reduce round about time more in this
thread http://www.spinics.net/lists/gluster-devel/msg15798.html.
3. *Make sure tests can be done in docker and run in parallel*
To reduce time for one test run from 2 hours we can look at running
tests in parallel. I did a prototype and got test time down to 40 mins
on a 16 GB RAM and 4 core VM.
Current blocked at :
Some of the tests fail in docker while they pass in a VM.
Note that it is .t failing, Gluster works fine in docker.
Need some help on this. More on this in a mail I will be sending later
today at gluster-devel.
*what do we get?*
Running 4 docker containers on our Laptops itself can reduce time
taken by test runs down to 90 mins. Running them on powerful machines,
it is down to 40 mins as seen in the prototype.
4. *Test definitions for every .t*
May be the time has come to upgrade our test infra to have tests with
test definitions. Every .t file could have a corresponding .def file
which is
A JSON/YAML/XML config
Defines the requirements of test
Type of volume
Special knowledge of brick size required?
Which repo source folders should trigger this test
Running time
Test RUN level
*what do we get?*
a. Run a partial set of tests on a commit based on git log and test
definitions and run complete regression as nightly.
b. Order test run based on run times. This combined with fail on first
test setting we have, we will fail as early as possible.
c. Order tests based on functionality level, which means a mount.t basic
test should run before a complex DHT test that makes use of FUSE mount.
Again, this will help us to fail as early as possible in failure scenarios.
d. With knowledge of type of volume required and number of bricks
required, we can re-use volumes that are created for subsequent tests.
Even the cleanup() function we have takes time. DiSTAF already has a
function equivalent to use_existing_else_create_new.
5. *Testing GFAPI*
We don't have a good test framework for gfapi as of today.
However, with the recent design proposal at
https://docs.google.com/document/d/1yuRLRbdccx_0V0UDAxqWbz4g983q5inuINHgM1YO040/edit?usp=sharing
and
Craig Cabrey from Facebook developing a set of coreutils using
GFAPI as mentioned here
http://www.spinics.net/lists/gluster-devel/msg15753.html
I guess we have it well covered :)
Reviews and suggestions welcome!
Thanks,
Raghavendra Talur
More information about the Gluster-devel
mailing list