[Gluster-devel] bug-1432542-mpx-restart-crash.t failures

Thu Aug 2 16:43:36 UTC 2018

On 08/01/2018 11:10 PM, Nigel Babu wrote:
> Hi Shyam,
> 
> Amar and I sat down to debug this failure[1] this morning. There was a
> bit of fun looking at the logs. It looked like the test restarted
> itself. The first log entry is at 16:20:03. This test has a timeout of
> 400 seconds which is around 16:26:43.
> 
> However, if you account for the fact that we log from the second step or
> so, it looks like the test timed out and we restarted it. The first log
> entry is from a few steps in, this makes sense. I think your patch[2] to
> increase the timeout to 800 seconds is the right way forward.
> 
> The last step before the timeout is this
> [2018-07-30 16:26:29.160943]  : volume stop patchy-vol17 : SUCCESS
> [2018-07-30 16:26:40.222688]  : volume delete patchy-vol17 : SUCCESS
> 
> There are 20 volumes, so it really needs at least a 90 second bump. I'm
> estimating 30 seconds per volume to clean up. You probably want to some
> extra time so it passes on lcov as well. So right now the 800 second
> clean up looks good.

Unfortunately the timeout bump still does not clear lcov, see,
https://build.gluster.org/job/line-coverage/401/console
https://build.gluster.org/job/line-coverage/400/console
https://build.gluster.org/job/line-coverage/406/console

The first test passes, then as a part of the full run it fails again.

Patch also pushes up the EXPECT_WITHIN to 120 seconds... :(

> 
> [1]: https://build.gluster.org/job/regression-test-burn-in/4051/
> [2]: https://review.gluster.org/#/c/20568/2
> -- 
> nigelb