[Gluster-Maintainers] [Gluster-devel] Master branch lock down: RCA for tests (UNSOLVED ./tests/basic/stats-dump.t)

Shyam Ranganathan srangana at redhat.com
Mon Aug 13 18:32:19 UTC 2018

On 08/12/2018 08:42 PM, Shyam Ranganathan wrote:
> As a means of keeping the focus going and squashing the remaining tests
> that were failing sporadically, request each test/component owner to,
> - respond to this mail changing the subject (testname.t) to the test
> name that they are responding to (adding more than one in case they have
> the same RCA)
> - with the current RCA and status of the same
> List of tests and current owners as per the spreadsheet that we were
> tracking are:
> ./tests/basic/stats-dump.t		TBD

This test fails as follows:

  01:07:31 not ok 20 , LINENUM:42
  01:07:31 FAILED COMMAND: grep .queue_size

  18:35:43 not ok 21 , LINENUM:43
  18:35:43 FAILED COMMAND: grep .queue_size

Basically when grep'ing for a pattern in the stats dump it is not
finding the second grep pattern of "queue_size" in one or the other bricks.

The above seems incorrect, if it found "aggr.fop.write.count" it stands
to reason that it found a stats dump, further there is a 2 second sleep
as well in the test case and the dump interval is 1 second.

The only reason for this to fail could hence possibly be that the file
was just (re)opened (by the io-stats dumper thread) for overwriting
content, at which point the fopen uses the mode "w+", and the file was
hence truncated, and the grep CLI also opened the file at the same time,
and hence found no content.

I will be adding a bug and a fix that tries this in a loop to avoid the
potential race that I see above as the cause.

Other ideas/causes welcome!

Also, this has failed in mux and non-mux environments,
Runs with failure:
(no logs)

(has logs)


More information about the maintainers mailing list