[Gluster-Maintainers] [Gluster-devel] Master branch lock down: RCA for tests (UNSOLVED ./tests/basic/stats-dump.t)
Niels de Vos
ndevos at redhat.com
Mon Aug 13 19:34:56 UTC 2018
On Mon, Aug 13, 2018 at 02:32:19PM -0400, Shyam Ranganathan wrote:
> On 08/12/2018 08:42 PM, Shyam Ranganathan wrote:
> > As a means of keeping the focus going and squashing the remaining tests
> > that were failing sporadically, request each test/component owner to,
> >
> > - respond to this mail changing the subject (testname.t) to the test
> > name that they are responding to (adding more than one in case they have
> > the same RCA)
> > - with the current RCA and status of the same
> >
> > List of tests and current owners as per the spreadsheet that we were
> > tracking are:
> >
> > ./tests/basic/stats-dump.t TBD
>
> This test fails as follows:
>
> 01:07:31 not ok 20 , LINENUM:42
> 01:07:31 FAILED COMMAND: grep .queue_size
> /var/lib/glusterd/stats/glusterfsd__d_backends_patchy1.dump
>
> 18:35:43 not ok 21 , LINENUM:43
> 18:35:43 FAILED COMMAND: grep .queue_size
> /var/lib/glusterd/stats/glusterfsd__d_backends_patchy2.dump
>
> Basically when grep'ing for a pattern in the stats dump it is not
> finding the second grep pattern of "queue_size" in one or the other bricks.
>
> The above seems incorrect, if it found "aggr.fop.write.count" it stands
> to reason that it found a stats dump, further there is a 2 second sleep
> as well in the test case and the dump interval is 1 second.
>
> The only reason for this to fail could hence possibly be that the file
> was just (re)opened (by the io-stats dumper thread) for overwriting
> content, at which point the fopen uses the mode "w+", and the file was
> hence truncated, and the grep CLI also opened the file at the same time,
> and hence found no content.
This sounds like a dangerous approach in any case. Truncating a file
while there are potential other readers should probably not be done. I
wonder if there is a good reason for this.
A safer solution would be to create a new temporary file, write the
stats to that and once done rename it to the expected filename. Any
process reading from the 'old' file will have its file-descriptor open
and can still read the previous, but consistent contents.
Niels
More information about the maintainers
mailing list