[Gluster-Maintainers] [Gluster-devel] Master branch lock down: RCA for tests (UNSOLVED ./tests/basic/stats-dump.t)

Niels de Vos ndevos at redhat.com
Mon Aug 13 19:34:56 UTC 2018


On Mon, Aug 13, 2018 at 02:32:19PM -0400, Shyam Ranganathan wrote:
> On 08/12/2018 08:42 PM, Shyam Ranganathan wrote:
> > As a means of keeping the focus going and squashing the remaining tests
> > that were failing sporadically, request each test/component owner to,
> > 
> > - respond to this mail changing the subject (testname.t) to the test
> > name that they are responding to (adding more than one in case they have
> > the same RCA)
> > - with the current RCA and status of the same
> > 
> > List of tests and current owners as per the spreadsheet that we were
> > tracking are:
> > 
> > ./tests/basic/stats-dump.t		TBD
> 
> This test fails as follows:
> 
>   01:07:31 not ok 20 , LINENUM:42
>   01:07:31 FAILED COMMAND: grep .queue_size
> /var/lib/glusterd/stats/glusterfsd__d_backends_patchy1.dump
> 
>   18:35:43 not ok 21 , LINENUM:43
>   18:35:43 FAILED COMMAND: grep .queue_size
> /var/lib/glusterd/stats/glusterfsd__d_backends_patchy2.dump
> 
> Basically when grep'ing for a pattern in the stats dump it is not
> finding the second grep pattern of "queue_size" in one or the other bricks.
> 
> The above seems incorrect, if it found "aggr.fop.write.count" it stands
> to reason that it found a stats dump, further there is a 2 second sleep
> as well in the test case and the dump interval is 1 second.
> 
> The only reason for this to fail could hence possibly be that the file
> was just (re)opened (by the io-stats dumper thread) for overwriting
> content, at which point the fopen uses the mode "w+", and the file was
> hence truncated, and the grep CLI also opened the file at the same time,
> and hence found no content.

This sounds like a dangerous approach in any case. Truncating a file
while there are potential other readers should probably not be done. I
wonder if there is a good reason for this.

A safer solution would be to create a new temporary file, write the
stats to that and once done rename it to the expected filename. Any
process reading from the 'old' file will have its file-descriptor open
and can still read the previous, but consistent contents.

Niels


More information about the maintainers mailing list