[Gluster-devel] [Outreachy] Need help with running bench test on Gluster cluster.

Wed Oct 12 16:48:32 UTC 2016

Took the liberty of adding [Outreachy] to the subject, makes it easier 
for others to search as well (probably). If people have objections, let 
me know and I will post a single response here removing the same for the 
future.

<rest inline>

On 10/11/2016 03:53 PM, Menaka Mohan wrote:
> Hi,
>
>
> I am Menaka M. I am interested in participating in this round of
> Outreachy under Gluster. I am new to this open source world. Kindly help
> me with the following query.

Welcome!

>
>
> I have set up the Gluster development environment with two servers and
> one client. I am trying to run the basic bench test on the Gluster
> cluster from this GitHub repo
> <https://github.com/gluster/gbench/tree/master/bench-tests/bt-0000-0001>. I
> also have IOZone installed. While trying to run the provided script, i
> get the following error. I was trying to identify the cause of the
> error. Kindly help me with that.

It looks like you have made progress (from your IRC ping on clients.ioz 
and its contents). I guess you got the CLIENTS, SERVERS and some other 
prerequisites right. It also looks like you either figured out how to 
setup rsh for iozone, or exported RSH env-var as 'ssh'.  So some good 
progress there.

It would be nice if you could take some time out to add to the README 
these additional prerequisites. (when you get the time) (you would be 
looking at github pull requests [1] for the same)

>
> So, I learned more about IOZone and also the performance testing section
> in the Gluster docs. With that knowledge and to learn more, I have gone
> through the code and running iozone commands mentioned in the
> GlusterBench.py script individually.
>
> If I am asking a very basic thing, apologies. I will quickly learn things.

Nope! not necessarily basic, as I got some failures too when using 
latest iozone binaries/sources. One such failure was in [2] where the 
latest iozone results are changed to report "kB/...", rather than 
"KB/..." and so the parsing failed. Anyway, I made a local fix to my 
python script for the same (will push the change out soonish).

So coming to your test failure:

- Could you post your full log using something like fpaste, and share 
the link (helps not making any assumptions on my part) [3]

- I hit a similar failure in my tests, see [4]. This run seems to have 
hit some ssh connection issue, because of which a sample was left as 
"None" (which is the default)

- In your case *all* samples are left as "None". So I suspect a more 
generic parsing failure of the results as obtained from *every* iozone run

- To detect what failed and why, it would be better to take a look at 
the entire log

- Additionally, you could also add a few prints within the 
extract_iozone_result function to debug the root cause of failure

>
> Regards,
> Menaka M
>
> ----------------------------------------------------------------------------------------------------------------------------------
>
> python GlusterBench.py -s 64 -f 10000 -n 5 -v
>
> Number threads = 4
> Client list = HadoopSlave4
>
> Running IOZone with 64KB record size and 4 threads,  Creating an 8 GB
> file with every thread.
> Running smallfile with 64KB files, creating 10000 files.
> Running squential IOZone tests, starting with sequential writes.
>
> About to gather sample --> 0
>
>     Iozone: Performance Test of File I/O
>             Version $Revision: 3.429 $
>         Compiled for 64 bit mode.
>         Build: linux-AMD64
>
>     Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
>                  Al Slater, Scott Rhine, Mike Wisner, Ken Goss
>                  Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
>                  Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
>                  Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,

Were there any additional output lines here? I do understand that you 
have kept the output terse later, but to understand parsing errors, the 
output from iozone would be helpful.

>
> Adding the current sample to the list: None

The above line is where the parsing has failed, it should retrieve a 
number and not a default string "None" from the routine.

> Dropping cache
> /root/sync-drop-caches.sh: 13: /root/sync-drop-caches.sh: Bad substitution

I think the default shell in your environment is not bash, and that 
leads to the above "Bad substitution". On a hunch I am providing a stack 
overflow link for the same [5]. This should not have a repercussion on 
the test result, but I could be mistaken.

> Cleaning up files.
>
> About to gather sample --> 1
>
>     /* I removed the Iozone and Contributors header */
>
> Adding the current sample to the list: None
> Dropping cache
> /root/sync-drop-caches.sh: 13: /root/sync-drop-caches.sh: Bad substitution
> Cleaning up files.
>
> About to gather sample --> 2
>
> Adding the current sample to the list: None
> Dropping cache
> /root/sync-drop-caches.sh: 13: /root/sync-drop-caches.sh: Bad substitution
> Cleaning up files.
>
> About to gather sample --> 3
>
> Adding the current sample to the list: None
> Dropping cache
> /root/sync-drop-caches.sh: 13: /root/sync-drop-caches.sh: Bad substitution
> Cleaning up files.
>
> About to gather sample --> 4
>
> Adding the current sample to the list: None
> Dropping cache
> /root/sync-drop-caches.sh: 13: /root/sync-drop-caches.sh: Bad substitution
> No cleanup on 4 iteration.
>
> The results for sequential writes are: [None, None, None, None, None]

And, it looks like every instance has had some problem causing the 
entire array of result to just have "None" as printed above.

>
> Traceback (most recent call last):
>   File "GlusterBench.py", line 255, in <module>
>     main()
>   File "GlusterBench.py", line 76, in main
>     average_seq_write = find_average(result1)
>   File "GlusterBench.py", line 250, in find_average
>     total = total + int(samples_in[x])
> TypeError: int() argument must be a string or a number, not 'NoneType'
>
> -----------------------------------------------------------------------------------------------------------------------------------------
>

Shyam

[1] github pul requests: 
https://help.github.com/articles/about-pull-requests/

[2] Parsing assert in the script, when using latest iozone: 
https://github.com/gluster/gbench/blob/master/bench-tests/bt-0000-0001/GlusterBench.py#L178

[3] Fedora paste: https://paste.fedoraproject.org/

[4] Similar error in my setup: 
http://paste.fedoraproject.org/449085/89758147/

[5] bash "Bad Substitution" error possible help: 
http://stackoverflow.com/questions/20615217/bash-bad-substitution