[Gluster-devel] uss.t in master doing bad things to our regression test VM's

Thu Feb 19 12:27:57 UTC 2015

Hi Justin,

I have submitted patch 'http://review.gluster.org/#/c/9703/', used a 
different approach to generate a random string.

Thanks,
Vijay

On Thursday 19 February 2015 05:21 PM, Vijaikumar M wrote:
>
> On Wednesday 18 February 2015 10:42 PM, Justin Clift wrote:
>> Hi Vijaikumar,
>>
>> As part of investigating what is going wrong with our VM's in Rackspace,
>> I created several new VM's (11 of them) and started a full regression
>> test run on them.
>>
>> They're all hitting a major problem with uss.t.  Part of it does a "cat"
>> on /dev/urandom... which is taking several hours at 100% of a cpu. :(
>>
>> Here is output from "ps -ef f" on one of them:
>>
>> root  12094  1287  0 13:23 ? S   0:00  \_ /bin/bash 
>> /opt/qa/regression.sh
>> root  12101 12094  0 13:23 ? S   0:00      \_ /bin/bash ./run-tests.sh
>> root  12116 12101  0 13:23 ? S   0:01          \_ /usr/bin/perl 
>> /usr/bin/prove -rf --timer ./tests
>> root    382 12116  0 14:13 ? S   0:00              \_ /bin/bash 
>> ./tests/basic/uss.t
>> root   1713   382  0 14:14 ? S   0:00                  \_ /bin/bash 
>> ./tests/basic/uss.t
>> root   1714  1713 96 14:14 ? R 166:31                      \_ cat 
>> /dev/urandom
>> root   1715  1713  2 14:14 ? S   5:04                      \_ tr -dc 
>> a-zA-Z
>> root   1716  1713  9 14:14 ? S  16:31                      \_ fold -w 8
>>
>> And from top:
>>
>> top - 17:09:19 up  3:50,  1 user,  load average: 1.04, 1.03, 1.00
>> Tasks: 240 total,   3 running, 237 sleeping,   0 stopped,   0 zombie
>> Cpu0  :  4.3%us, 95.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi, 
>> 0.0%si,  0.0%st
>> Cpu1  :  8.1%us, 15.9%sy,  0.0%ni, 76.0%id,  0.0%wa,  0.0%hi, 
>> 0.0%si,  0.0%st
>> Mem:   1916672k total,  1119544k used,   797128k free,   114976k buffers
>> Swap:        0k total,        0k used,        0k free,   427032k cached
>>
>>    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+ COMMAND
>>   1714 root      20   0 98.6m  620  504 R 96.0  0.0 169:00.94 cat
>>    137 root      20   0 36100 1396 1140 S 15.9  0.1  37:01.55 plymouthd
>>   1716 root      20   0 98.6m  712  616 S 10.0  0.0  16:46.55 fold
>>   1715 root      20   0 98.6m  636  540 S  2.7  0.0   5:08.95 tr
>>      9 root      20   0     0    0    0 S  0.3  0.0   0:00.59 
>> ksoftirqd/1
>>      1 root      20   0 19232 1128  860 S  0.0  0.1   0:00.93 init
>>      2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd
>>
>> Your name is on the commit which added the code, but that was months 
>> ago.
>>
>> No idea why it's suddenly being a problem.  Do you have any idea?
>>
>> I am going to shut down all of these new test VM's except one, which 
>> I can
>> give you (or anyone) access to, if that would help find and fix the 
>> problem.
> I am not sure why suddenly this is causing a problem.
> I can remove 'cat urandom' and use different approach to test this 
> particular case.
>
> Thanks,
> Vijay
>
>>
>> Btw, this is pretty important. ;)
>>
>> + Justin
>>
>> -- 
>> GlusterFS - http://www.gluster.org
>>
>> An open source, distributed file system scaling to several
>> petabytes, and handling thousands of clients.
>>
>> My personal twitter: twitter.com/realjustinclift
>>
>