[Gluster-infra] regression machines reporting slowly ? here is the reason ...

Prasanna Kalever pkalever at redhat.com
Sun Apr 24 10:35:40 UTC 2016


On Sun, Apr 24, 2016 at 7:11 AM, Vijay Bellur <vbellur at redhat.com> wrote:
> On Sat, Apr 23, 2016 at 9:30 AM, Prasanna Kalever <pkalever at redhat.com> wrote:
>> Hi all,
>>
>> Noticed our regression machines are reporting back really slow,
>> especially CentOs and Smoke
>>
>> I found that most of the slaves are marked offline, this could be the
>> biggest reasons ?
>>
>>
>
> Regression machines are scheduled to be offline if there are no active
> jobs. I wonder if the slowness is related to LVM or related factors as
> detailed in a recent thread?

Hi Vijay,

Honestly I was not aware of this case where the machines move to
offline state by them self, I was aware that they just go to idle
state, Thanks for sharing this information.
But we still need to reclaim most of machines,


CentOs slaves:     Hardly (2/14) salves are online [1]

slave20.cloud.gluster.org (online)
slave21.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave22.cloud.gluster.org (online)
slave23.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave24.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave25.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave26.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave27.cloud.gluster.org [Offline Reason: Disconnected by rastar :
rastar taking this down for pranith. Needed for debugging with tar
issue.  Apr 20, 2016 3:44:14 AM]
slave28.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave29.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]

slave32.cloud.gluster.org [Offline Reason: idle]
slave33.cloud.gluster.org [Offline Reason: idle]
slave34.cloud.gluster.org [Offline Reason: idle]

slave46.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]




Smoke slaves:      Hardly (2/15) slaves are online [2]

slave20.cloud.gluster.org (onine)
slave21.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave22.cloud.gluster.org (online)
slave23.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave24.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave25.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave26.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave27.cloud.gluster.org [Offline Reason: Disconnected by rastar :
rastar taking this down for pranith. Needed for debugging with tar
issue.Apr 20, 2016 3:44:14 AM]
slave28.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave29.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]

slave32.cloud.gluster.org [Offline Reason: idle]
slave33.cloud.gluster.org [Offline Reason: idle]
slave34.cloud.gluster.org [Offline Reason: idle]

slave46.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
slave47.cloud.gluster.org [Offline Reason: idle]



`
Netbsd slaves:       Only (6 /11) are online [3]

nbslave71.cloud.gluster.org (online)
nbslave72.cloud.gluster.org [Offline Reason: This node is offline
because Jenkins failed to launch the slave agent on it.]
nbslave74.cloud.gluster.org [Ofline Reason: Disconnected by kaushal
Mar 21, 2016 10:59:43 PM]
nbslave75.cloud.gluster.org (online)
nbslave77.cloud.gluster.org (online)
nbslave79.cloud.gluster.org (online)

nbslave7c.cloud.gluster.org (online)
nbslave7g.cloud.gluster.org [Ofline Reason: Disconnected by rastar :
anoop is using this to debug netbsd related issue Mar 29, 2016 2:27:20
AM]
nbslave7h.cloud.gluster.org [Ofline Reason: Disconnected by kaushal
Apr 13, 2016 3:15:06 AM]
nbslave7i.cloud.gluster.org [Ofline Reason: Disconnected by jdarcy :
Consistently generating spurious failures due to ping timeouts. This
costs people *hours* for a platform nobody uses except as a test for
perfused.
Feb 27, 2016 9:09:09 PM]
nbslave7j.cloud.gluster.org (online)









For CentOs regressions: 9/14 slaves were completly down  [not just idle]
For Smoke: 9/15 slaves, that's a good number
Netbsd Regresstion: We can to reclaim 5/11 slaves, that's a good number










https://build.gluster.org/label/rackspace_regression_2gb/
https://build.gluster.org/label/smoke_tests/
https://build.gluster.org/label/netbsd7_regression/







>
> -Vijay


More information about the Gluster-infra mailing list