<div dir="ltr">May be i should change the log message from 'Checking rebalance status' to 'Logging rebalance status' because the first 'rebalance status' command just does that . It executes 'rebalance status'. Now wait_for_rebalance_to_complete validates rebalance is 'completed' within 5 minutes ( default time out ). If that makes sense i will make those changes as well along with introducing the delay b/w 'start' and 'status'<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 30, 2017 at 4:26 PM, Atin Mukherjee <span dir="ltr"><<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span class="">On Wed, Aug 30, 2017 at 4:23 PM, Shwetha Panduranga <span dir="ltr"><<a href="mailto:spandura@redhat.com" target="_blank">spandura@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">This is the first check where we just execute 'rebalance status' . That's the command which failed and hence failed the test case. If u see the test case, the next step is wait_for_rebalance_to_complete (status --xml). This is where we execute rebalance status until 5 minutes for rebalance to get completed. Even before waiting for rebalance, the first execution of status command failed. Hence the test case failed. <br></div></blockquote><div><br></div></span><div>Cool. So there is still a problem in the test case. We can't assume rebalance status to report back success immediately after rebalance start and I've explained the why part in the earlier thread. Why do we need to do an intermediate check of rebalance status before going for wait_for_rebalance_to_complete ?</div><div><div class="h5"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"></div><div class="m_4032547901176452982gmail-HOEnZb"><div class="m_4032547901176452982gmail-h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 30, 2017 at 4:07 PM, Atin Mukherjee <span dir="ltr"><<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><pre class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-console-output"><span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> # Start Rebalance
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> <a href="http://g.log.info" target="_blank">g.log.info</a>("Starting Rebalance on the volume")
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> ret, _, _ = rebalance_start(self.mnode, self.volname)
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> self.assertEqual(ret, 0, ("Failed to start rebalance on the volume "
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> "%s", self.volname))
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> <a href="http://g.log.info" target="_blank">g.log.info</a>("Successfully started rebalance on the volume %s",
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> self.volname)
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span>
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> # Check Rebalance status
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> <a href="http://g.log.info" target="_blank">g.log.info</a>("Checking Rebalance status")
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> ret, _, _ = rebalance_status(self.mnode, self.volname)
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span> self.assertEqual(ret, 0, ("Failed to get rebalance status for the "
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span>> "volume %s", self.volname))
<span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223gmail-timestamp"><b>14:15:57</b> </span>E AssertionError: ('Failed to get rebalance status for the volume %s', 'testvol_distributed-dispersed<wbr>')</pre> <br></div>The above is the snip extracted from <a href="https://ci.centos.org/view/Gluster/job/gluster_glusto/377/console" target="_blank">https://ci.centos.org/view/Glu<wbr>ster/job/gluster_glusto/377/co<wbr>nsole</a><br><br></div>If we had gone for rebalance status checks multiple times, I should have seen multiple entries of rebalance_status failure or at least a difference in time, isn't it?<br><br></div><div class="m_4032547901176452982gmail-m_8877050191075345148HOEnZb"><div class="m_4032547901176452982gmail-m_8877050191075345148h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 30, 2017 at 3:39 PM, Shwetha Panduranga <span dir="ltr"><<a href="mailto:spandura@redhat.com" target="_blank">spandura@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><div><div><div>Case:<br><br></div>1) add-brick when IO is in progress , wait for 30 seconds<br><br></div>2) Trigger rebalance<br><br></div>3) Execute: 'rebalance status' ( there is no time delay b/w 2) and 3) )<br><br></div>4) wait_for_rebalance_to_complete ( This get's the xml output of rebalance status and keep checking for rebalance status to be 'complete' for every 10 seconds uptil 5 minutes. 5 minutes wait time can be passed as parameter )<br><br></div>At every step we check the exit status of the command output. If the exit status is non-zero we fail the test case. <br><span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223HOEnZb"><font color="#888888"><br></font></span></div><span class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223HOEnZb"><font color="#888888">-Shwetha</font></span></div><div class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223HOEnZb"><div class="m_4032547901176452982gmail-m_8877050191075345148m_2808310302129524223h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 30, 2017 at 6:06 AM, Sankarshan Mukhopadhyay <span dir="ltr"><<a href="mailto:sankarshan.mukhopadhyay@gmail.com" target="_blank">sankarshan.mukhopadhyay@gmail<wbr>.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span>On Wed, Aug 30, 2017 at 6:03 AM, Atin Mukherjee <<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>> wrote:<br>
><br>
> On Wed, 30 Aug 2017 at 00:23, Shwetha Panduranga <<a href="mailto:spandura@redhat.com" target="_blank">spandura@redhat.com</a>><br>
> wrote:<br>
>><br>
>> Hi Shyam, we are already doing it. we wait for rebalance status to be<br>
>> complete. We loop. we keep checking if the status is complete for '20'<br>
>> minutes or so.<br>
><br>
><br>
> Are you saying in this test rebalance status was executed multiple times<br>
> till it succeed? If yes then the test shouldn't have failed. Can I get to<br>
> access the complete set of logs?<br>
<br>
</span>Would you not prefer to look at the specific test under discussion as well?<br>
<span>______________________________<wbr>_________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@gluster.org" target="_blank">Gluster-devel@gluster.org</a><br>
</span><a href="http://lists.gluster.org/mailman/listinfo/gluster-devel" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-devel</a><br>
</blockquote></div><br></div>
</div></div><br>______________________________<wbr>_________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@gluster.org" target="_blank">Gluster-devel@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-devel" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-devel</a><br></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>