<div>Soumya,</div><div>I should have mentioned in my first email. The VIP was always able to failover to the remaining nodes.  But in many of my testings, the failover IP just did not carry over the states for the NFS client. So, it always look like  the NFS server is unavailable.</div><div><br></div><div>Thanks for your response.  Any pointers on where to look will be great. Lately, I also found out different NFS client played a significant role in my testings also, unfortunately... </div><div><br></div><div><br><div class="gmail_quote"><div>On Tue, May 9, 2017 at 11:21 PM Soumya Koduri &lt;<a href="mailto:skoduri@redhat.com">skoduri@redhat.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

<br>

On 05/10/2017 04:18 AM, ML Wong wrote:<br>

&gt; While I m troubleshooting the failover of Nfs-Ganesha, the failover is<br>

&gt; always successful when I shutdown Nfs-Ganesha service online while the<br>

&gt; OS is running. However, it always failed when I did a either shutdown -r<br>

&gt; or power-reset.<br>

&gt;<br>

&gt; During the failure, the Nfs client was just hung. Like you could not do<br>

&gt; a &quot;df&quot; or &quot;ls&quot; of the mount point. The share will eventually failover to<br>

&gt; the remaining expected node usually after 15 - 20 minutes.<br>

<br>

The time taken by pacemaker/corosync services to determine if a node is<br>

down is usually longer compared to the service down case. But yes it<br>

should n&#39;t take more than couple of minutes.<br>

<br>

Could you please check (may be by constantly querying) on how long it<br>

takes for the virtual-IP to failover by using either &#39;pcs status&#39; or &#39;ip<br>

a&#39; commands. If the IP failover happens quickly but if its just the NFS<br>

clients taking time to respond, then we have added usage of portblock<br>

feature to speed up client re-connects post failover. The fixes are<br>

available (from release-3.9). But before upgrading I suggest to check if<br>

the delay is with IP failover or client reconnects post failover.<br>

<br>

Thanks,<br>

Soumya<br>

<br>

&gt;<br>

&gt; Running on Centos7, gluster 3.7.1x, Nfs-Ganesha 2.3.0.x. I currently<br>

&gt; don&#39;t have the resources to upgrade, but if all of experts here think<br>

&gt; that&#39;s the only route. I guess I will have to make a case ...<br>

&gt;<br>

&gt; Thanks in advance!<br>

&gt;<br>

&gt;<br>

&gt; _______________________________________________<br>

&gt; Gluster-users mailing list<br>

&gt; <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

&gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailman/listinfo/gluster-users</a><br>

&gt;<br>

</blockquote></div></div>