<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Feb 28, 2018 at 2:49 AM, J. Bruce Fields <span dir="ltr">&lt;<a href="mailto:bfields@fieldses.org" target="_blank">bfields@fieldses.org</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Mon, Feb 26, 2018 at 11:20:49AM +0530, Raghavendra G wrote:<br>

&gt; On Fri, Feb 23, 2018 at 6:33 AM, J. Bruce Fields &lt;<a href="mailto:bfields@fieldses.org">bfields@fieldses.org</a>&gt;<br>

&gt; wrote:<br>

&gt;<br>

&gt; &gt; On Thu, Feb 22, 2018 at 01:17:58PM +0530, Raghavendra G wrote:<br>

</span><div><div class="h5">&gt; &gt; &gt; For a local filesystem, we may not end up in ESTALE errors. But, when<br>

&gt; &gt; rmdir<br>

&gt; &gt; &gt; is executed from multiple clients of a network fs (like NFS, Glusterfs),<br>

&gt; &gt; &gt; unlink or rmdir can easily fail with ESTALE as the other rm invocation<br>

&gt; &gt; &gt; could&#39;ve deleted it. I think this is what has happened in bugs like:<br>

&gt; &gt; &gt; <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1546717" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/<wbr>show_bug.cgi?id=1546717</a><br>

&gt; &gt; &gt; <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1245065" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/<wbr>show_bug.cgi?id=1245065</a><br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; This in fact was the earlier motivation to convert ESTALE into ENOENT, so<br>

&gt; &gt; &gt; that rm would ignore it. Now that I reverted the fix, looks like the bug<br>

&gt; &gt; &gt; has promptly resurfaced :)<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; There is one glitch though. Bug 1245065 mentions that some parts of<br>

&gt; &gt; &gt; directory structure remain undeleted. From my understanding, atleast one<br>

&gt; &gt; &gt; instance of rm (which is racing ahead of all others causing others to<br>

&gt; &gt; &gt; fail), should&#39;ve delted the directory structure completely. Though, I<br>

&gt; &gt; need<br>

&gt; &gt; &gt; to understand the directory traversal done by rm to find whether there<br>

&gt; &gt; are<br>

&gt; &gt; &gt; cyclic dependency between two rms causing both of them to fail.<br>

&gt; &gt;<br>

&gt; &gt; I don&#39;t see how you could avoid that.  The clients are each caching<br>

&gt; &gt; multiple subdirectories of the tree, and there&#39;s no guarantee that 1<br>

&gt; &gt; client has fresher caches of every subdirectory.  There&#39;s also no<br>

&gt; &gt; guarantee that the client that&#39;s ahead stays ahead--another client that<br>

&gt; &gt; sees which objects the first client has already deleted can leapfrog<br>

&gt; &gt; ahead.<br>

&gt; &gt;<br>

&gt;<br>

&gt; What are the drawbacks of applications (like rm) treating ESTALE equivalent<br>

&gt; of ENOENT? It seems to me, from the application perspective they both<br>

&gt; convey similar information. If rm could ignore ESTALE just like it does for<br>

&gt; ENOENT, probably we don&#39;t run into this issue.<br>

<br>

</div></div>That might work.  Or, maybe better, take &quot;ESTALE&quot; as a sign that the<br>

parent directory is gone and give up on trying to remove further entries<br>

from it.<br>

<br>

Could you remind me why this is a priority, anyway?  A quick look at the<br>

bz&#39;s suggest they&#39;re both artificial tests.  Were they were motivated by<br>

a customer problem originally?  Apologies if we&#39;ve already been over<br>

this....<br></blockquote><div><br></div><div>Its an artificial test.  Not motivated by any user&#39;s real world scenario. But, I was not sure whether such a usecase won&#39;t be used in realworld workloads. Hence was trying to debug it. Have you seen such realworld workloads on NFS?<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class="HOEnZb"><font color="#888888"><br>

--b.<br>

</font></span></blockquote></div><br></div></div>