[Gluster-devel] Moratorium on new patch acceptance

Vijaikumar M vmallika at redhat.com
Thu May 21 13:33:23 UTC 2015



On Thursday 21 May 2015 06:48 PM, Shyam wrote:
> On 05/21/2015 04:04 AM, Vijaikumar M wrote:
>> On Tuesday 19 May 2015 09:50 PM, Shyam wrote:
>>> On 05/19/2015 11:23 AM, Vijaikumar M wrote:
>>>
>>> Did that (in the attached script that I sent) and it still failed.
>>>
>>> Please note:
>>> - This dd command passes (or fails with EDQUOT)
>>>   - dd if=/dev/zero of=$N0/$mydir/newfile_2 bs=512 count=10240
>>> oflag=append oflag=sync conv=fdatasync
>>>   - We can even drop append and fdatasync, as sync sends a commit per
>>> block written which is better for the test and quota enforcement,
>>> whereas fdatasync does one in the end and sometimes fails (with larger
>>> block sizes, say 1M)
>>>   - We can change bs to [512 - 256k]
>>>
>>> - This dd command fails (or writes all the data)
>>>   - dd if=/dev/zero of=$N0/$mydir/newfile_2 bs=3M count=2 oflag=append
>>> oflag=sync conv=fdatasync
>>>
>>> The reasoning is that when we write a larger block size, NFS sends in
>>> multiple 256k chunks to write and then sends the commit before the
>>> next block. As a result if we exceed quota in the *last block* that we
>>> are writing, we *may* fail. If we exceed quota in the last but one
>>> block we will pass.
>>>
>>> Hope this shorter version explains it better.
>>>
>>> (VijayM is educating me on quota (over IM), and it looks like the
>>> quota update happens as a synctask in the background, so post the
>>> flush (NFS commit) we may still have a race)
>>>
>>> Post education solution:
>>> - Quota updates on disk xattr as a sync task, as a result if we
>>> exceeded quota in the n-1th block there is no guarantee that the nth
>>> block would fail, as the sync task may not have completed
>>>
>>> So I think we need to do the following for the quota based tests
>>> (expanding on the provided patch, 
>>> http://review.gluster.org/#/c/10811/ )
>>> - First dd that exceeds quota (with either oflag=sync or
>>> conv=fdatasync so that we do not see any flush behind or write behind
>>> effects) to be done without checks
>>> - Next check in an EXPECT_WITHIN that quota is exceeded (maybe add
>>> checks on the just created/appended file w.r.t its minimum size that
>>> would make it exceed the quota)
>>> - Then do a further dd to a new file or append to an existing file to
>>> get the EDQUOT error
>>> - Proceed with whatever the test case needs to do next
>>>
>>> Suggestions?
>>>
>>
>> Here is my analysis on spurious failure with testcase:
>> tests/bugs/distribute/bug-1161156.t
>> In release-3.7, marker is re-factored to use synctask to do background
>> accounting.
>> I have done below tests with different combination and found that
>> parallel writes is causing the spurious failure.
>> I have filed a bug# 1223658 to track parallel write issue with quota.
>
> Agreed with the observations, tallies with mine. Just one addition, 
> when we write 256k or less, the writes become serial as NFS writes in 
> 256k chunks, and due to oflag=sync it follows up with a flush, correct?
>
Yes

> Test (2) is interesting, even with marker foreground updates (which is 
> still in the UNWIND path), we observe failures. Do we know why? My 
> analysis/understanding of the same is that we have more in flight IOs 
> that passed quota enforcement (due to accounting on the UNWIND path), 
> does this bear any merit post your tests?
>
Yes, my understanding is same that it could be because of more in-flight 
IOs and there is not much impact if the marker is doing background updates.


>>
>>
>> 1) Parallel writes and Marker background update (Test always fails)
>>      TEST ! dd if=/dev/zero of=$N0/$mydir/newfile_2 bs=3M count=2
>> conv=fdatasync oflag=sync oflag=append
>>
>>      NFS client breaks 3M writes into multiple 256k chunks and does
>> parallel writes
>>
>> 2) Parallel writes and Marker foreground update (Test always fails)
>>      TEST ! dd if=/dev/zero of=$N0/$mydir/newfile_2 bs=3M count=2
>> conv=fdatasync oflag=sync oflag=append
>>
>>      Made a marker code change to account quota in foreground (without
>> synctask)
>>
>> 3) Serial writes and Marker background update (Test passed 100/100 
>> times)
>>      TEST ! dd if=/dev/zero of=$N0/$mydir/newfile_2 bs=256k count=24
>> conv=fdatasync oflag=sync oflag=append
>>
>>      Using smaller block size (256k), so that NFS client reduces
>> parallel writes
>>
>> 4) Serial writes and Marker foreground update (Test passed 100/100 
>> times)
>>      TEST ! dd if=/dev/zero of=$N0/$mydir/newfile_2 bs=256k count=24
>> conv=fdatasync oflag=sync oflag=append
>>
>>      Using smaller block size (256k), so that NFS client reduces
>> parallel writes
>>      Made a marker code change to account quota in foreground (without
>> synctask)
>>
>> 5) Parallel writes on release-3.6 (Test always fails)
>>      TEST ! dd if=/dev/zero of=$N0/$mydir/newfile_2 bs=3M count=2
>> conv=fdatasync oflag=sync oflag=append
>>      Moved marker xlator above IO-Threads in the graph.
>>
>> Thanks,
>> Vijay



More information about the Gluster-devel mailing list