[Gluster-devel] Problems with ec/nfs.t in regression tests

Fri Feb 13 04:32:10 UTC 2015

On 02/13/2015 12:07 AM, Niels de Vos wrote:
> On Thu, Feb 12, 2015 at 11:39:51PM +0530, Pranith Kumar Karampuri wrote:
>> On 02/12/2015 11:34 PM, Pranith Kumar Karampuri wrote:
>>> On 02/12/2015 08:15 PM, Xavier Hernandez wrote:
>>>> I've made some more investigation and the problem seems worse.
>>>>
>>>> It seems that NFS sends a huge amount of requests without waiting for
>>>> answers (I've had more than 1400 requests ongoing). Probably there will
>>>> be many factors that can influence on the load that this causes, and one
>>>> of them could be ec, but it's not related exclusively to ec. I've
>>>> repeated the test using a replica 3 and a replica 2 volumes and the
>>>> problem still happens.
>>>>
>>>> The test basically writes a file to an NFS mount using 'dd'. The file
>>>> has a size of 1GB. With a smaller file, the test passes successfully.
>>> Using NFS client and gluster NFS server on same machine with BIG file dd
>>> operations is known to cause hangs. anon-fd-quota.t used to give similar
>>> problems so we changed the test to not involve NFS mounts.
>> I don't re-collect the exact scenario. Avati found the deadlock of memory
>> allocation, when I just joined gluster, in 2010. Raghavendra Bhat raised
>> this bug then. CCed him to the thread as well if he knows the exact
>> scenario.
> This is a well know issue. When a system is under memory pressure, it
> will try to flush dirty pages from the VFS. The NFS-client will send the
> dirty pages over the network to the NFS-server. Unfortunately, the
> NFS-server needs to allocate memory for the handling or the WRITE
> procedures. This causes a loop and will most often get the system into a
> hang situation.
Yes. This was it :-). Seems like Xavi and Shyam found the reason for the 
failure though, which is not this.

Pranith
>
> Mounting with "-o sync", or flushing outstanding I/O from the client
> side should normally be sufficient to prevent these issues.
Nice, didn't know about this.

Pranith
>
> Niels
>
>> Pranith
>>> Pranith
>>>> One important thing to note is that I'm not using powerful servers (a
>>>> dual core Intel Atom), but this problem shouldn't happen anyway. It can
>>>> even happen on more powerful servers if they are busy doing other things
>>>> (maybe this is what's happening on jenkins' slaves).
>>>>
>>>> I think that this causes some NFS requests to timeout. This can be seen
>>>> in /var/log/messages (there are many of these messages):
>>>>
>>>> Feb 12 15:18:45 celler01 kernel: nfs: server gf01.datalab.es not
>>>> responding, timed out
>>>>
>>>> nfs log also has many errors:
>>>>
>>>> [2015-02-12 14:18:45.132905] E [rpcsvc.c:1257:rpcsvc_submit_generic]
>>>> 0-rpc-service: failed to submit message (XID: 0x7be78dbe, Program: NFS3,
>>>> ProgVers: 3, Proc: 7) to rpc
>>>> -transport (socket.nfs-server)
>>>> [2015-02-12 14:18:45.133009] E [nfs3.c:565:nfs3svc_submit_reply]
>>>> 0-nfs-nfsv3: Reply submission failed
>>>>
>>>> Additionally this causes disconnections from NFS that are not correctly
>>>> handled causing that a thread gets stuck in an infinite loop (I haven't
>>>> analyzed this problem deeply, but it seems like an attempt to use an
>>>> already disconnected socket). After a while, I get this error on the nfs
>>>> log:
>>>>
>>>> [2015-02-12 14:20:19.545429] C
>>>> [rpc-clnt-ping.c:109:rpc_clnt_ping_timer_expired] 0-patchy-client-0:
>>>> server 192.168.200.61:49152 has not responded in the last 42 seconds,
>>>> disconnecting.
>>>>
>>>> The console executing the test shows this (nfs.t is creating a replica 3
>>>> instead of a dispersed volume):
>>>>
>>>> # ./run-tests.sh tests/basic/ec/nfs.t
>>>>
>>>> ... GlusterFS Test Framework ...
>>>>
>>>> Running tests in file ./tests/basic/ec/nfs.t
>>>> [14:12:52] ./tests/basic/ec/nfs.t .. 8/10 dd: error writing
>>>> ‘/mnt/nfs/0/test’: Input/output error
>>>> [14:12:52] ./tests/basic/ec/nfs.t .. 9/10
>>>> not ok 9
>>>> [14:12:52] ./tests/basic/ec/nfs.t .. Failed 1/10 subtests
>>>> [14:27:41]
>>>>
>>>> Test Summary Report
>>>> -------------------
>>>> ./tests/basic/ec/nfs.t (Wstat: 0 Tests: 10 Failed: 1)
>>>> Failed test: 9
>>>> Files=1, Tests=10, 889 wallclock secs ( 0.13 usr 0.02 sys + 1.29 cusr
>>>> 3.45 csys = 4.89 CPU)
>>>> Result: FAIL
>>>> Failed tests ./tests/basic/ec/nfs.t
>>>>
>>>> Note that the test takes almost 15 minutes to complete.
>>>>
>>>> Is there any way to limit the number of requests NFS sends without
>>>> having an answer ?
>>>>
>>>> Xavi
>>>>
>>>> On 02/11/2015 04:20 PM, Shyam wrote:
>>>>> On 02/11/2015 09:40 AM, Xavier Hernandez wrote:
>>>>>> Hi,
>>>>>>
>>>>>> it seems that there are some failures in ec/nfs.t test on regression
>>>>>> tests. Doing some investigation I've found that before applying the
>>>>>> multi-threaded patch (commit 5e25569e) the problem does not seem to
>>>>>> happen.
>>>>> This has in interesting history in failures, on the regression runs for
>>>>> the MT epoll this (i.e ec/nfs.t) did not fail (there were others, but
>>>>> not nfs.t).
>>>>>
>>>>> The patch that allows configuration of MT epoll is where this started
>>>>> failing around Feb 5th (but later passed). (see patchset 7 failures on,
>>>>> http://review.gluster.org/#/c/9488/ )
>>>>>
>>>>> I state the above, as it may help narrowing down the changes in EC
>>>>> (maybe) that could have caused it.
>>>>>
>>>>> Also in the latter commit, there was an error configuring the number of
>>>>> threads so all regression runs would have run with a single epoll
>>>>> thread
>>>>> (the MT epoll patch had this hard coded, so that would have run with 2
>>>>> threads, but did not show up the issue (patch:
>>>>> http://review.gluster.org/#/c/3842/)).
>>>>>
>>>>> Again I state the above, as this should not be exposing a
>>>>> race/bug/problem due to the multi threaded nature of epoll, but of
>>>>> course needs investigation.
>>>>>
>>>>>> I'm not sure if this patch is the cause or it has revealed some bug in
>>>>>> ec or any other xlator.
>>>>> I guess we can reproduce this issue? If so I would try setting
>>>>> client.event-threads on master branch to 1, restarting the volume and
>>>>> then running the test (as a part of the test itself maybe) to eliminate
>>>>> the possibility that MT epoll is causing it.
>>>>>
>>>>> My belief on MT epoll causing it is in doubt as the runs failed on the
>>>>> http://review.gluster.org/#/c/9488/ (configuration patch), which had
>>>>> the
>>>>> thread count as 1 due to a bug in that code.
>>>>>
>>>>>> I can try to identify it (any help will be appreciated), but it may
>>>>>> take
>>>>>> some time. Would it be better to remove the test in the meantime ?
>>>>> I am checking if this is reproducible on my machine, so that I can
>>>>> possibly see what is going wrong.
>>>>>
>>>>> Shyam
>>>>> _______________________________________________
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel