[Gluster-devel] Shall we revert quota-anon-fd.t?

Wed Jun 11 10:58:46 UTC 2014

On Wed, Jun 11, 2014 at 01:31:04PM +0530, Vijay Bellur wrote:
> On 06/11/2014 10:45 AM, Pranith Kumar Karampuri wrote:
> >
> >On 06/11/2014 09:45 AM, Vijay Bellur wrote:
> >>On 06/11/2014 08:21 AM, Pranith Kumar Karampuri wrote:
> >>>hi,
> >>>    I see that quota-anon-fd.t is causing too many spurious failures. I
> >>>think we should revert it and raise a bug so that it can be fixed and
> >>>committed again along with the fix.
> >>>
> >>
> >>I think we can do that. The problem here is stemming from the issue
> >>that nfs can deadlock when we have client and servers on the same node
> >>with system memory utilization being on the higher side. We also need
> >>to look into other nfs tests to determine if there are similar
> >>possibilities.
> >
> >I doubt it is because of that, there are so many nfs mount tests,
> 
> I have been following this problem closely on b.g.o. This backtrace
> does indicate dd being hung:
> 
> INFO: task dd:6039 blocked for more than 120 seconds.
>       Not tainted 2.6.32-431.3.1.el6.x86_64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> dd            D ffff880028100840     0  6039   5704 0x00000080
>  ffff8801f843faa8 0000000000000286 ffff8801ffffffff 01eff88bb6f58e28
>  ffff8801db96bb80 ffff8801f8213590 00000000036c74dc ffffffffac6f4edf
>  ffff8801faf11af8 ffff8801f843ffd8 000000000000fbc8 ffff8801faf11af8
> Call Trace:
>  [<ffffffff810a70b1>] ? ktime_get_ts+0xb1/0xf0
>  [<ffffffff8111f940>] ? sync_page+0x0/0x50
>  [<ffffffff815280b3>] io_schedule+0x73/0xc0
>  [<ffffffff8111f97d>] sync_page+0x3d/0x50
>  [<ffffffff81528b7f>] __wait_on_bit+0x5f/0x90
>  [<ffffffff8111fbb3>] wait_on_page_bit+0x73/0x80
>  [<ffffffff8109b330>] ? wake_bit_function+0x0/0x50
>  [<ffffffff81135c05>] ? pagevec_lookup_tag+0x25/0x40
>  [<ffffffff8111ffdb>] wait_on_page_writeback_range+0xfb/0x190
>  [<ffffffff811201a8>] filemap_write_and_wait_range+0x78/0x90
>  [<ffffffff811baa4e>] vfs_fsync_range+0x7e/0x100
>  [<ffffffff811bab1b>] generic_write_sync+0x4b/0x50
>  [<ffffffff81122056>] generic_file_aio_write+0xe6/0x100
>  [<ffffffffa042f20e>] nfs_file_write+0xde/0x1f0 [nfs]
>  [<ffffffff81188c8a>] do_sync_write+0xfa/0x140
>  [<ffffffff8152a825>] ? page_fault+0x25/0x30
>  [<ffffffff8109b2b0>] ? autoremove_wake_function+0x0/0x40
>  [<ffffffff8128ec6f>] ? __clear_user+0x3f/0x70
>  [<ffffffff8128ec51>] ? __clear_user+0x21/0x70
>  [<ffffffff812263d6>] ? security_file_permission+0x16/0x20
>  [<ffffffff81188f88>] vfs_write+0xb8/0x1a0
>  [<ffffffff81189881>] sys_write+0x51/0x90
>  [<ffffffff810e1e6e>] ? __audit_syscall_exit+0x25e/0x290
>  [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
> 
> I have seen dd being in uninterruptible sleep on b.g.o. There are
> also instances [1] where anon-fd-nfs has run for close to 6000+
> seconds. This definitely points to the nfs deadlock.

[1] is a run where nfs.drc is still enabled. I'd like to know if you 
have seen other, more recent runs where http://review.gluster.org/8004 
has been included (disable nfs.drc by default).

Are there backtraces at the same time where alloc_pages() and/or 
try_to_free_pages() are listed? The blocking of the writer (here: dd) 
likely depends on the needed memory allocations on the receiving enf 
(here: nfs-server). This is a relatively common issue for the Linux 
kernel NFS server where loopback-mounts are used under memory pressure.  

A nice description and proposed solution of this has recently been 
posted to LWN.net:
- http://lwn.net/Articles/595652/

This solution is client-side (the NFS-client in the Linux kernel), and 
that should help preventing these issues for Gluster-nfs too (with 
a quick cursory look through it). But I don't think the patches have 
been merged yet.

> >only
> >this one keeps failing for the past 2-3 days.
> 
> It is a function of the system memory consumption and what oom
> killer decides to kill. If NFS or a glusterfsd process gets killed,
> then the test unit will fail. If the test can continue till the
> system reclaims memory, it can possibly succeed.
> 
> However, there could be other possibilities and we need to root
> cause them as well.

Yes, I agree. It would help if there is a known way to trigger the OOM 
so that investigation can be done on a different system than 
build.gluster.org. Does anyone know of steps that reliably reproduce 
this kind of issue?

Thanks,
Niels

> 
> 
> -Vijay
> 
> [1] http://build.gluster.org/job/regression/4783/console
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel