[Gluster-devel] GlusterFS 3.3.0 and NFS client failures

Fri Mar 22 10:04:16 UTC 2013

Hello,

  This is a problem that I've been chipping at on and off for a while and 
its finally cost me one recording too many - I just want to get it cured - any
help would be greatly appreciated.

  I'm using the kernel NFS client on a number of Linux machines (four, I believe), to map back to two Gluster 3.3.0 shares.

  I have seen Linux Mint and Ubuntu machines of various generations and
configurations (one is 64bit) hang intermittently on either one of the two 
Gluster shares on "access" (I can't say if its writing or not - the below log 
is for a write).  But by far the most common failure example is my MythTV
Backend server.  It has 5 tuners pulling down up to a gigabyte per hour 
each directly to an NFS share from Gluster 3.3.30 with two local 3TB 
drives in a "distribute" volume.  It also re-parses each recording for Ad 
filtering, so the share gets a good thrashing.  The myth backend box would 
fail (hang the system) once each 2-4 days.

The backend server was also updating its NIC via DHCP.  I have been using an MTU of 1460 and each DHCP event would thus result in this note in syslog;
 [  12.248640] r8169: WARNING! Changing of MTU on this NIC may lead to frame reception errors!

I change the DHCP MTU to 1500 and didn't see an improvement.  So, the
last change I made was a hard coded address and default MTU (of 1500). 
The most recent trial saw a 13 day run time which is well outside the norm,
but it still borked (one test only - may have been lucky).

>> syslog burp;
[1204800.908075] INFO: task mythbackend:21353 blocked for more than 120 seconds.
[1204800.908084] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1204800.908091] mythbackend   D f6af9d28     0 21353      1 0x00000000
[1204800.908107]  f6af9d38 00000086 00000002 f6af9d28 f6a4e580 c05d89e0 c08c3700 c08c3700
[1204800.908123]  bd3a4320 0004479c c08c3700 c08c3700 bd3a0e4e 0004479c 00000000 c08c3700
[1204800.908138]  c08c3700 f6a4e580 00000001 f6a4e580 c2488700 f6af9d80 f6af9d48 c05c6e51
[1204800.908152] Call Trace:
[1204800.908170]  [<c05c6e51>] io_schedule+0x61/0xa0
[1204800.908180]  [<c01d9c4d>] sync_page+0x3d/0x50
[1204800.908190]  [<c05c761d>] __wait_on_bit+0x4d/0x70
[1204800.908197]  [<c01d9c10>] ? sync_page+0x0/0x50
[1204800.908211]  [<c01d9e71>] wait_on_page_bit+0x91/0xa0
[1204800.908221]  [<c0165e60>] ? wake_bit_function+0x0/0x50
[1204800.908229]  [<c01da1f4>] filemap_fdatawait_range+0xd4/0x150
[1204800.908239]  [<c01da3c7>] filemap_write_and_wait_range+0x77/0x80
[1204800.908248]  [<c023aad4>] vfs_fsync_range+0x54/0x80
[1204800.908257]  [<c023ab5e>] generic_write_sync+0x5e/0x80
[1204800.908265]  [<c01dbda1>] generic_file_aio_write+0xa1/0xc0
[1204800.908292]  [<fb0bc94f>] nfs_file_write+0x9f/0x200 [nfs]
[1204800.908303]  [<c0218454>] do_sync_write+0xa4/0xe0
[1204800.908314]  [<c032e626>] ? apparmor_file_permission+0x16/0x20
[1204800.908324]  [<c0302a74>] ? security_file_permission+0x14/0x20
[1204800.908333]  [<c02185d2>] ? rw_verify_area+0x62/0xd0
[1204800.908342]  [<c02186e2>] vfs_write+0xa2/0x190
[1204800.908350]  [<c02183b0>] ? do_sync_write+0x0/0xe0
[1204800.908359]  [<c0218fa2>] sys_write+0x42/0x70
[1204800.908367]  [<c05c90a4>] syscall_call+0x7/0xb

This might suggest a hardware fault on the Myth Backend host (like the
NIC) but I don't believe that to be the case because I've seen the same
issue on other clients.  I suspect that they are much more rare because
the data volume on those clients pales in comparison to the Myth Backend
process (virtual guests, etc - light work - months between failures, doesn't
feel time related).

The only cure is a hard reset (of the host with the NFS client) as any FS 
operation on that share hangs - including df, ls, sync and umount - so the 
system fails to shutdown.

The kernel on the Myth Backend host isn't new ..

>> uname -a;
Linux jupiter 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:34:50 UTC 2010 i686 GNU/Linux

Is there a known good/bad version for the kernel/NFS client?  Am I under that bar?

The GlusterFS NFS server an embedded platform (Saturn) that has been running for 74 days;

>> uptime output;
08:39:07 up 74 days, 22:16,  load average: 0.87, 0.94, 0.94

It is a much more modern platform;

>> uname -a;
Linux (none) 3.2.14 #1 SMP Tue Apr 10 12:46:47 EST 2012 i686 GNU/Linux

It has had one error in all of that time;
>> dmesg output;
Pid: 4845, comm: glusterfsd Not tainted 3.2.14 #1
Call Trace:
 [<c10512d0>] __rcu_pending+0x64/0x294
 [<c1051640>] rcu_check_callbacks+0x87/0x98
 [<c1034521>] update_process_times+0x2d/0x58
 [<c1047bdf>] tick_periodic+0x63/0x65
 [<c1047c2d>] tick_handle_periodic+0x17/0x5e
 [<c1015ae9>] smp_apic_timer_interrupt+0x67/0x7a
 [<c1b2a691>] apic_timer_interrupt+0x31/0x40

.. this occurred months ago.

Unfortunately due to its embedded nature, there are no logs coming from 
this platform, only a looped buffer for syslog (and gluster doesn't seem to 
syslog).  In previous discussions here (months ago) you'll see where I was 
working to disable/remove logging from GlusterFS so that I could keep it 
alive in an embedded environment - this is the current run configuration.

The Myth Backend host only mounts one of the two NFS shares, but I've seen the fault on the hosts that only mount the other - so I'm reluctant to believe that its a hardware failure at the Drive level on the Saturn / Gluster 
server.

The /etc/fstab entry for this share, on the Myth Backend host, is;

  saturn:/recordings /var/lib/mythtv/saturn_recordings nfs nfsvers=3,rw,rsize=8192,wsize=8192,hard,intr,sync,dirsync,noac,noatime,nodev,nosuid 0  0

When I softened this to async with soft failures (a config taken straight 
from the Gluster site/FAQ) it crashed out in a much shorter time-frame 
(less than a day, one test only - may have been unlucky);

  saturn:/recordings /var/lib/mythtv/saturn_recordings nfs defaults,_netdev,nfsvers=3,proto=tcp 0  0

Other than the high use Myth Backend host I've failed to accurately nail 
down the trigger for this issue - which is making diagnostics painful (I like 
my TV too much to do more than reboot the failed box - and heaven forbid
the dad that fails to record Pepper Pig!).

Any thoughts?  Beyond enabling logs on the Saturn side ...  

Is it possible this is a bug that was reaped in later versions of Gluster?

Appreciate being set straight ..

Cheers,

--
Ian Latter
Late night coder ..
http://midnightcode.org/