[Gluster-devel] GlusterFS 3.3.0 and NFS client failures

Ian Latter ian.latter at midnightcode.org
Sun Mar 24 01:35:52 UTC 2013


Thanks for the extra queries;

> 1) when the system was "hung", was the client still flushing data to the
> nfs server? 

I have no more detail outside of what has been supplied.  I.e. the other two 
applications I've seen fail have been starting a guest from a Gluster NFS 
share from VMWare Player (2nd most common, has happened four or five 
times) and "cp" from bash (3rd most common, has happened twice) - each 
on different hosts (NFS clients).

> Any network activity? 

On the myth backend server - No NFS network activity occurs after the
event, but I can still SSH into it (and everything is fine in that session, you
just can't perform any NFS iops or you'll hang that session on the blocked
io).

On the Gluster NFS server (Saturn) there's no issues at all - other 
devices on the network continue to access NFS shares from that box.
And no action is required on the Gluster NFS server for the failed host 
to reconnect and operate normally again.

> The backtrace below shows that the system was just waiting for a long
> time waiting for a write to complete.

Note that this state never recovers.  Sometimes its days before I find that 
the backend server has hung (as in, it has been in a hung state for days - 
like when it failed the day after I left for an overseas trip).  It doesn't reset, 
all future IOPS on that share hang.

So one diagnostic option is for me to write up a script that just creates and 
destroys a shed-load of data on a Gluster NFS share with an strace 
running over it .. but your comment suggests that if a "cp" hangs on write 
with output like the below then you'll be unimpressed.  What about a 
network trace instead?


> 2) anything in the gluster nfs logs?

There are no other logs on Saturn, other than syslog and the console 
(dmesg), and I'm not seeing any Gluster or NFS entries there.  Is there 
a way to get Gluster to log to syslog rather than on-disk files?  I only 
have a couple of Megabytes of disk space available.


> 3) is it possible DHCP assigned a different IP while renewing lease?

No.  As a backend service it is issued a static DHCP assignment based 
on its MAC address, and in the final case (13 days uptime) I removed 
DHCP requesting completely by hard coding the IPv4 address in 
Ubuntu's /etc/network* scripts on the Myth Backend host.  I.e. it fails 
without a DHCP process involved.


The feeling that I get is that this is a rare issue - in which case I'm looking
for something domestic - like a hardware fault or something bespoke in 
my configuration.  The problem reminds of the error we found in one of 
the Gluster modules that would kill replication for some files > 2GB but 
not all, so one of my other thoughts was that it may be something related 
to this (Saturn) being a 32bit platform.  However I have another site 
running Saturn that doesn't have this problem, so I'm reluctant to believe
that its the kernel-to-gluster stack either.  But seeing the failures in three
NFS clients makes it look like its Gluster/Saturn side issue.

Let me try a packet capture and we'll see if there's anything odd in there.

The next failure is due in a week or so.  If I get a chance I'll also write up
a test script to see if I can force a failure (it might also give a view on the 
volume of data required to trigger an event, and if it works it will give me
the ability to separate testing from my production kit).



Thanks,




----- Original Message -----
>From: "Anand Avati" <anand.avati at gmail.com>
>To: "Ian Latter" <ian.latter at midnightcode.org>
>Subject:  Re: [Gluster-devel] GlusterFS 3.3.0 and NFS client failures
>Date: Sat, 23 Mar 2013 16:40:38 -0700
>
> Do you have any more details, like -
> 
> 1) when the system was "hung", was the client still flushing data to the
> nfs server? Any network activity? The backtrace below shows that the system
> was just waiting for a long time waiting for a write to complete.
> 
> 2) anything in the gluster nfs logs?
> 
> 3) is it possible DHCP assigned a different IP while renewing lease?
> 
> Avati
> 
> On Fri, Mar 22, 2013 at 3:04 AM, Ian Latter <ian.latter at midnightcode.org>wrote:
> 
> > Hello,
> >
> >
> >   This is a problem that I've been chipping at on and off for a while and
> > its finally cost me one recording too many - I just want to get it cured -
> > any
> > help would be greatly appreciated.
> >
> >   I'm using the kernel NFS client on a number of Linux machines (four, I
> > believe), to map back to two Gluster 3.3.0 shares.
> >
> >   I have seen Linux Mint and Ubuntu machines of various generations and
> > configurations (one is 64bit) hang intermittently on either one of the two
> > Gluster shares on "access" (I can't say if its writing or not - the below
> > log
> > is for a write).  But by far the most common failure example is my MythTV
> > Backend server.  It has 5 tuners pulling down up to a gigabyte per hour
> > each directly to an NFS share from Gluster 3.3.30 with two local 3TB
> > drives in a "distribute" volume.  It also re-parses each recording for Ad
> > filtering, so the share gets a good thrashing.  The myth backend box would
> > fail (hang the system) once each 2-4 days.
> >
> > The backend server was also updating its NIC via DHCP.  I have been using
> > an MTU of 1460 and each DHCP event would thus result in this note in syslog;
> >  [  12.248640] r8169: WARNING! Changing of MTU on this NIC may lead to
> > frame reception errors!
> >
> > I change the DHCP MTU to 1500 and didn't see an improvement.  So, the
> > last change I made was a hard coded address and default MTU (of 1500).
> > The most recent trial saw a 13 day run time which is well outside the norm,
> > but it still borked (one test only - may have been lucky).
> >
> > >> syslog burp;
> > [1204800.908075] INFO: task mythbackend:21353 blocked for more than 120
> > seconds.
> > [1204800.908084] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [1204800.908091] mythbackend   D f6af9d28     0 21353      1 0x00000000
> > [1204800.908107]  f6af9d38 00000086 00000002 f6af9d28 f6a4e580 c05d89e0
> > c08c3700 c08c3700
> > [1204800.908123]  bd3a4320 0004479c c08c3700 c08c3700 bd3a0e4e 0004479c
> > 00000000 c08c3700
> > [1204800.908138]  c08c3700 f6a4e580 00000001 f6a4e580 c2488700 f6af9d80
> > f6af9d48 c05c6e51
> > [1204800.908152] Call Trace:
> > [1204800.908170]  [<c05c6e51>] io_schedule+0x61/0xa0
> > [1204800.908180]  [<c01d9c4d>] sync_page+0x3d/0x50
> > [1204800.908190]  [<c05c761d>] __wait_on_bit+0x4d/0x70
> > [1204800.908197]  [<c01d9c10>] ? sync_page+0x0/0x50
> > [1204800.908211]  [<c01d9e71>] wait_on_page_bit+0x91/0xa0
> > [1204800.908221]  [<c0165e60>] ? wake_bit_function+0x0/0x50
> > [1204800.908229]  [<c01da1f4>] filemap_fdatawait_range+0xd4/0x150
> > [1204800.908239]  [<c01da3c7>] filemap_write_and_wait_range+0x77/0x80
> > [1204800.908248]  [<c023aad4>] vfs_fsync_range+0x54/0x80
> > [1204800.908257]  [<c023ab5e>] generic_write_sync+0x5e/0x80
> > [1204800.908265]  [<c01dbda1>] generic_file_aio_write+0xa1/0xc0
> > [1204800.908292]  [<fb0bc94f>] nfs_file_write+0x9f/0x200 [nfs]
> > [1204800.908303]  [<c0218454>] do_sync_write+0xa4/0xe0
> > [1204800.908314]  [<c032e626>] ? apparmor_file_permission+0x16/0x20
> > [1204800.908324]  [<c0302a74>] ? security_file_permission+0x14/0x20
> > [1204800.908333]  [<c02185d2>] ? rw_verify_area+0x62/0xd0
> > [1204800.908342]  [<c02186e2>] vfs_write+0xa2/0x190
> > [1204800.908350]  [<c02183b0>] ? do_sync_write+0x0/0xe0
> > [1204800.908359]  [<c0218fa2>] sys_write+0x42/0x70
> > [1204800.908367]  [<c05c90a4>] syscall_call+0x7/0xb
> >
> > This might suggest a hardware fault on the Myth Backend host (like the
> > NIC) but I don't believe that to be the case because I've seen the same
> > issue on other clients.  I suspect that they are much more rare because
> > the data volume on those clients pales in comparison to the Myth Backend
> > process (virtual guests, etc - light work - months between failures,
> > doesn't
> > feel time related).
> >
> > The only cure is a hard reset (of the host with the NFS client) as any FS
> > operation on that share hangs - including df, ls, sync and umount - so the
> > system fails to shutdown.
> >
> > The kernel on the Myth Backend host isn't new ..
> >
> > >> uname -a;
> > Linux jupiter 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:34:50 UTC
> > 2010 i686 GNU/Linux
> >
> > Is there a known good/bad version for the kernel/NFS client?  Am I under
> > that bar?
> >
> >
> > The GlusterFS NFS server an embedded platform (Saturn) that has been
> > running for 74 days;
> >
> > >> uptime output;
> > 08:39:07 up 74 days, 22:16,  load average: 0.87, 0.94, 0.94
> >
> > It is a much more modern platform;
> >
> > >> uname -a;
> > Linux (none) 3.2.14 #1 SMP Tue Apr 10 12:46:47 EST 2012 i686 GNU/Linux
> >
> > It has had one error in all of that time;
> > >> dmesg output;
> > Pid: 4845, comm: glusterfsd Not tainted 3.2.14 #1
> > Call Trace:
> >  [<c10512d0>] __rcu_pending+0x64/0x294
> >  [<c1051640>] rcu_check_callbacks+0x87/0x98
> >  [<c1034521>] update_process_times+0x2d/0x58
> >  [<c1047bdf>] tick_periodic+0x63/0x65
> >  [<c1047c2d>] tick_handle_periodic+0x17/0x5e
> >  [<c1015ae9>] smp_apic_timer_interrupt+0x67/0x7a
> >  [<c1b2a691>] apic_timer_interrupt+0x31/0x40
> >
> > .. this occurred months ago.
> >
> > Unfortunately due to its embedded nature, there are no logs coming from
> > this platform, only a looped buffer for syslog (and gluster doesn't seem to
> > syslog).  In previous discussions here (months ago) you'll see where I was
> > working to disable/remove logging from GlusterFS so that I could keep it
> > alive in an embedded environment - this is the current run configuration.
> >
> > The Myth Backend host only mounts one of the two NFS shares, but I've seen
> > the fault on the hosts that only mount the other - so I'm reluctant to
> > believe that its a hardware failure at the Drive level on the Saturn /
> > Gluster
> > server.
> >
> > The /etc/fstab entry for this share, on the Myth Backend host, is;
> >
> >   saturn:/recordings /var/lib/mythtv/saturn_recordings nfs
> > nfsvers=3,rw,rsize=8192,wsize=8192,hard,intr,sync,dirsync,noac,noatime,nodev,nosuid
> > 0  0
> >
> > When I softened this to async with soft failures (a config taken straight
> > from the Gluster site/FAQ) it crashed out in a much shorter time-frame
> > (less than a day, one test only - may have been unlucky);
> >
> >   saturn:/recordings /var/lib/mythtv/saturn_recordings nfs
> > defaults,_netdev,nfsvers=3,proto=tcp 0  0
> >
> >
> > Other than the high use Myth Backend host I've failed to accurately nail
> > down the trigger for this issue - which is making diagnostics painful (I
> > like
> > my TV too much to do more than reboot the failed box - and heaven forbid
> > the dad that fails to record Pepper Pig!).
> >
> >
> > Any thoughts?  Beyond enabling logs on the Saturn side ...
> >
> > Is it possible this is a bug that was reaped in later versions of Gluster?
> >
> > Appreciate being set straight ..
> >
> >
> >
> >
> >
> > Cheers,
> >
> >
> >
> >
> > --
> > Ian Latter
> > Late night coder ..
> > http://midnightcode.org/
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at nongnu.org
> > https://lists.nongnu.org/mailman/listinfo/gluster-devel
> >
> 


--
Ian Latter
Late night coder ..
http://midnightcode.org/




More information about the Gluster-devel mailing list