[Bugs] [Bug 1301401] New: RFE: FEATURE: Lock revocation for POSIX xlator

Sun Jan 24 20:35:22 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1301401

            Bug ID: 1301401
           Summary: RFE: FEATURE: Lock revocation for POSIX xlator
           Product: GlusterFS
           Version: 3.7.6
         Component: locks
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: rwareing at fb.com
                CC: bugs at gluster.org

Created attachment 1117722
  --> https://bugzilla.redhat.com/attachment.cgi?id=1117722&action=edit
Clean patch for v3.7.6 tag in github repo.

Description of problem:
Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster instability
and eventual complete unavailability due to failures in releasing entry/inode
locks in a timely manner.

Classic symptoms on this are increased brick (and/or gNFSd) memory usage due
the high number of (lock request) frames piling up in the processes.  The
failure-mode results in bricks eventually slowing down to a crawl due to
swapping, or OOMing due to complete memory exhaustion; during this period the
entire cluster can begin to fail.  End-users will experience this as hangs on
the filesystem, first in a specific region of the file-system and ultimately
the entire filesystem as the offending brick begins to turn into a zombie (i.e.
not quite dead, but not quite alive either).

Currently, these situations must be handled by an administrator detecting &
intervening via the "clear-locks" CLI command.  Unfortunately this doesn't
scale for large numbers of clusters, and it depends on the correct (external)
detection of the locks piling up (for which there is little signal other than
state dumps).

This patch introduces two features to remedy this situation:

1. Monkey-unlocking - This is a feature targeted at developers (only!) to help
track down crashes due to stale locks, and prove the utility of he lock
revocation feature.  It does this by silently dropping 1% of unlock requests;
simulating bugs or mis-behaving clients.

The feature is activated via:
features.locks-monkey-unlocking <on/off>

You'll see the message
"[<timestamp>] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY LOCKING
(forcing stuck lock)!" in the logs indicating a request has been dropped.

2. Lock revocation - Once enabled, this feature will revoke a contended lock
either by the amount of time the lock has been held, how many other lock
requests are waiting on the lock to be freed, or some combination of both. 
Clients which are losing their locks will be notified by receiving EAGAIN (send
back to their callback function).

The feature is activated via these options:
features.locks-revocation-secs <integer; 0 to disable>
features.locks-revocation-clear-all [on/off]
features.locks-revocation-max-blocked <integer>

Recommended settings are: 1800 seconds for a time based timeout (give clients
the benefit of the doubt, or chose a max-blocked requires some experimentation
depending on your workload, but generally values of hundreds to low thousands
(it's normal for many ten's of locks to be taken out when files are being
written @ high throughput).

Version-Release number of selected component (if applicable):
Clear patch-set provided for GlusterFS v3.7.6, v3.6 patches are available upon
request.

How reproducible:
- Without using monkey-unlocking these situations are extremely difficult to
reproduce.
- 100% by turning on monkey-unlocking; a crash bug was immediately detected
using this feature (and a fix is included with this patch: changes to
xlators/features/locks/src/clear.c).

Steps to Reproduce:
First you will need TWO fuse mounts for this repro.  Call them /mnt/patchy1 &
/mnt/patchy2.

1. Enable monkey unlocking on the volume:
gluster vol set patchy features.locks-monkey-unlocking on

2. From the "patchy1", use DD or some other utility to begin writing to a file,
eventually the dd will hang due to the dropped unlocked requests.  This now
simulates the broken client.  Run:

for i in {1..1000};do dd if=/dev/zero of=/mnt/patchy1/testfile bs=1k
count=10;done'

...this will eventually hang as the unlock request has been lost.

3. Goto another window and setup the mount "patchy2" @ /mnt/patchy2, and
observe that 'echo "hello" >> /mnt/patchy2/testfile" will hang due to the
inability of the client to take out the required lock.

4. Next, re-start the test this time enabling lock revocation; use a timeout of
2-5 seconds for testing: 'gluster vol set patchy features.locks-revocation-secs
<2-5>'

5. Wait 2-5 seconds before executing step 3 above this time.  Observe that this
time the access to the file will succeed, and the writes on patchy1 will
unblock until they hit another failed unlock request due to "monkey-unlocking".

Actual results:
n/a

Expected results:
n/a

Additional info:

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.