[Gluster-devel] [Review request] write-behind to retry failed syncs

Thu Nov 19 06:43:53 UTC 2015

For ease of access, I am posting the summary from commit-msg below:

1. When sync fails, the cached-write is still preserved unless there
   is a flush/fsync waiting on it.
2. When a sync fails and there is a flush/fsync waiting on the
   cached-write, the cache is thrown away and no further retries will
   be made. In other words flush/fsync act as barriers for all the
   previous writes. All previous writes are either successfully
   synced to backend or forgotten in case of an error. Without such
   barrier fop (especially flush which is issued prior to a close), we
   end up retrying for ever even after fd is closed.
3. If a fop is waiting on cached-write and syncing to backend fails,
   the waiting fop is failed.
4. sync failures when no fop is waiting are ignored and are not
   propagated to application.
5. The effect of repeated sync failures is that, there will be no
   cache for future writes and they cannot be written behind.

Above algo is for handling of transient errors (EDQUOT, ENOSPC,
ENOTCONN). Handling of non-transient errors is slightly different as
below:
1. Throw away the write-buffer, so that cache is freed. This means no
   retries are made for non-transient errors. Also, since cache is
   freed, future writes can be written-behind.
2. Retain the request till an fsync or flush. This means all future
   operations to failed regions will fail till an fsync/flush. This is
   a conservative error handling to force application to know that a
   written-behind write has failed and take remedial action like
   rollback to last fsync and retrying all the writes from that point.