[Bugs] [Bug 1739427] New: An Input/Output error happens on a disperse volume when doing unaligned writes to a sparse file

bugzilla at redhat.com bugzilla at redhat.com
Fri Aug 9 10:03:48 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1739427

            Bug ID: 1739427
           Summary: An Input/Output error happens on a disperse volume
                    when doing unaligned writes to a sparse file
           Product: GlusterFS
           Version: 7
            Status: NEW
         Component: disperse
          Keywords: Reopened
          Assignee: bugs at gluster.org
          Reporter: pkarampu at redhat.com
                CC: bugs at gluster.org, jahernan at redhat.com
        Depends On: 1730715
            Blocks: 1731448, 1732779
  Target Milestone: ---
    Classification: Community



+++ This bug was initially created as a clone of Bug #1730715 +++

Description of problem:

When a write not aligned to the stripe size is done concurrently with other
wirtes on a sparse file of a disperse volume, EIO error can be returned in some
cases.

Version-Release number of selected component (if applicable): mainline


How reproducible:

randomly

Steps to Reproduce:
1. Create a disperse volume
2. Create an empty file
3. Write to two non-overlapping areas of the file with unaligned offsets

Actual results:

In some cases the write to the lower offset fails with EIO.

Expected results:

Both writes should succeed.

Additional info:

EC doesn't allow concurrent writes on overlapping areas, they are serialized.
However non-overlapping writes are serviced in parallel. When a write is not
aligned, EC first needs to read the entire chunk from disk, apply the modified
fragment and write it again.

Suppose we have a 4+2 disperse volume.

The problem appears on sparse files because a write to an offset implicitly
creates data on offsets below it. For example, if a file is empty and we read
10 bytes from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10 bytes (all
containing 0's).

So if we have two writes, the first one at offset 10 and the second one at
offset 1M, EC will send both in parallel because they do not overlap. However,
the first one will try to read missing data from the first chunk (i.e. offset 0
to 9) to recombine the entire chunk and do the final write. This read will
happen in parallel with the write to 1M. What could happen is that 3 bricks
process the write before the read, and the other 3 process the read before the
write. First 3 bricks will return 10 bytes, while the latest three will return
0 (because the file on the brick has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because it
needs at least 4 consistent answers to recover the data. So this read fails
with EIO error. This error is propagated to the parent write, which is aborted
and EIO is returned to the application.

--- Additional comment from Worker Ant on 2019-07-17 12:58:56 UTC ---

REVIEW: https://review.gluster.org/23066 (cluster/ec: fix EIO error for
concurrent writes on sparse files) posted (#1) for review on master by Xavi
Hernandez

--- Additional comment from Worker Ant on 2019-07-24 10:20:48 UTC ---

REVIEW: https://review.gluster.org/23066 (cluster/ec: fix EIO error for
concurrent writes on sparse files) merged (#4) on master by Pranith Kumar
Karampuri

--- Additional comment from Worker Ant on 2019-07-27 06:41:19 UTC ---

REVIEW: https://review.gluster.org/23113 (cluster/ec: fix EIO error for
concurrent writes on sparse files) posted (#1) for review on release-6 by lidi


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1730715
[Bug 1730715] An Input/Output error happens on a disperse volume when doing
unaligned writes to a sparse file
https://bugzilla.redhat.com/show_bug.cgi?id=1731448
[Bug 1731448] [GSS] An Input/Output error happens on a disperse volume when
doing unaligned writes to a sparse file
https://bugzilla.redhat.com/show_bug.cgi?id=1732779
[Bug 1732779] [GSS] An Input/Output error happens on a disperse volume when
doing unaligned writes to a sparse file
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list