[Bugs] [Bug 1243655] New: Sharding - Use (f)xattrop (as opposed to (f)setxattr) to update shard size and block count

Thu Jul 16 03:20:11 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1243655

            Bug ID: 1243655
           Summary: Sharding - Use (f)xattrop (as opposed to (f)setxattr)
                    to update shard size and block count
           Product: GlusterFS
           Version: 3.7.3
         Component: sharding
          Keywords: Triaged
          Assignee: bugs at gluster.org
          Reporter: kdhananj at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org
        Depends On: 1232391

+++ This bug was initially created as a clone of Bug #1232391 +++

Description of problem:

Running iozone on a sharded volume fails with EBADFD.
>From the strace output of iozone, it was found that the application ended up
reading fewer bytes than it was expecting to, and bailing out with EBADFD.
On closer inspection of the file on the backend, it was found that the 'size'
xattr was reflecting a smaller number than the actual size of the file.

Turns out this can happen when write-behind flushes the cached writes in one go
, causing them to hit the disk in an out-of-order fashion where the different
io-threads perform the writes in parallel, without any serialisation. For
example, when a write in the range [0-100] races with a write on the same file
in [101-200] range, it could so happen that the second write hits the disk
before the first. And this could cause the second write request to persist the
file size as 200 followed by the second write request to persist the size as
100 bytes, leading to incorrect file size accounting.

And then, this bug can also be hit in cases where the applications performing
I/O are multi-threaded in nature.

The solution involves using xattrop (adding/subtracting only the delta byte
count) as opposed to setxattr to update the size.

Note that even with this approach, things could go wrong if two or more threads
of an application, as part of writing off the end of a file, end up creating
holes in overlapping regions where the hole's contribution to the file size
could be counted more than once, leading to incorrect file size accounting. But
this would be tackled through restructuring of the /.shard backend, which is a
bigger change and will come in much later.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

--- Additional comment from Anand Avati on 2015-06-30 09:27:29 EDT ---

REVIEW: http://review.gluster.org/11467 (features/shard: Use xattrop (as
opposed to setxattr) for updates to size xattr) posted (#1) for review on
master by Krutika Dhananjay (kdhananj at redhat.com)

--- Additional comment from Anand Avati on 2015-07-15 03:43:31 EDT ---

REVIEW: http://review.gluster.org/11467 (features/shard: Use xattrop (as
opposed to setxattr) for updates to size xattr) posted (#3) for review on
master by wangzhen (linux_wz at 163.com)

--- Additional comment from Anand Avati on 2015-07-15 04:27:01 EDT ---

REVIEW: http://review.gluster.org/11467 (features/shard: Use xattrop (as
opposed to setxattr) for updates to size xattr) posted (#4) for review on
master by Krutika Dhananjay (kdhananj at redhat.com)

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1232391
[Bug 1232391] Sharding - Use (f)xattrop (as opposed to (f)setxattr) to
update shard size and block count
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.