[Bugs] [Bug 1261716] New: Sharding - read/write performance improvements for VM workload

Thu Sep 10 03:25:28 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1261716

            Bug ID: 1261716
           Summary: Sharding - read/write performance improvements for VM
                    workload
           Product: GlusterFS
           Version: 3.7.4
         Component: sharding
          Keywords: Triaged
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: kdhananj at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org, pcuzner at redhat.com,
                    sabose at redhat.com
        Depends On: 1258905
            Blocks: 1258386 (Gluster-HC-1), 1261706 (glusterfs-3.7.5)

+++ This bug was initially created as a clone of Bug #1258905 +++

Description of problem:

Paul Cuzner in his testing of sharding in hyperconvergence environment noted a
3x write latency and 2x read latency.
One network operation that can be eliminated among the many things that shard
translator does in every WRITEV and READV fop is the LOOKUP that is done on the
zeroth shard to fetch the size and block_count xattr. Since VM workload is a
single-writer use-case, and the client that wrote to a file is always the one
that is going to read it, the size and block count xattr could be cached (and
kept up-to-date) in the inode ctx of the main file, thereby saving the need for
the (extra) LOOKUP every time.

The other place where network fops can be avoided is XATTROP, if/when there is
a WRITEV that does not change the file size && block count,

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

--- Additional comment from Krutika Dhananjay on 2015-09-01 21:57:23 EDT ---

(In reply to Krutika Dhananjay from comment #0)
> Description of problem:
> 
> Paul Cuzner in his testing of sharding in hyperconvergence environment noted
> a 3x write latency and 2x read latency.
> One network operation that can be eliminated among the many things that
> shard translator does in every WRITEV and READV fop is the LOOKUP that is
> done on the zeroth shard to fetch the size and block_count xattr. Since VM
> workload is a single-writer use-case, and the client that wrote to a file is
> always the one that is going to read it, the size and block count xattr
> could be cached (and kept up-to-date) in the inode ctx of the main file,
> thereby saving the need for the (extra) LOOKUP every time.
> 

Forgot to add that the credit for this idea above goes to Pranith Kumar K.

-Krutika

> The other place where network fops can be avoided is XATTROP, if/when there
> is a WRITEV that does not change the file size && block count,
> 
> Version-Release number of selected component (if applicable):
> 
> 
> How reproducible:
> 
> 
> Steps to Reproduce:
> 1.
> 2.
> 3.
> 
> Actual results:
> 
> 
> Expected results:
> 
> 
> Additional info:

--- Additional comment from Paul Cuzner on 2015-09-01 22:54:59 EDT ---

(In reply to Krutika Dhananjay from comment #0)
> Description of problem:
> 
> Paul Cuzner in his testing of sharding in hyperconvergence environment noted
> a 3x write latency and 2x read latency.
> One network operation that can be eliminated among the many things that
> shard translator does in every WRITEV and READV fop is the LOOKUP that is
> done on the zeroth shard to fetch the size and block_count xattr. Since VM
> workload is a single-writer use-case, and the client that wrote to a file is
> always the one that is going to read it, the size and block count xattr
> could be cached (and kept up-to-date) in the inode ctx of the main file,
> thereby saving the need for the (extra) LOOKUP every time.
> 
> The other place where network fops can be avoided is XATTROP, if/when there
> is a WRITEV that does not change the file size && block count,
> 
> Version-Release number of selected component (if applicable):
> 
> 
> How reproducible:
> 
> 
> Steps to Reproduce:
> 1.
> 2.
> 3.
> 
> Actual results:
> 
> 
> Expected results:
> 
> 
> Additional info:

This sounds great but I have to ask about shared vdisks - for example, RHEV
supports vdisk sharing across vm's. Typically this would mean that the disk is
only ever online to one vm at a time - but I wanted to make sure that this use
case is considered.

--- Additional comment from Vijay Bellur on 2015-09-08 07:59:02 EDT ---

REVIEW: http://review.gluster.org/12126 (features/shard: Performance
improvements in IO path) posted (#1) for review on master by Krutika Dhananjay
(kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2015-09-09 06:46:07 EDT ---

REVIEW: http://review.gluster.org/12126 (features/shard: Performance
improvements in IO path) posted (#2) for review on master by Krutika Dhananjay
(kdhananj at redhat.com)

--- Additional comment from Vijay Bellur on 2015-09-09 06:46:09 EDT ---

REVIEW: http://review.gluster.org/12138 (features/shard: Performance
improvements in IO path - Part 2) posted (#1) for review on master by Krutika
Dhananjay (kdhananj at redhat.com)

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1258386
[Bug 1258386] [TRACKER] Gluster Hyperconvergence - Phase 1
https://bugzilla.redhat.com/show_bug.cgi?id=1258905
[Bug 1258905] Sharding - read/write performance improvements for VM
workload
https://bugzilla.redhat.com/show_bug.cgi?id=1261706
[Bug 1261706] GlusterFS 3.7.5 release tracker
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.