[Bugs] [Bug 1294969] New: Large system file distribution is broken

bugzilla at redhat.com bugzilla at redhat.com
Thu Dec 31 11:08:11 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1294969

            Bug ID: 1294969
           Summary: Large system file distribution is broken
           Product: GlusterFS
           Version: 3.7.7
         Component: distribute
          Keywords: ZStream
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: sabansal at redhat.com
                CC: bugs at gluster.org, bugzilla.redhat at koetsier.net,
                    gluster-bugs at redhat.com, hamiller at redhat.com,
                    jgeraert at redhat.com, mhergaar at redhat.com,
                    rgowdapp at redhat.com, sabansal at redhat.com,
                    spalai at redhat.com, srangana at redhat.com,
                    storage-qa-internal at redhat.com
        Depends On: 1281946, 1282751



+++ This bug was initially created as a clone of Bug #1282751 +++

+++ This bug was initially created as a clone of Bug #1281946 +++


DHT layout computation uses a count of 1MB chunks to denote the size of a
single brick. When totaling these chunks up the int32 value overflows, and
causes incorrect chunk computation, giving rise to overflowing layout every few
bricks (the above layout sort order would be slightly incorrect when viewed
from DHT dev eyes, as it should be sorted based on subvolume name as it is a
fresh layout).

The function being referred to where this overflow occurs is:
dht_selfheal_layout_new_directory
 - total_size here overflows when adding chunks from each brick pair
 - hence chunk becomes a larger value, as a result we do not end up with
disjoint layout ranges

To fix the issue, this computation needs to be fixed to handle total chunks
beyond 32 bit integer. Looking at possible solutions here.


> 
> To fix the issue, this computation needs to be fixed to handle total chunks
> beyond 32 bit integer. Looking at possible solutions here.

Won't using an unsigned 64 bit type for variables total_size, chunks (and
relevant variables) fix the issue? With 64 bit, we can handle around
17179869184.0 PB, which should be sufficient.

Currently the max size is 0xffffffff. With the increase in the total size would
be need to increase the maz size as well?

--- Additional comment from Vijay Bellur on 2015-11-17 05:45:46 EST ---

REVIEW: http://review.gluster.org/12597 (dht: changing variable type to avoid
overflow) posted (#1) for review on master by Sakshi Bansal

--- Additional comment from Shyamsundar on 2015-11-17 10:26:51 EST ---

@Sakshi, Please open up initial Description, or provide a summary of the
initial description that is not private.

--- Additional comment from Vijay Bellur on 2015-11-17 13:20:19 EST ---

REVIEW: http://review.gluster.org/12597 (dht: changing variable type to avoid
overflow) posted (#2) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2015-11-17 23:42:54 EST ---

REVIEW: http://review.gluster.org/12597 (dht: changing variable type to avoid
overflow) posted (#3) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2015-11-18 04:16:18 EST ---

REVIEW: https://review.gluster.org/12597 (dht: changing variable type to avoid
overflow) posted (#4) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2015-11-18 05:54:16 EST ---

REVIEW: http://review.gluster.org/12597 (dht: changing variable type to avoid
overflow) posted (#5) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2015-11-23 05:20:24 EST ---

REVIEW: http://review.gluster.org/12597 (dht: changing variable type to avoid
overflow) posted (#6) for review on master by Raghavendra G
(rgowdapp at redhat.com)

--- Additional comment from Vijay Bellur on 2015-11-26 10:59:04 EST ---

REVIEW: http://review.gluster.org/12597 (dht : changing variable type to avoid
overflow) posted (#7) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2015-11-27 00:04:31 EST ---

COMMIT: http://review.gluster.org/12597 committed in master by Raghavendra G
(rgowdapp at redhat.com) 
------
commit 6b315b87d80cf681b976d78b444c761fc3a1caa7
Author: Sakshi Bansal <sabansal at redhat.com>
Date:   Tue Nov 17 15:11:40 2015 +0530

    dht : changing variable type to avoid overflow

    For layout computation we find total size of the cluster
    and store it in an unsigned 32 bit variable. For large
    clusters this value may overflow which leads to wrong
    computations and for some bricks the layout may overflow.
    Hence using unsigned 64 bit to handle large values.

    Change-Id: I7c3ba26ea2c4158065ea9e74705a7ede1b6759c7
    BUG: 1282751
    Signed-off-by: Sakshi Bansal <sabansal at redhat.com>
    Reviewed-on: http://review.gluster.org/12597
    Reviewed-by: Susant Palai <spalai at redhat.com>
    Tested-by: NetBSD Build System <jenkins at build.gluster.org>
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Raghavendra G <rgowdapp at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1281946
[Bug 1281946] Large system file distribution is broken
https://bugzilla.redhat.com/show_bug.cgi?id=1282751
[Bug 1282751] Large system file distribution is broken
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list