[Gluster-users] Questions about expanding a volume
craig at gluster.com
Fri Nov 12 05:27:09 UTC 2010
Great questions, thanks. If you don't rebalance you will get lots of stub files, creating and redirecting using stub files will slow your environment down. When you add one node you will always move 50% of your files during a rebalance.
Counterintuitively (I think) they more nodes you add at a single time the fewer files that get moved. Details are below. You can use cron to schedule when to run the rebalance, when load is getting high run "volume rebalance <VOLNAME> stop", w hen load is low you would run "volume rebalance <VOLNAME> start". The rebalance will start again where it stopped.
Basic Assumptions:- Distribute equally distributes all the files across all the nodes :O
Existing nodes in the cluster are a set of "N" nodes
New nodes being added to cluster are a set of "M" nodes.
N+M will be the total number of nodes in new volume configuration.
Total files in the cluster before rebalance "X"
Number of files on each existing nodes are "J" = (X / N)
Number of files on each nodes after rebalance/scaling are "K" = (X / (N+M))
K * M = Z (Total Number of Files on set of M nodes after rebalance/scaling)
J * N = X (Total files in the cluster before rebalance/scaling)
Z / N = Y (Total Number of Files moved from each existing nodes after rebalance/scaling)
( Y / J ) * 100 = Percentage of Files moved from each 'N' nodes after rebalance/scaling.
( J - Y ) / J * 100 = Percentage of Files existing on each 'N' nodes after rebalance/scaling
NOTE: "N" is obtained as not as just number of nodes but total sub-volumes for "distribute" translator. "M" is number of additional sub-volumes added before starting rebalance and scaling.
So for multiple exports from a single server we need to calculate the total value moved from the server by multiplying with such number of exports.
Senior Systems Engineer
From: "John Lao" <jlao at cloud9analytics.com>
To: gluster-users at gluster.org
Sent: Wednesday, November 10, 2010 1:36:02 PM
Subject: [Gluster-users] Questions about expanding a volume
I am currently running glusterfs 3.1 with 3 bricks in distribute mode and I am thinking of adding a 4th brick. How does gluster treat a new brick when it is added to an existing volume? If I do not rebalance the volume will it send all/most new data to the new brick or will it still distribute it evenly?
Also, what's the performance impact on the volume when running a rebalance? We have about 5.5TB of data, most files are less than 1 meg.
Gluster-users mailing list
Gluster-users at gluster.org
More information about the Gluster-users