[Gluster-users] Gluster scalability [Was: Adding new storage nodes to existing GlusterFS network]

Tue Sep 28 17:34:25 UTC 2010

  On 9/28/2010 10:26 AM, Emmanuel Noobadmin wrote:
> After following Roland's thread
> (http://gluster.org/pipermail/gluster-users/2010-September/005311.html),
> I'm wondering if this means there's a limit to how scalable gluster is
> if we start small.
>
> It seems that every time a new brick is added, the scale and defrag
> script must be ran. Since we're going over the network, for those of
> us starting on low budget interconnect, i.e. Gigabit Ethernet, it
> would take a long while.
>
> Let's say I'm using 4x1.5TB drives for 4.5TB RAID 5 storage brick.
> Starting with four in replicate/distribute. So effectively 9TB of
> space for the gluster network. Now if we hit 90% capacity and add four
> new 4.5TB bricks. Am I correct to understand the scale and defrag
> script would cause say around 6TB of data to be spread around, twice
> since it's replicate and assuming the remaining 2TB get to stay where
> they were.
>
> If the network was able to sustain 30MB/s, that would take around 48
> hours of continuous operation to complete. Since the cluster is
> unlikely to be idle and there is bound to be some overheads, would
> that be closer to 72hrs in reality?
>
> Now it seems to me that since the scale and defrag would redistribute
> the chunks all over the new nodes, the next set of four would take 2x
> (97~145hrs) as long since there are more data/files now. Then the next
> group of four would take 3x (146~220hrs) or about a week.
>
> At some point, it seems that adding a new set of nodes may cause a
> scale/defrag time so long that the organisation may have to add a new
> set before it finishes?
>
> It doesn't seem to make sense so what am I actually getting wrong?
In part it depends on your network infrastructure -- in particular your 
switches/routers.  The 30MB/s you mentioned is (or should be) per 
interface.  Yes, with more nodes there is more data to move around, but 
there are also more interfaces involved in moving the data.  As long as 
you don't come close to saturating your switches/routers, it should (at 
least in theory) take roughly the same time regardless of how many nodes 
are involved (assuming that the nodes remain the same).