[Gluster-users] Practical limits to the number of volumes?
Toby Corkindale
toby.corkindale at strategicdata.com.au
Wed May 22 01:30:17 UTC 2013
On 21/05/13 22:45, Joseph Santaniello wrote:
> Hello All,
>
> I am exploring options for deploying a Gluster system, and one
> possible scenario we are contemplating would involve potentially
> thousands (1-2000) of volumes with correspondingly thousands of
> mounts.
>
> Are there any intrinsic reason why this would be a bad idea with Gluster?
Two thoughts occur to me - firstly, memory consumption:
Gluster spawns a process for every volume on the servers and for every
mount on the client. So you'd end up with a lot of glusterfs processes
running on each machine. That's a lot of context switching for the
kernel to do, and they're going to use a non-negligible amount of memory.
I'm not actually sure what the real-world memory requirement per process
is.. On a couple of machines I just checked, it looks like between
15-30M (VmRSS-VmLib), but your mileage my vary.
If your memory use per-gluster-process is just 24M, that's still 48G of
ram required to launch a couple of thousand of them. If it turns out
they need more like 128M each, that's quarter of a terabyte of memory
required per machine.
The second thing that worries me is that gluster's recovery mechanism
doesn't have anything to prevent simultaneous recovery across all the
volumes on a node. As a result, as soon as a bad node rejoins the
cluster, all your 2000 volumes will simultaneously start rebuilding,
causing massive random i/o load, and all your clients will starve.
That happens to me even with just a couple of dozen volumes, so I hate
to think how it'd go with thousands!
-Toby
More information about the Gluster-users
mailing list