[Gluster-users] some thoughts please on setting up a software archive based on glusterfs

webmaster at securitywonks.org webmaster at securitywonks.org
Tue Jul 29 22:41:01 UTC 2008

Dear Keith

> At 12:21 PM 7/29/2008, webmaster at securitywonks.org wrote:
>>happy to hear you posted in "Who's using Gluster", all the best :) please
>>let me know, once your experiment results are stable and all manual
>> things
>>are automated :)
> our goal is to have around November, a fully (or mostly) self
> managing clustered cpanel installation (running over gluster)
> hopefully that timeframe will work.
>>if I have to add multiple gluster clients, how to do? whether the only
>> way
>>is to use multiple dedicated servers for the cause?
>>or is it economical to setup multiple VPS on a physical server and use
>> for
>>multiple gluster clients ? (I am just trying to make it economical if
>>possible, while trying to gain some extra performance). just trying to
>>think in different methods to do this economically, what do you say sir?
> I do understand the love affair with VPS's and virtual machines,
> however, they dont usually solve performance issues, and generally
> result in reduced performance.
> Lets take your scenario:
> You use memcached to cache database info.  This uses ram.  You also
> will want to use local disk to cache gluster files.
> If you take a server with 8GB of ram, and a 500GB drive, you now can
> cache 400GB+ of filesystem data and you can load up multiple
> memcached processes (I think each one can address 2GB of ram?)  so in
> a single machine you can cache 6-7GB of DB stuff in memory (Also, the
> OS will use whatever extra ram is has to cache).
> Now, split that up into 4 virtual machines.
> You now have, per machine, 100MB of disk cache, and less than 2GB of
> memory for caching (more likely 1 or 1.5).
> So, in your case, running one larger instance is going to provide
> MUCH better performance than splitting your resources.
> The only advantage to virtualizing in this manner is if you're also
> partitioning your data, and then you might want different virtual
> machines doing different things, so you can optimize each unit for
> each particular function.
> For example... you might have one webserver instance serving small
> image files, another serving large software package files, and
> possibly another for the database.  This way you can allocate more
> ram to the DB instance, and more disk for caching to the file server
> instances.  Possibly more for the image server and less for the large
> file server (since that's likely to get more cache misses anyway,
> just focus on optimizing the gluster config for the large files and
> allow more caching space for the small ones.
> But simply virtualizing your hardware and cloning your config will
> have a negative impact on performance overall.

I currently plan to host screenshots, icons, web server, database server
gluster client on server with configuration like quad core processor, 8GB
RAM and 500GB HDD (if required, I will take 750GB HDD). For now, to
simplify the setup, I will simply host screenshots, icons etc on seperate
cpanel accounts and not go for VPS model (even though we can setup
lighttpd for image serving and apache/php for dynamic page serving etc).
If required, I will keep another copy of web server elsewhere and use
round robin way of routing requests. As we already have memcache and
mysql, that will be fine as it is planned for initial start.

coming to glusterfs servers, I hope to have dual core processor server
with 4GB RAM and 500GB HDD, hope that will do fine know.

>>need to think about Alu translator, once again then, thanks for your
>> input
>>on this,
> I'm sure those with more familiarity with the translator can give
> better advice, but as I understand things, it may not help in your
> particular situation.

I will keep this in mind when i ask their team.
>>what is the hardware configuration, it will be helpful, to know, share
>> the
>>configuration details if you like, we will be glad to know
> one server pair are athalon 64 uniprocessors with 2 gb ram each.
> another pair are slightly less speedy processors with 1gb ram each.
> Admitedly my configuration isn't the most powerful, but it works just
> fine, and I get reasonable performance out of them.
> I don't load 1000's of hosts onto my cpanel servers, as I'm not in
> the comodity web sale business.  I'd imagine for those customers
> memory and possibly cpu would need to be much improved.

that configuration is ok for normal hosting. Add some more RAM, it makes
life comfortable.

>>but if we do cache on all gluster clients, end of the day, I doubt, these
>>may become like regular file servers know, any updates may not only
>>stimulate synchronisation between glusterfs servers, but also updation of
>>gluster client cache know, please share your thoughts and observations
>>further, thank you
> the clients behave the same when caching as the servers with afr, as
> far as I know...
> when a file is requested, the gluster client asks the server for it's
> files version and timestamp, it compares it with its own.  if it's
> copy is the same (or newer, I presume), it serves from the local disk
> cache, if not, it fetches from the server and updates its cache.
same behaviour as memcache, i used to think of when memcache kind of
solution be available for file hosting, I feel gluster will fulfill that
cause. Hope more redundant feature be available as it matures.

> If you have an environment where your files are updated constantly,
> you wont benefit from the cache, but as you describe your
> environment, I'd imagine mostly things are loaded once and left alone.
> You're serving software downloads--the software doesn't change
> frequently.  You'll simply add more, right?

agreed, software titles get updated occasionally, some people release
minor versions regularly and some do release updated versions
infrequently. Since, we will host more software, we will have cache
updates occuring regularly, but different content updates occur.

>>what is your recommended configuration of gluster clients?
> I think if they're simply clients with a web server, you would
> benefit more from larger disk (if you'll benefit from caching) and
> wouldn't need as much memory.
>>memcached needs code changes, it will be helpful, try it, it will support
>>the cause, many big sites use it
> yes.. I'm not in control of most of the code my clients run, so I
> haven't bothered with it.
>>thank you my friend, will you notify me, when your cpanel setup is ready
>>with automation?
> I'll send you a note, but again, it likely wont be until after October.

all the best to your cluster based on cpanel and gluster combination
> Keith

With Best Regards
Raghu Veer

More information about the Gluster-users mailing list