[Gluster-users] building 4-nodes cluster

Roman Hlynovskiy roman.hlynovskiy at gmail.com
Mon Aug 11 09:40:14 UTC 2008


Hello Keith,

ok thanks, we will try to make stress tests with php and check if the
same situation apply to our configuration.
did this semaphore issue occurred only with some specific number of
simultaneous connections or it was matter of "luck" :) ?


2008/8/11 Keith Freedman <freedman at freeformit.com>:
> I'll let one of the devs respond to your specific config.
>
> There are a couple cautions ...
> if you're running PHP, you'll want to modify your php.ini to have
> session_save_path on shared storage..  If someones session starts on server
> one and the browser directs them to server2 , their session is missing
>  (Either that or use DB based sessions).
>
> I've noticed some problems with this configuration, in that it seems PHP
> likes to create semaphores all the time.  These get created in
> session_save_path.   There seems to be some cases where processes sometimes
> block on the semaphore form the other server.
>
> I haven't been able to figure out exactly why, and it may be exclusive to my
> configuration, but it's something to watch out for.
> You might end up with non-killable php processes out iowait blocked.  the
> only solution has been to kill gluster and remount the filesystem.  This
> only takes a second but it's inconvenient, and until you realize it's
> happening, any process which tries to access the same files will block also,
> thus eventually consuming all your spare httpd processes.
>
> Keith
>
> At 10:51 PM 8/10/2008, Roman Hlynovskiy wrote:
>>
>> Hello everyone,
>>
>> We want to build a cluster of 4 web-servers. ftp and http will be
>> load-balanced, so we will never know which node will serve ftp/http
>> traffic.
>> Since we don't want to loose any part of functionality in case of
>> getting one of the servers out of order, we have invented the
>> following architecture:
>>  - each server will have 2 data bricks and 1 namespace bricks
>>  - each second data brick is AFRed with first data brick of the next
>> server
>>  - all namespace bricks ar AFRed
>>
>> we've tried to follow recommendations from wiki and the following
>> configs have been created:
>> ------------------------------- begin server config
>> -------------------------------------------
>>
>> #
>> # Object Storage Brick 1
>> #
>>
>> # low-level brick pointing to physical folder
>> volume posix1
>>        type storage/posix
>>        option directory /mnt/os1/export
>> end-volume
>>
>> # put support for fcntl over brick
>> volume locks1
>>        type features/posix-locks
>>        subvolumes posix1
>>        option mandatory on
>> end-volume
>>
>> # put additional io threads for this brick
>> volume brick1
>>        type performance/io-threads
>>        option thread-count 4
>>        option cache-size 32MB
>>        subvolumes locks1
>> end-volume
>>
>> #
>> # Object Storage Brick 2
>> #
>>
>> # low-level brick pointing to physical folder
>> volume posix2
>>        type storage/posix
>>        option directory /mnt/os2/export
>> end-volume
>>
>> # put support for fcntl over brick
>> volume locks2
>>        type features/posix-locks
>>        subvolumes posix2
>>        option mandatory on
>> end-volume
>>
>> # put additional io threads for this brick
>> volume brick2
>>        type performance/io-threads
>>        option thread-count 4
>>        option cache-size 32MB
>>        subvolumes locks2
>> end-volume
>>
>> #
>> # Metadata Storage
>> #
>>
>> volume brick1ns
>>        type storage/posix
>>        option directory /mnt/ms1
>> end-volume
>>
>> #
>> # Volume to export
>> #
>>
>> volume server
>>        type protocol/server
>>        subvolumes brick1 brick2 brick1ns brick2ns
>>        option transport-type tcp/server
>>        option auth.ip.brick1.allow *
>>        option auth.ip.brick2.allow *
>>        option auth.ip.brick1ns.allow *
>> end-volume
>>
>> ------------------------------- end server config
>> -------------------------------------------
>>
>> and client config from one of the nodes
>>
>> ------------------------------- begin client config
>> -------------------------------------------
>>
>> ### begin x-346-01 ###
>>
>> volume brick01
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.11
>>  option remote-subvolume brick1
>> end-volume
>>
>> volume brick02
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.11
>>  option remote-subvolume brick2
>> end-volume
>>
>> volume brick01ns
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.11
>>  option remote-subvolume brick1ns
>> end-volume
>>
>> ### end x-346-01 ###
>>
>>
>>
>> ### begin x-346-02 ###
>>
>> volume brick03
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.21
>>  option remote-subvolume brick1
>> end-volume
>>
>> volume brick04
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.21
>>  option remote-subvolume brick2
>> end-volume
>>
>> volume brick03ns
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.21
>>  option remote-subvolume brick1n
>> end-volume
>>
>> ### end x-346-02 ###
>>
>>
>>
>> ### begin x-346-03 ###
>>
>> volume brick05
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.31
>>  option remote-subvolume brick1
>> end-volume
>>
>> volume brick06
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.31
>>  option remote-subvolume brick2
>> end-volume
>>
>> volume brick05ns
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.31
>>  option remote-subvolume brick1ns
>> end-volume
>>
>> ### begin x-346-03 ###
>>
>>
>>
>> ### begin x-346-04 ###
>>
>> volume brick07
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.41
>>  option remote-subvolume brick1
>> end-volume
>>
>> volume brick08
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.41
>>  option remote-subvolume brick2
>> end-volume
>>
>> volume brick07ns
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 192.168.252.41
>>  option remote-subvolume brick1ns
>> end-volume
>>
>> ### begin x-346-04 ###
>>
>>
>>
>> ### afr bricks ###
>>
>> volume afr01
>>  type cluster/afr
>>  subvolumes brick02 brick03
>> end-volume
>>
>> volume afr02
>>  type cluster/afr
>>  subvolumes brick04 brick05
>> end-volume
>>
>> volume afr03
>>  type cluster/afr
>>  subvolumes brick06 brick07
>> end-volume
>>
>> volume afr04
>>  type cluster/afr
>>  subvolumes brick08 brick01
>> end-volume
>>
>> volume afrns
>>  type cluster/afr
>>  subvolumes brick01ns brick03ns brick05ns brick07ns
>> end-volume
>>
>> ### unify ###
>>
>> volume unify
>>  type cluster/unify
>>  option namespace afrns
>>  option scheduler nufa
>>  option nufa.local-volume-name brick03
>>  option nufa.local-volume-name brick04
>>  option nufa.limits.min-free-disk 5%
>>  subvolumes afr01 afr02 afr03 afr04
>> end-volume
>>
>> ------------------------------- end client config
>> -------------------------------------------
>>
>> seems everything is working fine, but we want to know if there are any
>> alternatives to such configuration and maybe some additional
>> optimizations may be applied?
>> is there any mechanisms to split one file over more than 2 nodes?
>> Do we need readahead translators if we use nufa with local-volume
>> options? what about write-ahead? did we miss something else?
>>
>>
>> --
>> ...WBR, Roman Hlynovskiy
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
>



-- 
...WBR, Roman Hlynovskiy




More information about the Gluster-users mailing list