[Gluster-devel] Confused on AFR, where does it happen client or server
Sascha Ottolski
ottolski at web.de
Tue Jan 8 07:55:21 UTC 2008
Am Dienstag 08 Januar 2008 06:06:30 schrieb Anand Avati:
> > > Brandon,
> > > who does the copy is decided where the AFR translator is loaded. if
> > > you have AFR loaded on the client side, then the client does the two
> > > writes.
> >
> > you
> >
> > > can also have AFR loaded on the server side, and handle server do the
> > > replication. Translators can be loaded anywhere (client or server,
> >
> > anywhere
> >
> > > in the graph). You need to think more on the lines on how you can
> >
> > 'program
> >
> > > glusterfs' rather than how to 'configure glusterfs'.
> >
> > Performance-wise, which is better? Or does it make sense one way vs.
> > the other based on number of clients?
>
> Depends, if the interconnect between server and client is precious, then
> have the servers replicate (load afr on server side) with replication
> happening on a seperate network. This is also good if you have servers
> interconnected with high speed networks like infiniband.
>
> If your servers are having just one network interface (no seperate network
> for replication), and your client apps are IO bound, then it does not
> matter where you load AFR; they all would give the same performance.
>
> avati
i did a simple test recently, which suggests that there is a significant
performance difference: I did a comparison of client vs. server
side afr with bonnie, for a one client and two servers setup with tla
patch628, connected over GB Ethernet; please see my results below.
There also was a posting on this list with a lot of test results,
suggesting that server side afr is fastest:
http://lists.nongnu.org/archive/html/gluster-devel/2007-08/msg00136.html
In my own results though, client-side afr seems to be better in most of
the test; I should note that I'm not sure if the chosen setup has a
negative impact on the performance (two servers afr-ing each other), so
any comments on this would be highly appreciate (I add the configs for
the tests below).
server side afr (I hope it stays readable):
Version 1.03 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
stf-db22 31968M 31438 43 35528 0 990 0 32375 43 41107 1 38.1 0
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 34 0 416 0 190 0 35 0 511 0 227 0
client side afr:
Version 1.03 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
stf-db22 31968M 27583 38 31518 0 862 0 49522 63 56388 2 28.0 0
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 418 0 2225 1 948 1 455 0 2305 1 947 0
server side afr config:
glusterfs-server.vol.server_afr:
volume fsbrick1
type storage/posix
option directory /data1
end-volume
volume fsbrick2
type storage/posix
option directory /data2
end-volume
volume nsfsbrick1
type storage/posix
option directory /data-ns1
end-volume
volume brick1
type performance/io-threads
option thread-count 8
option queue-limit 1024
subvolumes fsbrick1
end-volume
volume brick2
type performance/io-threads
option thread-count 8
option queue-limit 1024
subvolumes fsbrick2
end-volume
volume brick1r
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.99
option remote-subvolume brick2
end-volume
volume afr1
type cluster/afr
subvolumes brick1 brick1r
# option replicate *:2 # obsolete with tla snapshot
end-volume
### Add network serving capability to above bricks.
volume server
type protocol/server
option transport-type tcp/server # For TCP/IP transport
option listen-port 6996 # Default is 6996
option client-volume-filename /etc/glusterfs/glusterfs-client.vol
subvolumes afr1 nsfsbrick1
option auth.ip.afr1.allow * # Allow access to "brick" volume
option auth.ip.brick2.allow * # Allow access to "brick" volume
option auth.ip.nsfsbrick1.allow * # Allow access to "brick" volume
end-volume
glusterfs-client.vol.test.server_afr:
volume fsc1
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.10
option remote-subvolume afr1
end-volume
volume fsc2
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.99
option remote-subvolume afr1
end-volume
volume ns1
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.10
option remote-subvolume nsfsbrick1
end-volume
volume ns2
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.99
option remote-subvolume nsfsbrick1
end-volume
volume afrns
type cluster/afr
subvolumes ns1 ns2
end-volume
volume bricks
type cluster/unify
subvolumes fsc1 fsc2
option namespace afrns
option scheduler alu
option alu.limits.min-free-disk 5% # Stop creating files when free-space lt 5 %
option alu.limits.max-open-files 10000
option alu.order disk-usage:read-usage:write-usage:open-files-usage:disk-speed-usage
option alu.disk-usage.entry-threshold 2GB # Units in KB, MB and GB are allowed
option alu.disk-usage.exit-threshold 60MB # Units in KB, MB and GB are allowed
option alu.open-files-usage.entry-threshold 1024
option alu.open-files-usage.exit-threshold 32
option alu.stat-refresh.interval 10sec
end-volume
volume readahead
type performance/read-ahead
option page-size 256KB
option page-count 2
subvolumes bricks
end-volume
volume write-behind
type performance/write-behind
option aggregate-size 1MB
subvolumes readahead
end-volume
-----------------------------------------------------------------------
client side afr config:
glusterfs-server.vol.client_afr:
volume fsbrick1
type storage/posix
option directory /data1
end-volume
volume fsbrick2
type storage/posix
option directory /data2
end-volume
volume nsfsbrick1
type storage/posix
option directory /data-ns1
end-volume
volume brick1
type performance/io-threads
option thread-count 8
option queue-limit 1024
subvolumes fsbrick1
end-volume
volume brick2
type performance/io-threads
option thread-count 8
option queue-limit 1024
subvolumes fsbrick2
end-volume
### Add network serving capability to above bricks.
volume server
type protocol/server
option transport-type tcp/server # For TCP/IP transport
option listen-port 6996 # Default is 6996
option client-volume-filename /etc/glusterfs/glusterfs-client.vol
subvolumes brick1 brick2 nsfsbrick1
option auth.ip.brick1.allow * # Allow access to "brick" volume
option auth.ip.brick2.allow * # Allow access to "brick" volume
option auth.ip.nsfsbrick1.allow * # Allow access to "brick" volume
end-volume
glusterfs-client.vol.test.client_afr:
volume fsc1
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.10
option remote-subvolume brick1
end-volume
volume fsc1r
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.10
option remote-subvolume brick2
end-volume
volume fsc2
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.99
option remote-subvolume brick1
end-volume
volume fsc2r
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.99
option remote-subvolume brick2
end-volume
volume afr1
type cluster/afr
subvolumes fsc1 fsc2r
# option replicate *:2 # obsolete with tla snapshot
end-volume
volume afr2
type cluster/afr
subvolumes fsc2 fsc1r
# option replicate *:2 # obsolete with tla snapshot
end-volume
volume ns1
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.10
option remote-subvolume nsfsbrick1
end-volume
volume ns2
type protocol/client
option transport-type tcp/client
option remote-host 10.10.1.99
option remote-subvolume nsfsbrick1
end-volume
volume afrns
type cluster/afr
subvolumes ns1 ns2
# option replicate *:2 # obsolete with tla snapshot
end-volume
volume bricks
type cluster/unify
subvolumes afr1 afr2
option namespace afrns
option scheduler alu
option alu.limits.min-free-disk 5% # Stop creating files when free-space lt 5 %
option alu.limits.max-open-files 10000
option alu.order disk-usage:read-usage:write-usage:open-files-usage:disk-speed-usage
option alu.disk-usage.entry-threshold 2GB # Units in KB, MB and GB are allowed
option alu.disk-usage.exit-threshold 60MB # Units in KB, MB and GB are allowed
option alu.open-files-usage.entry-threshold 1024
option alu.open-files-usage.exit-threshold 32
option alu.stat-refresh.interval 10sec
end-volume
volume readahead
type performance/read-ahead
option page-size 256KB
option page-count 2
subvolumes bricks
end-volume
volume write-behind
type performance/write-behind
option aggregate-size 1MB
subvolumes readahead
end-volume
Cheers, Sascha
More information about the Gluster-devel
mailing list