[Gluster-users] Best Practices for Gluster Replication

Thu May 13 13:43:26 UTC 2010

Chris,

Excellent, and thanks - that was exactly what I was looking for.

James

-----Original Message-----
From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Christopher Hawkins
Sent: Thursday, May 13, 2010 8:23 AM
To: gluster-users
Subject: Re: [Gluster-users] Best Practices for Gluster Replication

I have followed that debate before. The impression I got was that if you handle it in the clients, then they will be able to fail from one server to the next if the original goes down (after the timeout). 

But if you handle it in the server side, then if a server goes down the only way to get HA for the clients is to externally implement round robin DNS or something like that. Other than this issue, I think either way is technically acceptable. If memory serves, this is the reason why client side is the "default" or preferred setup. 

Chris

----- "James Burnash" <jburnash at knight.com> wrote:

> I know it's only been a day, and I understand that people are busy  -
> but nobody has anything to share on this subject?
> 
> It seems like it would be good thing to be able to at least understand
> why implementing it on the back end would be a bad idea ...
> 
> -----Original Message-----
> From: gluster-users-bounces at gluster.org
> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Burnash,
> James
> Sent: Wednesday, May 12, 2010 10:30 AM
> To: gluster-users at gluster.org
> Subject: [Gluster-users] Best Practices for Gluster Replication
> 
> Greetings List,
> 
> I've searched through the Gluster wiki and a lot of threads to try to
> answer this question, but so far no real luck.
> 
> Simply put - is it better to have replication handled by the clients,
> or by the bricks themselves?
> 
> Volgen for a raid 1 solution creates a config file that does the
> mirroring on the client side - which I would take as an implicit
> endorsement from the Gluster team (great team, BTW). However, it seems
> to me that if the bricks replicated between themselves on our 10Gb
> storage network, it could save a lot of bandwidth for the clients and
> conceivably save them CPU cycles an I/O as well.
> 
> Client machines have 1Gb connections to the storage network, and are
> running CentOS 5.2.
> Server machines have 10Gb connections to the storage network, and are
> running CentOS 5.4.
> 
> Glusterfs.vol:
> ## file auto generated by /usr/bin/glusterfs-volgen (mount.vol)
> # Cmd line:
> # $ /usr/bin/glusterfs-volgen --name testfs --raid 1
> jc1letgfs13-pfs1:/export/read-write
> jc1letgfs14-pfs1:/export/read-write
> jc1letgfs15-pfs1:/export/read-write
> jc1letgfs16-pfs1:/export/read-write
> jc1letgfs17-pfs1:/export/read-write
> jc1letgfs18-pfs1:/export/read-write
> 
> # RAID 1
> # TRANSPORT-TYPE tcp
> volume jc1letgfs17-pfs1-1
>     type protocol/client
>     option transport-type tcp
>     option remote-host jc1letgfs17-pfs1
>     option transport.socket.nodelay on
>     option transport.remote-port 6996
>     option remote-subvolume brick1
> end-volume
> 
> volume jc1letgfs18-pfs1-1
>     type protocol/client
>     option transport-type tcp
>     option remote-host jc1letgfs18-pfs1
>     option transport.socket.nodelay on
>     option transport.remote-port 6996
>     option remote-subvolume brick1
> end-volume
> 
> volume jc1letgfs13-pfs1-1
>     type protocol/client
>     option transport-type tcp
>     option remote-host jc1letgfs13-pfs1
>     option transport.socket.nodelay on
>     option transport.remote-port 6996
>     option remote-subvolume brick1
> end-volume
> 
> volume jc1letgfs15-pfs1-1
>     type protocol/client
>     option transport-type tcp
>     option remote-host jc1letgfs15-pfs1
>     option transport.socket.nodelay on
>     option transport.remote-port 6996
>     option remote-subvolume brick1
> end-volume
> 
> volume jc1letgfs16-pfs1-1
>     type protocol/client
>     option transport-type tcp
>     option remote-host jc1letgfs16-pfs1
>     option transport.socket.nodelay on
>     option transport.remote-port 6996
>     option remote-subvolume brick1
> end-volume
> 
> volume jc1letgfs14-pfs1-1
>     type protocol/client
>     option transport-type tcp
>     option remote-host jc1letgfs14-pfs1
>     option transport.socket.nodelay on
>     option transport.remote-port 6996
>     option remote-subvolume brick1
> end-volume
> 
> volume mirror-0
>     type cluster/replicate
>     subvolumes jc1letgfs13-pfs1-1 jc1letgfs14-pfs1-1
> end-volume
> 
> volume mirror-1
>     type cluster/replicate
>     subvolumes jc1letgfs15-pfs1-1 jc1letgfs16-pfs1-1
> end-volume
> 
> volume mirror-2
>     type cluster/replicate
>     subvolumes jc1letgfs17-pfs1-1 jc1letgfs18-pfs1-1
> end-volume
> 
> volume distribute
>     type cluster/distribute
>     subvolumes mirror-0 mirror-1 mirror-2
> end-volume
> 
> volume readahead
>     type performance/read-ahead
>     option page-count 4
>     subvolumes distribute
> end-volume
> 
> volume iocache
>     type performance/io-cache
>     option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed
> 's/[^0-9]//g') / 5120 ))`MB
>     option cache-timeout 1
>     subvolumes readahead
> end-volume
> 
> volume quickread
>     type performance/quick-read
>     option cache-timeout 1
>     option max-file-size 64kB
>     subvolumes iocache
> end-volume
> 
> volume writebehind
>     type performance/write-behind
>     option cache-size 4MB
>     subvolumes quickread
> end-volume
> 
> volume statprefetch
>     type performance/stat-prefetch
>     subvolumes writebehind
> end-volume
> 
> Glusterfsd.vol:
> ## file auto generated by /usr/bin/glusterfs-volgen (export.vol)
> # Cmd line:
> # $ /usr/bin/glusterfs-volgen --name testfs
> jc1letgfs13-pfs1:/export/read-write
> jc1letgfs14-pfs1:/export/read-write
> jc1letgfs15-pfs1:/export/read-write
> 
> volume posix1
>   type storage/posix
>   option directory /export/read-write
> end-volume
> 
> volume locks1
>     type features/locks
>     subvolumes posix1
> end-volume
> 
> volume brick1
>     type performance/io-threads
>     option thread-count 8
>     subvolumes locks1
> end-volume
> 
> volume server-tcp
>     type protocol/server
>     option transport-type tcp
>     option auth.addr.brick1.allow *
>     option transport.socket.listen-port 6996
>     option transport.socket.nodelay on
>     subvolumes brick1
> end-volume
> 
> James Burnash
> 
> 
> DISCLAIMER:
> This e-mail, and any attachments thereto, is intended only for use by
> the addressee(s) named herein and may contain legally privileged
> and/or confidential information. If you are not the intended recipient
> of this e-mail, you are hereby notified that any dissemination,
> distribution or copying of this e-mail, and any attachments thereto,
> is strictly prohibited. If you have received this in error, please
> immediately notify me and permanently delete the original and any copy
> of any e-mail and any printout thereof. E-mail transmission cannot be
> guaranteed to be secure or error-free. The sender therefore does not
> accept liability for any errors or omissions in the contents of this
> message which arise as a result of e-mail transmission.
> NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may,
> at its discretion, monitor and review the content of all e-mail
> communications. http://www.knight.com
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users