[Gluster-users] Gluster and bonding
Jorick Astrego
jorick at netbulae.eu
Mon Feb 25 12:44:04 UTC 2019
Hi,
Well no, mode 5 and mode 6 also have fault tollerance and don't need any
special switch config.
Quick google search:
https://serverfault.com/questions/734246/does-balance-alb-and-balance-tlb-support-fault-tolerance
Bonding Mode 5 (balance-tlb) works by looking at all the devices in
the bond, and sending out the slave with the least current traffic
load. Traffic is only received by one slave (the "primary slave").
If a slave is lost, that slave is not considered for transmission,
so this mode is fault-tolerant.
Bonding Mode 6 (balance-alb) works as above, except incoming ARP
requests are intercepted by the bonding driver, and the bonding
driver generates ARP replies so that external hosts are tricked into
sending their traffic into one of the other bonding slaves instead
of the primary slave. If many hosts in the same broadcast domain
contact the bond, then traffic should balance roughly evenly into
all slaves.
If a slave is lost in Mode 6, then it may take some time for a
remote host to time out its ARP table entry and send a new ARP
request. A TCP or SCTP retransmission tents to lead into ARP request
fairly quickly, but a UDP datagram does not, and will rely on the
usual ARP table refresh. So Mode 6 /is/ fault tolerant, but
convergence on slave loss may take some time depending on the Layer
4 protocol used.
If you are worried about fast fault tolerance, then consider using
Mode 4 (802.3ad aka LACP) which negotiates link aggregation between
the bond and the switch, and constantly updates the link status
between the aggregation partners. Mode 4 also has configurable load
balance hashing so is better for in-order delivery of TCP streams
compared to Mode 5 or Mode 6.
https://wiki.linuxfoundation.org/networking/bonding
*
*balance-tlb or 5*
Adaptive transmit load balancing: channel bonding that does not
require any special switch support. The outgoing traffic is
distributed according to the current load (computed relative to the
speed) on each slave. Incoming traffic is received by the current
slave. *If the receiving slave fails, another slave takes over the
MAC address of the failed receiving slave.*
o
Prerequisite:
1.
Ethtool support in the base drivers for retrieving the speed
of each slave.
*
*balance-alb or 6 *
Adaptive load balancing: *includes balance-tlb plus receive load
balancing* (rlb) for IPV4 traffic, and does not require any special
switch support. The receive load balancing is achieved by ARP
negotiation.
o
The bonding driver intercepts the ARP Replies sent by the local
system on their way out and overwrites the source hardware
address with the unique hardware address of one of the slaves in
the bond such that different peers use different hardware
addresses for the server.
o
Receive traffic from connections created by the server is also
balanced. When the local system sends an ARP Request the bonding
driver copies and saves the peer's IP information from the ARP
packet.
o
When the ARP Reply arrives from the peer, its hardware address
is retrieved and the bonding driver initiates an ARP reply to
this peer assigning it to one of the slaves in the bond.
o
A problematic outcome of using ARP negotiation for balancing is
that each time that an ARP request is broadcast it uses the
hardware address of the bond. Hence, peers learn the hardware
address of the bond and the balancing of receive traffic
collapses to the current slave. This is handled by sending
updates (ARP Replies) to all the peers with their individually
assigned hardware address such that the traffic is
redistributed. Receive traffic is also redistributed when a new
slave is added to the bond and when an inactive slave is
re-activated. The receive load is distributed sequentially
(round robin) among the group of highest speed slaves in the bond.
o
When a link is reconnected or a new slave joins the bond the
receive traffic is redistributed among all active slaves in the
bond by initiating ARP Replies with the selected mac address to
each of the clients. The updelay parameter (detailed below) must
be set to a value equal or greater than the switch's forwarding
delay so that the ARP Replies sent to the peers will not be
blocked by the switch.
On 2/25/19 1:16 PM, Martin Toth wrote:
> Hi Alex,
>
> you have to use bond mode 4 (LACP - 802.3ad) in order to achieve
> redundancy of cables/ports/switches. I suppose this is what you want.
>
> BR,
> Martin
>
>> On 25 Feb 2019, at 11:43, Alex K <rightkicktech at gmail.com
>> <mailto:rightkicktech at gmail.com>> wrote:
>>
>> Hi All,
>>
>> I was asking if it is possible to have the two separate cables
>> connected to two different physical switched. When trying mode6 or
>> mode1 in this setup gluster was refusing to start the volumes, giving
>> me "transport endpoint is not connected".
>>
>> server1: cable1 ---------------- switch1 ---------------------
>> server2: cable1
>> |
>> server1: cable2 ---------------- switch2 ---------------------
>> server2: cable2
>>
>> Both switches are connected with each other also. This is done to
>> achieve redundancy for the switches.
>> When disconnecting cable2 from both servers, then gluster was happy.
>> What could be the problem?
>>
>> Thanx,
>> Alex
>>
>>
>> On Mon, Feb 25, 2019 at 11:32 AM Jorick Astrego <jorick at netbulae.eu
>> <mailto:jorick at netbulae.eu>> wrote:
>>
>> Hi,
>>
>> We use bonding mode 6 (balance-alb) for GlusterFS traffic
>>
>> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4
>>
>> Preferred bonding mode for Red Hat Gluster Storage client is
>> mode 6 (balance-alb), this allows client to transmit writes
>> in parallel on separate NICs much of the time.
>>
>> Regards,
>>
>> Jorick Astrego
>>
>> On 2/25/19 5:41 AM, Dmitry Melekhov wrote:
>>> 23.02.2019 19:54, Alex K пишет:
>>>> Hi all,
>>>>
>>>> I have a replica 3 setup where each server was configured with
>>>> a dual interfaces in mode 6 bonding. All cables were connected
>>>> to one common network switch.
>>>>
>>>> To add redundancy to the switch, and avoid being a single point
>>>> of failure, I connected each second cable of each server to a
>>>> second switch. This turned out to not function as gluster was
>>>> refusing to start the volume logging "transport endpoint is
>>>> disconnected" although all nodes were able to reach each other
>>>> (ping) in the storage network. I switched the mode to mode 1
>>>> (active/passive) and initially it worked but following a reboot
>>>> of all cluster same issue appeared. Gluster is not starting the
>>>> volumes.
>>>>
>>>> Isn't active/passive supposed to work like that? Can one have
>>>> such redundant network setup or are there any other recommended
>>>> approaches?
>>>>
>>>
>>> Yes, we use lacp, I guess this is mode 4 ( we use teamd ), it
>>> is, no doubt, best way.
>>>
>>>
>>>> Thanx,
>>>> Alex
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>
>> Met vriendelijke groet, With kind regards,
>>
>> Jorick Astrego
>> *
>> Netbulae Virtualization Experts *
>> ------------------------------------------------------------------------
>> Tel: 053 20 30 270 info at netbulae.eu <mailto:info at netbulae.eu>
>> Staalsteden 4-3A KvK 08198180
>> Fax: 053 20 30 271 www.netbulae.eu <http://www.netbulae.eu/>
>> 7547 TA Enschede BTW NL821234584B01
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
Met vriendelijke groet, With kind regards,
Jorick Astrego
Netbulae Virtualization Experts
----------------
Tel: 053 20 30 270 info at netbulae.eu Staalsteden 4-3A KvK 08198180
Fax: 053 20 30 271 www.netbulae.eu 7547 TA Enschede BTW NL821234584B01
----------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190225/74bd54bd/attachment.html>
More information about the Gluster-users
mailing list