Add documentation about NB DB driver
This includes the option to use the OVN-Cluster for routing instead of the kernel. It also updates the supportability matrix to better reflect the current status, and makes a little reorg on the organization structure Change-Id: If8fb9a42f74511e9f70a25d7c08dce99c20c3f10
This commit is contained in:
parent
6678aa5250
commit
f94c041e7a
BIN
doc/images/ovn-cluster-overview.png
Normal file
BIN
doc/images/ovn-cluster-overview.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 44 KiB |
@ -22,62 +22,67 @@ The next sections highlight the options and features supported by each driver
|
||||
BGP Driver (SB)
|
||||
---------------
|
||||
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-------------+
|
||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
|
||||
+=================+=====================================================+==========================================+==========================================+==========================+====================+=======================+=============+
|
||||
| Underlay | Expose IPs on the default underlay network | Adding IP to dummy nic isolated in a VRF | Ingress: ip rules, and ip routes on the | Yes | Yes | No | Yes |
|
||||
| | | | routing table associated with OVS | | (expose_ipv6_gua | | |
|
||||
| | | | Egress: OVS flow to change MAC | (expose_tenant_networks) | _tenant_networks) | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-------------+
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-----------+
|
||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
|
||||
+=================+=====================================================+==========================================+==========================================+==========================+====================+=======================+===========+
|
||||
| Underlay | Expose IPs on the default underlay network. | Adding IP to dummy NIC isolated in a VRF | Ingress: ip rules, and ip routes on the | Yes | Yes | No | Yes |
|
||||
| | | | routing table associated with OVS | | (expose_ipv6_gua | | |
|
||||
| | | | Egress: OVS flow to change MAC | (expose_tenant_networks) | _tenant_networks) | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-----------+
|
||||
|
||||
|
||||
BGP Driver (NB)
|
||||
---------------
|
||||
|
||||
Note until RFE on OVN (https://bugzilla.redhat.com/show_bug.cgi?id=2107515)
|
||||
is implemented there is no option to expose tenant networks as we do not know
|
||||
where the CR-LRP port is associated to.
|
||||
OVN version 23.09 is required to expose tenant networks and ovn-lb, because
|
||||
CR-LRP port chassis information in the NB DB is only available in that
|
||||
version (https://bugzilla.redhat.com/show_bug.cgi?id=2107515).
|
||||
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+-------------+
|
||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants or GUA | OVS-DPDK/HWOL Support | Implemented |
|
||||
+=================+=====================================================+==========================================+==========================================+==========================+=======================+=============+
|
||||
| Underlay | Expose IPs on the default underlay network | Adding IP to dummy nic isolated in a VRF | Ingress: ip rules, and ip routes on the | No support until OVN | No | Yes |
|
||||
| | | | routing table associated to ovs | has information about | | |
|
||||
| | | | Egress: ovs-flow to change mac | the CR-LRP chassis on | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ the SB DB +-----------------------+-------------+
|
||||
| L2VNI | Extends the L2 segment on a given VNI | No need to expose it, automatic with the | Ingress: vxlan + bridge device | | No | No |
|
||||
| | | FRR configuration and the wiring | Egress: nothing | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
|
||||
| VRF | Expose IPs on a given VRF (vni id) | Add IPs to dummy nic associated to the | Ingress: vxlan + bridge device | | No | No |
|
||||
| | | VRF device (lo_VNI_ID) | Egress: flow to redirect to VRF device | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
|
||||
| Dynamic | Mix of the previous, depending on annotations it | Mix of the previous three | Ingress: mix of all the above | | No | No |
|
||||
| | exposes it differently and on different VNIs | | Egress: mix of all the above | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
|
||||
| OVN-Cluster | Make use of an extra OVN cluster (per node) instead | Adding IP to dummy nic isolated in a VRF | Ingress: ovn routes, ovs flow (mac tweak)| | Yes | No |
|
||||
| | of kernel routing -- exposing the IPs with BGP is | (as it only supports the underlay option)| Egress: ovn routes and policies, | | | |
|
||||
| | the same as before | | and ovs flow (mac tweak) | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+-------------+
|
||||
The following table lists the various methods you can use to expose the
|
||||
networks/IPS, how they expose the IPs and the tenant networks, and whether
|
||||
OVS-DPDK and hardware offload (HWOL) is supported.
|
||||
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants or GUA | OVS-DPDK/HWOL Support | Supported |
|
||||
+=================+=====================================================+==========================================+==========================================+==========================+=======================+===============+
|
||||
| Underlay | Expose IPs on the default underlay network. | Adding IP to dummy NIC isolated in a VRF.| Ingress: ip rules, and ip routes on the | Yes | No | Yes |
|
||||
| | | | routing table associated to OVS | | | |
|
||||
| | | | Egress: OVS-flow to change MAC | (expose_tenant_networks) | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||
| L2VNI | Extends the L2 segment on a given VNI. | No need to expose it, automatic with the | Ingress: vxlan + bridge device | N/A | No | No |
|
||||
| | | FRR configuration and the wiring. | Egress: nothing | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||
| VRF | Expose IPs on a given VRF (vni id). | Add IPs to dummy NIC associated to the | Ingress: vxlan + bridge device | Yes | No | No |
|
||||
| | | VRF device (lo_VNI_ID). | Egress: flow to redirect to VRF device | (Not implemented) | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||
| Dynamic | Mix of the previous. Depending on annotations it | Mix of the previous three. | Ingress: mix of all the above | Depends on the method | No | No |
|
||||
| | exposes IPs differently and on different VNIs. | | Egress: mix of all the above | used | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||
| OVN | Make use of an extra OVN cluster (per node) instead | Adding IP to dummy NIC isolated in a VRF | Ingress: OVN routes, OVS flow (MAC tweak)| Yes | Yes | Yes. Only for |
|
||||
| | of kernel routing -- exposing the IPs with BGP is | (as it only supports the underlay | Egress: OVN routes and policies, | (Not implemented) | | ipv4 and flat |
|
||||
| | the same as before. | option). | and OVS flow (MAC tweak) | | | provider |
|
||||
| | | | | | | networks |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||
|
||||
|
||||
BGP Stretched Driver (SB)
|
||||
-------------------------
|
||||
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
|
||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
|
||||
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+=============+
|
||||
| Underlay | Expose IPs on the default underlay network | Adding IP routes to default VRF table | Ingress: ip rules, and ip routes on the | Yes | No | No | Yes |
|
||||
| | | | routing table associated to ovs | | | | |
|
||||
| | | | Egress: ovs-flow to change mac | | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
|
||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
|
||||
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+===========+
|
||||
| Underlay | Expose IPs on the default underlay network. | Adding IP routes to default VRF table. | Ingress: ip rules, and ip routes on the | Yes | No | No | Yes |
|
||||
| | | | routing table associated to OVS | | | | |
|
||||
| | | | Egress: OVS-flow to change MAC | | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
|
||||
|
||||
|
||||
EVPN Driver (SB)
|
||||
----------------
|
||||
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
|
||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
|
||||
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+=============+
|
||||
| VRF | Expose IPs on a given VRF (vni id) -- requires | Add IPs to dummy nic associated to the | Ingress: vxlan + bridge device | Yes | No | No | No |
|
||||
| | newtorking-bgpvpn or manual NB DB inputs | VRF device (lo_VNI_ID) | Egress: flow to redirect to VRF device | | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
|
||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
|
||||
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+===========+
|
||||
| VRF | Expose IPs on a given VRF (vni id) -- requires | Add IPs to dummy NIC associated to the | Ingress: vxlan + bridge device | Yes | No | No | No |
|
||||
| | newtorking-bgpvpn or manual NB DB inputs. | VRF device (lo_VNI_ID). | Egress: flow to redirect to VRF device | | | | |
|
||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
|
163
doc/source/contributor/agent_deployment.rst
Normal file
163
doc/source/contributor/agent_deployment.rst
Normal file
@ -0,0 +1,163 @@
|
||||
Agent deployment
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The BGP mode (for both NB and SB drivers) exposes the VMs and LBs in provider
|
||||
networks or with FIPs, as well as VMs on tenant networks if
|
||||
``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks`` configuration
|
||||
options are enabled.
|
||||
|
||||
There is a need to deploy the agent in all the nodes where VMs can be created
|
||||
as well as in the networker nodes (i.e., where OVN router gateway ports can be
|
||||
allocated):
|
||||
|
||||
- For VMs and Amphora load balancers on provider networks or with FIPs,
|
||||
the IP is exposed on the node where the VM (or amphora) is deployed.
|
||||
Therefore the agent needs to be running on the compute nodes.
|
||||
|
||||
- For VMs on tenant networks (with ``expose_tenant_networks`` or
|
||||
``expose_ipv6_gua_tenant_networks`` configuration options enabled), the agent
|
||||
needs to be running on the networker nodes. In OpenStack, with OVN
|
||||
networking, the N/S traffic to the tenant VMs (without FIPs) needs to go
|
||||
through the networking nodes, more specifically the one hosting the
|
||||
chassisredirect OVN port (cr-lrp), connecting the provider network to the
|
||||
OVN virtual router. Hence, the VM IPs are advertised through BGP in that
|
||||
node, and from there it follows the normal path to the OpenStack compute
|
||||
node where the VM is located — through the tunnel.
|
||||
|
||||
- Similarly, for OVN load balancer the IPs are exposed on the networker node.
|
||||
In this case the ARP request for the VIP is replied by the OVN router
|
||||
gateway port, therefore the traffic needs to be injected into OVN overlay
|
||||
at that point too.
|
||||
Therefore the agent needs to be running on the networker nodes for OVN
|
||||
load balancers.
|
||||
|
||||
As an example of how to start the OVN BGP Agent on the nodes, see the commands
|
||||
below:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ python setup.py install
|
||||
$ cat bgp-agent.conf
|
||||
# sample configuration that can be adapted based on needs
|
||||
[DEFAULT]
|
||||
debug=True
|
||||
reconcile_interval=120
|
||||
expose_tenant_networks=True
|
||||
# expose_ipv6_gua_tenant_networks=True
|
||||
# for SB DB driver
|
||||
driver=ovn_bgp_driver
|
||||
# for NB DB driver
|
||||
#driver=nb_ovn_bgp_driver
|
||||
bgp_AS=64999
|
||||
bgp_nic=bgp-nic
|
||||
bgp_vrf=bgp-vrf
|
||||
bgp_vrf_table_id=10
|
||||
ovsdb_connection=tcp:127.0.0.1:6640
|
||||
address_scopes=2237917c7b12489a84de4ef384a2bcae
|
||||
|
||||
[ovn]
|
||||
ovn_nb_connection = tcp:172.17.0.30:6641
|
||||
ovn_sb_connection = tcp:172.17.0.30:6642
|
||||
|
||||
[agent]
|
||||
root_helper=sudo ovn-bgp-agent-rootwrap /etc/ovn-bgp-agent/rootwrap.conf
|
||||
root_helper_daemon=sudo ovn-bgp-agent-rootwrap-daemon /etc/ovn-bgp-agent/rootwrap.conf
|
||||
|
||||
$ sudo bgp-agent --config-dir bgp-agent.conf
|
||||
Starting BGP Agent...
|
||||
Loaded chassis 51c8480f-c573-4c1c-b96e-582f9ca21e70.
|
||||
BGP Agent Started...
|
||||
Ensuring VRF configuration for advertising routes
|
||||
Configuring br-ex default rule and routing tables for each provider network
|
||||
Found routing table for br-ex with: ['201', 'br-ex']
|
||||
Sync current routes.
|
||||
Add BGP route for logical port with ip 172.24.4.226
|
||||
Add BGP route for FIP with ip 172.24.4.199
|
||||
Add BGP route for CR-LRP Port 172.24.4.221
|
||||
....
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
If you only want to expose the IPv6 GUA tenant IPs, then remove the option
|
||||
``expose_tenant_networks`` and add ``expose_ipv6_gua_tenant_networks=True``
|
||||
instead.
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
If you want to filter the tenant networks to be exposed by some specific
|
||||
address scopes, add the list of address scopes to ``address_scope=XXX``
|
||||
section. If no filtering should be applied, just remove the line.
|
||||
|
||||
|
||||
Note that the OVN BGP Agent operates under the next assumptions:
|
||||
|
||||
- A dynamic routing solution, in this case FRR, is deployed and
|
||||
advertises/withdraws routes added/deleted to/from certain local interface,
|
||||
in this case the ones associated to the VRF created to that end. As only VM
|
||||
and load balancer IPs need to be advertised, FRR needs to be configure with
|
||||
the proper filtering so that only /32 (or /128 for IPv6) IPs are advertised.
|
||||
A sample config for FRR is:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
frr version 7.5
|
||||
frr defaults traditional
|
||||
hostname cmp-1-0
|
||||
log file /var/log/frr/frr.log debugging
|
||||
log timestamp precision 3
|
||||
service integrated-vtysh-config
|
||||
line vty
|
||||
|
||||
router bgp 64999
|
||||
bgp router-id 172.30.1.1
|
||||
bgp log-neighbor-changes
|
||||
bgp graceful-shutdown
|
||||
no bgp default ipv4-unicast
|
||||
no bgp ebgp-requires-policy
|
||||
|
||||
neighbor uplink peer-group
|
||||
neighbor uplink remote-as internal
|
||||
neighbor uplink password foobar
|
||||
neighbor enp2s0 interface peer-group uplink
|
||||
neighbor enp3s0 interface peer-group uplink
|
||||
|
||||
address-family ipv4 unicast
|
||||
redistribute connected
|
||||
neighbor uplink activate
|
||||
neighbor uplink allowas-in origin
|
||||
neighbor uplink prefix-list only-host-prefixes out
|
||||
exit-address-family
|
||||
|
||||
address-family ipv6 unicast
|
||||
redistribute connected
|
||||
neighbor uplink activate
|
||||
neighbor uplink allowas-in origin
|
||||
neighbor uplink prefix-list only-host-prefixes out
|
||||
exit-address-family
|
||||
|
||||
ip prefix-list only-default permit 0.0.0.0/0
|
||||
ip prefix-list only-host-prefixes permit 0.0.0.0/0 ge 32
|
||||
|
||||
route-map rm-only-default permit 10
|
||||
match ip address prefix-list only-default
|
||||
set src 172.30.1.1
|
||||
|
||||
ip protocol bgp route-map rm-only-default
|
||||
|
||||
ipv6 prefix-list only-default permit ::/0
|
||||
ipv6 prefix-list only-host-prefixes permit ::/0 ge 128
|
||||
|
||||
route-map rm-only-default permit 11
|
||||
match ipv6 address prefix-list only-default
|
||||
set src f00d:f00d:f00d:f00d:f00d:f00d:f00d:0004
|
||||
|
||||
ipv6 protocol bgp route-map rm-only-default
|
||||
|
||||
ip nht resolve-via-default
|
||||
|
||||
|
||||
- The relevant provider OVS bridges are created and configured with a loopback
|
||||
IP address (eg. 1.1.1.1/32 for IPv4), and proxy ARP/NDP is enabled on their
|
||||
kernel interface.
|
66
doc/source/contributor/bgp_advertising.rst
Normal file
66
doc/source/contributor/bgp_advertising.rst
Normal file
@ -0,0 +1,66 @@
|
||||
BGP Advertisement
|
||||
+++++++++++++++++
|
||||
|
||||
The OVN BGP Agent (both SB and NB drivers) is in charge of triggering FRR
|
||||
(IP routing protocol suite for Linux which includes protocol daemons for BGP,
|
||||
OSPF, RIP, among others) to advertise/withdraw directly connected routes via
|
||||
BGP. To do that, when the agent starts, it ensures that:
|
||||
|
||||
- FRR local instance is reconfigured to leak routes for a new VRF. To do that
|
||||
it uses ``vtysh shell``. It connects to the existsing FRR socket (
|
||||
``--vty_socket`` option) and executes the next commands, passing them through
|
||||
a file (``-c FILE_NAME`` option):
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
router bgp {{ bgp_as }}
|
||||
address-family ipv4 unicast
|
||||
import vrf {{ vrf_name }}
|
||||
exit-address-family
|
||||
|
||||
address-family ipv6 unicast
|
||||
import vrf {{ vrf_name }}
|
||||
exit-address-family
|
||||
|
||||
router bgp {{ bgp_as }} vrf {{ vrf_name }}
|
||||
bgp router-id {{ bgp_router_id }}
|
||||
address-family ipv4 unicast
|
||||
redistribute connected
|
||||
exit-address-family
|
||||
|
||||
address-family ipv6 unicast
|
||||
redistribute connected
|
||||
exit-address-family
|
||||
|
||||
|
||||
- There is a VRF created (the one leaked in the previous step), by default
|
||||
with name ``bgp-vrf``.
|
||||
|
||||
- There is a dummy interface type (by default named ``bgp-nic``), associated to
|
||||
the previously created VRF device.
|
||||
|
||||
- Ensure ARP/NDP is enabled at OVS provider bridges by adding an IP to it.
|
||||
|
||||
|
||||
Then, to expose the VMs/LB IPs as they are created (or upon
|
||||
initialization or re-sync), since the FRR configuration has the
|
||||
``redistribute connected`` option enabled, the only action needed to expose it
|
||||
(or withdraw it) is to add it (or remove it) from the ``bgp-nic`` dummy interface.
|
||||
Then it relies on Zebra to do the BGP advertisement, as Zebra detects the
|
||||
addition/deletion of the IP on the local interface and advertises/withdraws
|
||||
the route:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip addr add IPv4/32 dev bgp-nic
|
||||
$ ip addr add IPv6/128 dev bgp-nic
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
As we also want to be able to expose VM connected to tenant networks
|
||||
(when ``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks``
|
||||
configuration options are enabled), there is a need to expose the Neutron
|
||||
router gateway port (CR-LRP on OVN) so that the traffic to VMs in tenant
|
||||
networks is injected into OVN overlay through the node that is hosting
|
||||
that port.
|
@ -1,603 +0,0 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
Convention for heading levels in Neutron devref:
|
||||
======= Heading 0 (reserved for the title in a document)
|
||||
------- Heading 1
|
||||
~~~~~~~ Heading 2
|
||||
+++++++ Heading 3
|
||||
''''''' Heading 4
|
||||
(Avoid deeper levels because they do not render well.)
|
||||
|
||||
=======================================
|
||||
OVN BGP Agent: Design of the BGP Driver
|
||||
=======================================
|
||||
|
||||
Purpose
|
||||
-------
|
||||
|
||||
The purpose of this document is to present the design decision behind
|
||||
the BGP Driver for the Networking OVN BGP agent.
|
||||
|
||||
The main purpose of adding support for BGP is to be able to expose Virtual
|
||||
Machines (VMs) and Load Balancers (LBs) IPs through BGP dynamic protocol
|
||||
when they either have a Floating IP (FIP) associated or are booted/created
|
||||
on a provider network -- also in tenant networks if a flag is enabled.
|
||||
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
With the increment of virtualized/containerized workloads it is becoming more
|
||||
and more common to use pure layer-3 Spine and Leaf network deployments at
|
||||
datacenters. There are several benefits of this, such as reduced complexity at
|
||||
scale, reduced failures domains, limiting broadcast traffic, among others.
|
||||
|
||||
The OVN BGP Agent is a Python based daemon that runs on each node
|
||||
(e.g., OpenStack controllers and/or compute nodes). It connects to the OVN
|
||||
SouthBound DataBase (OVN SB DB) to detect the specific events it needs to
|
||||
react to, and then leverages FRR to expose the routes towards the VMs, and
|
||||
kernel networking capabilities to redirect the traffic arriving on the nodes
|
||||
to the OVN overlay.
|
||||
|
||||
.. note::
|
||||
|
||||
Note it is only intended for the N/S traffic, the E/W traffic will work
|
||||
exactly the same as before, i.e., VMs are connected through geneve
|
||||
tunnels.
|
||||
|
||||
|
||||
The agent provides a multi-driver implementation that allows you to configure
|
||||
it for specific infrastructure running on top of OVN, for instance OpenStack
|
||||
or Kubernetes/OpenShift.
|
||||
This simple design allows the agent to implement different drivers, depending
|
||||
on what OVN SB DB events are being watched (watchers examples at
|
||||
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
|
||||
triggered in reaction to them (drivers examples at
|
||||
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
|
||||
``ovn_bgp_agent/drivers/driver_api.py``).
|
||||
|
||||
A driver implements the support for BGP capabilities. It ensures both VMs and
|
||||
LBs on providers networks or with Floating IPs associated can be
|
||||
exposed throug BGP. In addition, VMs on tenant networks can be also exposed
|
||||
if the ``expose_tenant_network`` configuration option is enabled.
|
||||
To control what tenant networks are exposed another flag can be used:
|
||||
``address_scopes``. If not set, all the tenant networks will be exposed, while
|
||||
if it is configured with a (set of) address_scopes, only the tenant networks
|
||||
whose address_scope matches will be exposed.
|
||||
|
||||
A common driver API is defined exposing the next methods:
|
||||
|
||||
- ``expose_ip`` and ``withdraw_ip``: used to expose/withdraw IPs for local
|
||||
OVN ports.
|
||||
|
||||
- ``expose_remote_ip`` and ``withdraw_remote_ip``: use to expose/withdraw IPs
|
||||
through another node when the VM/Pod are running on a different node.
|
||||
For example for VMs on tenant networks where the traffic needs to be
|
||||
injected through the OVN router gateway port.
|
||||
|
||||
- ``expose_subnet`` and ``withdraw_subnet``: used to expose/withdraw subnets through
|
||||
the local node.
|
||||
|
||||
|
||||
Proposed Solution
|
||||
-----------------
|
||||
|
||||
To support BGP functionality the OVN BGP Agent includes a driver
|
||||
that performs the extra steps required for exposing the IPs through BGP on
|
||||
the right nodes and steering the traffic to/from the node from/to the OVN
|
||||
overlay. In order to configure which driver to use, one should set the
|
||||
``driver`` configuration option in the ``bgp-agent.conf`` file.
|
||||
|
||||
This driver requires a watcher to react to the BGP-related events.
|
||||
In this case, the BGP actions will be trigger by events related to
|
||||
``Port_Binding`` and ``Load_Balancer`` OVN SB DB tables.
|
||||
The information in those tables gets modified by actions related to VMs or LBs
|
||||
creation/deletion, as well as FIPs association/disassociation to/from them.
|
||||
|
||||
Then, the agent performs some actions in order to ensure those VMs are
|
||||
reachable through BGP:
|
||||
|
||||
- Traffic between nodes or BGP Advertisement: These are the actions needed to
|
||||
expose the BGP routes and make sure all the nodes know how to reach the
|
||||
VM/LB IP on the nodes.
|
||||
|
||||
- Traffic within a node or redirecting traffic to/from OVN overlay: These are
|
||||
the actions needed to redirect the traffic to/from a VM to the OVN neutron
|
||||
networks, when traffic reaches the node where the VM is or in their way
|
||||
out of the node.
|
||||
|
||||
The code for the BGP driver is located at
|
||||
``drivers/openstack/ovn_bgp_driver.py``, and its associated watcher can be
|
||||
found at ``drivers/openstack/watchers/bgp_watcher.py``.
|
||||
|
||||
|
||||
OVN SB DB Events
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The watcher associated to the BGP driver detect the relevant events on the
|
||||
OVN SB DB to call the driver functions to configure BGP and linux kernel
|
||||
networking accordingly.
|
||||
The folloging events are watched and handled by the BGP watcher:
|
||||
|
||||
- VMs or LBs created/deleted on provider networks
|
||||
|
||||
- FIPs association/disassociation to VMs or LBs
|
||||
|
||||
- VMs or LBs created/deleted on tenant networks (if the
|
||||
``expose_tenant_networks`` configuration option is enabled, or if the
|
||||
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
|
||||
|
||||
.. note::
|
||||
|
||||
If ``expose_tenant_networks`` flag is enabled, it does not matter the
|
||||
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
|
||||
will be advertized.
|
||||
|
||||
|
||||
The BGP watcher detects OVN Southbound Database events at the ``Port_Binding``
|
||||
and ``Load_Balancer`` tables. It creates new event classes named
|
||||
``PortBindingChassisEvent`` and ``OVNLBEvent``, that all the events
|
||||
watched for BGP use as the base (inherit from).
|
||||
|
||||
The specific defined events to react to are:
|
||||
|
||||
- ``PortBindingChassisCreatedEvent``: Detects when a port of type
|
||||
``""`` (empty double-qoutes), ``virtual``, or ``chassisredirect`` gets
|
||||
attached to the OVN chassis where the agent is running. This is the case for
|
||||
VM or amphora LB ports on the provider networks, VM or amphora LB ports on
|
||||
tenant networks with a FIP associated, and neutron gateway router ports
|
||||
(CR-LRPs). It calls ``expose_ip`` driver method to perform the needed
|
||||
actions to expose it.
|
||||
|
||||
- ``PortBindingChassisDeletedEvent``: Detects when a port of type
|
||||
``""`` (empty double-quotes), ``virtual``, or ``chassisredirect`` gets
|
||||
detached from the OVN chassis where the agent is running. This is the case
|
||||
for VM or amphora LB ports on the provider networks, VM or amphora LB ports
|
||||
on tenant networks with a FIP associated, and neutron gateway router ports
|
||||
(CR-LRPs). It calls ``withdraw_ip`` driver method to perform the needed
|
||||
actions to withdraw the exposed BGP route.
|
||||
|
||||
- ``FIPSetEvent``: Detects when a patch port gets its nat_addresses field
|
||||
updated (e.g., action related to FIPs NATing). If that so, and the associated
|
||||
VM port is on the local chassis the event is processed by the agent and the
|
||||
required ip rule gets created and also the IP is (BGP) exposed. It calls
|
||||
``expose_ip`` driver method, including the associated_port information, to
|
||||
perform the required actions.
|
||||
|
||||
- ``FIPUnsetEvent``: Same as previous, but when the nat_address field get an
|
||||
IP deleted. It calls ``withdraw_ip`` driver method to perform the required
|
||||
actions.
|
||||
|
||||
- ``SubnetRouterAttachedEvent``: Detects when a patch port gets created.
|
||||
This means a subnet is attached to a router. In the ``expose_tenant_network``
|
||||
case, if the chassis is the one having the cr-lrp port for that router where
|
||||
the port is getting created, then the event is processed by the agent and the
|
||||
needed actions (ip rules and routes, and ovs rules) for exposing the IPs on
|
||||
that network are performed. This event calls the driver_api
|
||||
``expose_subnet``. The same happens if ``expose_ipv6_gua_tenant_networks``
|
||||
is used, but then, the IPs are only exposed if they are IPv6 global.
|
||||
|
||||
- ``SubnetRouterDetachedEvent``: Same as previous one, but for the deletion
|
||||
of the port. It calls ``withdraw_subnet``.
|
||||
|
||||
- ``TenantPortCreateEvent``: Detects when a port of type ``""`` (empty
|
||||
double-quotes) or ``virtual`` gets updated. If that port is not on a
|
||||
provider network, and the chasis where the event is processed has the
|
||||
LogicalRouterPort for the network and the OVN router gateway port where the
|
||||
network is connected to, then the event is processed and the actions to
|
||||
expose it through BGP are triggered. It calls the ``expose_remote_ip`` as in
|
||||
this case the IPs are exposed through the node with the OVN router gateway
|
||||
port, instead of where the VM is.
|
||||
|
||||
- ``TenantPortDeleteEvent``: Same as previous one, but for the deletion of the
|
||||
port. It calls ``withdraw_remote_ip``.
|
||||
|
||||
- ``OVNLBMemberUpdateEvent``: This event is required to handle the OVN load
|
||||
balancers created on the provider networks. It detects when new datapaths
|
||||
are added/removed to/from the ``Load_Balancer`` entries. This happens when
|
||||
members are added/removed -- their respective datapaths are added into the
|
||||
``Load_Balancer`` table entry. The event is only processed in the nodes with the
|
||||
relevant OVN router gateway ports, as it is where it needs to get exposed to
|
||||
be injected into OVN overlay. It calls ``expose_ovn_lb_on_provider`` when the
|
||||
second datapath is added (first one is the one belonging to the VIP (i.e.,
|
||||
the provider network), while the second one belongs to the load balancer
|
||||
member -- note all the load balancer members are expected to be connected
|
||||
through the same router to the provider network). And it calls
|
||||
``withdraw_ovn_lb_on_provider`` when that member gets deleted (only one
|
||||
datapath left) or the event type is ROW_DELETE, meaning the whole
|
||||
load balancer is deleted.
|
||||
|
||||
|
||||
Driver Logic
|
||||
~~~~~~~~~~~~
|
||||
|
||||
The BGP driver is in charge of the networking configuration ensuring that
|
||||
VMs and LBs on provider networks or with FIPs can be reached through BGP
|
||||
(N/S traffic). In addition, if ``expose_tenant_networks`` flag is enabled,
|
||||
VMs in tenant networks should be reachable too -- although instead of directly
|
||||
in the node they are created, through one of the network gateway chassis nodes.
|
||||
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
|
||||
GUA ranges. In addition, if the config option ``address_scopes`` is set only
|
||||
the tenant networks with matching corresponding address_scope will be exposed.
|
||||
|
||||
To accomplish this, it needs to ensure that:
|
||||
|
||||
- VM and LBs IPs can be advertized in a node where the traffic could be
|
||||
injected into the OVN overlay, in this case either the node hosting the VM
|
||||
or the node where the router gateway port is scheduled (see limitations
|
||||
subsection).
|
||||
|
||||
- Once the traffic reaches the specific node, the traffic is redirected to the
|
||||
OVN overlay by leveraging kernel networking.
|
||||
|
||||
|
||||
BGP Advertisement
|
||||
+++++++++++++++++
|
||||
|
||||
The OVN BGP Agent is in charge of triggering FRR (ip routing protocol
|
||||
suite for Linux which includes protocol daemons for BGP, OSPF, RIP,
|
||||
among others) to advertise/withdraw directly connected routes via BGP.
|
||||
To do that, when the agent starts, it ensures that:
|
||||
|
||||
- FRR local instance is reconfigured to leak routes for a new VRF. To do that
|
||||
it uses ``vtysh shell``. It connects to the existsing FRR socket (
|
||||
``--vty_socket`` option) and executes the next commands, passing them through
|
||||
a file (``-c FILE_NAME`` option):
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
LEAK_VRF_TEMPLATE = '''
|
||||
router bgp {{ bgp_as }}
|
||||
address-family ipv4 unicast
|
||||
import vrf {{ vrf_name }}
|
||||
exit-address-family
|
||||
|
||||
address-family ipv6 unicast
|
||||
import vrf {{ vrf_name }}
|
||||
exit-address-family
|
||||
|
||||
router bgp {{ bgp_as }} vrf {{ vrf_name }}
|
||||
bgp router-id {{ bgp_router_id }}
|
||||
address-family ipv4 unicast
|
||||
redistribute connected
|
||||
exit-address-family
|
||||
|
||||
address-family ipv6 unicast
|
||||
redistribute connected
|
||||
exit-address-family
|
||||
|
||||
'''
|
||||
|
||||
|
||||
- There is a VRF created (the one leaked in the previous step), by default
|
||||
with name ``bgp_vrf``.
|
||||
|
||||
- There is a dummy interface type (by default named ``bgp-nic``), associated to
|
||||
the previously created VRF device.
|
||||
|
||||
- Ensure ARP/NDP is enabled at OVS provider bridges by adding an IP to it
|
||||
|
||||
|
||||
Then, to expose the VMs/LB IPs as they are created (or upon
|
||||
initialization or re-sync), since the FRR configuration has the
|
||||
``redistribute connected`` option enabled, the only action needed to expose it
|
||||
(or withdraw it) is to add it (or remove it) from the ``bgp-nic`` dummy interface.
|
||||
Then it relies on Zebra to do the BGP advertisemant, as Zebra detects the
|
||||
addition/deletion of the IP on the local interface and advertises/withdraw
|
||||
the route:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip addr add IPv4/32 dev bgp-nic
|
||||
$ ip addr add IPv6/128 dev bgp-nic
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
As we also want to be able to expose VM connected to tenant networks
|
||||
(when ``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks``
|
||||
configuration options are enabled), there is a need to expose the Neutron
|
||||
router gateway port (CR-LRP on OVN) so that the traffic to VMs on tenant
|
||||
networks is injected into OVN overlay through the node that is hosting
|
||||
that port.
|
||||
|
||||
|
||||
Traffic Redirection to/from OVN
|
||||
+++++++++++++++++++++++++++++++
|
||||
|
||||
Once the VM/LB IP is exposed in an specific node (either the one hosting the
|
||||
VM/LB or the one with the OVN router gateway port), the OVN BGP Agent is in
|
||||
charge of configuring the linux kernel networking and OVS so that the traffic
|
||||
can be injected into the OVN overlay, and vice versa. To do that, when the
|
||||
agent starts, it ensures that:
|
||||
|
||||
- ARP/NDP is enabled at OVS provider bridges by adding an IP to it
|
||||
|
||||
- There is a routing table associated to each OVS provider bridge
|
||||
(adds entry at /etc/iproute2/rt_tables)
|
||||
|
||||
- If provider network is a VLAN network, a VLAN device connected
|
||||
to the bridge is created, and it has ARP and NDP enabed.
|
||||
|
||||
- Cleans up extra OVS flows at the OVS provider bridges
|
||||
|
||||
Then, either upon events or due to (re)sync (regularly or during start up), it:
|
||||
|
||||
- Adds an IP rule to apply specific routing table routes,
|
||||
in this case the one associated to the OVS provider bridge:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip rule
|
||||
0: from all lookup local
|
||||
1000: from all lookup [l3mdev-table]
|
||||
*32000: from all to IP lookup br-ex* # br-ex is the OVS provider bridge
|
||||
*32000: from all to CIDR lookup br-ex* # for VMs in tenant networks
|
||||
32766: from all lookup main
|
||||
32767: from all lookup default
|
||||
|
||||
|
||||
- Adds an IP route at the OVS provider bridge routing table so that the traffic is
|
||||
routed to the OVS provider bridge device:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip route show table br-ex
|
||||
default dev br-ex scope link
|
||||
*CIDR via CR-LRP_IP dev br-ex* # for VMs in tenant networks
|
||||
*CR-LRP_IP dev br-ex scope link* # for the VM in tenant network redirection
|
||||
*IP dev br-ex scope link* # IPs on provider or FIPs
|
||||
|
||||
|
||||
- Adds a static ARP entry for the OVN router gateway ports (CR-LRP) so that the
|
||||
traffic is steered to OVN via br-int -- this is because OVN does not reply
|
||||
to ARP requests outside its L2 network:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip nei
|
||||
...
|
||||
CR-LRP_IP dev br-ex lladdr CR-LRP_MAC PERMANENT
|
||||
...
|
||||
|
||||
|
||||
- For IPv6, instead of the static ARP entry, and NDP proxy is added, same
|
||||
reasoning:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip -6 nei add proxy CR-LRP_IP dev br-ex
|
||||
|
||||
|
||||
- Finally, in order for properly send the traffic out from the OVN overlay
|
||||
to kernel networking to be sent out of the node, the OVN BGP Agent needs
|
||||
to add a new flow at the OVS provider bridges so that the destination MAC
|
||||
address is changed to the MAC address of the OVS provider bridge
|
||||
(``actions=mod_dl_dst:OVN_PROVIDER_BRIDGE_MAC,NORMAL``):
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ sudo ovs-ofctl dump-flows br-ex
|
||||
cookie=0x3e7, duration=77.949s, table=0, n_packets=0, n_bytes=0, priority=900,ip,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
|
||||
cookie=0x3e7, duration=77.937s, table=0, n_packets=0, n_bytes=0, priority=900,ipv6,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
|
||||
|
||||
|
||||
|
||||
Driver API
|
||||
++++++++++
|
||||
|
||||
The BGP driver needs to implement the ``driver_api.py`` interface with the
|
||||
following functions:
|
||||
|
||||
- ``expose_ip``: creates all the ip rules and routes, and ovs flows needed
|
||||
to redirect the traffic to OVN overlay. It also ensure FRR exposes through
|
||||
BGP the required IP.
|
||||
|
||||
- ``withdraw_ip``: removes the above configuration to withdraw the exposed IP.
|
||||
|
||||
- ``expose_subnet``: add kernel networking configuration (ip rules and route)
|
||||
to ensure traffic can go from the node to the OVN overlay, and viceversa,
|
||||
for IPs within the tenant subnet CIDR.
|
||||
|
||||
- ``withdraw_subnet``: removes the above kernel networking configuration.
|
||||
|
||||
- ``expose_remote_ip``: BGP expose VM tenant network IPs through the chassis
|
||||
hosting the OVN gateway port for the router where the VM is connected.
|
||||
It ensures traffic destinated to the VM IP arrives to this node by exposing
|
||||
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
|
||||
the traffic is redirected to the OVN overlay once on the node.
|
||||
|
||||
- ``withdraw_remote_ip``: removes the above steps to stop advertizing the IP
|
||||
through BGP from the node.
|
||||
|
||||
And in addition, it also implements these 2 extra ones for the OVN load
|
||||
balancers on the provider networks
|
||||
|
||||
- ``expose_ovn_lb_on_provider``: adds kernel networking configuration to ensure
|
||||
traffic is forwarded from the node to the OVN overlay as well as to expose
|
||||
the VIP through BGP.
|
||||
|
||||
- ``withdraw_ovn_lb_on_provider``: removes the above steps to stop advertising
|
||||
the load balancer VIP.
|
||||
|
||||
|
||||
Agent deployment
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The BGP mode exposes the VMs and LBs in provider networks or with
|
||||
FIPs, as well as VMs on tenant networks if ``expose_tenant_networks`` or
|
||||
``expose_ipv6_gua_tenant_networks`` configuration options are enabled.
|
||||
|
||||
There is a need to deploy the agent in all the nodes where VMs can be created
|
||||
as well as in the networker nodes (i.e., where OVN router gateway ports can be
|
||||
allocated):
|
||||
|
||||
- For VMs and Amphora load balancers on provider networks or with FIPs,
|
||||
the IP is exposed on the node where the VM (or amphora) is deployed.
|
||||
Therefore the agent needs to be running on the compute nodes.
|
||||
|
||||
- For VMs on tenant networks (with ``expose_tenant_networks`` or
|
||||
``expose_ipv6_gua_tenant_networks`` configuration options enabled), the agent
|
||||
needs to be running on the networker nodes. In OpenStack, with OVN
|
||||
networking, the N/S traffic to the tenant VMs (without FIPs) needs to go
|
||||
through the networking nodes, more specifically the one hosting the
|
||||
chassisredirect ovn port (cr-lrp), connecting the provider network to the
|
||||
OVN virtual router. Hence, the VM IPs is advertised through BGP in that
|
||||
node, and from there it follows the normal path to the OpenStack compute
|
||||
node where the VM is located — the Geneve tunnel.
|
||||
|
||||
- Similarly, for OVN load balancer the IPs are exposed on the networker node.
|
||||
In this case the ARP request for the VIP is replied by the OVN router
|
||||
gateway port, therefore the traffic needs to be injected into OVN overlay
|
||||
at that point too.
|
||||
Therefore the agent needs to be running on the networker nodes for OVN
|
||||
load balancers.
|
||||
|
||||
As an example of how to start the OVN BGP Agent on the nodes, see the commands
|
||||
below:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ python setup.py install
|
||||
$ cat bgp-agent.conf
|
||||
# sample configuration that can be adapted based on needs
|
||||
[DEFAULT]
|
||||
debug=True
|
||||
reconcile_interval=120
|
||||
expose_tenant_networks=True
|
||||
# expose_ipv6_gua_tenant_networks=True
|
||||
driver=osp_bgp_driver
|
||||
address_scopes=2237917c7b12489a84de4ef384a2bcae
|
||||
|
||||
$ sudo bgp-agent --config-dir bgp-agent.conf
|
||||
Starting BGP Agent...
|
||||
Loaded chassis 51c8480f-c573-4c1c-b96e-582f9ca21e70.
|
||||
BGP Agent Started...
|
||||
Ensuring VRF configuration for advertising routes
|
||||
Configuring br-ex default rule and routing tables for each provider network
|
||||
Found routing table for br-ex with: ['201', 'br-ex']
|
||||
Sync current routes.
|
||||
Add BGP route for logical port with ip 172.24.4.226
|
||||
Add BGP route for FIP with ip 172.24.4.199
|
||||
Add BGP route for CR-LRP Port 172.24.4.221
|
||||
....
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
If you only want to expose the IPv6 GUA tenant IPs, then remove the option
|
||||
``expose_tenant_networks`` and add ``expose_ipv6_gua_tenant_networks=True``
|
||||
instead.
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
If you want to filter the tenant networks to be exposed by some specific
|
||||
address scopes, add the list of address scopes to ``addresss_scope=XXX``
|
||||
section. If no filtering should be applied, just remove the line.
|
||||
|
||||
|
||||
Note that the OVN BGP Agent operates under the next assumptions:
|
||||
|
||||
- A dynamic routing solution, in this case FRR, is deployed and
|
||||
advertises/withdraws routes added/deleted to/from certain local interface,
|
||||
in this case the ones associated to the VRF created to that end. As only VM
|
||||
and load balancer IPs needs to be advertised, FRR needs to be configure with
|
||||
the proper filtering so that only /32 (or /128 for IPv6) IPs are advertised.
|
||||
A sample config for FRR is:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
frr version 7.0
|
||||
frr defaults traditional
|
||||
hostname cmp-1-0
|
||||
log file /var/log/frr/frr.log debugging
|
||||
log timestamp precision 3
|
||||
service integrated-vtysh-config
|
||||
line vty
|
||||
|
||||
router bgp 64999
|
||||
bgp router-id 172.30.1.1
|
||||
bgp log-neighbor-changes
|
||||
bgp graceful-shutdown
|
||||
no bgp default ipv4-unicast
|
||||
no bgp ebgp-requires-policy
|
||||
|
||||
neighbor uplink peer-group
|
||||
neighbor uplink remote-as internal
|
||||
neighbor uplink password foobar
|
||||
neighbor enp2s0 interface peer-group uplink
|
||||
neighbor enp3s0 interface peer-group uplink
|
||||
|
||||
address-family ipv4 unicast
|
||||
redistribute connected
|
||||
neighbor uplink activate
|
||||
neighbor uplink allowas-in origin
|
||||
neighbor uplink prefix-list only-host-prefixes out
|
||||
exit-address-family
|
||||
|
||||
address-family ipv6 unicast
|
||||
redistribute connected
|
||||
neighbor uplink activate
|
||||
neighbor uplink allowas-in origin
|
||||
neighbor uplink prefix-list only-host-prefixes out
|
||||
exit-address-family
|
||||
|
||||
ip prefix-list only-default permit 0.0.0.0/0
|
||||
ip prefix-list only-host-prefixes permit 0.0.0.0/0 ge 32
|
||||
|
||||
route-map rm-only-default permit 10
|
||||
match ip address prefix-list only-default
|
||||
set src 172.30.1.1
|
||||
|
||||
ip protocol bgp route-map rm-only-default
|
||||
|
||||
ipv6 prefix-list only-default permit ::/0
|
||||
ipv6 prefix-list only-host-prefixes permit ::/0 ge 128
|
||||
|
||||
route-map rm-only-default permit 11
|
||||
match ipv6 address prefix-list only-default
|
||||
set src f00d:f00d:f00d:f00d:f00d:f00d:f00d:0004
|
||||
|
||||
ipv6 protocol bgp route-map rm-only-default
|
||||
|
||||
ip nht resolve-via-default
|
||||
|
||||
|
||||
- The relevant provider OVS bridges are created and configured with a loopback
|
||||
IP address (eg. 1.1.1.1/32 for IPv4), and proxy ARP/NDP is enabled on their
|
||||
kernel interface. In the case of OpenStack this is done by TripleO directly.
|
||||
|
||||
|
||||
Limitations
|
||||
-----------
|
||||
|
||||
The following limitations apply:
|
||||
|
||||
- There is no API to decide what to expose, all VMs/LBs on providers or with
|
||||
Floating IPs associated to them will get exposed. For the VMs in the tenant
|
||||
networks, the flag ``address_scopes`` should be used for filtering what
|
||||
subnets to expose -- which should be also used to ensure no overlapping IPs.
|
||||
|
||||
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
|
||||
using address scopes and subnet pools.
|
||||
|
||||
- Network traffic is steered by kernel routing (ip routes and rules), therefore
|
||||
OVS-DPDK, where the kernel space is skipped, is not supported.
|
||||
|
||||
- Network traffic is steered by kernel routing (ip routes and rules), therefore
|
||||
SRIOV, where the hypervisor is skipped, is not supported.
|
||||
|
||||
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
|
||||
the provider or the FIPs associated to the VIPs on tenant networks needs to
|
||||
go through the networking nodes (the ones hosting the Neutron Router Gateway
|
||||
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
|
||||
load balancer members to the provider network). Therefore, the entry point
|
||||
into the OVN overlay needs to be one of those networking nodes, and
|
||||
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
|
||||
nodes the traffic will follow the normal tunneled path (Geneve tunnel) to
|
||||
the OpenStack compute node where the selected member is located.
|
78
doc/source/contributor/bgp_traffic_redirection.rst
Normal file
78
doc/source/contributor/bgp_traffic_redirection.rst
Normal file
@ -0,0 +1,78 @@
|
||||
Traffic Redirection to/from OVN
|
||||
+++++++++++++++++++++++++++++++
|
||||
|
||||
Besides the VM/LB IP being exposed in a specific node (either the one hosting
|
||||
the VM/LB or the one with the OVN router gateway port), the OVN BGP Agent is in
|
||||
charge of configuring the linux kernel networking and OVS so that the traffic
|
||||
can be injected into the OVN overlay, and vice versa. To do that, when the
|
||||
agent starts, it ensures that:
|
||||
|
||||
- ARP/NDP is enabled on OVS provider bridges by adding an IP to it
|
||||
|
||||
- There is a routing table associated to each OVS provider bridge
|
||||
(adds entry at /etc/iproute2/rt_tables)
|
||||
|
||||
- If the provider network is a VLAN network, a VLAN device connected
|
||||
to the bridge is created, and it has ARP and NDP enabled.
|
||||
|
||||
- Cleans up extra OVS flows at the OVS provider bridges
|
||||
|
||||
Then, either upon events or due to (re)sync (regularly or during start up), it:
|
||||
|
||||
- Adds an IP rule to apply specific routing table routes,
|
||||
in this case the one associated to the OVS provider bridge:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip rule
|
||||
0: from all lookup local
|
||||
1000: from all lookup [l3mdev-table]
|
||||
*32000: from all to IP lookup br-ex* # br-ex is the OVS provider bridge
|
||||
*32000: from all to CIDR lookup br-ex* # for VMs in tenant networks
|
||||
32766: from all lookup main
|
||||
32767: from all lookup default
|
||||
|
||||
|
||||
- Adds an IP route at the OVS provider bridge routing table so that the traffic is
|
||||
routed to the OVS provider bridge device:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip route show table br-ex
|
||||
default dev br-ex scope link
|
||||
*CIDR via CR-LRP_IP dev br-ex* # for VMs in tenant networks
|
||||
*CR-LRP_IP dev br-ex scope link* # for the VM in tenant network redirection
|
||||
*IP dev br-ex scope link* # IPs on provider or FIPs
|
||||
|
||||
|
||||
- Adds a static ARP entry for the OVN router gateway ports (CR-LRP) so that the
|
||||
traffic is steered to OVN via br-int -- this is because OVN does not reply
|
||||
to ARP requests outside its L2 network:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip neigh
|
||||
...
|
||||
CR-LRP_IP dev br-ex lladdr CR-LRP_MAC PERMANENT
|
||||
...
|
||||
|
||||
|
||||
- For IPv6, instead of the static ARP entry, an NDP proxy is added, same
|
||||
reasoning:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ ip -6 neigh add proxy CR-LRP_IP dev br-ex
|
||||
|
||||
|
||||
- Finally, in order for properly send the traffic out from the OVN overlay
|
||||
to kernel networking to be sent out of the node, the OVN BGP Agent needs
|
||||
to add a new flow at the OVS provider bridges so that the destination MAC
|
||||
address is changed to the MAC address of the OVS provider bridge
|
||||
(``actions=mod_dl_dst:OVN_PROVIDER_BRIDGE_MAC,NORMAL``):
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
$ sudo ovs-ofctl dump-flows br-ex
|
||||
cookie=0x3e7, duration=77.949s, table=0, n_packets=0, n_bytes=0, priority=900,ip,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
|
||||
cookie=0x3e7, duration=77.937s, table=0, n_packets=0, n_bytes=0, priority=900,ipv6,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
|
310
doc/source/contributor/drivers/bgp_mode_design.rst
Normal file
310
doc/source/contributor/drivers/bgp_mode_design.rst
Normal file
@ -0,0 +1,310 @@
|
||||
.. _bgp_driver:
|
||||
|
||||
===================================================================
|
||||
[SB DB] OVN BGP Agent: Design of the BGP Driver with kernel routing
|
||||
===================================================================
|
||||
|
||||
Purpose
|
||||
-------
|
||||
|
||||
The addition of a BGP driver enables the OVN BGP agent to expose virtual
|
||||
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
|
||||
protocol when these IP addresses are either associated with a floating IP
|
||||
(FIP) or are booted or created on a provider network. The same functionality
|
||||
is available on project networks, when a special flag is set.
|
||||
|
||||
This document presents the design decision behind the BGP Driver for the
|
||||
Networking OVN BGP agent.
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
With the growing popularity of virtualized and containerized workloads,
|
||||
it is common to use pure Layer 3 spine and leaf network deployments in data
|
||||
centers. The benefits of this practice reduce scaling complexities,
|
||||
failure domains, and broadcast traffic limits.
|
||||
|
||||
The southbound OVN BGP agent is a Python-based daemon that runs on each
|
||||
OpenStack Controller and Compute node.
|
||||
The agent monitors the Open Virtual Network (OVN) southbound database
|
||||
for certain VM and floating IP (FIP) events.
|
||||
When these events occur, the agent notifies the FRR BGP daemon (bgpd)
|
||||
to advertise the IP address or FIP associated with the VM.
|
||||
The agent also triggers actions that route the external traffic to the OVN
|
||||
overlay.
|
||||
Because the agent uses a multi-driver implementation, you can configure the
|
||||
agent for the specific infrastructure that runs on top of OVN, such as OSP or
|
||||
Kubernetes and OpenShift.
|
||||
|
||||
.. note::
|
||||
|
||||
Note it is only intended for the N/S traffic, the E/W traffic will work
|
||||
exactly the same as before, i.e., VMs are connected through geneve
|
||||
tunnels.
|
||||
|
||||
|
||||
This design simplicity enables the agent to implement different drivers,
|
||||
depending on what OVN SB DB events are being watched (watchers examples at
|
||||
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
|
||||
triggered in reaction to them (drivers examples at
|
||||
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
|
||||
``ovn_bgp_agent/drivers/driver_api.py``).
|
||||
|
||||
A driver implements the support for BGP capabilities. It ensures that both VMs
|
||||
and LBs on provider networks or associated floating IPs are exposed through BGP.
|
||||
In addition, VMs on tenant networks can be also exposed
|
||||
if the ``expose_tenant_network`` configuration option is enabled.
|
||||
To control what tenant networks are exposed another flag can be used:
|
||||
``address_scopes``. If not set, all the tenant networks will be exposed, while
|
||||
if it is configured with a (set of) address_scopes, only the tenant networks
|
||||
whose address_scope matches will be exposed.
|
||||
|
||||
A common driver API is defined exposing the these methods:
|
||||
|
||||
- ``expose_ip`` and ``withdraw_ip``: exposes or withdraws IPs for local
|
||||
OVN ports.
|
||||
|
||||
- ``expose_remote_ip`` and ``withdraw_remote_ip``: exposes or withdraws IPs
|
||||
through another node when the VM or pods are running on a different node.
|
||||
For example, use for VMs on tenant networks where the traffic needs to be
|
||||
injected through the OVN router gateway port.
|
||||
|
||||
- ``expose_subnet`` and ``withdraw_subnet``: exposes or withdraws subnets
|
||||
through the local node.
|
||||
|
||||
|
||||
Proposed Solution
|
||||
-----------------
|
||||
|
||||
To support BGP functionality the OVN BGP Agent includes a driver
|
||||
that performs the extra steps required for exposing the IPs through BGP on
|
||||
the correct nodes and steering the traffic to/from the node from/to the OVN
|
||||
overlay. To configure the OVN BGP agent to use the BGP driver set the
|
||||
``driver`` configuration option in the ``bgp-agent.conf`` file to
|
||||
``ovn_bgp_driver``.
|
||||
|
||||
The BGP driver requires a watcher to react to the BGP-related events.
|
||||
In this case, BGP actions are triggered by events related to
|
||||
``Port_Binding`` and ``Load_Balancer`` OVN SB DB tables.
|
||||
The information in these tables is modified when VMs and LBs are created and
|
||||
deleted, and when FIPs for them are associated and disassociated.
|
||||
|
||||
Then, the agent performs some actions in order to ensure those VMs are
|
||||
reachable through BGP:
|
||||
|
||||
- Traffic between nodes or BGP Advertisement: These are the actions needed to
|
||||
expose the BGP routes and make sure all the nodes know how to reach the
|
||||
VM/LB IP on the nodes.
|
||||
|
||||
- Traffic within a node or redirecting traffic to/from OVN overlay: These are
|
||||
the actions needed to redirect the traffic to/from a VM to the OVN Neutron
|
||||
networks, when traffic reaches the node where the VM is or in their way
|
||||
out of the node.
|
||||
|
||||
The code for the BGP driver is located at
|
||||
``ovn_bgp_agent/drivers/openstack/ovn_bgp_driver.py``, and its associated
|
||||
watcher can be found at
|
||||
``ovn_bgp_agent/drivers/openstack/watchers/bgp_watcher.py``.
|
||||
|
||||
|
||||
OVN SB DB Events
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The watcher associated with the BGP driver detects the relevant events on the
|
||||
OVN SB DB to call the driver functions to configure BGP and linux kernel
|
||||
networking accordingly.
|
||||
The following events are watched and handled by the BGP watcher:
|
||||
|
||||
- VMs or LBs created/deleted on provider networks
|
||||
|
||||
- FIPs association/disassociation to VMs or LBs
|
||||
|
||||
- VMs or LBs created/deleted on tenant networks (if the
|
||||
``expose_tenant_networks`` configuration option is enabled, or if the
|
||||
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
|
||||
|
||||
.. note::
|
||||
|
||||
If ``expose_tenant_networks`` flag is enabled, it does not matter the
|
||||
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
|
||||
are advertised.
|
||||
|
||||
|
||||
It creates new event classes named
|
||||
``PortBindingChassisEvent`` and ``OVNLBEvent``, that all the events
|
||||
watched for BGP use as the base (inherit from).
|
||||
|
||||
The BGP watcher reacts to the following events:
|
||||
|
||||
- ``PortBindingChassisCreatedEvent``: Detects when a port of type
|
||||
``""`` (empty double-qoutes), ``virtual``, or ``chassisredirect`` gets
|
||||
attached to the OVN chassis where the agent is running. This is the case for
|
||||
VM or amphora LB ports on the provider networks, VM or amphora LB ports on
|
||||
tenant networks with a FIP associated, and neutron gateway router ports
|
||||
(CR-LRPs). It calls ``expose_ip`` driver method to perform the needed
|
||||
actions to expose it.
|
||||
|
||||
- ``PortBindingChassisDeletedEvent``: Detects when a port of type
|
||||
``""`` (empty double-quotes), ``virtual``, or ``chassisredirect`` gets
|
||||
detached from the OVN chassis where the agent is running. This is the case
|
||||
for VM or amphora LB ports on the provider networks, VM or amphora LB ports
|
||||
on tenant networks with a FIP associated, and neutron gateway router ports
|
||||
(CR-LRPs). It calls ``withdraw_ip`` driver method to perform the needed
|
||||
actions to withdraw the exposed BGP route.
|
||||
|
||||
- ``FIPSetEvent``: Detects when a Port_Binding entry of type ``patch`` gets
|
||||
its ``nat_addresses`` field updated (e.g., action related to FIPs NATing).
|
||||
When true, and the associated VM port is on the local chassis, the event
|
||||
is processed by the agent and the required IP rule gets created and its
|
||||
IP is (BGP) exposed. It calls the ``expose_ip`` driver method, including
|
||||
the associated_port information, to perform the required actions.
|
||||
|
||||
- ``FIPUnsetEvent``: Same as previous, but when the ``nat_addresses`` field get
|
||||
an IP deleted. It calls the ``withdraw_ip`` driver method to perform the
|
||||
required actions.
|
||||
|
||||
- ``SubnetRouterAttachedEvent``: Detects when a Port_Binding entry of type
|
||||
``patch`` port gets created. This means a subnet is attached to a router.
|
||||
In the ``expose_tenant_network``
|
||||
case, if the chassis is the one having the cr-lrp port for that router where
|
||||
the port is getting created, then the event is processed by the agent and the
|
||||
needed actions (ip rules and routes, and ovs rules) for exposing the IPs on
|
||||
that network are performed. This event calls the driver API
|
||||
``expose_subnet``. The same happens if ``expose_ipv6_gua_tenant_networks``
|
||||
is used, but then, the IPs are only exposed if they are IPv6 global.
|
||||
|
||||
- ``SubnetRouterDetachedEvent``: Same as ``SubnetRouterAttachedEvent``,
|
||||
but for the deletion of the port. It calls ``withdraw_subnet``.
|
||||
|
||||
- ``TenantPortCreateEvent``: Detects when a port of type ``""`` (empty
|
||||
double-quotes) or ``virtual`` gets updated. If that port is not on a
|
||||
provider network, and the chassis where the event is processed has the
|
||||
``LogicalRouterPort`` for the network and the OVN router gateway port where
|
||||
the network is connected to, then the event is processed and the actions to
|
||||
expose it through BGP are triggered. It calls the ``expose_remote_ip``
|
||||
because in this case the IPs are exposed through the node with the OVN router
|
||||
gateway port, instead of the node where the VM is located.
|
||||
|
||||
- ``TenantPortDeleteEvent``: Same as ``TenantPortCreateEvent``, but for
|
||||
the deletion of the port. It calls ``withdraw_remote_ip``.
|
||||
|
||||
- ``OVNLBMemberUpdateEvent``: This event is required to handle the OVN load
|
||||
balancers created on the provider networks. It detects when new datapaths
|
||||
are added/removed to/from the ``Load_Balancer`` entries. This happens when
|
||||
members are added/removed which triggers the addition/deletion of their
|
||||
datapaths into the ``Load_Balancer`` table entry.
|
||||
The event is only processed in the nodes with
|
||||
the relevant OVN router gateway ports, because it is where it needs to get
|
||||
exposed to be injected into OVN overlay.
|
||||
``OVNLBMemberUpdateEvent`` calls ``expose_ovn_lb_on_provider`` only when the
|
||||
second datapath is added. The first datapath belongs to the VIP for the
|
||||
provider network, while the second one belongs to the load balancer member.
|
||||
``OVNLBMemberUpdateEvent`` calls ``withdraw_ovn_lb_on_provider`` when the
|
||||
second datapath is deleted, or the entire load balancer is deleted (event
|
||||
type is ``ROW_DELETE``).
|
||||
|
||||
.. note::
|
||||
|
||||
All the load balancer members are expected to be connected through the same
|
||||
router to the provider network.
|
||||
|
||||
|
||||
Driver Logic
|
||||
~~~~~~~~~~~~
|
||||
|
||||
The BGP driver is in charge of the networking configuration ensuring that
|
||||
VMs and LBs on provider networks or with FIPs can be reached through BGP
|
||||
(N/S traffic). In addition, if the ``expose_tenant_networks`` flag is enabled,
|
||||
VMs in tenant networks should be reachable too -- although instead of directly
|
||||
in the node they are created, through one of the network gateway chassis nodes.
|
||||
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
|
||||
GUA ranges. In addition, if the config option ``address_scopes`` is set, only
|
||||
the tenant networks with matching corresponding ``address_scope`` will be
|
||||
exposed.
|
||||
|
||||
To accomplish the network configuration and advertisement, the driver ensures:
|
||||
|
||||
- VM and LBs IPs can be advertised in a node where the traffic could be
|
||||
injected into the OVN overlay, in this case either the node hosting the VM
|
||||
or the node where the router gateway port is scheduled (see limitations
|
||||
subsection).
|
||||
|
||||
- Once the traffic reaches the specific node, the traffic is redirected to the
|
||||
OVN overlay by leveraging kernel networking.
|
||||
|
||||
|
||||
.. include:: ../bgp_advertising.rst
|
||||
|
||||
|
||||
.. include:: ../bgp_traffic_redirection.rst
|
||||
|
||||
|
||||
Driver API
|
||||
++++++++++
|
||||
|
||||
The BGP driver needs to implement the ``driver_api.py`` interface with the
|
||||
following functions:
|
||||
|
||||
- ``expose_ip``: creates all the IP rules and routes, and OVS flows needed
|
||||
to redirect the traffic to the OVN overlay. It also ensure FRR exposes
|
||||
through BGP the required IP.
|
||||
|
||||
- ``withdraw_ip``: removes the above configuration to withdraw the exposed IP.
|
||||
|
||||
- ``expose_subnet``: add kernel networking configuration (IP rules and route)
|
||||
to ensure traffic can go from the node to the OVN overlay, and vice versa,
|
||||
for IPs within the tenant subnet CIDR.
|
||||
|
||||
- ``withdraw_subnet``: removes the above kernel networking configuration.
|
||||
|
||||
- ``expose_remote_ip``: BGP exposes VM tenant network IPs through the chassis
|
||||
hosting the OVN gateway port for the router where the VM is connected.
|
||||
It ensures traffic destinated to the VM IP arrives to this node by exposing
|
||||
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
|
||||
the traffic is redirected to the OVN overlay once on the node.
|
||||
|
||||
- ``withdraw_remote_ip``: removes the above steps to stop advertising the IP
|
||||
through BGP from the node.
|
||||
|
||||
The driver API implements these additional methods for OVN load balancers on
|
||||
provider networks:
|
||||
|
||||
- ``expose_ovn_lb_on_provider``: adds kernel networking configuration to ensure
|
||||
traffic is forwarded from the node to the OVN overlay and to expose
|
||||
the VIP through BGP.
|
||||
|
||||
- ``withdraw_ovn_lb_on_provider``: removes the above steps to stop advertising
|
||||
the load balancer VIP.
|
||||
|
||||
|
||||
.. include:: ../agent_deployment.rst
|
||||
|
||||
|
||||
Limitations
|
||||
-----------
|
||||
|
||||
The following limitations apply:
|
||||
|
||||
- There is no API to decide what to expose, all VMs/LBs on providers or with
|
||||
floating IPs associated with them will get exposed. For the VMs in the tenant
|
||||
networks, the flag ``address_scopes`` should be used for filtering what
|
||||
subnets to expose -- which should be also used to ensure no overlapping IPs.
|
||||
|
||||
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
|
||||
using address scopes and subnet pools.
|
||||
|
||||
- Network traffic is steered by kernel routing (IP routes and rules), therefore
|
||||
OVS-DPDK, where the kernel space is skipped, is not supported.
|
||||
|
||||
- Network traffic is steered by kernel routing (IP routes and rules), therefore
|
||||
SR-IOV, where the hypervisor is skipped, is not supported.
|
||||
|
||||
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
|
||||
the provider or the FIPs associated to the VIPs on tenant networks needs to
|
||||
go through the networking nodes (the ones hosting the Neutron Router Gateway
|
||||
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
|
||||
load balancer members to the provider network). Therefore, the entry point
|
||||
into the OVN overlay needs to be one of those networking nodes, and
|
||||
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
|
||||
nodes the traffic follows the normal tunneled path (Geneve tunnel) to
|
||||
the OpenStack compute node where the selected member is located.
|
@ -12,9 +12,9 @@
|
||||
''''''' Heading 4
|
||||
(Avoid deeper levels because they do not render well.)
|
||||
|
||||
========================================
|
||||
Design of OVN BGP Agent with EVPN Driver
|
||||
========================================
|
||||
=========================================================
|
||||
Design of OVN BGP Agent with EVPN Driver (kernel routing)
|
||||
=========================================================
|
||||
|
||||
Purpose
|
||||
-------
|
||||
@ -96,7 +96,7 @@ watcher detects it).
|
||||
The overall arquitecture and integration between the ``networking-bgpvpn``
|
||||
and the ``networking-bgp-ovn`` agent are shown in the next figure:
|
||||
|
||||
.. image:: ../../images/networking-bgpvpn_integration.png
|
||||
.. image:: ../../../images/networking-bgpvpn_integration.png
|
||||
:alt: integration components
|
||||
:align: center
|
||||
:width: 100%
|
||||
@ -409,7 +409,7 @@ The next figure shows the N/S traffic flow through the VRF to the VM,
|
||||
including information regarding the OVS flows on the provider bridge (br-ex),
|
||||
and the routes on the VRF routing table.
|
||||
|
||||
.. image:: ../../images/evpn_traffic_flow.png
|
||||
.. image:: ../../../images/evpn_traffic_flow.png
|
||||
:alt: integration components
|
||||
:align: center
|
||||
:width: 100%
|
12
doc/source/contributor/drivers/index.rst
Normal file
12
doc/source/contributor/drivers/index.rst
Normal file
@ -0,0 +1,12 @@
|
||||
==========================
|
||||
BGP Drivers Documentation
|
||||
==========================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
bgp_mode_design
|
||||
nb_bgp_mode_design
|
||||
ovn_bgp_mode_design
|
||||
evpn_mode_design
|
||||
bgp_mode_stretched_l2_design
|
386
doc/source/contributor/drivers/nb_bgp_mode_design.rst
Normal file
386
doc/source/contributor/drivers/nb_bgp_mode_design.rst
Normal file
@ -0,0 +1,386 @@
|
||||
.. _nb_bgp_driver:
|
||||
|
||||
======================================================================
|
||||
[NB DB] NB OVN BGP Agent: Design of the BGP Driver with kernel routing
|
||||
======================================================================
|
||||
|
||||
Purpose
|
||||
-------
|
||||
|
||||
The addition of a BGP driver enables the OVN BGP agent to expose virtual
|
||||
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
|
||||
protocol when these IP addresses are either associated with a floating IP
|
||||
(FIP) or are booted or created on a provider network.
|
||||
The same functionality is available on project networks, when a special
|
||||
flag is set.
|
||||
|
||||
This document presents the design decision behind the NB BGP Driver for
|
||||
the Networking OVN BGP agent.
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
With the growing popularity of virtualized and containerized workloads,
|
||||
it is common to use pure Layer 3 spine and leaf network deployments in
|
||||
data centers. The benefits of this practice reduce scaling complexities,
|
||||
failure domains, and broadcast traffic limits
|
||||
|
||||
The northbound OVN BGP agent is a Python-based daemon that runs on each
|
||||
OpenStack Controller and Compute node.
|
||||
The agent monitors the Open Virtual Network (OVN) northbound database
|
||||
for certain VM and floating IP (FIP) events.
|
||||
When these events occur, the agent notifies the FRR BGP daemon (bgpd)
|
||||
to advertise the IP address or FIP associated with the VM.
|
||||
The agent also triggers actions that route the external traffic to the OVN
|
||||
overlay.
|
||||
Unlike its predecessor, the (southbound) OVN BGP agent, the northbound OVN BGP
|
||||
agent uses the northbound database API which is more stable than the southbound
|
||||
database API because the former is isolated from internal changes to core OVN.
|
||||
|
||||
.. note::
|
||||
|
||||
Note northbound OVN BGP agent driver is only intended for the N/S traffic,
|
||||
the E/W traffic will work exactly the same as before, i.e., VMs are
|
||||
connected through geneve tunnels.
|
||||
|
||||
|
||||
The agent provides a multi-driver implementation that allows you to configure
|
||||
it for specific infrastructure running on top of OVN, for instance OpenStack
|
||||
or Kubernetes/OpenShift.
|
||||
This design simplicity enables the agent to implement different drivers,
|
||||
depending on what OVN NB DB events are being watched (watchers examples at
|
||||
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
|
||||
triggered in reaction to them (drivers examples at
|
||||
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
|
||||
``ovn_bgp_agent/drivers/driver_api.py``).
|
||||
|
||||
A driver implements the support for BGP capabilities. It ensures that both VMs
|
||||
and LBs on provider networks or associated Floating IPs are exposed through
|
||||
BGP. In addition, VMs on tenant networks can be also exposed
|
||||
if the ``expose_tenant_network`` configuration option is enabled.
|
||||
To control what tenant networks are exposed another flag can be used:
|
||||
``address_scopes``. If not set, all the tenant networks will be exposed, while
|
||||
if it is configured with a (set of) address_scopes, only the tenant networks
|
||||
whose address_scope matches will be exposed.
|
||||
|
||||
A common driver API is defined exposing the these methods:
|
||||
|
||||
- ``expose_ip`` and ``withdraw_ip``: exposes or withdraws IPs for local
|
||||
OVN ports.
|
||||
|
||||
- ``expose_remote_ip`` and ``withdraw_remote_ip``: exposes or withdraws IPs
|
||||
through another node when the VM or pods are running on a different node.
|
||||
For example, use for VMs on tenant networks where the traffic needs to be
|
||||
injected through the OVN router gateway port.
|
||||
|
||||
- ``expose_subnet`` and ``withdraw_subnet``: exposes or withdraws subnets through
|
||||
the local node.
|
||||
|
||||
|
||||
Proposed Solution
|
||||
-----------------
|
||||
|
||||
To support BGP functionality the NB OVN BGP Agent includes a new driver
|
||||
that performs the steps required for exposing the IPs through BGP on
|
||||
the correct nodes and steering the traffic to/from the node from/to the OVN
|
||||
overlay.
|
||||
To configure the OVN BGP agent to use the northbound OVN BGP driver, in the
|
||||
``bgp-agent.conf`` file, set the value of ``driver`` to ``nb_ovn_bgp_driver``.
|
||||
|
||||
This driver requires a watcher to react to the BGP-related events.
|
||||
In this case, BGP actions are triggered by events related to
|
||||
``Logical_Switch_Port``, ``Logical_Router_Port``and ``Load_Balancer``
|
||||
on OVN NB DB tables.
|
||||
The information in these tables is modified when VMs and LBs are created and
|
||||
deleted, and when FIPs for them are associated and disassociated.
|
||||
|
||||
Then, the agent performs these actions to ensure the VMs are reachable through
|
||||
BGP:
|
||||
|
||||
- Traffic between nodes or BGP Advertisement: These are the actions needed to
|
||||
expose the BGP routes and make sure all the nodes know how to reach the
|
||||
VM/LB IP on the nodes. This is exactly the same as in the initial OVN BGP
|
||||
Driver (see :ref:`bgp_driver`)
|
||||
|
||||
- Traffic within a node or redirecting traffic to/from OVN overlay (wiring):
|
||||
These are the actions needed to redirect the traffic to/from a VM to the OVN
|
||||
neutron networks, when traffic reaches the node where the VM is or in their
|
||||
way out of the node.
|
||||
|
||||
The code for the NB BGP driver is located at
|
||||
``ovn_bgp_agent/drivers/openstack/nb_ovn_bgp_driver.py``, and its associated
|
||||
watcher can be found at
|
||||
``ovn_bgp_agent/drivers/openstack/watchers/nb_bgp_watcher.py``.
|
||||
|
||||
Note this new driver also allows different ways of wiring the node to the OVN
|
||||
overlay. These are configurable through the option ``exposing_method``, where
|
||||
for now you can select:
|
||||
|
||||
- ``underlay``: using kernel routing (what we describe in this document), same
|
||||
as supported by the driver at :ref:`bgp_driver`.
|
||||
|
||||
- ``ovn``: using an extra OVN cluster per node to perform the routing at
|
||||
OVN/OVS level instead of kernel, therefore enabling datapath acceleration
|
||||
(Hardware Offloading and OVS-DPDK). More information about this mechanism
|
||||
at :ref:`bgp_driver`.
|
||||
|
||||
|
||||
OVN NB DB Events
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The watcher associated with the BGP driver detects the relevant events on the
|
||||
OVN NB DB to call the driver functions to configure BGP and linux kernel
|
||||
networking accordingly.
|
||||
|
||||
.. note::
|
||||
|
||||
Linux Kernel Networking is used when the default ``exposing_method``
|
||||
(``underlay``) is used. If ``ovn`` is used instead, OVN routing is
|
||||
used instead of Kernel. For more details on this see :ref:`ovn_routing`.
|
||||
|
||||
The following events are watched and handled by the BGP watcher:
|
||||
|
||||
- VMs or LBs created/deleted on provider networks
|
||||
|
||||
- FIPs association/disassociation to VMs or LBs
|
||||
|
||||
- VMs or LBs created/deleted on tenant networks (if the
|
||||
``expose_tenant_networks`` configuration option is enabled, or if the
|
||||
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
|
||||
|
||||
.. note::
|
||||
|
||||
If ``expose_tenant_networks`` flag is enabled, it does not matter the
|
||||
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
|
||||
are advertised.
|
||||
|
||||
|
||||
The NB BGP watcher reacts to the following events:
|
||||
|
||||
- ``Logical_Switch_Port``
|
||||
|
||||
- ``Logical_Router_Port``
|
||||
|
||||
- ``Load_Balancer``
|
||||
|
||||
Besides the previously existing ``OVNLBEvent`` class, the NB BGP watcher has
|
||||
new event classes named ``LSPChassisEvent`` and ``LRPChassisEvent`` that
|
||||
all the events watched for NB BGP driver use as the base (inherit from).
|
||||
|
||||
The specific defined events to react to are:
|
||||
|
||||
- ``LogicalSwitchPortProviderCreateEvent``: Detects when a VM or an amphora LB
|
||||
port, logical switch ports of type ``""`` (empty double-qoutes) or
|
||||
``virtual``, comes up or gets attached to the OVN chassis where the agent is
|
||||
running. If the ports are on a provider network, then the driver calls the
|
||||
``expose_ip`` driver method to perform the needed actions to expose the port
|
||||
(wire and advertise). If the port is on a tenant network, the driver
|
||||
dismisses the event.
|
||||
|
||||
- ``LogicalSwitchPortProviderDeleteEvent``: Detects when a VM or an amphora LB
|
||||
port, logical switch ports of type "" (empty double-qoutes) or ``virtual``,
|
||||
goes down or gets detached from the OVN chassis where the agent is running.
|
||||
If the ports are on a provider network, then the driver calls the
|
||||
``withdraw_ip`` driver method to perform the needed actions to withdraw the
|
||||
port (withdraw and unwire). If the port is on a tenant network, the driver
|
||||
dismisses the event.
|
||||
|
||||
- ``LogicalSwitchPortFIPCreateEvent``: Similar to
|
||||
``LogicalSwitchPortProviderCreateEvent`` but focusing on the changes on the
|
||||
FIP information on the Logical Switch Port external_ids.
|
||||
It calls ``expose_fip`` driver method to perform the needed actions to expose
|
||||
the floating IP (wire and advertize).
|
||||
|
||||
- ``LogicalSwitchPortFIPDeleteEvent``: Same as previous one but for withdrawing
|
||||
FIPs. In this case it is similar to ``LogicalSwitchPortProviderDeleteEvent``
|
||||
but instaed calls the ``withdraw_fip`` driver method to perform the needed actions
|
||||
to withdraw the floating IP (Withdraw and unwire).
|
||||
|
||||
- ``LocalnetCreateDeleteEvent``: Detects creation/deletion of OVN localnet
|
||||
ports, which indicates the creation/deletion of provider networks. This
|
||||
triggers a resync (``sync`` method) action to perform the base configuration
|
||||
needed for the provider networks, such as OVS flows or arp/ndp
|
||||
configurations.
|
||||
|
||||
- ``ChassisRedirectCreateEvent``: Similar to
|
||||
``LogicalSwitchPortProviderCreateEvent`` but with the focus on logical router
|
||||
ports, such as the OVN gateway ports (cr-lrps), instead of logical switch
|
||||
ports. The driver calls ``expose_ip`` which performs additional steps to also
|
||||
expose IPs related to the cr-lrps, such as the ovn-lb or IPs in tenant
|
||||
networks. The watcher ``match`` checks the chassis information in the
|
||||
``status`` field, which must be ovn23.09 or later.
|
||||
|
||||
- ``ChassisRedirectDeleteEvent``: Similar to
|
||||
``LogicalSwitchPortProviderDeleteEvent`` but with the focus on logical router
|
||||
ports, such as the OVN gateway ports (cr-lrps), instead of logical switch
|
||||
ports. The driver calls ``withdraw_ip`` which performs additional steps to
|
||||
also withdraw IPs related to the cr-lrps, such as the ovn-lb or IPs in tenant
|
||||
networks. The watcher ``match`` checks the chassis information in the
|
||||
``status`` field, which must be ovn23.09 or later.
|
||||
|
||||
- ``LogicalSwitchPortSubnetAttachEvent``: Detects Logical Switch Ports of type
|
||||
``router`` (connecting Logical Switch to Logical Router) and checks if the
|
||||
associated router is associated to the local chassis, i.e., if the CR-LRP of
|
||||
the router is located in the local chassis. If that is the case, the
|
||||
``expose_subnet`` driver method is called which is in charge of the wiring
|
||||
needed for the IPs on that subnet (set of IP routes and rules).
|
||||
|
||||
- ``LogicalSwitchPortSubnetDetachEvent``: Similar to
|
||||
``LogicalSwitchPortSubnetAttachEvent`` but for unwiring the subnet, so it is
|
||||
calling the``withdraw_subnet`` driver method.
|
||||
|
||||
- ``LogicalSwitchPortTenantCreateEvent``: Detects when a logical switch port
|
||||
of type ``""`` (empty double-qoutes) or ``virtual``, similar to
|
||||
``LogicalSwitchPortProviderCreateEvent``. It checks if the network associated
|
||||
to the VM is exposed in the local chassis (meaning its cr-lrp is also local).
|
||||
If that is the case, it calls ``expose_remote_ip``, which manages the
|
||||
advertising of the IP -- there is no need for wiring, as that is done when
|
||||
the subnet is exposed by ``LogicalSwitchPortSubnetAttachEvent`` event.
|
||||
|
||||
- ``LogicalSwitchPortTenantDeleteEvent``: Similar to
|
||||
``LogicalSwitchPortTenantCreateEvent`` but for withdrawing IPs.
|
||||
Calling ``withdraw_remote_ips``.
|
||||
|
||||
- ``OVNLBCreateEvent``: Detects Load_Balancer events and processes them only
|
||||
if the Load_Balancer entry has associated VIPs and the router is local to
|
||||
the chassis.
|
||||
If the VIP or router is added to a provider network, the driver calls
|
||||
``expose_ovn_lb_vip`` to expose and wire the VIP or router.
|
||||
If the VIP or router is added to a tenant network, the driver calls
|
||||
``expose_ovn_lb_vip`` to only expose the VIP or router.
|
||||
If a floating IP is added, then the driver calls ``expose_ovn_lb_fip`` to
|
||||
expose and wire the FIP.
|
||||
|
||||
- ``OVNLBDeleteEvent``: If the VIP or router is removed from a provider
|
||||
network, the driver calls ``withdraw_ovn_lb_vip`` to withdraw and unwire
|
||||
the VIP or router. If the VIP or router is removed to a tenant network,
|
||||
the driver calls ``withdraw_ovn_lb_vip`` to only withdraw the VIP or router.
|
||||
If a floating IP is removed, then the driver calls ``withdraw_ovn_lb_fip``
|
||||
to withdraw and unwire the FIP.
|
||||
|
||||
|
||||
Driver Logic
|
||||
~~~~~~~~~~~~
|
||||
|
||||
The NB BGP driver is in charge of the networking configuration ensuring that
|
||||
VMs and LBs on provider networks or with FIPs can be reached through BGP
|
||||
(N/S traffic). In addition, if the ``expose_tenant_networks`` flag is enabled,
|
||||
VMs in tenant networks should be reachable too -- although instead of directly
|
||||
in the node they are created, through one of the network gateway chassis nodes.
|
||||
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
|
||||
GUA ranges. In addition, if the config option ``address_scopes`` is set, only
|
||||
the tenant networks with matching corresponding ``address_scope`` will be
|
||||
exposed.
|
||||
|
||||
.. note::
|
||||
|
||||
To be able to expose tenant networks a ovn version ovn23.09 or newer is
|
||||
needed
|
||||
|
||||
To accomplish the network configuration and advertisement, the driver ensures:
|
||||
|
||||
- VM and LBs IPs can be advertised in a node where the traffic can be injected
|
||||
into the OVN overlay: either in the node that hosts the VM or in the node
|
||||
where the router gateway port is scheduled. (See the "limitations"
|
||||
subsection.).
|
||||
|
||||
- After the traffic reaches the specific node, kernel networking redirects the
|
||||
traffic to the OVN overlay, if the default ``underlay`` exposing method is
|
||||
used.
|
||||
|
||||
|
||||
.. include:: ../bgp_advertising.rst
|
||||
|
||||
|
||||
.. include:: ../bgp_traffic_redirection.rst
|
||||
|
||||
|
||||
Driver API
|
||||
++++++++++
|
||||
|
||||
The NB BGP driver implements the ``driver_api.py`` interface with the
|
||||
following functions:
|
||||
|
||||
- ``expose_ip``: creates all the IP rules and routes, and OVS flows needed
|
||||
to redirect the traffic to OVN overlay. It also ensures that FRR exposes
|
||||
the required IP by using BGP.
|
||||
|
||||
- ``withdraw_ip``: removes the configuration (IP rules/routes, OVS flows)
|
||||
from ``expose_ip`` method to withdraw the exposed IP.
|
||||
|
||||
- ``expose_subnet``: adds kernel networking configuration (IP rules and route)
|
||||
to ensure traffic can go from the node to the OVN overlay (and back)
|
||||
for IPs within the tenant subnet CIDR.
|
||||
|
||||
- ``withdraw_subnet``: removes kernel networking configuration added by
|
||||
``expose_subnet``.
|
||||
|
||||
- ``expose_remote_ip``: BGP expose VM tenant network IPs through the chassis
|
||||
hosting the OVN gateway port for the router where the VM is connected.
|
||||
It ensures traffic directed to the VM IP arrives at this node by exposing
|
||||
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
|
||||
the traffic is redirected to the OVN overlay after it arrives on the node.
|
||||
|
||||
- ``withdraw_remote_ip``: removes the configuration added by
|
||||
``expose_remote_ip``.
|
||||
|
||||
And in addition, the driver also implements extra methods for the FIPs and the
|
||||
OVN load balancers:
|
||||
|
||||
- ``expose_fip`` and ``withdraw_fip`` which are equivalent to ``expose_ip`` and
|
||||
``withdraw_ip`` but for FIPs.
|
||||
|
||||
- ``expose_ovn_lb_vip``: adds kernel networking configuration to ensure
|
||||
traffic is forwarded from the node with the associated cr-lrp to the OVN
|
||||
overlay, as well as to expose the VIP through BGP in that node.
|
||||
|
||||
- ``withdraw_ovn_lb_vip``: removes the above steps to stop advertising
|
||||
the load balancer VIP.
|
||||
|
||||
- ``expose_ovn_lb_fip`` and ``withdraw_ovn_lb_fip``: for exposing the FIPs
|
||||
associated to ovn loadbalancers. This is similar to
|
||||
``expose_fip/withdraw_fip`` but taking into account that it must be exposed
|
||||
on the node with the cr-lrp for the router associated to the loadbalancer.
|
||||
|
||||
|
||||
.. include:: ../agent_deployment.rst
|
||||
|
||||
|
||||
Limitations
|
||||
-----------
|
||||
|
||||
The following limitations apply:
|
||||
|
||||
- OVN 23.09 or later is needed to support exposing tenant networks IPs and
|
||||
OVN loadbalancers.
|
||||
|
||||
- There is no API to decide what to expose, all VMs/LBs on providers or with
|
||||
floating IPs associated with them are exposed. For the VMs in the tenant
|
||||
networks, use the flag ``address_scopes`` to filter which subnets to expose,
|
||||
which also prefents having overlapping IPs.
|
||||
|
||||
- In the currently implemented exposing methods (``underlay`` and
|
||||
``ovn``) there is no support for overlapping CIDRs, so this must be
|
||||
avoided, e.g., by using address scopes and subnet pools.
|
||||
|
||||
- For the default exposing method (``underlay``) the network traffic is steered
|
||||
by kernel routing (ip routes and rules), therefore OVS-DPDK, where the kernel
|
||||
space is skipped, is not supported. With the ``ovn`` exposing method
|
||||
the routing is done at ovn level, so this limitation does not exists.
|
||||
More details in :ref:`ovn_routing`.
|
||||
|
||||
- For the default exposing method (``underlay``) the network traffic is steered
|
||||
by kernel routing (ip routes and rules), therefore SRIOV, where the hypervisor
|
||||
is skipped, is not supported. With the ``ovn`` exposing method
|
||||
the routing is done at ovn level, so this limitation does not exists.
|
||||
More details in :ref:`ovn_routing`.
|
||||
|
||||
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
|
||||
the provider or the FIPs associated with the VIPs on tenant networks needs to
|
||||
go through the networking nodes (the ones hosting the Neutron Router Gateway
|
||||
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
|
||||
load balancer members to the provider network). Therefore, the entry point
|
||||
into the OVN overlay needs to be one of those networking nodes, and
|
||||
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
|
||||
nodes the traffic will follow the normal tunneled path (Geneve tunnel) to
|
||||
the OpenStack compute node where the selected member is located.
|
265
doc/source/contributor/drivers/ovn_bgp_mode_design.rst
Normal file
265
doc/source/contributor/drivers/ovn_bgp_mode_design.rst
Normal file
@ -0,0 +1,265 @@
|
||||
.. _ovn_routing:
|
||||
|
||||
===================================================================
|
||||
[NB DB] NB OVN BGP Agent: Design of the BGP Driver with OVN routing
|
||||
===================================================================
|
||||
|
||||
This is an extension of the NB OVN BGP Driver which adds a new
|
||||
``exposing_method`` named ``ovn`` to make use of OVN routing, instead of
|
||||
relying on Kernel routing.
|
||||
|
||||
Purpose
|
||||
-------
|
||||
|
||||
The addition of a BGP driver enables the OVN BGP agent to expose virtual
|
||||
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
|
||||
protocol when these IP addresses are either associated with a floating IP
|
||||
(FIP) or are booted or created on a provider network.
|
||||
The same functionality is available on project networks, when a special
|
||||
flag is set.
|
||||
|
||||
This document presents the design decision behind the extensions on the
|
||||
NB OVN BGP Driver to support OVN routing instead of kernel routing,
|
||||
and therefore enabling datapath acceleartion.
|
||||
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
The main goal is to make the BGP capabilities of OVN BGP Agent compliant with
|
||||
OVS-DPDK and HWOL. To do that we need to move to OVN/OVS what the OVN BGP
|
||||
Agent is currently doing with Kernel networking -- redirect traffic to/from
|
||||
the OpenStack OVN Overlay.
|
||||
|
||||
To accomplish this goal, the following is required:
|
||||
|
||||
- Ensure that incoming traffic gets redirected from the physical NICs to the OVS
|
||||
integration bridge (br-int) though one or more OVS provider bridges (br-ex)
|
||||
without using kernel routes and rules.
|
||||
|
||||
- Ensure the outgoing traffic gets redirected to the physical NICs without
|
||||
using the default kernel routes.
|
||||
|
||||
- Expose the IPs in the same way as we did before.
|
||||
|
||||
The third point is simple as it is already being done, but for the first two
|
||||
points OVN virtual routing capabilities are needed, ensuring the traffic gets
|
||||
routed from the NICS to the OpenStack Overlay and vice versa.
|
||||
|
||||
|
||||
Proposed Solution
|
||||
-----------------
|
||||
|
||||
To avoid placing kernel networking in the middle of the datapath and blocking
|
||||
acceleration, the proposed solution mandates locating a separate OVN cluster
|
||||
on each node that manages the needed virtual infrastructure between the
|
||||
OpenStack networking overlay and the physical network.
|
||||
Because routing occurs at OVN/OVS level, this proposal makes it is possible
|
||||
to support hardware offloading (HWOL) and OVS-DPDK.
|
||||
|
||||
The next figure shows the proposed cluster required to manage the OVN virtual
|
||||
networking infrastructure on each node.
|
||||
|
||||
.. image:: ../../../images/ovn-cluster-overview.png
|
||||
:alt: OVN Routing integration
|
||||
:align: center
|
||||
:width: 100%
|
||||
|
||||
In a standard deployment ``br-int`` is directly connected to the OVS external
|
||||
bridge (``br-ex``) where the physical NICs are attached.
|
||||
By contrast, in the default BGP driver solution (see :ref:`nb_bgp_driver`),
|
||||
the physical NICs are not directly attached to br-ex, but rely on kernel
|
||||
networking (ip routes and ip rules) to redirect the traffic to ``br-ex``.
|
||||
The OVN routing architecture proposes the following mapping:
|
||||
|
||||
- ``br-int`` connects to an external (from the OpenStack perspective) OVS bridge
|
||||
(``br-osp``).
|
||||
|
||||
- ``br-osp`` does not have any physical resources attached, just patch
|
||||
ports connecting them to ``br-int`` and ``br-bgp``.
|
||||
|
||||
- ``br-bgp`` is the integration bridge managed by the extra OVN cluster
|
||||
deployed per node. This is where the virtual OVN resources are be created
|
||||
(routers and switches). It creates mappings to ``br-osp`` and ``br-ex``
|
||||
(patch ports).
|
||||
|
||||
- ``br-ex`` keeps being the external bridge, where the physical NICs are
|
||||
attached (as in default environments without BGP). But instead of being
|
||||
directly connected to ``br-int``, is connected to ``br-bgp``. Note for
|
||||
ECMP purposes, each nic is attached to a different ``br-ex`` device
|
||||
(``br-ex`` and ``br-ex-2``).
|
||||
|
||||
The virtual OVN resources requires the following:
|
||||
|
||||
- Logical Router (``bgp-router``): manages the routing that was
|
||||
previously done in the kernel networking layer between both networks
|
||||
(physical and OpenStack OVN overlay). It has two connections (i.e., Logical
|
||||
Router Ports) towards the ``bgp-ex-X`` Logical Switches to add support for ECMP
|
||||
(only one switch is required but you must have several in case of ECMP),
|
||||
and one connection to the ``bgp-osp`` Logical Switch to ensure traffic
|
||||
to/from the OpenStack networking overlay.
|
||||
|
||||
- Logical Switch (``bgp-ex``): is connected to the ``bgp-router``, and has
|
||||
a localnet to connect it to ``br-ex`` and therefore the physical NICs. There
|
||||
is one Logical Switch per NIC (``bgp-ex`` and ``bgp-ex-2``).
|
||||
|
||||
- Logical Switch (``bgp-osp``): is connected to the ``bgp-router``, and has
|
||||
a localnet to connect it to ``br-osp`` to enable it to send traffic to
|
||||
and from the OpenStack OVN overlay.
|
||||
|
||||
The following OVS flows are required on both OVS bridges:
|
||||
|
||||
- ``br-ex-X`` bridges: require a flow to ensure only the traffic
|
||||
targetted for OpenStack provider networks is redirected to the OVN cluster.
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
cookie=0x3e7, duration=942003.114s, table=0, n_packets=1825, n_bytes=178850, priority=1000,ip,in_port=eth1,nw_dst=172.16.0.0/16 actions=mod_dl_dst:52:54:00:30:93:ea,output:"patch-bgp-ex-lo"
|
||||
|
||||
|
||||
|
||||
- ``br-osp`` bridge: require a flow for each OpenStack provider network to
|
||||
change the MAC by the one on the router port in the OVN cluster and to
|
||||
properly manage traffic that is routed to the OVN cluster.
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
cookie=0x3e7, duration=942011.971s, table=0, n_packets=8644, n_bytes=767152, priority=1000,ip,in_port="patch-provnet-0" actions=mod_dl_dst:40:44:00:00:00:06,NORMAL
|
||||
|
||||
|
||||
OVN NB DB Events
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The OVN northbound database events that the driver monitors are the same as
|
||||
the ones for the NB DB driver with the ``underlay`` exposing mode.
|
||||
See :ref:`nb_bgp_driver`. The main difference between the two drivers is
|
||||
that the wiring actions are simplified for the OVN routing driver.
|
||||
|
||||
|
||||
Driver Logic
|
||||
~~~~~~~~~~~~
|
||||
|
||||
As with the other BGP drivers or ``exposing modes`` (:ref:`bgp_driver`,
|
||||
:ref:`nb_bgp_driver`) the NB DB Driver with the ``ovn`` exposing mode enabled
|
||||
(i.e., enabling ``OVN routing`` instead of rely on ``Kernel networking``)
|
||||
is in charge of exposing the IPs with BGP and of the networking configuration
|
||||
to ensure that VMs abd LBs on provider networks or with FIPs can be reached
|
||||
through BGP (N/S traffic). Similarly, if ``expose_tenant_networks`` flag is
|
||||
enabled, VMs in tenant networks should be reachable too -- although instead
|
||||
of directly in the node they are created, through one of the network gateway
|
||||
chassis nodes. The same happens with ``expose_ipv6_gua_tenant_networks``
|
||||
but only for IPv6 GUA ranges.
|
||||
In addition, if the config option ``address_scopes`` is set only the tenant
|
||||
networks with matching corresponding address_scope will be exposed.
|
||||
|
||||
To accomplish this, it needs to configure the extra per node ovn cluster to
|
||||
ensure that:
|
||||
|
||||
- VM and LBs IPs can be advertized in a node where the traffic could be injected
|
||||
into the OVN overlay through the extra ovn cluster (instead of the Kernel
|
||||
routing) -- either in the node hosting the VM or the node where the router
|
||||
gateway port is scheduled.
|
||||
|
||||
- Once the traffic reaches the specific node, the traffic is redirected to the
|
||||
OVN overlay by using the extra ovn cluster per node with the proper OVN
|
||||
configuration. To do this it needs to create Logical Switches, Logical
|
||||
Routers and the routing configuration between them (routes and policies).
|
||||
|
||||
.. include:: ../bgp_advertising.rst
|
||||
|
||||
|
||||
Traffic Redirection to/from OVN
|
||||
+++++++++++++++++++++++++++++++
|
||||
|
||||
As explained before, the main idea of this exposing mode is to leverage OVN
|
||||
routing instead of kernel routing. For the traffic going out the steps are
|
||||
the next:
|
||||
|
||||
- If (OpenStack) OVN cluster knows about the destination MAC then that works
|
||||
as in deployment without BGP or OVN cluster support (no arp needed, MAC
|
||||
directly used). If the MAC is unknown but on the same provider network(s)
|
||||
range, the ARP gets replied by the Logical Switch Port on the ``bgp-osp`` LS
|
||||
thanks to enabling arp_proxy on it. And if it is a different range, it will
|
||||
reply due to the router having default routes to the outside.
|
||||
The flow at ``br-osp`` is in charge of changing the destination MAC by the
|
||||
one on the Logical Router Port on ``bgp-router`` LR.
|
||||
|
||||
- The previous step takes the traffic to the extra OVN cluster per node, where
|
||||
the default (ECMP) routes are used to send the traffic to the external
|
||||
Logical Switch and from there to the physical nics attached to the external
|
||||
OVS bridge(s) (``br-ex``, ``br-ex-2``). In case of known MAC by OpenStack,
|
||||
instead of the default routes, a Logical Route Policy gets applied so that
|
||||
traffic is forced to be redirected out (through the LRPs connected to the
|
||||
external LS) when comming through the internal LRP (the one connected to
|
||||
OpenStack).
|
||||
|
||||
And for the traffic comming in:
|
||||
|
||||
- The flow hits the ovs flow added at the ``br-ex-X`` bridge(s) to redirect
|
||||
the traffic to the per node OVN cluster, changing the destination MAC by
|
||||
the one at the related ``br-ex`` device, which are the same used for the
|
||||
OVN cluster Logical Router Ports. This takes the traffic to the OVN router.
|
||||
|
||||
- After that, thanks to having the arp_proxy enabled on the LSP on ``bgp-osp``
|
||||
the traffic will be redirected to there. And due to a limitation in the
|
||||
functionality of arp_proxy, there is a need of adding an extra static mac
|
||||
binding entry in the cluster so that the VM MAC is used for destination
|
||||
instead of the own LSP MAC, which would lead to droping the traffic on the
|
||||
LS pipeline.
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
_uuid : 6e1626b3-832c-4ee6-9311-69ebc15cb14d
|
||||
ip : "172.16.201.219"
|
||||
logical_port : bgp-router-openstack
|
||||
mac : "fa:16:3e:82:ee:19"
|
||||
override_dynamic_mac: true
|
||||
|
||||
|
||||
Driver API
|
||||
++++++++++
|
||||
|
||||
This is the very same as in the NB DB driver with the ``underlay`` exposing
|
||||
mode. See :ref:`nb_bgp_driver`.
|
||||
|
||||
|
||||
Agent deployment
|
||||
~~~~~~~~~~~~~~~~
|
||||
The deployment is similar to the NB DB driver with the ``underlay`` exposing
|
||||
method but with some extra configuration. See :ref:`nb_bgp_driver` for the base.
|
||||
|
||||
It is needed to state the exposing method in the DEFAULT section and the extra
|
||||
configuration for the local ovn cluster that performs the routing, including the
|
||||
range for the provider networks to expose/handle:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
[DEFAULT]
|
||||
exposing_method=ovn
|
||||
|
||||
[local_ovn_cluster]
|
||||
ovn_nb_connection=unix:/run/ovn/ovnnb_db.sock
|
||||
ovn_sb_connection=unix:/run/ovn/ovnsb_db.sock
|
||||
external_nics=eth1,eth2
|
||||
peer_ips=100.64.1.5,100.65.1.5
|
||||
provider_networks_pool_prefixes=172.16.0.0/16
|
||||
|
||||
|
||||
Limitations
|
||||
-----------
|
||||
|
||||
The following limitations apply:
|
||||
|
||||
- OVN 23.06 or later is needed
|
||||
|
||||
- Tenant networks, subnet and ovn-loadbalancer are not yet supported, and will
|
||||
require OVN 23.09 or nlaterewer.
|
||||
|
||||
- IPv6 not yet supported
|
||||
|
||||
- ECMP not properly working as there is no support for BFD at the ovn-cluster,
|
||||
which means if one of the routes goes away the OVN cluster won't react to it
|
||||
and there will be traffic disruption.
|
||||
|
||||
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
|
||||
using address scopes and subnet pools.
|
@ -5,7 +5,7 @@
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
bgp_mode_design
|
||||
evpn_mode_design
|
||||
bgp_mode_stretched_l2_design
|
||||
bgp_supportability_matrix
|
||||
drivers/index
|
||||
agent_deployment
|
||||
bgp_advertising
|
||||
bgp_traffic_redirection
|
||||
|
@ -10,10 +10,11 @@ Welcome to the documentation of OVN BGP Agent
|
||||
Contents:
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:maxdepth: 3
|
||||
|
||||
readme
|
||||
contributor/index
|
||||
bgp_supportability_matrix
|
||||
|
||||
Indices and tables
|
||||
==================
|
||||
|
Loading…
x
Reference in New Issue
Block a user