Add documentation about NB DB driver

This includes the option to use the OVN-Cluster for routing
instead of the kernel.

It also updates the supportability matrix to better reflect the
current status, and makes a little reorg on the organization
structure

Change-Id: If8fb9a42f74511e9f70a25d7c08dce99c20c3f10
This commit is contained in:
Luis Tomas Bolivar 2023-12-13 08:06:21 +01:00
parent 6678aa5250
commit f94c041e7a
14 changed files with 1339 additions and 656 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

View File

@ -22,62 +22,67 @@ The next sections highlight the options and features supported by each driver
BGP Driver (SB)
---------------
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-------------+
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
+=================+=====================================================+==========================================+==========================================+==========================+====================+=======================+=============+
| Underlay | Expose IPs on the default underlay network | Adding IP to dummy nic isolated in a VRF | Ingress: ip rules, and ip routes on the | Yes | Yes | No | Yes |
| | | | routing table associated with OVS | | (expose_ipv6_gua | | |
| | | | Egress: OVS flow to change MAC | (expose_tenant_networks) | _tenant_networks) | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-------------+
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-----------+
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
+=================+=====================================================+==========================================+==========================================+==========================+====================+=======================+===========+
| Underlay | Expose IPs on the default underlay network. | Adding IP to dummy NIC isolated in a VRF | Ingress: ip rules, and ip routes on the | Yes | Yes | No | Yes |
| | | | routing table associated with OVS | | (expose_ipv6_gua | | |
| | | | Egress: OVS flow to change MAC | (expose_tenant_networks) | _tenant_networks) | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-----------+
BGP Driver (NB)
---------------
Note until RFE on OVN (https://bugzilla.redhat.com/show_bug.cgi?id=2107515)
is implemented there is no option to expose tenant networks as we do not know
where the CR-LRP port is associated to.
OVN version 23.09 is required to expose tenant networks and ovn-lb, because
CR-LRP port chassis information in the NB DB is only available in that
version (https://bugzilla.redhat.com/show_bug.cgi?id=2107515).
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+-------------+
| Exposing Method | Description | Expose with | Wired with | Expose Tenants or GUA | OVS-DPDK/HWOL Support | Implemented |
+=================+=====================================================+==========================================+==========================================+==========================+=======================+=============+
| Underlay | Expose IPs on the default underlay network | Adding IP to dummy nic isolated in a VRF | Ingress: ip rules, and ip routes on the | No support until OVN | No | Yes |
| | | | routing table associated to ovs | has information about | | |
| | | | Egress: ovs-flow to change mac | the CR-LRP chassis on | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ the SB DB +-----------------------+-------------+
| L2VNI | Extends the L2 segment on a given VNI | No need to expose it, automatic with the | Ingress: vxlan + bridge device | | No | No |
| | | FRR configuration and the wiring | Egress: nothing | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
| VRF | Expose IPs on a given VRF (vni id) | Add IPs to dummy nic associated to the | Ingress: vxlan + bridge device | | No | No |
| | | VRF device (lo_VNI_ID) | Egress: flow to redirect to VRF device | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
| Dynamic | Mix of the previous, depending on annotations it | Mix of the previous three | Ingress: mix of all the above | | No | No |
| | exposes it differently and on different VNIs | | Egress: mix of all the above | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
| OVN-Cluster | Make use of an extra OVN cluster (per node) instead | Adding IP to dummy nic isolated in a VRF | Ingress: ovn routes, ovs flow (mac tweak)| | Yes | No |
| | of kernel routing -- exposing the IPs with BGP is | (as it only supports the underlay option)| Egress: ovn routes and policies, | | | |
| | the same as before | | and ovs flow (mac tweak) | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+-------------+
The following table lists the various methods you can use to expose the
networks/IPS, how they expose the IPs and the tenant networks, and whether
OVS-DPDK and hardware offload (HWOL) is supported.
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
| Exposing Method | Description | Expose with | Wired with | Expose Tenants or GUA | OVS-DPDK/HWOL Support | Supported |
+=================+=====================================================+==========================================+==========================================+==========================+=======================+===============+
| Underlay | Expose IPs on the default underlay network. | Adding IP to dummy NIC isolated in a VRF.| Ingress: ip rules, and ip routes on the | Yes | No | Yes |
| | | | routing table associated to OVS | | | |
| | | | Egress: OVS-flow to change MAC | (expose_tenant_networks) | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
| L2VNI | Extends the L2 segment on a given VNI. | No need to expose it, automatic with the | Ingress: vxlan + bridge device | N/A | No | No |
| | | FRR configuration and the wiring. | Egress: nothing | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
| VRF | Expose IPs on a given VRF (vni id). | Add IPs to dummy NIC associated to the | Ingress: vxlan + bridge device | Yes | No | No |
| | | VRF device (lo_VNI_ID). | Egress: flow to redirect to VRF device | (Not implemented) | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
| Dynamic | Mix of the previous. Depending on annotations it | Mix of the previous three. | Ingress: mix of all the above | Depends on the method | No | No |
| | exposes IPs differently and on different VNIs. | | Egress: mix of all the above | used | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
| OVN | Make use of an extra OVN cluster (per node) instead | Adding IP to dummy NIC isolated in a VRF | Ingress: OVN routes, OVS flow (MAC tweak)| Yes | Yes | Yes. Only for |
| | of kernel routing -- exposing the IPs with BGP is | (as it only supports the underlay | Egress: OVN routes and policies, | (Not implemented) | | ipv4 and flat |
| | the same as before. | option). | and OVS flow (MAC tweak) | | | provider |
| | | | | | | networks |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
BGP Stretched Driver (SB)
-------------------------
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+=============+
| Underlay | Expose IPs on the default underlay network | Adding IP routes to default VRF table | Ingress: ip rules, and ip routes on the | Yes | No | No | Yes |
| | | | routing table associated to ovs | | | | |
| | | | Egress: ovs-flow to change mac | | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+===========+
| Underlay | Expose IPs on the default underlay network. | Adding IP routes to default VRF table. | Ingress: ip rules, and ip routes on the | Yes | No | No | Yes |
| | | | routing table associated to OVS | | | | |
| | | | Egress: OVS-flow to change MAC | | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
EVPN Driver (SB)
----------------
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+=============+
| VRF | Expose IPs on a given VRF (vni id) -- requires | Add IPs to dummy nic associated to the | Ingress: vxlan + bridge device | Yes | No | No | No |
| | newtorking-bgpvpn or manual NB DB inputs | VRF device (lo_VNI_ID) | Egress: flow to redirect to VRF device | | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+===========+
| VRF | Expose IPs on a given VRF (vni id) -- requires | Add IPs to dummy NIC associated to the | Ingress: vxlan + bridge device | Yes | No | No | No |
| | newtorking-bgpvpn or manual NB DB inputs. | VRF device (lo_VNI_ID). | Egress: flow to redirect to VRF device | | | | |
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+

View File

@ -0,0 +1,163 @@
Agent deployment
~~~~~~~~~~~~~~~~
The BGP mode (for both NB and SB drivers) exposes the VMs and LBs in provider
networks or with FIPs, as well as VMs on tenant networks if
``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks`` configuration
options are enabled.
There is a need to deploy the agent in all the nodes where VMs can be created
as well as in the networker nodes (i.e., where OVN router gateway ports can be
allocated):
- For VMs and Amphora load balancers on provider networks or with FIPs,
the IP is exposed on the node where the VM (or amphora) is deployed.
Therefore the agent needs to be running on the compute nodes.
- For VMs on tenant networks (with ``expose_tenant_networks`` or
``expose_ipv6_gua_tenant_networks`` configuration options enabled), the agent
needs to be running on the networker nodes. In OpenStack, with OVN
networking, the N/S traffic to the tenant VMs (without FIPs) needs to go
through the networking nodes, more specifically the one hosting the
chassisredirect OVN port (cr-lrp), connecting the provider network to the
OVN virtual router. Hence, the VM IPs are advertised through BGP in that
node, and from there it follows the normal path to the OpenStack compute
node where the VM is located — through the tunnel.
- Similarly, for OVN load balancer the IPs are exposed on the networker node.
In this case the ARP request for the VIP is replied by the OVN router
gateway port, therefore the traffic needs to be injected into OVN overlay
at that point too.
Therefore the agent needs to be running on the networker nodes for OVN
load balancers.
As an example of how to start the OVN BGP Agent on the nodes, see the commands
below:
.. code-block:: ini
$ python setup.py install
$ cat bgp-agent.conf
# sample configuration that can be adapted based on needs
[DEFAULT]
debug=True
reconcile_interval=120
expose_tenant_networks=True
# expose_ipv6_gua_tenant_networks=True
# for SB DB driver
driver=ovn_bgp_driver
# for NB DB driver
#driver=nb_ovn_bgp_driver
bgp_AS=64999
bgp_nic=bgp-nic
bgp_vrf=bgp-vrf
bgp_vrf_table_id=10
ovsdb_connection=tcp:127.0.0.1:6640
address_scopes=2237917c7b12489a84de4ef384a2bcae
[ovn]
ovn_nb_connection = tcp:172.17.0.30:6641
ovn_sb_connection = tcp:172.17.0.30:6642
[agent]
root_helper=sudo ovn-bgp-agent-rootwrap /etc/ovn-bgp-agent/rootwrap.conf
root_helper_daemon=sudo ovn-bgp-agent-rootwrap-daemon /etc/ovn-bgp-agent/rootwrap.conf
$ sudo bgp-agent --config-dir bgp-agent.conf
Starting BGP Agent...
Loaded chassis 51c8480f-c573-4c1c-b96e-582f9ca21e70.
BGP Agent Started...
Ensuring VRF configuration for advertising routes
Configuring br-ex default rule and routing tables for each provider network
Found routing table for br-ex with: ['201', 'br-ex']
Sync current routes.
Add BGP route for logical port with ip 172.24.4.226
Add BGP route for FIP with ip 172.24.4.199
Add BGP route for CR-LRP Port 172.24.4.221
....
.. note::
If you only want to expose the IPv6 GUA tenant IPs, then remove the option
``expose_tenant_networks`` and add ``expose_ipv6_gua_tenant_networks=True``
instead.
.. note::
If you want to filter the tenant networks to be exposed by some specific
address scopes, add the list of address scopes to ``address_scope=XXX``
section. If no filtering should be applied, just remove the line.
Note that the OVN BGP Agent operates under the next assumptions:
- A dynamic routing solution, in this case FRR, is deployed and
advertises/withdraws routes added/deleted to/from certain local interface,
in this case the ones associated to the VRF created to that end. As only VM
and load balancer IPs need to be advertised, FRR needs to be configure with
the proper filtering so that only /32 (or /128 for IPv6) IPs are advertised.
A sample config for FRR is:
.. code-block:: ini
frr version 7.5
frr defaults traditional
hostname cmp-1-0
log file /var/log/frr/frr.log debugging
log timestamp precision 3
service integrated-vtysh-config
line vty
router bgp 64999
bgp router-id 172.30.1.1
bgp log-neighbor-changes
bgp graceful-shutdown
no bgp default ipv4-unicast
no bgp ebgp-requires-policy
neighbor uplink peer-group
neighbor uplink remote-as internal
neighbor uplink password foobar
neighbor enp2s0 interface peer-group uplink
neighbor enp3s0 interface peer-group uplink
address-family ipv4 unicast
redistribute connected
neighbor uplink activate
neighbor uplink allowas-in origin
neighbor uplink prefix-list only-host-prefixes out
exit-address-family
address-family ipv6 unicast
redistribute connected
neighbor uplink activate
neighbor uplink allowas-in origin
neighbor uplink prefix-list only-host-prefixes out
exit-address-family
ip prefix-list only-default permit 0.0.0.0/0
ip prefix-list only-host-prefixes permit 0.0.0.0/0 ge 32
route-map rm-only-default permit 10
match ip address prefix-list only-default
set src 172.30.1.1
ip protocol bgp route-map rm-only-default
ipv6 prefix-list only-default permit ::/0
ipv6 prefix-list only-host-prefixes permit ::/0 ge 128
route-map rm-only-default permit 11
match ipv6 address prefix-list only-default
set src f00d:f00d:f00d:f00d:f00d:f00d:f00d:0004
ipv6 protocol bgp route-map rm-only-default
ip nht resolve-via-default
- The relevant provider OVS bridges are created and configured with a loopback
IP address (eg. 1.1.1.1/32 for IPv4), and proxy ARP/NDP is enabled on their
kernel interface.

View File

@ -0,0 +1,66 @@
BGP Advertisement
+++++++++++++++++
The OVN BGP Agent (both SB and NB drivers) is in charge of triggering FRR
(IP routing protocol suite for Linux which includes protocol daemons for BGP,
OSPF, RIP, among others) to advertise/withdraw directly connected routes via
BGP. To do that, when the agent starts, it ensures that:
- FRR local instance is reconfigured to leak routes for a new VRF. To do that
it uses ``vtysh shell``. It connects to the existsing FRR socket (
``--vty_socket`` option) and executes the next commands, passing them through
a file (``-c FILE_NAME`` option):
.. code-block:: ini
router bgp {{ bgp_as }}
address-family ipv4 unicast
import vrf {{ vrf_name }}
exit-address-family
address-family ipv6 unicast
import vrf {{ vrf_name }}
exit-address-family
router bgp {{ bgp_as }} vrf {{ vrf_name }}
bgp router-id {{ bgp_router_id }}
address-family ipv4 unicast
redistribute connected
exit-address-family
address-family ipv6 unicast
redistribute connected
exit-address-family
- There is a VRF created (the one leaked in the previous step), by default
with name ``bgp-vrf``.
- There is a dummy interface type (by default named ``bgp-nic``), associated to
the previously created VRF device.
- Ensure ARP/NDP is enabled at OVS provider bridges by adding an IP to it.
Then, to expose the VMs/LB IPs as they are created (or upon
initialization or re-sync), since the FRR configuration has the
``redistribute connected`` option enabled, the only action needed to expose it
(or withdraw it) is to add it (or remove it) from the ``bgp-nic`` dummy interface.
Then it relies on Zebra to do the BGP advertisement, as Zebra detects the
addition/deletion of the IP on the local interface and advertises/withdraws
the route:
.. code-block:: ini
$ ip addr add IPv4/32 dev bgp-nic
$ ip addr add IPv6/128 dev bgp-nic
.. note::
As we also want to be able to expose VM connected to tenant networks
(when ``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks``
configuration options are enabled), there is a need to expose the Neutron
router gateway port (CR-LRP on OVN) so that the traffic to VMs in tenant
networks is injected into OVN overlay through the node that is hosting
that port.

View File

@ -1,603 +0,0 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
Convention for heading levels in Neutron devref:
======= Heading 0 (reserved for the title in a document)
------- Heading 1
~~~~~~~ Heading 2
+++++++ Heading 3
''''''' Heading 4
(Avoid deeper levels because they do not render well.)
=======================================
OVN BGP Agent: Design of the BGP Driver
=======================================
Purpose
-------
The purpose of this document is to present the design decision behind
the BGP Driver for the Networking OVN BGP agent.
The main purpose of adding support for BGP is to be able to expose Virtual
Machines (VMs) and Load Balancers (LBs) IPs through BGP dynamic protocol
when they either have a Floating IP (FIP) associated or are booted/created
on a provider network -- also in tenant networks if a flag is enabled.
Overview
--------
With the increment of virtualized/containerized workloads it is becoming more
and more common to use pure layer-3 Spine and Leaf network deployments at
datacenters. There are several benefits of this, such as reduced complexity at
scale, reduced failures domains, limiting broadcast traffic, among others.
The OVN BGP Agent is a Python based daemon that runs on each node
(e.g., OpenStack controllers and/or compute nodes). It connects to the OVN
SouthBound DataBase (OVN SB DB) to detect the specific events it needs to
react to, and then leverages FRR to expose the routes towards the VMs, and
kernel networking capabilities to redirect the traffic arriving on the nodes
to the OVN overlay.
.. note::
Note it is only intended for the N/S traffic, the E/W traffic will work
exactly the same as before, i.e., VMs are connected through geneve
tunnels.
The agent provides a multi-driver implementation that allows you to configure
it for specific infrastructure running on top of OVN, for instance OpenStack
or Kubernetes/OpenShift.
This simple design allows the agent to implement different drivers, depending
on what OVN SB DB events are being watched (watchers examples at
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
triggered in reaction to them (drivers examples at
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
``ovn_bgp_agent/drivers/driver_api.py``).
A driver implements the support for BGP capabilities. It ensures both VMs and
LBs on providers networks or with Floating IPs associated can be
exposed throug BGP. In addition, VMs on tenant networks can be also exposed
if the ``expose_tenant_network`` configuration option is enabled.
To control what tenant networks are exposed another flag can be used:
``address_scopes``. If not set, all the tenant networks will be exposed, while
if it is configured with a (set of) address_scopes, only the tenant networks
whose address_scope matches will be exposed.
A common driver API is defined exposing the next methods:
- ``expose_ip`` and ``withdraw_ip``: used to expose/withdraw IPs for local
OVN ports.
- ``expose_remote_ip`` and ``withdraw_remote_ip``: use to expose/withdraw IPs
through another node when the VM/Pod are running on a different node.
For example for VMs on tenant networks where the traffic needs to be
injected through the OVN router gateway port.
- ``expose_subnet`` and ``withdraw_subnet``: used to expose/withdraw subnets through
the local node.
Proposed Solution
-----------------
To support BGP functionality the OVN BGP Agent includes a driver
that performs the extra steps required for exposing the IPs through BGP on
the right nodes and steering the traffic to/from the node from/to the OVN
overlay. In order to configure which driver to use, one should set the
``driver`` configuration option in the ``bgp-agent.conf`` file.
This driver requires a watcher to react to the BGP-related events.
In this case, the BGP actions will be trigger by events related to
``Port_Binding`` and ``Load_Balancer`` OVN SB DB tables.
The information in those tables gets modified by actions related to VMs or LBs
creation/deletion, as well as FIPs association/disassociation to/from them.
Then, the agent performs some actions in order to ensure those VMs are
reachable through BGP:
- Traffic between nodes or BGP Advertisement: These are the actions needed to
expose the BGP routes and make sure all the nodes know how to reach the
VM/LB IP on the nodes.
- Traffic within a node or redirecting traffic to/from OVN overlay: These are
the actions needed to redirect the traffic to/from a VM to the OVN neutron
networks, when traffic reaches the node where the VM is or in their way
out of the node.
The code for the BGP driver is located at
``drivers/openstack/ovn_bgp_driver.py``, and its associated watcher can be
found at ``drivers/openstack/watchers/bgp_watcher.py``.
OVN SB DB Events
~~~~~~~~~~~~~~~~
The watcher associated to the BGP driver detect the relevant events on the
OVN SB DB to call the driver functions to configure BGP and linux kernel
networking accordingly.
The folloging events are watched and handled by the BGP watcher:
- VMs or LBs created/deleted on provider networks
- FIPs association/disassociation to VMs or LBs
- VMs or LBs created/deleted on tenant networks (if the
``expose_tenant_networks`` configuration option is enabled, or if the
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
.. note::
If ``expose_tenant_networks`` flag is enabled, it does not matter the
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
will be advertized.
The BGP watcher detects OVN Southbound Database events at the ``Port_Binding``
and ``Load_Balancer`` tables. It creates new event classes named
``PortBindingChassisEvent`` and ``OVNLBEvent``, that all the events
watched for BGP use as the base (inherit from).
The specific defined events to react to are:
- ``PortBindingChassisCreatedEvent``: Detects when a port of type
``""`` (empty double-qoutes), ``virtual``, or ``chassisredirect`` gets
attached to the OVN chassis where the agent is running. This is the case for
VM or amphora LB ports on the provider networks, VM or amphora LB ports on
tenant networks with a FIP associated, and neutron gateway router ports
(CR-LRPs). It calls ``expose_ip`` driver method to perform the needed
actions to expose it.
- ``PortBindingChassisDeletedEvent``: Detects when a port of type
``""`` (empty double-quotes), ``virtual``, or ``chassisredirect`` gets
detached from the OVN chassis where the agent is running. This is the case
for VM or amphora LB ports on the provider networks, VM or amphora LB ports
on tenant networks with a FIP associated, and neutron gateway router ports
(CR-LRPs). It calls ``withdraw_ip`` driver method to perform the needed
actions to withdraw the exposed BGP route.
- ``FIPSetEvent``: Detects when a patch port gets its nat_addresses field
updated (e.g., action related to FIPs NATing). If that so, and the associated
VM port is on the local chassis the event is processed by the agent and the
required ip rule gets created and also the IP is (BGP) exposed. It calls
``expose_ip`` driver method, including the associated_port information, to
perform the required actions.
- ``FIPUnsetEvent``: Same as previous, but when the nat_address field get an
IP deleted. It calls ``withdraw_ip`` driver method to perform the required
actions.
- ``SubnetRouterAttachedEvent``: Detects when a patch port gets created.
This means a subnet is attached to a router. In the ``expose_tenant_network``
case, if the chassis is the one having the cr-lrp port for that router where
the port is getting created, then the event is processed by the agent and the
needed actions (ip rules and routes, and ovs rules) for exposing the IPs on
that network are performed. This event calls the driver_api
``expose_subnet``. The same happens if ``expose_ipv6_gua_tenant_networks``
is used, but then, the IPs are only exposed if they are IPv6 global.
- ``SubnetRouterDetachedEvent``: Same as previous one, but for the deletion
of the port. It calls ``withdraw_subnet``.
- ``TenantPortCreateEvent``: Detects when a port of type ``""`` (empty
double-quotes) or ``virtual`` gets updated. If that port is not on a
provider network, and the chasis where the event is processed has the
LogicalRouterPort for the network and the OVN router gateway port where the
network is connected to, then the event is processed and the actions to
expose it through BGP are triggered. It calls the ``expose_remote_ip`` as in
this case the IPs are exposed through the node with the OVN router gateway
port, instead of where the VM is.
- ``TenantPortDeleteEvent``: Same as previous one, but for the deletion of the
port. It calls ``withdraw_remote_ip``.
- ``OVNLBMemberUpdateEvent``: This event is required to handle the OVN load
balancers created on the provider networks. It detects when new datapaths
are added/removed to/from the ``Load_Balancer`` entries. This happens when
members are added/removed -- their respective datapaths are added into the
``Load_Balancer`` table entry. The event is only processed in the nodes with the
relevant OVN router gateway ports, as it is where it needs to get exposed to
be injected into OVN overlay. It calls ``expose_ovn_lb_on_provider`` when the
second datapath is added (first one is the one belonging to the VIP (i.e.,
the provider network), while the second one belongs to the load balancer
member -- note all the load balancer members are expected to be connected
through the same router to the provider network). And it calls
``withdraw_ovn_lb_on_provider`` when that member gets deleted (only one
datapath left) or the event type is ROW_DELETE, meaning the whole
load balancer is deleted.
Driver Logic
~~~~~~~~~~~~
The BGP driver is in charge of the networking configuration ensuring that
VMs and LBs on provider networks or with FIPs can be reached through BGP
(N/S traffic). In addition, if ``expose_tenant_networks`` flag is enabled,
VMs in tenant networks should be reachable too -- although instead of directly
in the node they are created, through one of the network gateway chassis nodes.
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
GUA ranges. In addition, if the config option ``address_scopes`` is set only
the tenant networks with matching corresponding address_scope will be exposed.
To accomplish this, it needs to ensure that:
- VM and LBs IPs can be advertized in a node where the traffic could be
injected into the OVN overlay, in this case either the node hosting the VM
or the node where the router gateway port is scheduled (see limitations
subsection).
- Once the traffic reaches the specific node, the traffic is redirected to the
OVN overlay by leveraging kernel networking.
BGP Advertisement
+++++++++++++++++
The OVN BGP Agent is in charge of triggering FRR (ip routing protocol
suite for Linux which includes protocol daemons for BGP, OSPF, RIP,
among others) to advertise/withdraw directly connected routes via BGP.
To do that, when the agent starts, it ensures that:
- FRR local instance is reconfigured to leak routes for a new VRF. To do that
it uses ``vtysh shell``. It connects to the existsing FRR socket (
``--vty_socket`` option) and executes the next commands, passing them through
a file (``-c FILE_NAME`` option):
.. code-block:: ini
LEAK_VRF_TEMPLATE = '''
router bgp {{ bgp_as }}
address-family ipv4 unicast
import vrf {{ vrf_name }}
exit-address-family
address-family ipv6 unicast
import vrf {{ vrf_name }}
exit-address-family
router bgp {{ bgp_as }} vrf {{ vrf_name }}
bgp router-id {{ bgp_router_id }}
address-family ipv4 unicast
redistribute connected
exit-address-family
address-family ipv6 unicast
redistribute connected
exit-address-family
'''
- There is a VRF created (the one leaked in the previous step), by default
with name ``bgp_vrf``.
- There is a dummy interface type (by default named ``bgp-nic``), associated to
the previously created VRF device.
- Ensure ARP/NDP is enabled at OVS provider bridges by adding an IP to it
Then, to expose the VMs/LB IPs as they are created (or upon
initialization or re-sync), since the FRR configuration has the
``redistribute connected`` option enabled, the only action needed to expose it
(or withdraw it) is to add it (or remove it) from the ``bgp-nic`` dummy interface.
Then it relies on Zebra to do the BGP advertisemant, as Zebra detects the
addition/deletion of the IP on the local interface and advertises/withdraw
the route:
.. code-block:: ini
$ ip addr add IPv4/32 dev bgp-nic
$ ip addr add IPv6/128 dev bgp-nic
.. note::
As we also want to be able to expose VM connected to tenant networks
(when ``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks``
configuration options are enabled), there is a need to expose the Neutron
router gateway port (CR-LRP on OVN) so that the traffic to VMs on tenant
networks is injected into OVN overlay through the node that is hosting
that port.
Traffic Redirection to/from OVN
+++++++++++++++++++++++++++++++
Once the VM/LB IP is exposed in an specific node (either the one hosting the
VM/LB or the one with the OVN router gateway port), the OVN BGP Agent is in
charge of configuring the linux kernel networking and OVS so that the traffic
can be injected into the OVN overlay, and vice versa. To do that, when the
agent starts, it ensures that:
- ARP/NDP is enabled at OVS provider bridges by adding an IP to it
- There is a routing table associated to each OVS provider bridge
(adds entry at /etc/iproute2/rt_tables)
- If provider network is a VLAN network, a VLAN device connected
to the bridge is created, and it has ARP and NDP enabed.
- Cleans up extra OVS flows at the OVS provider bridges
Then, either upon events or due to (re)sync (regularly or during start up), it:
- Adds an IP rule to apply specific routing table routes,
in this case the one associated to the OVS provider bridge:
.. code-block:: ini
$ ip rule
0: from all lookup local
1000: from all lookup [l3mdev-table]
*32000: from all to IP lookup br-ex* # br-ex is the OVS provider bridge
*32000: from all to CIDR lookup br-ex* # for VMs in tenant networks
32766: from all lookup main
32767: from all lookup default
- Adds an IP route at the OVS provider bridge routing table so that the traffic is
routed to the OVS provider bridge device:
.. code-block:: ini
$ ip route show table br-ex
default dev br-ex scope link
*CIDR via CR-LRP_IP dev br-ex* # for VMs in tenant networks
*CR-LRP_IP dev br-ex scope link* # for the VM in tenant network redirection
*IP dev br-ex scope link* # IPs on provider or FIPs
- Adds a static ARP entry for the OVN router gateway ports (CR-LRP) so that the
traffic is steered to OVN via br-int -- this is because OVN does not reply
to ARP requests outside its L2 network:
.. code-block:: ini
$ ip nei
...
CR-LRP_IP dev br-ex lladdr CR-LRP_MAC PERMANENT
...
- For IPv6, instead of the static ARP entry, and NDP proxy is added, same
reasoning:
.. code-block:: ini
$ ip -6 nei add proxy CR-LRP_IP dev br-ex
- Finally, in order for properly send the traffic out from the OVN overlay
to kernel networking to be sent out of the node, the OVN BGP Agent needs
to add a new flow at the OVS provider bridges so that the destination MAC
address is changed to the MAC address of the OVS provider bridge
(``actions=mod_dl_dst:OVN_PROVIDER_BRIDGE_MAC,NORMAL``):
.. code-block:: ini
$ sudo ovs-ofctl dump-flows br-ex
cookie=0x3e7, duration=77.949s, table=0, n_packets=0, n_bytes=0, priority=900,ip,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
cookie=0x3e7, duration=77.937s, table=0, n_packets=0, n_bytes=0, priority=900,ipv6,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
Driver API
++++++++++
The BGP driver needs to implement the ``driver_api.py`` interface with the
following functions:
- ``expose_ip``: creates all the ip rules and routes, and ovs flows needed
to redirect the traffic to OVN overlay. It also ensure FRR exposes through
BGP the required IP.
- ``withdraw_ip``: removes the above configuration to withdraw the exposed IP.
- ``expose_subnet``: add kernel networking configuration (ip rules and route)
to ensure traffic can go from the node to the OVN overlay, and viceversa,
for IPs within the tenant subnet CIDR.
- ``withdraw_subnet``: removes the above kernel networking configuration.
- ``expose_remote_ip``: BGP expose VM tenant network IPs through the chassis
hosting the OVN gateway port for the router where the VM is connected.
It ensures traffic destinated to the VM IP arrives to this node by exposing
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
the traffic is redirected to the OVN overlay once on the node.
- ``withdraw_remote_ip``: removes the above steps to stop advertizing the IP
through BGP from the node.
And in addition, it also implements these 2 extra ones for the OVN load
balancers on the provider networks
- ``expose_ovn_lb_on_provider``: adds kernel networking configuration to ensure
traffic is forwarded from the node to the OVN overlay as well as to expose
the VIP through BGP.
- ``withdraw_ovn_lb_on_provider``: removes the above steps to stop advertising
the load balancer VIP.
Agent deployment
~~~~~~~~~~~~~~~~
The BGP mode exposes the VMs and LBs in provider networks or with
FIPs, as well as VMs on tenant networks if ``expose_tenant_networks`` or
``expose_ipv6_gua_tenant_networks`` configuration options are enabled.
There is a need to deploy the agent in all the nodes where VMs can be created
as well as in the networker nodes (i.e., where OVN router gateway ports can be
allocated):
- For VMs and Amphora load balancers on provider networks or with FIPs,
the IP is exposed on the node where the VM (or amphora) is deployed.
Therefore the agent needs to be running on the compute nodes.
- For VMs on tenant networks (with ``expose_tenant_networks`` or
``expose_ipv6_gua_tenant_networks`` configuration options enabled), the agent
needs to be running on the networker nodes. In OpenStack, with OVN
networking, the N/S traffic to the tenant VMs (without FIPs) needs to go
through the networking nodes, more specifically the one hosting the
chassisredirect ovn port (cr-lrp), connecting the provider network to the
OVN virtual router. Hence, the VM IPs is advertised through BGP in that
node, and from there it follows the normal path to the OpenStack compute
node where the VM is located — the Geneve tunnel.
- Similarly, for OVN load balancer the IPs are exposed on the networker node.
In this case the ARP request for the VIP is replied by the OVN router
gateway port, therefore the traffic needs to be injected into OVN overlay
at that point too.
Therefore the agent needs to be running on the networker nodes for OVN
load balancers.
As an example of how to start the OVN BGP Agent on the nodes, see the commands
below:
.. code-block:: ini
$ python setup.py install
$ cat bgp-agent.conf
# sample configuration that can be adapted based on needs
[DEFAULT]
debug=True
reconcile_interval=120
expose_tenant_networks=True
# expose_ipv6_gua_tenant_networks=True
driver=osp_bgp_driver
address_scopes=2237917c7b12489a84de4ef384a2bcae
$ sudo bgp-agent --config-dir bgp-agent.conf
Starting BGP Agent...
Loaded chassis 51c8480f-c573-4c1c-b96e-582f9ca21e70.
BGP Agent Started...
Ensuring VRF configuration for advertising routes
Configuring br-ex default rule and routing tables for each provider network
Found routing table for br-ex with: ['201', 'br-ex']
Sync current routes.
Add BGP route for logical port with ip 172.24.4.226
Add BGP route for FIP with ip 172.24.4.199
Add BGP route for CR-LRP Port 172.24.4.221
....
.. note::
If you only want to expose the IPv6 GUA tenant IPs, then remove the option
``expose_tenant_networks`` and add ``expose_ipv6_gua_tenant_networks=True``
instead.
.. note::
If you want to filter the tenant networks to be exposed by some specific
address scopes, add the list of address scopes to ``addresss_scope=XXX``
section. If no filtering should be applied, just remove the line.
Note that the OVN BGP Agent operates under the next assumptions:
- A dynamic routing solution, in this case FRR, is deployed and
advertises/withdraws routes added/deleted to/from certain local interface,
in this case the ones associated to the VRF created to that end. As only VM
and load balancer IPs needs to be advertised, FRR needs to be configure with
the proper filtering so that only /32 (or /128 for IPv6) IPs are advertised.
A sample config for FRR is:
.. code-block:: ini
frr version 7.0
frr defaults traditional
hostname cmp-1-0
log file /var/log/frr/frr.log debugging
log timestamp precision 3
service integrated-vtysh-config
line vty
router bgp 64999
bgp router-id 172.30.1.1
bgp log-neighbor-changes
bgp graceful-shutdown
no bgp default ipv4-unicast
no bgp ebgp-requires-policy
neighbor uplink peer-group
neighbor uplink remote-as internal
neighbor uplink password foobar
neighbor enp2s0 interface peer-group uplink
neighbor enp3s0 interface peer-group uplink
address-family ipv4 unicast
redistribute connected
neighbor uplink activate
neighbor uplink allowas-in origin
neighbor uplink prefix-list only-host-prefixes out
exit-address-family
address-family ipv6 unicast
redistribute connected
neighbor uplink activate
neighbor uplink allowas-in origin
neighbor uplink prefix-list only-host-prefixes out
exit-address-family
ip prefix-list only-default permit 0.0.0.0/0
ip prefix-list only-host-prefixes permit 0.0.0.0/0 ge 32
route-map rm-only-default permit 10
match ip address prefix-list only-default
set src 172.30.1.1
ip protocol bgp route-map rm-only-default
ipv6 prefix-list only-default permit ::/0
ipv6 prefix-list only-host-prefixes permit ::/0 ge 128
route-map rm-only-default permit 11
match ipv6 address prefix-list only-default
set src f00d:f00d:f00d:f00d:f00d:f00d:f00d:0004
ipv6 protocol bgp route-map rm-only-default
ip nht resolve-via-default
- The relevant provider OVS bridges are created and configured with a loopback
IP address (eg. 1.1.1.1/32 for IPv4), and proxy ARP/NDP is enabled on their
kernel interface. In the case of OpenStack this is done by TripleO directly.
Limitations
-----------
The following limitations apply:
- There is no API to decide what to expose, all VMs/LBs on providers or with
Floating IPs associated to them will get exposed. For the VMs in the tenant
networks, the flag ``address_scopes`` should be used for filtering what
subnets to expose -- which should be also used to ensure no overlapping IPs.
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
using address scopes and subnet pools.
- Network traffic is steered by kernel routing (ip routes and rules), therefore
OVS-DPDK, where the kernel space is skipped, is not supported.
- Network traffic is steered by kernel routing (ip routes and rules), therefore
SRIOV, where the hypervisor is skipped, is not supported.
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
the provider or the FIPs associated to the VIPs on tenant networks needs to
go through the networking nodes (the ones hosting the Neutron Router Gateway
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
load balancer members to the provider network). Therefore, the entry point
into the OVN overlay needs to be one of those networking nodes, and
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
nodes the traffic will follow the normal tunneled path (Geneve tunnel) to
the OpenStack compute node where the selected member is located.

View File

@ -0,0 +1,78 @@
Traffic Redirection to/from OVN
+++++++++++++++++++++++++++++++
Besides the VM/LB IP being exposed in a specific node (either the one hosting
the VM/LB or the one with the OVN router gateway port), the OVN BGP Agent is in
charge of configuring the linux kernel networking and OVS so that the traffic
can be injected into the OVN overlay, and vice versa. To do that, when the
agent starts, it ensures that:
- ARP/NDP is enabled on OVS provider bridges by adding an IP to it
- There is a routing table associated to each OVS provider bridge
(adds entry at /etc/iproute2/rt_tables)
- If the provider network is a VLAN network, a VLAN device connected
to the bridge is created, and it has ARP and NDP enabled.
- Cleans up extra OVS flows at the OVS provider bridges
Then, either upon events or due to (re)sync (regularly or during start up), it:
- Adds an IP rule to apply specific routing table routes,
in this case the one associated to the OVS provider bridge:
.. code-block:: ini
$ ip rule
0: from all lookup local
1000: from all lookup [l3mdev-table]
*32000: from all to IP lookup br-ex* # br-ex is the OVS provider bridge
*32000: from all to CIDR lookup br-ex* # for VMs in tenant networks
32766: from all lookup main
32767: from all lookup default
- Adds an IP route at the OVS provider bridge routing table so that the traffic is
routed to the OVS provider bridge device:
.. code-block:: ini
$ ip route show table br-ex
default dev br-ex scope link
*CIDR via CR-LRP_IP dev br-ex* # for VMs in tenant networks
*CR-LRP_IP dev br-ex scope link* # for the VM in tenant network redirection
*IP dev br-ex scope link* # IPs on provider or FIPs
- Adds a static ARP entry for the OVN router gateway ports (CR-LRP) so that the
traffic is steered to OVN via br-int -- this is because OVN does not reply
to ARP requests outside its L2 network:
.. code-block:: ini
$ ip neigh
...
CR-LRP_IP dev br-ex lladdr CR-LRP_MAC PERMANENT
...
- For IPv6, instead of the static ARP entry, an NDP proxy is added, same
reasoning:
.. code-block:: ini
$ ip -6 neigh add proxy CR-LRP_IP dev br-ex
- Finally, in order for properly send the traffic out from the OVN overlay
to kernel networking to be sent out of the node, the OVN BGP Agent needs
to add a new flow at the OVS provider bridges so that the destination MAC
address is changed to the MAC address of the OVS provider bridge
(``actions=mod_dl_dst:OVN_PROVIDER_BRIDGE_MAC,NORMAL``):
.. code-block:: ini
$ sudo ovs-ofctl dump-flows br-ex
cookie=0x3e7, duration=77.949s, table=0, n_packets=0, n_bytes=0, priority=900,ip,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
cookie=0x3e7, duration=77.937s, table=0, n_packets=0, n_bytes=0, priority=900,ipv6,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL

View File

@ -0,0 +1,310 @@
.. _bgp_driver:
===================================================================
[SB DB] OVN BGP Agent: Design of the BGP Driver with kernel routing
===================================================================
Purpose
-------
The addition of a BGP driver enables the OVN BGP agent to expose virtual
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
protocol when these IP addresses are either associated with a floating IP
(FIP) or are booted or created on a provider network. The same functionality
is available on project networks, when a special flag is set.
This document presents the design decision behind the BGP Driver for the
Networking OVN BGP agent.
Overview
--------
With the growing popularity of virtualized and containerized workloads,
it is common to use pure Layer 3 spine and leaf network deployments in data
centers. The benefits of this practice reduce scaling complexities,
failure domains, and broadcast traffic limits.
The southbound OVN BGP agent is a Python-based daemon that runs on each
OpenStack Controller and Compute node.
The agent monitors the Open Virtual Network (OVN) southbound database
for certain VM and floating IP (FIP) events.
When these events occur, the agent notifies the FRR BGP daemon (bgpd)
to advertise the IP address or FIP associated with the VM.
The agent also triggers actions that route the external traffic to the OVN
overlay.
Because the agent uses a multi-driver implementation, you can configure the
agent for the specific infrastructure that runs on top of OVN, such as OSP or
Kubernetes and OpenShift.
.. note::
Note it is only intended for the N/S traffic, the E/W traffic will work
exactly the same as before, i.e., VMs are connected through geneve
tunnels.
This design simplicity enables the agent to implement different drivers,
depending on what OVN SB DB events are being watched (watchers examples at
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
triggered in reaction to them (drivers examples at
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
``ovn_bgp_agent/drivers/driver_api.py``).
A driver implements the support for BGP capabilities. It ensures that both VMs
and LBs on provider networks or associated floating IPs are exposed through BGP.
In addition, VMs on tenant networks can be also exposed
if the ``expose_tenant_network`` configuration option is enabled.
To control what tenant networks are exposed another flag can be used:
``address_scopes``. If not set, all the tenant networks will be exposed, while
if it is configured with a (set of) address_scopes, only the tenant networks
whose address_scope matches will be exposed.
A common driver API is defined exposing the these methods:
- ``expose_ip`` and ``withdraw_ip``: exposes or withdraws IPs for local
OVN ports.
- ``expose_remote_ip`` and ``withdraw_remote_ip``: exposes or withdraws IPs
through another node when the VM or pods are running on a different node.
For example, use for VMs on tenant networks where the traffic needs to be
injected through the OVN router gateway port.
- ``expose_subnet`` and ``withdraw_subnet``: exposes or withdraws subnets
through the local node.
Proposed Solution
-----------------
To support BGP functionality the OVN BGP Agent includes a driver
that performs the extra steps required for exposing the IPs through BGP on
the correct nodes and steering the traffic to/from the node from/to the OVN
overlay. To configure the OVN BGP agent to use the BGP driver set the
``driver`` configuration option in the ``bgp-agent.conf`` file to
``ovn_bgp_driver``.
The BGP driver requires a watcher to react to the BGP-related events.
In this case, BGP actions are triggered by events related to
``Port_Binding`` and ``Load_Balancer`` OVN SB DB tables.
The information in these tables is modified when VMs and LBs are created and
deleted, and when FIPs for them are associated and disassociated.
Then, the agent performs some actions in order to ensure those VMs are
reachable through BGP:
- Traffic between nodes or BGP Advertisement: These are the actions needed to
expose the BGP routes and make sure all the nodes know how to reach the
VM/LB IP on the nodes.
- Traffic within a node or redirecting traffic to/from OVN overlay: These are
the actions needed to redirect the traffic to/from a VM to the OVN Neutron
networks, when traffic reaches the node where the VM is or in their way
out of the node.
The code for the BGP driver is located at
``ovn_bgp_agent/drivers/openstack/ovn_bgp_driver.py``, and its associated
watcher can be found at
``ovn_bgp_agent/drivers/openstack/watchers/bgp_watcher.py``.
OVN SB DB Events
~~~~~~~~~~~~~~~~
The watcher associated with the BGP driver detects the relevant events on the
OVN SB DB to call the driver functions to configure BGP and linux kernel
networking accordingly.
The following events are watched and handled by the BGP watcher:
- VMs or LBs created/deleted on provider networks
- FIPs association/disassociation to VMs or LBs
- VMs or LBs created/deleted on tenant networks (if the
``expose_tenant_networks`` configuration option is enabled, or if the
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
.. note::
If ``expose_tenant_networks`` flag is enabled, it does not matter the
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
are advertised.
It creates new event classes named
``PortBindingChassisEvent`` and ``OVNLBEvent``, that all the events
watched for BGP use as the base (inherit from).
The BGP watcher reacts to the following events:
- ``PortBindingChassisCreatedEvent``: Detects when a port of type
``""`` (empty double-qoutes), ``virtual``, or ``chassisredirect`` gets
attached to the OVN chassis where the agent is running. This is the case for
VM or amphora LB ports on the provider networks, VM or amphora LB ports on
tenant networks with a FIP associated, and neutron gateway router ports
(CR-LRPs). It calls ``expose_ip`` driver method to perform the needed
actions to expose it.
- ``PortBindingChassisDeletedEvent``: Detects when a port of type
``""`` (empty double-quotes), ``virtual``, or ``chassisredirect`` gets
detached from the OVN chassis where the agent is running. This is the case
for VM or amphora LB ports on the provider networks, VM or amphora LB ports
on tenant networks with a FIP associated, and neutron gateway router ports
(CR-LRPs). It calls ``withdraw_ip`` driver method to perform the needed
actions to withdraw the exposed BGP route.
- ``FIPSetEvent``: Detects when a Port_Binding entry of type ``patch`` gets
its ``nat_addresses`` field updated (e.g., action related to FIPs NATing).
When true, and the associated VM port is on the local chassis, the event
is processed by the agent and the required IP rule gets created and its
IP is (BGP) exposed. It calls the ``expose_ip`` driver method, including
the associated_port information, to perform the required actions.
- ``FIPUnsetEvent``: Same as previous, but when the ``nat_addresses`` field get
an IP deleted. It calls the ``withdraw_ip`` driver method to perform the
required actions.
- ``SubnetRouterAttachedEvent``: Detects when a Port_Binding entry of type
``patch`` port gets created. This means a subnet is attached to a router.
In the ``expose_tenant_network``
case, if the chassis is the one having the cr-lrp port for that router where
the port is getting created, then the event is processed by the agent and the
needed actions (ip rules and routes, and ovs rules) for exposing the IPs on
that network are performed. This event calls the driver API
``expose_subnet``. The same happens if ``expose_ipv6_gua_tenant_networks``
is used, but then, the IPs are only exposed if they are IPv6 global.
- ``SubnetRouterDetachedEvent``: Same as ``SubnetRouterAttachedEvent``,
but for the deletion of the port. It calls ``withdraw_subnet``.
- ``TenantPortCreateEvent``: Detects when a port of type ``""`` (empty
double-quotes) or ``virtual`` gets updated. If that port is not on a
provider network, and the chassis where the event is processed has the
``LogicalRouterPort`` for the network and the OVN router gateway port where
the network is connected to, then the event is processed and the actions to
expose it through BGP are triggered. It calls the ``expose_remote_ip``
because in this case the IPs are exposed through the node with the OVN router
gateway port, instead of the node where the VM is located.
- ``TenantPortDeleteEvent``: Same as ``TenantPortCreateEvent``, but for
the deletion of the port. It calls ``withdraw_remote_ip``.
- ``OVNLBMemberUpdateEvent``: This event is required to handle the OVN load
balancers created on the provider networks. It detects when new datapaths
are added/removed to/from the ``Load_Balancer`` entries. This happens when
members are added/removed which triggers the addition/deletion of their
datapaths into the ``Load_Balancer`` table entry.
The event is only processed in the nodes with
the relevant OVN router gateway ports, because it is where it needs to get
exposed to be injected into OVN overlay.
``OVNLBMemberUpdateEvent`` calls ``expose_ovn_lb_on_provider`` only when the
second datapath is added. The first datapath belongs to the VIP for the
provider network, while the second one belongs to the load balancer member.
``OVNLBMemberUpdateEvent`` calls ``withdraw_ovn_lb_on_provider`` when the
second datapath is deleted, or the entire load balancer is deleted (event
type is ``ROW_DELETE``).
.. note::
All the load balancer members are expected to be connected through the same
router to the provider network.
Driver Logic
~~~~~~~~~~~~
The BGP driver is in charge of the networking configuration ensuring that
VMs and LBs on provider networks or with FIPs can be reached through BGP
(N/S traffic). In addition, if the ``expose_tenant_networks`` flag is enabled,
VMs in tenant networks should be reachable too -- although instead of directly
in the node they are created, through one of the network gateway chassis nodes.
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
GUA ranges. In addition, if the config option ``address_scopes`` is set, only
the tenant networks with matching corresponding ``address_scope`` will be
exposed.
To accomplish the network configuration and advertisement, the driver ensures:
- VM and LBs IPs can be advertised in a node where the traffic could be
injected into the OVN overlay, in this case either the node hosting the VM
or the node where the router gateway port is scheduled (see limitations
subsection).
- Once the traffic reaches the specific node, the traffic is redirected to the
OVN overlay by leveraging kernel networking.
.. include:: ../bgp_advertising.rst
.. include:: ../bgp_traffic_redirection.rst
Driver API
++++++++++
The BGP driver needs to implement the ``driver_api.py`` interface with the
following functions:
- ``expose_ip``: creates all the IP rules and routes, and OVS flows needed
to redirect the traffic to the OVN overlay. It also ensure FRR exposes
through BGP the required IP.
- ``withdraw_ip``: removes the above configuration to withdraw the exposed IP.
- ``expose_subnet``: add kernel networking configuration (IP rules and route)
to ensure traffic can go from the node to the OVN overlay, and vice versa,
for IPs within the tenant subnet CIDR.
- ``withdraw_subnet``: removes the above kernel networking configuration.
- ``expose_remote_ip``: BGP exposes VM tenant network IPs through the chassis
hosting the OVN gateway port for the router where the VM is connected.
It ensures traffic destinated to the VM IP arrives to this node by exposing
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
the traffic is redirected to the OVN overlay once on the node.
- ``withdraw_remote_ip``: removes the above steps to stop advertising the IP
through BGP from the node.
The driver API implements these additional methods for OVN load balancers on
provider networks:
- ``expose_ovn_lb_on_provider``: adds kernel networking configuration to ensure
traffic is forwarded from the node to the OVN overlay and to expose
the VIP through BGP.
- ``withdraw_ovn_lb_on_provider``: removes the above steps to stop advertising
the load balancer VIP.
.. include:: ../agent_deployment.rst
Limitations
-----------
The following limitations apply:
- There is no API to decide what to expose, all VMs/LBs on providers or with
floating IPs associated with them will get exposed. For the VMs in the tenant
networks, the flag ``address_scopes`` should be used for filtering what
subnets to expose -- which should be also used to ensure no overlapping IPs.
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
using address scopes and subnet pools.
- Network traffic is steered by kernel routing (IP routes and rules), therefore
OVS-DPDK, where the kernel space is skipped, is not supported.
- Network traffic is steered by kernel routing (IP routes and rules), therefore
SR-IOV, where the hypervisor is skipped, is not supported.
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
the provider or the FIPs associated to the VIPs on tenant networks needs to
go through the networking nodes (the ones hosting the Neutron Router Gateway
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
load balancer members to the provider network). Therefore, the entry point
into the OVN overlay needs to be one of those networking nodes, and
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
nodes the traffic follows the normal tunneled path (Geneve tunnel) to
the OpenStack compute node where the selected member is located.

View File

@ -12,9 +12,9 @@
''''''' Heading 4
(Avoid deeper levels because they do not render well.)
========================================
Design of OVN BGP Agent with EVPN Driver
========================================
=========================================================
Design of OVN BGP Agent with EVPN Driver (kernel routing)
=========================================================
Purpose
-------
@ -96,7 +96,7 @@ watcher detects it).
The overall arquitecture and integration between the ``networking-bgpvpn``
and the ``networking-bgp-ovn`` agent are shown in the next figure:
.. image:: ../../images/networking-bgpvpn_integration.png
.. image:: ../../../images/networking-bgpvpn_integration.png
:alt: integration components
:align: center
:width: 100%
@ -409,7 +409,7 @@ The next figure shows the N/S traffic flow through the VRF to the VM,
including information regarding the OVS flows on the provider bridge (br-ex),
and the routes on the VRF routing table.
.. image:: ../../images/evpn_traffic_flow.png
.. image:: ../../../images/evpn_traffic_flow.png
:alt: integration components
:align: center
:width: 100%

View File

@ -0,0 +1,12 @@
==========================
BGP Drivers Documentation
==========================
.. toctree::
:maxdepth: 1
bgp_mode_design
nb_bgp_mode_design
ovn_bgp_mode_design
evpn_mode_design
bgp_mode_stretched_l2_design

View File

@ -0,0 +1,386 @@
.. _nb_bgp_driver:
======================================================================
[NB DB] NB OVN BGP Agent: Design of the BGP Driver with kernel routing
======================================================================
Purpose
-------
The addition of a BGP driver enables the OVN BGP agent to expose virtual
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
protocol when these IP addresses are either associated with a floating IP
(FIP) or are booted or created on a provider network.
The same functionality is available on project networks, when a special
flag is set.
This document presents the design decision behind the NB BGP Driver for
the Networking OVN BGP agent.
Overview
--------
With the growing popularity of virtualized and containerized workloads,
it is common to use pure Layer 3 spine and leaf network deployments in
data centers. The benefits of this practice reduce scaling complexities,
failure domains, and broadcast traffic limits
The northbound OVN BGP agent is a Python-based daemon that runs on each
OpenStack Controller and Compute node.
The agent monitors the Open Virtual Network (OVN) northbound database
for certain VM and floating IP (FIP) events.
When these events occur, the agent notifies the FRR BGP daemon (bgpd)
to advertise the IP address or FIP associated with the VM.
The agent also triggers actions that route the external traffic to the OVN
overlay.
Unlike its predecessor, the (southbound) OVN BGP agent, the northbound OVN BGP
agent uses the northbound database API which is more stable than the southbound
database API because the former is isolated from internal changes to core OVN.
.. note::
Note northbound OVN BGP agent driver is only intended for the N/S traffic,
the E/W traffic will work exactly the same as before, i.e., VMs are
connected through geneve tunnels.
The agent provides a multi-driver implementation that allows you to configure
it for specific infrastructure running on top of OVN, for instance OpenStack
or Kubernetes/OpenShift.
This design simplicity enables the agent to implement different drivers,
depending on what OVN NB DB events are being watched (watchers examples at
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
triggered in reaction to them (drivers examples at
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
``ovn_bgp_agent/drivers/driver_api.py``).
A driver implements the support for BGP capabilities. It ensures that both VMs
and LBs on provider networks or associated Floating IPs are exposed through
BGP. In addition, VMs on tenant networks can be also exposed
if the ``expose_tenant_network`` configuration option is enabled.
To control what tenant networks are exposed another flag can be used:
``address_scopes``. If not set, all the tenant networks will be exposed, while
if it is configured with a (set of) address_scopes, only the tenant networks
whose address_scope matches will be exposed.
A common driver API is defined exposing the these methods:
- ``expose_ip`` and ``withdraw_ip``: exposes or withdraws IPs for local
OVN ports.
- ``expose_remote_ip`` and ``withdraw_remote_ip``: exposes or withdraws IPs
through another node when the VM or pods are running on a different node.
For example, use for VMs on tenant networks where the traffic needs to be
injected through the OVN router gateway port.
- ``expose_subnet`` and ``withdraw_subnet``: exposes or withdraws subnets through
the local node.
Proposed Solution
-----------------
To support BGP functionality the NB OVN BGP Agent includes a new driver
that performs the steps required for exposing the IPs through BGP on
the correct nodes and steering the traffic to/from the node from/to the OVN
overlay.
To configure the OVN BGP agent to use the northbound OVN BGP driver, in the
``bgp-agent.conf`` file, set the value of ``driver`` to ``nb_ovn_bgp_driver``.
This driver requires a watcher to react to the BGP-related events.
In this case, BGP actions are triggered by events related to
``Logical_Switch_Port``, ``Logical_Router_Port``and ``Load_Balancer``
on OVN NB DB tables.
The information in these tables is modified when VMs and LBs are created and
deleted, and when FIPs for them are associated and disassociated.
Then, the agent performs these actions to ensure the VMs are reachable through
BGP:
- Traffic between nodes or BGP Advertisement: These are the actions needed to
expose the BGP routes and make sure all the nodes know how to reach the
VM/LB IP on the nodes. This is exactly the same as in the initial OVN BGP
Driver (see :ref:`bgp_driver`)
- Traffic within a node or redirecting traffic to/from OVN overlay (wiring):
These are the actions needed to redirect the traffic to/from a VM to the OVN
neutron networks, when traffic reaches the node where the VM is or in their
way out of the node.
The code for the NB BGP driver is located at
``ovn_bgp_agent/drivers/openstack/nb_ovn_bgp_driver.py``, and its associated
watcher can be found at
``ovn_bgp_agent/drivers/openstack/watchers/nb_bgp_watcher.py``.
Note this new driver also allows different ways of wiring the node to the OVN
overlay. These are configurable through the option ``exposing_method``, where
for now you can select:
- ``underlay``: using kernel routing (what we describe in this document), same
as supported by the driver at :ref:`bgp_driver`.
- ``ovn``: using an extra OVN cluster per node to perform the routing at
OVN/OVS level instead of kernel, therefore enabling datapath acceleration
(Hardware Offloading and OVS-DPDK). More information about this mechanism
at :ref:`bgp_driver`.
OVN NB DB Events
~~~~~~~~~~~~~~~~
The watcher associated with the BGP driver detects the relevant events on the
OVN NB DB to call the driver functions to configure BGP and linux kernel
networking accordingly.
.. note::
Linux Kernel Networking is used when the default ``exposing_method``
(``underlay``) is used. If ``ovn`` is used instead, OVN routing is
used instead of Kernel. For more details on this see :ref:`ovn_routing`.
The following events are watched and handled by the BGP watcher:
- VMs or LBs created/deleted on provider networks
- FIPs association/disassociation to VMs or LBs
- VMs or LBs created/deleted on tenant networks (if the
``expose_tenant_networks`` configuration option is enabled, or if the
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
.. note::
If ``expose_tenant_networks`` flag is enabled, it does not matter the
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
are advertised.
The NB BGP watcher reacts to the following events:
- ``Logical_Switch_Port``
- ``Logical_Router_Port``
- ``Load_Balancer``
Besides the previously existing ``OVNLBEvent`` class, the NB BGP watcher has
new event classes named ``LSPChassisEvent`` and ``LRPChassisEvent`` that
all the events watched for NB BGP driver use as the base (inherit from).
The specific defined events to react to are:
- ``LogicalSwitchPortProviderCreateEvent``: Detects when a VM or an amphora LB
port, logical switch ports of type ``""`` (empty double-qoutes) or
``virtual``, comes up or gets attached to the OVN chassis where the agent is
running. If the ports are on a provider network, then the driver calls the
``expose_ip`` driver method to perform the needed actions to expose the port
(wire and advertise). If the port is on a tenant network, the driver
dismisses the event.
- ``LogicalSwitchPortProviderDeleteEvent``: Detects when a VM or an amphora LB
port, logical switch ports of type "" (empty double-qoutes) or ``virtual``,
goes down or gets detached from the OVN chassis where the agent is running.
If the ports are on a provider network, then the driver calls the
``withdraw_ip`` driver method to perform the needed actions to withdraw the
port (withdraw and unwire). If the port is on a tenant network, the driver
dismisses the event.
- ``LogicalSwitchPortFIPCreateEvent``: Similar to
``LogicalSwitchPortProviderCreateEvent`` but focusing on the changes on the
FIP information on the Logical Switch Port external_ids.
It calls ``expose_fip`` driver method to perform the needed actions to expose
the floating IP (wire and advertize).
- ``LogicalSwitchPortFIPDeleteEvent``: Same as previous one but for withdrawing
FIPs. In this case it is similar to ``LogicalSwitchPortProviderDeleteEvent``
but instaed calls the ``withdraw_fip`` driver method to perform the needed actions
to withdraw the floating IP (Withdraw and unwire).
- ``LocalnetCreateDeleteEvent``: Detects creation/deletion of OVN localnet
ports, which indicates the creation/deletion of provider networks. This
triggers a resync (``sync`` method) action to perform the base configuration
needed for the provider networks, such as OVS flows or arp/ndp
configurations.
- ``ChassisRedirectCreateEvent``: Similar to
``LogicalSwitchPortProviderCreateEvent`` but with the focus on logical router
ports, such as the OVN gateway ports (cr-lrps), instead of logical switch
ports. The driver calls ``expose_ip`` which performs additional steps to also
expose IPs related to the cr-lrps, such as the ovn-lb or IPs in tenant
networks. The watcher ``match`` checks the chassis information in the
``status`` field, which must be ovn23.09 or later.
- ``ChassisRedirectDeleteEvent``: Similar to
``LogicalSwitchPortProviderDeleteEvent`` but with the focus on logical router
ports, such as the OVN gateway ports (cr-lrps), instead of logical switch
ports. The driver calls ``withdraw_ip`` which performs additional steps to
also withdraw IPs related to the cr-lrps, such as the ovn-lb or IPs in tenant
networks. The watcher ``match`` checks the chassis information in the
``status`` field, which must be ovn23.09 or later.
- ``LogicalSwitchPortSubnetAttachEvent``: Detects Logical Switch Ports of type
``router`` (connecting Logical Switch to Logical Router) and checks if the
associated router is associated to the local chassis, i.e., if the CR-LRP of
the router is located in the local chassis. If that is the case, the
``expose_subnet`` driver method is called which is in charge of the wiring
needed for the IPs on that subnet (set of IP routes and rules).
- ``LogicalSwitchPortSubnetDetachEvent``: Similar to
``LogicalSwitchPortSubnetAttachEvent`` but for unwiring the subnet, so it is
calling the``withdraw_subnet`` driver method.
- ``LogicalSwitchPortTenantCreateEvent``: Detects when a logical switch port
of type ``""`` (empty double-qoutes) or ``virtual``, similar to
``LogicalSwitchPortProviderCreateEvent``. It checks if the network associated
to the VM is exposed in the local chassis (meaning its cr-lrp is also local).
If that is the case, it calls ``expose_remote_ip``, which manages the
advertising of the IP -- there is no need for wiring, as that is done when
the subnet is exposed by ``LogicalSwitchPortSubnetAttachEvent`` event.
- ``LogicalSwitchPortTenantDeleteEvent``: Similar to
``LogicalSwitchPortTenantCreateEvent`` but for withdrawing IPs.
Calling ``withdraw_remote_ips``.
- ``OVNLBCreateEvent``: Detects Load_Balancer events and processes them only
if the Load_Balancer entry has associated VIPs and the router is local to
the chassis.
If the VIP or router is added to a provider network, the driver calls
``expose_ovn_lb_vip`` to expose and wire the VIP or router.
If the VIP or router is added to a tenant network, the driver calls
``expose_ovn_lb_vip`` to only expose the VIP or router.
If a floating IP is added, then the driver calls ``expose_ovn_lb_fip`` to
expose and wire the FIP.
- ``OVNLBDeleteEvent``: If the VIP or router is removed from a provider
network, the driver calls ``withdraw_ovn_lb_vip`` to withdraw and unwire
the VIP or router. If the VIP or router is removed to a tenant network,
the driver calls ``withdraw_ovn_lb_vip`` to only withdraw the VIP or router.
If a floating IP is removed, then the driver calls ``withdraw_ovn_lb_fip``
to withdraw and unwire the FIP.
Driver Logic
~~~~~~~~~~~~
The NB BGP driver is in charge of the networking configuration ensuring that
VMs and LBs on provider networks or with FIPs can be reached through BGP
(N/S traffic). In addition, if the ``expose_tenant_networks`` flag is enabled,
VMs in tenant networks should be reachable too -- although instead of directly
in the node they are created, through one of the network gateway chassis nodes.
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
GUA ranges. In addition, if the config option ``address_scopes`` is set, only
the tenant networks with matching corresponding ``address_scope`` will be
exposed.
.. note::
To be able to expose tenant networks a ovn version ovn23.09 or newer is
needed
To accomplish the network configuration and advertisement, the driver ensures:
- VM and LBs IPs can be advertised in a node where the traffic can be injected
into the OVN overlay: either in the node that hosts the VM or in the node
where the router gateway port is scheduled. (See the "limitations"
subsection.).
- After the traffic reaches the specific node, kernel networking redirects the
traffic to the OVN overlay, if the default ``underlay`` exposing method is
used.
.. include:: ../bgp_advertising.rst
.. include:: ../bgp_traffic_redirection.rst
Driver API
++++++++++
The NB BGP driver implements the ``driver_api.py`` interface with the
following functions:
- ``expose_ip``: creates all the IP rules and routes, and OVS flows needed
to redirect the traffic to OVN overlay. It also ensures that FRR exposes
the required IP by using BGP.
- ``withdraw_ip``: removes the configuration (IP rules/routes, OVS flows)
from ``expose_ip`` method to withdraw the exposed IP.
- ``expose_subnet``: adds kernel networking configuration (IP rules and route)
to ensure traffic can go from the node to the OVN overlay (and back)
for IPs within the tenant subnet CIDR.
- ``withdraw_subnet``: removes kernel networking configuration added by
``expose_subnet``.
- ``expose_remote_ip``: BGP expose VM tenant network IPs through the chassis
hosting the OVN gateway port for the router where the VM is connected.
It ensures traffic directed to the VM IP arrives at this node by exposing
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
the traffic is redirected to the OVN overlay after it arrives on the node.
- ``withdraw_remote_ip``: removes the configuration added by
``expose_remote_ip``.
And in addition, the driver also implements extra methods for the FIPs and the
OVN load balancers:
- ``expose_fip`` and ``withdraw_fip`` which are equivalent to ``expose_ip`` and
``withdraw_ip`` but for FIPs.
- ``expose_ovn_lb_vip``: adds kernel networking configuration to ensure
traffic is forwarded from the node with the associated cr-lrp to the OVN
overlay, as well as to expose the VIP through BGP in that node.
- ``withdraw_ovn_lb_vip``: removes the above steps to stop advertising
the load balancer VIP.
- ``expose_ovn_lb_fip`` and ``withdraw_ovn_lb_fip``: for exposing the FIPs
associated to ovn loadbalancers. This is similar to
``expose_fip/withdraw_fip`` but taking into account that it must be exposed
on the node with the cr-lrp for the router associated to the loadbalancer.
.. include:: ../agent_deployment.rst
Limitations
-----------
The following limitations apply:
- OVN 23.09 or later is needed to support exposing tenant networks IPs and
OVN loadbalancers.
- There is no API to decide what to expose, all VMs/LBs on providers or with
floating IPs associated with them are exposed. For the VMs in the tenant
networks, use the flag ``address_scopes`` to filter which subnets to expose,
which also prefents having overlapping IPs.
- In the currently implemented exposing methods (``underlay`` and
``ovn``) there is no support for overlapping CIDRs, so this must be
avoided, e.g., by using address scopes and subnet pools.
- For the default exposing method (``underlay``) the network traffic is steered
by kernel routing (ip routes and rules), therefore OVS-DPDK, where the kernel
space is skipped, is not supported. With the ``ovn`` exposing method
the routing is done at ovn level, so this limitation does not exists.
More details in :ref:`ovn_routing`.
- For the default exposing method (``underlay``) the network traffic is steered
by kernel routing (ip routes and rules), therefore SRIOV, where the hypervisor
is skipped, is not supported. With the ``ovn`` exposing method
the routing is done at ovn level, so this limitation does not exists.
More details in :ref:`ovn_routing`.
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
the provider or the FIPs associated with the VIPs on tenant networks needs to
go through the networking nodes (the ones hosting the Neutron Router Gateway
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
load balancer members to the provider network). Therefore, the entry point
into the OVN overlay needs to be one of those networking nodes, and
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
nodes the traffic will follow the normal tunneled path (Geneve tunnel) to
the OpenStack compute node where the selected member is located.

View File

@ -0,0 +1,265 @@
.. _ovn_routing:
===================================================================
[NB DB] NB OVN BGP Agent: Design of the BGP Driver with OVN routing
===================================================================
This is an extension of the NB OVN BGP Driver which adds a new
``exposing_method`` named ``ovn`` to make use of OVN routing, instead of
relying on Kernel routing.
Purpose
-------
The addition of a BGP driver enables the OVN BGP agent to expose virtual
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
protocol when these IP addresses are either associated with a floating IP
(FIP) or are booted or created on a provider network.
The same functionality is available on project networks, when a special
flag is set.
This document presents the design decision behind the extensions on the
NB OVN BGP Driver to support OVN routing instead of kernel routing,
and therefore enabling datapath acceleartion.
Overview
--------
The main goal is to make the BGP capabilities of OVN BGP Agent compliant with
OVS-DPDK and HWOL. To do that we need to move to OVN/OVS what the OVN BGP
Agent is currently doing with Kernel networking -- redirect traffic to/from
the OpenStack OVN Overlay.
To accomplish this goal, the following is required:
- Ensure that incoming traffic gets redirected from the physical NICs to the OVS
integration bridge (br-int) though one or more OVS provider bridges (br-ex)
without using kernel routes and rules.
- Ensure the outgoing traffic gets redirected to the physical NICs without
using the default kernel routes.
- Expose the IPs in the same way as we did before.
The third point is simple as it is already being done, but for the first two
points OVN virtual routing capabilities are needed, ensuring the traffic gets
routed from the NICS to the OpenStack Overlay and vice versa.
Proposed Solution
-----------------
To avoid placing kernel networking in the middle of the datapath and blocking
acceleration, the proposed solution mandates locating a separate OVN cluster
on each node that manages the needed virtual infrastructure between the
OpenStack networking overlay and the physical network.
Because routing occurs at OVN/OVS level, this proposal makes it is possible
to support hardware offloading (HWOL) and OVS-DPDK.
The next figure shows the proposed cluster required to manage the OVN virtual
networking infrastructure on each node.
.. image:: ../../../images/ovn-cluster-overview.png
:alt: OVN Routing integration
:align: center
:width: 100%
In a standard deployment ``br-int`` is directly connected to the OVS external
bridge (``br-ex``) where the physical NICs are attached.
By contrast, in the default BGP driver solution (see :ref:`nb_bgp_driver`),
the physical NICs are not directly attached to br-ex, but rely on kernel
networking (ip routes and ip rules) to redirect the traffic to ``br-ex``.
The OVN routing architecture proposes the following mapping:
- ``br-int`` connects to an external (from the OpenStack perspective) OVS bridge
(``br-osp``).
- ``br-osp`` does not have any physical resources attached, just patch
ports connecting them to ``br-int`` and ``br-bgp``.
- ``br-bgp`` is the integration bridge managed by the extra OVN cluster
deployed per node. This is where the virtual OVN resources are be created
(routers and switches). It creates mappings to ``br-osp`` and ``br-ex``
(patch ports).
- ``br-ex`` keeps being the external bridge, where the physical NICs are
attached (as in default environments without BGP). But instead of being
directly connected to ``br-int``, is connected to ``br-bgp``. Note for
ECMP purposes, each nic is attached to a different ``br-ex`` device
(``br-ex`` and ``br-ex-2``).
The virtual OVN resources requires the following:
- Logical Router (``bgp-router``): manages the routing that was
previously done in the kernel networking layer between both networks
(physical and OpenStack OVN overlay). It has two connections (i.e., Logical
Router Ports) towards the ``bgp-ex-X`` Logical Switches to add support for ECMP
(only one switch is required but you must have several in case of ECMP),
and one connection to the ``bgp-osp`` Logical Switch to ensure traffic
to/from the OpenStack networking overlay.
- Logical Switch (``bgp-ex``): is connected to the ``bgp-router``, and has
a localnet to connect it to ``br-ex`` and therefore the physical NICs. There
is one Logical Switch per NIC (``bgp-ex`` and ``bgp-ex-2``).
- Logical Switch (``bgp-osp``): is connected to the ``bgp-router``, and has
a localnet to connect it to ``br-osp`` to enable it to send traffic to
and from the OpenStack OVN overlay.
The following OVS flows are required on both OVS bridges:
- ``br-ex-X`` bridges: require a flow to ensure only the traffic
targetted for OpenStack provider networks is redirected to the OVN cluster.
.. code-block:: ini
cookie=0x3e7, duration=942003.114s, table=0, n_packets=1825, n_bytes=178850, priority=1000,ip,in_port=eth1,nw_dst=172.16.0.0/16 actions=mod_dl_dst:52:54:00:30:93:ea,output:"patch-bgp-ex-lo"
- ``br-osp`` bridge: require a flow for each OpenStack provider network to
change the MAC by the one on the router port in the OVN cluster and to
properly manage traffic that is routed to the OVN cluster.
.. code-block:: ini
cookie=0x3e7, duration=942011.971s, table=0, n_packets=8644, n_bytes=767152, priority=1000,ip,in_port="patch-provnet-0" actions=mod_dl_dst:40:44:00:00:00:06,NORMAL
OVN NB DB Events
~~~~~~~~~~~~~~~~
The OVN northbound database events that the driver monitors are the same as
the ones for the NB DB driver with the ``underlay`` exposing mode.
See :ref:`nb_bgp_driver`. The main difference between the two drivers is
that the wiring actions are simplified for the OVN routing driver.
Driver Logic
~~~~~~~~~~~~
As with the other BGP drivers or ``exposing modes`` (:ref:`bgp_driver`,
:ref:`nb_bgp_driver`) the NB DB Driver with the ``ovn`` exposing mode enabled
(i.e., enabling ``OVN routing`` instead of rely on ``Kernel networking``)
is in charge of exposing the IPs with BGP and of the networking configuration
to ensure that VMs abd LBs on provider networks or with FIPs can be reached
through BGP (N/S traffic). Similarly, if ``expose_tenant_networks`` flag is
enabled, VMs in tenant networks should be reachable too -- although instead
of directly in the node they are created, through one of the network gateway
chassis nodes. The same happens with ``expose_ipv6_gua_tenant_networks``
but only for IPv6 GUA ranges.
In addition, if the config option ``address_scopes`` is set only the tenant
networks with matching corresponding address_scope will be exposed.
To accomplish this, it needs to configure the extra per node ovn cluster to
ensure that:
- VM and LBs IPs can be advertized in a node where the traffic could be injected
into the OVN overlay through the extra ovn cluster (instead of the Kernel
routing) -- either in the node hosting the VM or the node where the router
gateway port is scheduled.
- Once the traffic reaches the specific node, the traffic is redirected to the
OVN overlay by using the extra ovn cluster per node with the proper OVN
configuration. To do this it needs to create Logical Switches, Logical
Routers and the routing configuration between them (routes and policies).
.. include:: ../bgp_advertising.rst
Traffic Redirection to/from OVN
+++++++++++++++++++++++++++++++
As explained before, the main idea of this exposing mode is to leverage OVN
routing instead of kernel routing. For the traffic going out the steps are
the next:
- If (OpenStack) OVN cluster knows about the destination MAC then that works
as in deployment without BGP or OVN cluster support (no arp needed, MAC
directly used). If the MAC is unknown but on the same provider network(s)
range, the ARP gets replied by the Logical Switch Port on the ``bgp-osp`` LS
thanks to enabling arp_proxy on it. And if it is a different range, it will
reply due to the router having default routes to the outside.
The flow at ``br-osp`` is in charge of changing the destination MAC by the
one on the Logical Router Port on ``bgp-router`` LR.
- The previous step takes the traffic to the extra OVN cluster per node, where
the default (ECMP) routes are used to send the traffic to the external
Logical Switch and from there to the physical nics attached to the external
OVS bridge(s) (``br-ex``, ``br-ex-2``). In case of known MAC by OpenStack,
instead of the default routes, a Logical Route Policy gets applied so that
traffic is forced to be redirected out (through the LRPs connected to the
external LS) when comming through the internal LRP (the one connected to
OpenStack).
And for the traffic comming in:
- The flow hits the ovs flow added at the ``br-ex-X`` bridge(s) to redirect
the traffic to the per node OVN cluster, changing the destination MAC by
the one at the related ``br-ex`` device, which are the same used for the
OVN cluster Logical Router Ports. This takes the traffic to the OVN router.
- After that, thanks to having the arp_proxy enabled on the LSP on ``bgp-osp``
the traffic will be redirected to there. And due to a limitation in the
functionality of arp_proxy, there is a need of adding an extra static mac
binding entry in the cluster so that the VM MAC is used for destination
instead of the own LSP MAC, which would lead to droping the traffic on the
LS pipeline.
.. code-block:: ini
_uuid : 6e1626b3-832c-4ee6-9311-69ebc15cb14d
ip : "172.16.201.219"
logical_port : bgp-router-openstack
mac : "fa:16:3e:82:ee:19"
override_dynamic_mac: true
Driver API
++++++++++
This is the very same as in the NB DB driver with the ``underlay`` exposing
mode. See :ref:`nb_bgp_driver`.
Agent deployment
~~~~~~~~~~~~~~~~
The deployment is similar to the NB DB driver with the ``underlay`` exposing
method but with some extra configuration. See :ref:`nb_bgp_driver` for the base.
It is needed to state the exposing method in the DEFAULT section and the extra
configuration for the local ovn cluster that performs the routing, including the
range for the provider networks to expose/handle:
.. code-block:: ini
[DEFAULT]
exposing_method=ovn
[local_ovn_cluster]
ovn_nb_connection=unix:/run/ovn/ovnnb_db.sock
ovn_sb_connection=unix:/run/ovn/ovnsb_db.sock
external_nics=eth1,eth2
peer_ips=100.64.1.5,100.65.1.5
provider_networks_pool_prefixes=172.16.0.0/16
Limitations
-----------
The following limitations apply:
- OVN 23.06 or later is needed
- Tenant networks, subnet and ovn-loadbalancer are not yet supported, and will
require OVN 23.09 or nlaterewer.
- IPv6 not yet supported
- ECMP not properly working as there is no support for BFD at the ovn-cluster,
which means if one of the routes goes away the OVN cluster won't react to it
and there will be traffic disruption.
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
using address scopes and subnet pools.

View File

@ -5,7 +5,7 @@
.. toctree::
:maxdepth: 2
bgp_mode_design
evpn_mode_design
bgp_mode_stretched_l2_design
bgp_supportability_matrix
drivers/index
agent_deployment
bgp_advertising
bgp_traffic_redirection

View File

@ -10,10 +10,11 @@ Welcome to the documentation of OVN BGP Agent
Contents:
.. toctree::
:maxdepth: 2
:maxdepth: 3
readme
contributor/index
bgp_supportability_matrix
Indices and tables
==================