Fix octavia-interface timeout

Added Restart=on-failure policy to octavia-interface systemd unit
Added octavia_interface_wait_timeout variable to control
TimeoutStartSec in octavia-interface systemd unit

Change-Id: I9de6c27131ce78e85aac56ea5d91d9740fd58354
Closes-Bug: 2067036
This commit is contained in:
Victor Chembaev 2024-05-24 11:15:40 +03:00
parent 4ab17364e4
commit a3fcc07c7a
3 changed files with 32 additions and 0 deletions

View File

@ -7,6 +7,10 @@ After=docker.service
Type=oneshot
User=root
Group=root
Restart=on-failure
{% if octavia_interface_wait_timeout is defined %}
TimeoutStartSec={{ octavia_interface_wait_timeout }}
{% endif %}
RemainAfterExit=true
ExecStartPre=/sbin/ip link set dev {{ octavia_network_interface }} address {{ port_info.port.mac_address }}
ExecStart=/sbin/dhclient -v {{ octavia_network_interface }} -cf /etc/dhcp/octavia-dhclient.conf

View File

@ -437,6 +437,24 @@ Add ``octavia_network_type`` to ``globals.yml`` and set the value to ``tenant``
Nextfollow the deployment instructions as normal.
Failure handling
----------------
On large deployments, where neutron-openvswitch-agent sync could takes
more then 5 minutes, you can get an error on octavia-interface.service
systemd unit, because it can't wait either o-hm0 interface is already
attached to br-int, or octavia management VxLAN is already configured
on that host. In this case you have to add ``octavia_interface_wait_timeout``
to ``globals.yml`` and set the value to new timeout in seconds
.. code-block:: yaml
octavia_interface_wait_timeout: 1800
On deployments with up to 2500 network ports per network node sync process
could take up to 30mins. But you have to consider this value according
to your deployment size.
OVN provider
============

View File

@ -0,0 +1,10 @@
---
fixes:
- |
Fixes 2067036.
Added ``octavia_interface_wait_timeout`` to control
octavia-interface.service timeout to be able wait
openvswitch agent sync has been finished and
octavia-lb-net is reachable from the host.
Also set restart policy for this unit to on-failure
`LP#2067036 <https://launchpad.net/bugs/2067036>`__