
Route domains will allow multiple L3 networks to self organize static routes. This allows additions of L3 networks with manually updating previously deployed networks. - Fix a YamlIngester error with labels on NetworkLinks - Update the CLI --block option - Add routedomains attribue to Networks in schema and object model - Add routedomain documentation - Add unit test for routedomain route generation - Add unit test coverage reporting Change-Id: I059d2eae6da84c4f9ad909f0287432e6cf0970d0
497 lines
18 KiB
ReStructuredText
497 lines
18 KiB
ReStructuredText
..
|
|
Copyright 2017 AT&T Intellectual Property.
|
|
All Rights Reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may
|
|
not use this file except in compliance with the License. You may obtain
|
|
a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
|
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
|
License for the specific language governing permissions and limitations
|
|
under the License.
|
|
|
|
.. _topology_label:
|
|
|
|
=======================
|
|
Authoring Site Topology
|
|
=======================
|
|
|
|
Drydock uses a YAML-formatted site topology definition to configure
|
|
downstream drivers to provision baremetal nodes. This topology describes
|
|
the networking configuration of a site as well as the set of node configurations
|
|
that will be deployed. A node configuration consists of network attachment,
|
|
network addressing, local storage, kernel selection and configuration and
|
|
metadata.
|
|
|
|
The best source for a sample of the YAML schema for a topology is the unit
|
|
test input `source </tests/yaml_samples/fullsite.yaml>`_ in
|
|
``./tests/yaml_samples/fullsite.yaml``.
|
|
|
|
Defining Networking
|
|
===================
|
|
|
|
Network definitions in the topology are described by two document types:
|
|
NetworkLink and Network. NetworkLink describes a physical or logical link
|
|
between a node and switch. It is concerned with attributes that must be agreed
|
|
upon by both endpoints: bonding, media speed, trunking, etc. A Network describes
|
|
the layer 2 and layer 3 networks accessible over a link.
|
|
|
|
Network Links
|
|
-------------
|
|
|
|
The NetworkLink document defines layer 1 and layer 2 attributes that should be
|
|
in-sync between the node and the switch. Each link can support a single untagged
|
|
VLAN and 0 or more tagged VLANs.
|
|
|
|
Example YAML schema of the NetworkLink spec:
|
|
|
|
.. code:: yaml
|
|
|
|
spec:
|
|
bonding:
|
|
mode: 802.3ad
|
|
hash: layer3+4
|
|
peer_rate: slow
|
|
mtu: 9000
|
|
linkspeed: auto
|
|
trunking:
|
|
mode: 802.1q
|
|
allowed_networks:
|
|
- public
|
|
- mgmt
|
|
|
|
``bonding`` describes combining multiple physical links into a single logical
|
|
link (aka LAG or link aggregation group).
|
|
|
|
* ``mode``: What bonding mode to configure
|
|
|
|
* ``disabled``: Do not configure a bond
|
|
* ``802.3ad``: Use 802.3ad dynamic aggregation (aka LACP)
|
|
* ``active-backup``: Use static active/standby bonding
|
|
* ``balanced-rr``: Use static round-robin bonding
|
|
|
|
For a ``mode`` of ``802.3ad`` the optional attributes below are available:
|
|
|
|
* ``hash``: The link selection hash. Supported values are ``layer3+4``,
|
|
``layer2+3``, ``layer2``. Default is ``layer3+4``
|
|
* ``peer_rate``: How frequently to send LACP control frames. Supported values
|
|
are ``fast`` and ``slow``. Default is ``fast``
|
|
* ``mon_rate``: Interval between checking link state in milliseconds.
|
|
Default is ``100``
|
|
* ``up_delay``: Delay in milliseconds between a link coming up and being marked
|
|
up in the bond. Must be greater than ``mon_rate``. Default is ``200``
|
|
* ``down_delay``: Delay in milliseconds between a link going down and being
|
|
marked down in the bond. Must be greater than ``mon_rate``.
|
|
Default is ``200``
|
|
|
|
``mtu`` is the maximum transmission unit for the link. It must be equal or
|
|
greater than the MTU of any VLAN interfaces using the link. Default is ``1500``.
|
|
|
|
``linkspeed`` is the physical layer speed and duplex. Recommended to always be
|
|
``auto``
|
|
|
|
``trunking`` describes how multiple layer 2 networks will be multiplexed on the
|
|
link.
|
|
|
|
* ``mode``: Can be ``disabled`` for no trunking or ``802.1q`` for standard
|
|
VLAN tagging
|
|
* ``default_network``: For ``mode: disabled``, this is the single network on
|
|
the link. For ``mode: 802.1q`` this is optionally the network accessed by
|
|
untagged frames.
|
|
|
|
``allowed_networks`` is a sequence of network names listing all networks allowed
|
|
on this link. Each Network can be listed on one and only one NetworkLink.
|
|
|
|
Network
|
|
-------
|
|
|
|
The Network document defines the layer 2 and layer 3 networks nodes will access.
|
|
Each Network is accessible over exactly one NetworkLink. However that
|
|
NetworkLink can be attached to different interfaces on different nodes to
|
|
support changing hardware configurations.
|
|
|
|
Example YAML schema of the Network spec:
|
|
|
|
.. code:: yaml
|
|
|
|
spec:
|
|
vlan: '102'
|
|
mtu: 1500
|
|
cidr: 172.16.3.0/24
|
|
routedomain: storage
|
|
ranges:
|
|
- type: static
|
|
start: 172.16.3.15
|
|
end: 172.16.3.200
|
|
- type: dhcp
|
|
start: 172.16.3.201
|
|
end: 172.16.3.254
|
|
routes:
|
|
- subnet: 0.0.0.0/0
|
|
gateway: 172.16.3.1
|
|
metric: 10
|
|
- gateawy: 172.16.3.2
|
|
metric: 10
|
|
routedomain: storage
|
|
dns:
|
|
domain: sitename.example.com
|
|
servers: 8.8.8.8
|
|
|
|
If a Network is accessible over a NetworkLink using 802.1q VLAN tagging, the
|
|
``vlan`` attribute specified the VLAN tag for this Network. It should be omitted
|
|
for non-tagged Networks.
|
|
|
|
``mtu`` is the maximum transmission unit for this Network. Must be equal or less
|
|
than the ``mtu`` defined for the hosting NetworkLink. Can be omitted to default
|
|
to the NetworkLink ``mtu``.
|
|
|
|
``cidr`` is the classless inter-domain routing address for the network.
|
|
|
|
``routedomain`` is a logical grouping of L3 networks such that a network that
|
|
describes a static route for accessing the route domain will yield a list of
|
|
static routes for all the networks in the routedomain. See the description
|
|
of ``routes`` below for more information.
|
|
|
|
``ranges`` defines a sequence of IP addresses within the defined ``cidr``.
|
|
Ranges cannot overlap.
|
|
|
|
* ``type``: The type of address range.
|
|
|
|
* ``static``: A range used for static, explicit address assignments for
|
|
nodes.
|
|
* ``dhcp``: A range used for assigning DHCP addresses. Note that a network
|
|
being used for PXE booting must have a DHCP range defined.
|
|
* ``reserved``: A range of addresses that will not be used by MaaS.
|
|
|
|
* ``start``: The starting IP of the range, inclusive.
|
|
* ``end``: The last IP of the range, inclusive
|
|
|
|
``routes`` defines a list of static routes to be configured on nodes attached to
|
|
this network. The routes can defined in one of two ways: an explicit destination
|
|
``subnet`` where the route will be configured exactly as described or a destination
|
|
``routedomain`` where Drydock will calculate all the destination L3 subnets for the
|
|
routedomain and add routes for each of them using the ``gateway`` and ``metric``
|
|
defined.
|
|
|
|
* ``subnet``: Destination CIDR for the route
|
|
* ``gateway``: The gateway IP on this Network to use for accessing the destination
|
|
* ``metric``: The metric or weight for this route
|
|
* ``routedomain``: Use this route's gateway and metric for accessing networks in the
|
|
defined routedomain.
|
|
|
|
``dns`` is used for specifying the list of DNS servers to use if this network
|
|
is the primary network for the node.
|
|
|
|
* ``servers``: A comma-separated list of IP addresses to use for DNS resolution
|
|
* ``domain``: A domain that can be used for automated registration of IP
|
|
addresses assigned from this Network
|
|
|
|
DHCP Relay
|
|
~~~~~~~~~~
|
|
|
|
DHCP relaying is used when a DHCP server is not attached to the same layer 2
|
|
broadcast domain as nodes that are being PXE booted. The DHCP requests from the
|
|
node are consumed by the relay (generally configured on a top-of-rack switch)
|
|
which then encapsulates the request in layer 3 routing and sends it to an
|
|
upstream DHCP server. The Network spec supports a ``dhcp_relay`` key for
|
|
Networks that should relay DHCP requests.
|
|
|
|
* The Network must have a configured DHCP relay, this is *not* configured by
|
|
Drydock or MaaS.
|
|
* The ``upstream_target`` IP address must be a host IP address for a MaaS rack
|
|
controller
|
|
* The Network must have a defined DHCP address range.
|
|
* The upstream target network must have a defined DHCP address range.
|
|
|
|
The ``dhcp_relay`` stanza:
|
|
|
|
.. code:: yaml
|
|
|
|
dhcp_relay:
|
|
upstream_target: 172.16.4.100
|
|
|
|
Defining Node Configuration
|
|
===========================
|
|
|
|
Node configuration is defined in three documents: ``HostProfile``,
|
|
``HardwareProfile`` and ``BaremetalNode``. ``HardwareProfile`` defines
|
|
attributes directly related to hardware configuration such as card-slot layout
|
|
and firmware levels. ``HostProfile`` is a generic definition for how a node
|
|
should be configured such that many nodes can reference a single ``HostProfile``
|
|
and each will be configured identically. A ``BaremetalNode`` is a concrete
|
|
reference to the particular physical node. The ``BaremetalNode`` definition will
|
|
reference a ``HostProfile`` and can then extend or override any of the
|
|
configuration values.
|
|
|
|
Example ``HostProfile`` and ``BaremetalNode`` configuration:
|
|
|
|
.. code:: yaml
|
|
|
|
---
|
|
apiVersion: 'drydock/v1'
|
|
kind: HostProfile
|
|
metadata:
|
|
name: defaults
|
|
region: sitename
|
|
date: 17-FEB-2017
|
|
author: sh8121@att.com
|
|
spec:
|
|
# configuration values
|
|
---
|
|
apiVersion: 'drydock/v1'
|
|
kind: HostProfile
|
|
metadata:
|
|
name: compute_node
|
|
region: sitename
|
|
date: 17-FEB-2017
|
|
author: sh8121@att.com
|
|
spec:
|
|
host_profile: defaults
|
|
# compute_node customizations to defaults
|
|
---
|
|
apiVersion: 'drydock/v1'
|
|
kind: BaremetalNode
|
|
metadata:
|
|
name: compute01
|
|
region: sitename
|
|
date: 17-FEB-2017
|
|
author: sh8121@att.com
|
|
spec:
|
|
host_profile: compute_node
|
|
# configuration customization specific to single node compute01
|
|
|
|
|
|
In the above example, the *compute_node* ``HostProfile`` adopts all values from
|
|
the *defaults* ``HostProfile`` and can then override defined values or append
|
|
additional values. ``BaremetalNode`` *compute01* then adopts all values from the
|
|
*compute_node* ``HostProfile`` (which includes all the configuration items it
|
|
adopted from *defaults*) and can then again override or append any
|
|
configuration that is specific to that node.
|
|
|
|
Defining Node Interfaces and Network Addressing
|
|
===============================================
|
|
|
|
Node network attachment can be described in a ``HostProfile`` or a
|
|
``BaremetalNode`` document. Node addressing is allowed only in a
|
|
``BaremetalNode`` document. If a ``HostProfile`` or ``BaremetalNode`` needs to
|
|
remove a defined interface from an inherited configuration, it can set the
|
|
mapping value for the interface name to ``null``.
|
|
|
|
Once the interface attachments to networks is defined, ``HostProfile`` and
|
|
``BaremetalNode`` specs must define a ``primary_network`` attribute to denote
|
|
which network the node should use as the primary route.
|
|
|
|
Interfaces
|
|
----------
|
|
|
|
Interfaces for a node can be described in either a ``HostProfile`` or
|
|
``BaremetalNode`` definition. This will attach a defined NetworkLink to a host
|
|
interface and define which Networks should be configured to use that interface.
|
|
|
|
Example interface definition YAML schema:
|
|
|
|
.. code:: yaml
|
|
|
|
interfaces:
|
|
pxe:
|
|
device_link: pxe
|
|
labels:
|
|
pxe: true
|
|
slaves:
|
|
- prim_nic01
|
|
networks:
|
|
- pxe
|
|
bond0:
|
|
device_link: gp
|
|
slaves:
|
|
- prim_nic01
|
|
- prim_nic02
|
|
networks:
|
|
- mgmt
|
|
- private
|
|
|
|
Each key in the interfaces mapping is a defined interface. The key is the name
|
|
that will be used on the deployed node for the interface. The value must be a
|
|
mapping defining the interface configuration or ``null`` to denote removal of
|
|
that interface for an inherited configuration.
|
|
|
|
* ``device_link``: The name of the defined NetworkLink that will be attached to
|
|
this interface. The NetworkLink definition includes part of the interface
|
|
configuration such as bonding.
|
|
* ``labels``: Metadata for describing this interface.
|
|
* ``slaves``: The list of hardware interfaces used for creating this interface.
|
|
This value can be a device alias defined in the HardwareProfile or the kernel
|
|
name of the hardware interface. For bonded interfaces, this would list all the
|
|
slaves. For non-bonded interfaces, this should list the single hardware
|
|
interface used.
|
|
* ``networks``: This is the list of networks to enable on this interface. If
|
|
multiple networks are listed, the NetworkLink attached to this interface must
|
|
have trunking enabled or the design validation will fail.
|
|
|
|
Addressing
|
|
----------
|
|
|
|
Addressing for a node can only be defined in a ``BaremetalNode`` definition. The
|
|
``addressing`` stanza simply defines a static IP address or ``dhcp`` for each
|
|
network a node should have a configured layer 3 interface on. It is a valid
|
|
design to omit networks from the ``addressing`` stanza, in that case the
|
|
interface attached to the omitted network will be configured as link up with no
|
|
address.
|
|
|
|
Example ``addressing`` YAML schema:
|
|
|
|
.. code:: yaml
|
|
|
|
addressing:
|
|
- network: pxe
|
|
address: dhcp
|
|
- network: mgmt
|
|
address: 172.16.1.21
|
|
- network: private
|
|
address: 172.16.2.21
|
|
- network: oob
|
|
address: 172.16.100.21
|
|
|
|
|
|
Defining Node Storage
|
|
=====================
|
|
|
|
Storage can be defined in the ``storage`` stanza of either a HostProfile or
|
|
BaremetalNode document. The storage configuration can describe the creation of
|
|
partitions on physical disks, the assignment of physical disks and/or partitions
|
|
to volume groups, and the creation of logical volumes. Drydock will make a best
|
|
effort to parse out system-level storage such as the root filesystem or boot
|
|
filesystem and take appropriate steps to configure them in the active node
|
|
provisioning driver. At a minimum, the storage configuration *must* contain
|
|
a root filesystem partition.
|
|
|
|
Example YAML schema of the ``storage`` stanza:
|
|
|
|
.. code:: yaml
|
|
|
|
storage:
|
|
physical_devices:
|
|
sda:
|
|
labels:
|
|
bootdrive: true
|
|
partitions:
|
|
- name: 'root'
|
|
size: '10g'
|
|
bootable: true
|
|
filesystem:
|
|
mountpoint: '/'
|
|
fstype: 'ext4'
|
|
mount_options: 'defaults'
|
|
- name: 'boot'
|
|
size: '1g'
|
|
filesystem:
|
|
mountpoint: '/boot'
|
|
fstype: 'ext4'
|
|
mount_options: 'defaults'
|
|
sdb:
|
|
volume_group: 'log_vg'
|
|
volume_groups:
|
|
log_vg:
|
|
logical_volumes:
|
|
- name: 'log_lv'
|
|
size: '500m'
|
|
filesystem:
|
|
mountpoint: '/var/log'
|
|
fstype: 'xfs'
|
|
mount_options: 'defaults'
|
|
|
|
Schema
|
|
------
|
|
|
|
The ``storage`` stanza can contain two top-level keys: ``physical_devices`` and
|
|
``volume_groups``. The latter is optional.
|
|
|
|
Physical Devices and Partitions
|
|
-------------------------------
|
|
|
|
A physical device can either be carved up in partitions (including a single
|
|
partition consuming the entire device) or added to a volume group as a physical
|
|
volume. Each key in the ``physical_devices`` mapping represents a device on a
|
|
node. The key should either be a device alias defined in the HardwareProfile or
|
|
the name of the device published by the OS. The value of each key must be a
|
|
mapping with the following keys
|
|
|
|
* ``labels``: A mapping of key/value strings providing generic labels for the
|
|
device
|
|
* ``partitions``: A sequence of mappings listing the partitions to be created on
|
|
the device. The mapping is described below. Incompatible with the
|
|
``volume_group`` specification.
|
|
* ``volume_group``: A volume group name to add the device to as a physical
|
|
volume. Incompatible with the ``partitions`` specification.
|
|
|
|
Partition
|
|
~~~~~~~~~
|
|
|
|
A partition mapping describes a GPT partition on a physical disk. It can be left
|
|
as a raw block device or formatted and mounted as a filesystem.
|
|
|
|
* ``name``: Metadata describing the partition in the topology
|
|
* ``size``: The size of the partition. See the *Size Format* section below
|
|
* ``bootable``: Boolean whether this partition should be the bootable device
|
|
* ``part_uuid``: A UUID4 formatted UUID to assign to the partition. If not
|
|
specified one will be generated
|
|
* ``filesystem``: An optional mapping describing how the partition should be
|
|
formatted and mounted
|
|
|
|
* ``mountpoint``: Where the filesystem should be mounted. If not specified
|
|
the partition will be left as a raw device
|
|
* ``fstype``: The format of the filesystem. Defaults to ext4
|
|
* ``mount_options``: fstab style mount options. Default is 'defaults'
|
|
* ``fs_uuid``: A UUID4 formatted UUID to assign to the filesystem. If not
|
|
specified one will be generated
|
|
* ``fs_label``: A filesystem label to assign to the filesystem. Optional.
|
|
|
|
Size Format
|
|
~~~~~~~~~~~
|
|
|
|
The size specification for a partition or logical volume is formed from three
|
|
parts:
|
|
|
|
* The first character can optionally be ``>`` indicating that the size specified
|
|
is a minimum and the calculated size should be at least the minimum and should
|
|
take the rest of the available space on the physical device or volume group.
|
|
* The second part is the numeric portion and must be an integer
|
|
* The third part is a label
|
|
|
|
* m|M|mb|MB: Megabytes or 10^6 * the numeric
|
|
* g|G|gb|GB: Gigabytes or 10^9 * the numeric
|
|
* t|T|tb|TB: Terabytes or 10^12 * the numeric
|
|
* %: The percentage of total device or volume group space
|
|
|
|
Volume Groups and Logical Volumes
|
|
---------------------------------
|
|
|
|
Logical volumes can be used to create RAID-0 volumes spanning multiple physical
|
|
disks or partitions. Each key in the ``volume_groups`` mapping is a name
|
|
assigned to a volume group. This name must be specified as the ``volume_group``
|
|
attribute on one or more physical devices or partitions or the configuration is
|
|
invalid. Each mapping value is another mapping describing the volume group.
|
|
|
|
* ``vg_uuid``: A UUID4 format uuid applied to the volume group. If not
|
|
specified, one is generated
|
|
* ``logical_volumes``: A sequence of mappings listing the logical volumes to be
|
|
created in the volume group
|
|
|
|
Logical Volume
|
|
~~~~~~~~~~~~~~
|
|
|
|
A logical volume is a RAID-0 volume. Using logical volumes for ``/`` and
|
|
``/boot`` is supported
|
|
|
|
* ``name``: Required field. Used as the logical volume name.
|
|
* ``size``: The logical volume size. See *Size Format* above for details.
|
|
* ``lv_uuid``: A UUID4 format uuid applied to the logical volume: If not
|
|
specified, one is generated
|
|
* ``filesystem``: A mapping specifying how the logical volume should be
|
|
formatted and mounted. See the *Partition* section above for filesystem
|
|
details.
|