nodepool/doc/source/configuration.rst
James E. Blair fd454706ca Add delete-after-upload option
This allows operators to delete large diskimage files after uploads
are complete, in order to save space.

A setting is also provided to keep certain formats, so that if
operators would like to delete large formats such as "raw" while
retaining a qcow2 copy (which, in an emergency, could be used to
inspect the image, or manually converted and uploaded for use),
that is possible.

Change-Id: I97ca3422044174f956d6c5c3c35c2dbba9b4cadf
2024-03-09 06:51:56 -08:00

18 KiB

zuul

Configuration

Nodepool reads its configuration from /etc/nodepool/nodepool.yaml by default. The configuration file follows the standard YAML syntax with a number of sections defined with top level keys. For example, a full configuration file may have the diskimages, labels, and providers sections:

diskimages:
  ...
labels:
  ...
providers:
  ...

The following drivers are available.

aws azure gce ibmvpc kubernetes openshift openshift-pods openstack static metastatic

The following sections are available. All are required unless otherwise indicated.

Options

webapp

Define the webapp endpoint port and listen address

port

The port to provide basic status information

listen_address

Listen address for web app

elements-dir

If an image is configured to use diskimage-builder and glance to locally create and upload images, then a collection of diskimage-builder elements must be present. The elements-dir parameter indicates a directory that holds one or more elements.

images-dir

When we generate images using diskimage-builder they need to be written to somewhere. The images-dir parameter is the place to write them.

Note

The builder daemon creates a UUID to uniquely identify itself and to mark image builds in ZooKeeper that it owns. This file will be named builder_id.txt and will live in the directory named by the images-dir option. If this file does not exist, it will be created on builder startup and a UUID will be created automatically.

build-log-dir

The builder will store build logs in this directory. It will create one file for each build, named <image>-<build-id>.log; for example, fedora-0000000004.log. It defaults to /var/log/nodepool/builds.

build-log-retention

At the start of each build, the builder will remove old build logs if they exceed this value. This option specifies how many will be kept (usually you will see one more, as deletion happens before starting a new build). By default, the last 7 old build logs are kept. Set this to -1 to disable removal of logs.

zookeeper-servers

Lists the ZooKeeper servers uses for coordinating information between nodepool workers.

zookeeper-servers:
  - host: zk1.example.com
    port: 2181
    chroot: /nodepool

Each entry is a dictionary with the following keys

host

A zookeeper host

port

Port to talk to zookeeper

chroot

The chroot key, used for interpreting ZooKeeper paths relative to the supplied root path, is also optional and has no default.

zookeeper-tls

To use TLS connections with Zookeeper, provide this dictionary with the following keys:

cert

The path to the PEM encoded certificate.

key

The path to the PEM encoded key.

ca

The path to the PEM encoded CA certificate.

zookeeper-timeout

The ZooKeeper session timeout, in seconds.

labels

Defines the types of nodes that should be created. Jobs should be written to run on nodes of a certain label. Example

labels:
  - name: my-precise
    max-ready-age: 3600
    min-ready: 2
  - name: multi-precise
    min-ready: 2

Each entry is a dictionary with the following keys

name

Unique name used to tie jobs to those instances.

max-ready-age

Maximum number of seconds the node shall be in ready state. If this is exceeded the node will be deleted. A value of 0 disables this.

min-ready

Minimum number of instances that should be in a ready state. Nodepool always creates more nodes as necessary in response to demand, but setting min-ready can speed processing by attempting to keep nodes on-hand and ready for immedate use. min-ready is best-effort based on available capacity and is not a guaranteed allocation. The default of 0 means that nodepool will only create nodes of this label when there is demand. Set to -1 to have the label considered disabled, so that no nodes will be created at all.

max-hold-age

Maximum number of seconds a node shall be in "hold" state. If this is exceeded the node will be deleted. A value of 0 disables this.

This setting is applied to all nodes, regardless of label or provider.

diskimages

This section lists the images to be built using diskimage-builder. The name of the diskimage is mapped to the providers.[openstack].diskimages section of the provider, to determine which providers should received uploads of each image. The diskimage will be built in every format required by the providers with which it is associated. Because Nodepool needs to know which formats to build, if the diskimage will only be built if it appears in at least one provider.

To remove a diskimage from the system entirely, remove all associated entries in providers.[openstack].diskimages and remove its entry from diskimages. All uploads will be deleted as well as the files on disk.

If multiple builders are used to build disjoint images, the diskimage stanza for every image must be present on every builder, however, each builder may have different providers configured, and a given builder will only build images used by its configured providers.

A sample configuration section is illustrated below.

diskimages:
  - name: base
    abstract: True
    elements:
      - vm
      - simple-init
      - openstack-repos
      - nodepool-base
      - cache-devstack
      - cache-bindep
      - growroot
      - infra-package-needs
    env-vars:
      TMPDIR: /opt/dib_tmp
      DIB_CHECKSUM: '1'
      DIB_IMAGE_CACHE: /opt/dib_cache

  - name: ubuntu-bionic
    parent: base
    pause: False
    rebuild-age: 86400
    elements:
      - ubuntu-minimal
    release: bionic
    username: zuul
    env-vars:
      DIB_APT_LOCAL_CACHE: '0'
      DIB_DISABLE_APT_CLEANUP: '1'
      FS_TYPE: ext3

  - name: ubuntu-focal
    base: ubuntu-bionic
    release: focal
    env-vars:
      DIB_DISABLE_APT_CLEANUP: '0'

  - name: centos-8
    parent: base
    pause: True
    rebuild-age: 86400
    formats:
      - raw
      - tar
    elements:
      - centos-minimal
      - epel
    release: '8'
    username: centos
    env-vars:
      FS_TYPE: xfs

Each entry is a dictionary with the following keys

name

Identifier to reference the disk image in providers.[openstack].diskimages and labels.

abstract

An abstract entry is used to group common configuration together, but will not create any actual image. A diskimage marked as abstract should be inherited from in another diskimage via its diskimages.parent attribute.

An abstract entry can have a diskimages.parent attribute as well; values will merge down.

parent

A parent diskimage entry to inherit from. Any values from the parent will be populated into this image. Setting any fields in the current image will override the parent values execept for the following:

  • diskimages.env-vars: new keys are additive, any existing keys from the parent will be overwritten by values in the current diskimage (i.e. Python update() semantics for a dictionary).
  • diskimages.elements: values are additive; the list of elements from the parent will be extended with any values in the current diskimage. Note that the element list passed to diskimage-builder is not ordered; elements specify their own dependencies and diskimage-builder builds a graph from that, not the command-line order.

Note that a parent diskimage may also have it's own parent, creating a chain of inheritance. See also diskimages.abstract for defining common configuration that does not create a diskimage.

formats

The list of formats to build is normally automatically created based on the needs of the providers to which the image is uploaded. To build images even when no providers are configured or to build additional formats which you know you may need in the future, list those formats here.

In case the diskimage is not used by any provider and no formats are configured, the image won't be built.

delete-after-upload

When set to True, the builder will delete the disk image file from disk after all uploads are complete. If more than one format is built, this is performed individually for each format (so that, for example, if all vhd uploads are complete but not qcow2, then the qcow2 files will remain while vhd are deleted).

The diskimage manifest directories are retained as long as any uploaded image is present.

keep-formats

If diskimages.delete-after-upload is set, this setting may be used to retain images of certain formats even after their uploads are complete. For example, setting this value to ["qcow2"] will retain qcow2 images while deleting all other formats.

rebuild-age

If the current diskimage is older than this value (in seconds), then nodepool will attempt to rebuild it. Defaults to 86400 (24 hours).

release

Specifies the distro to be used as a base image to build the image using diskimage-builder.

build-timeout

How long (in seconds) to wait for the diskimage build before giving up. The default is 8 hours.

elements

Enumerates all the elements that will be included when building the image, and will point to the elements-dir path referenced in the same config file.

env-vars

Arbitrary environment variables that will be available in the spawned diskimage-builder child process.

pause

When set to True, nodepool-builder will not build the diskimage.

username

The username that a consumer should use when connecting to the node.

python-path

The path of the default python interpreter. Used by Zuul to set ansible_python_interpreter. The special value auto will direct Zuul to use inbuilt Ansible logic to select the interpreter on Ansible >=2.8, and default to /usr/bin/python2 for earlier versions.

shell-type

The shell type of the node's default shell executable. Used by Zuul to set ansible_shell_type. This setting should not be used unless the default shell is a non-Bourne (sh) compatible shell, e.g. csh or fish. For a windows image with the experimental connection-type ssh, cmd or powershell should be set and reflect the node's DefaultShell configuration.

dib-cmd

Configure the command called to create this disk image. By default this just disk-image-create; i.e. it will use the first match in $PATH. For example, you may want to override this with a fully qualified path to an alternative executable if a custom diskimage-builder is installed in another virutalenv.

Note

Any wrapping scripts or similar should consider that the command-line or environment arguments to disk-image-create are not considered an API and may change.

metadata

This provides default values for the provider-specific metadata or tags values which can be set for diskimage uploads to specific providers. This is a dictionary of arbitrary key/value pairs. Avoid the use of nodepool_ as a key prefix since Nodepool uses this for internal values.

providers

Lists the providers Nodepool should use. Each provider is associated to a driver listed below.

Each entry is a dictionary with the following keys

name

Name of the provider

max-concurrency

Maximum number of node requests that this provider is allowed to handle concurrently. The default, if not specified, is to have no maximum. Since each node request is handled by a separate thread, this can be useful for limiting the number of threads used by the nodepool-launcher daemon.

priority

The priority of this provider (a lesser number is a higher priority). Nodepool launchers will yield requests to other provider pools with a higher priority as long as they are not paused. This means that in general, higher priority pools will reach quota first before lower priority pools begin to be used.

This setting provides the default for each provider pool, but the value can be overidden in the pool configuration.

driver

The driver type.

aws

For details on the extra options required and provided by the AWS driver, see the separate section aws-driver

azure

For details on the extra options required and provided by the Azure driver, see the separate section azure-driver

gce

For details on the extra options required and provided by the GCE driver, see the separate section gce-driver

kubernetes

For details on the extra options required and provided by the kubernetes driver, see the separate section kubernetes-driver

openshift

For details on the extra options required and provided by the openshift driver, see the separate section openshift-driver

openshiftpods

For details on the extra options required and provided by the openshiftpods driver, see the separate section openshift-pods-driver

openstack

For details on the extra options required and provided by the OpenStack driver, see the separate section openstack-driver

static

For details on the extra options required and provided by the static driver, see the separate section static-driver

tenant-resource-limits

A list of global resource limits enforced per tenant (e.g. Zuul tenants).

These limits are calculated on a best-effort basis. Because of parallelism within launcher instances, and especially with multiple launcher instances, the limits are not guaranteed to be exact.

tenant-resource-limits:
  - tenant-name: example-tenant
    max-servers: 10
    max-cores: 200
    max-ram: 16565
    'L-43DA4232': 448

Each entry is a dictionary with the following keys. Any other keys are interpreted as driver-specific resource limits (otherwise specified as max-resources in the provider configuration). The only driver that currently supports additional resource limits is AWS.

tenant-name

A tenant name correspodinding, e.g., to a Zuul tenant.

max-servers

The maximum number of servers a tenant can allocate.

max-cores

The maximum number of CPU cores a tenant can allocate.

max-ram

The maximum amount of main memory (RAM) a tenant can allocate.

max-volumes

The maximum number of volumes a tenant can allocate. Currently only used by the OpenStack driver.

max-volume-gb

The maximum total size in gigabytes of volumes a tenant can allocate. Currently only used by the OpenStack driver.