
Blueprint to support multiple ssh auth strategies. Change-Id: Ib882fd2c9354b91c5069a35ec74003e0259fec3f
11 KiB
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
http://creativecommons.org/licenses/by/3.0/legalcode
Multiple strategies for ssh access to VMs
https://blueprints.launchpad.net/tempest/+spec/ssh-auth-strategy
Different strategies for ssh access to VMs in tests.
Problem description
Ssh access to created servers is in several cases key to properly validate the result of an API call or a scenario (use case) test. This is true for compute but not limited to it. Network and volume verification must often rely on test servers, and ssh access to the VM helps significantly for the verification.
Support for ssh access to VMs in tempest tests is both heterogeneous
as well as incomplete. Not all tests honour the same config options. The
existing run_ssh
option is only taken into account by some
of the tests, the compute API ones. Not all tests use the same strategy
for ssh access, and several tests do not perform any ssh verification at
all. The reason often is that ssh verification is a common source of
"flakiness" and timeouts in tests, and allocation of the resources
required for ssh verification can be expensive.
Proposed change
Consolidate the available configuration options and make sure they
are honoured everywhere. Configuration shall be declaritive, i.e.
tempest users shall configure how they expect ssh to work, and if that's
not compatible with the deployed cloud tempest shall raise an
InvalidConfiguration
. Improve the configuration help text
to guide configuration for instance validation.
Current configuration options relevant to instance validation are:
CONF.auth.allow_tenant_isolation
: affects the fixed network nameCONF.compute.[image|image_alt]_ssh_user
CONF.compute.image_ssh_password
: not image specific, and it's used by only two tests, without checking against the ssh_auth_methodCONF.compute.image_alt_ssh_password
: unusedCONF.compute.run_ssh
CONF.compute.ssh_auth_method
: used for resource setup by API compute tests, but not honoured by the tests. The image[_alt]_ssh[user|password] settings are meant to be used when this is set to "configured". At the moment it is not enforced nor documentedCONF.compute.ssh_connect_method
: used for resource setup by API compute tests, not honoured by the tests. When set to floating, it should be verified that a floating IP range is configuredCONF.compute.ssh_user
: currently used for ssh verification by most API and scenario tests, which is a problem because configuration supports different images, each with an own ssh userCONF.compute.ping_timeout
: used by scenario test onlyCONF.compute.ssh_timeout
: used by RemoteClientCONF.compute.ssh_channel_timeout
: used by RemoteClientCONF.compute.fixed_network_name
: used by API and scenario tests. It's the name of the network for the primary IP with nova networking; or with neutron networking when tenant isolation is disabled. The logic, as implemented by test_list_server_filters shall be moved to an helper and reused everywhere. It may be used for ssh validation only if floating IPs are disabledCONF.compute.network_for_ssh
: used by RemoteClient and some scenario tests to discover an IP for ssh validation. It can be used if floating IP for ssh is disabled, in which case the fixed_network_name could be used as well; except for the case of multi-nic testing, which would require more logic anyways to enable the 2nd nicCONF.compute.ip_version_for_ssh
: used byRemoteClient
. It should be overridable via parameter instead of one config for all tests.CONF.compute.use_floatingip_for_ssh
: used by some scenario tests, duplicate of ssh_connect_method, which is not used at the momentCONF.compute.path_to_private_key
: unusedCONF.network.tenant_network_reachable
: used by scenario tests. In some cases it's used for tests that want to verify both tenant and public network connectivity. In other cases it's used to find out which IP to be used for instance validation, which overlaps with the ssh_connect_methodCONF.network.public_network_id
: used for allocation of floating IPs when neutron is enabled.
Target configuration shall include a new group "validation" used for all option related to validation of API call results, and the following options:
CONF.validation.connect_method
: default ssh method. Tests may still use different method if they want to do so (fixed or floating)CONF.validation.auth_method
: default auth method. Tests may still use a different method if they want to do so (only ssh key supported for now). Additional methods will be handled in a separate specCONF.validation.ip_version_for_ssh
: default IP version for sshCONF.validation.*timeout
(for ping, connect and ssh)CONF.*.*ssh_user
(for the various images available)CONF.network.fixed_network_name
: default fixed network name; this parameter is only valid in case of nova network (with flat networking), and for now with pre-provisioned accounts. Once the bp test-accounts-continued is implemented this may still be used as default fixed network name if not specified in accounts.yaml.CONF.network.floating_network_name
: default floating network name, used to allocate floating IPs when neutron is enabled. DeprecatesCONF.network.public_network_id
CONF.network.tenant_network_reachable
: used when the configured ssh_connect_method is "fixed". If this is set to false raise anInvalidConfiguration
exception
Configuration options that are renamed or that planned for removal should go through the deprecation process.
A few options are image specific: image name, ssh user / password, typical time to boot / ssh. Such options would be better handled in a dedicated images.yaml file rather than in tempest.conf. This will be handled in a separate spec.
Define an helper functions that read, validate and process the
configuration, which in future will help decoupling
create_test_server
from CONF, for migration to
tempest-lib.
Extend the existing RemoteClient
to provide tools
for:
- ping: attempts a single ping to a target to server
- connect: attempts a single TCP connect on a generic port to a target server
- ssh: attempts a single ssh connection to a target server
- validaton: validates a server by using a configurable sequence of the above; cares about retries and timeouts
Bits of implementation for that are already available in scenario
tests. They should be consolidated in RemoteClient
.
Define a validation_resources
function, similar to the
existing network_resources
, to be used in the class level
resource_setup
, which allocates required reusable
resources, such as: a key pair, a security group with rules in it, and a
floating ip. It returns all the resources in form of a dict, ready to be
used in create_test_server
. Tests which use more than one
server will allocated additional floating IPs on demand. Once bp
test-accounts-continued is implemented as well we may consider
consolidating validation_resources
and
network_resources
.
Centralize create_test_server
, and make sure all tests
use this central implementation. Add the following features:
- it includes an
sshable
boolean parameter in thecreate_test_server
helper function, defaults toFalse
. If set toTrue
it ensures the server is created with all the required resources associated, e.g. that it has a public key injected, and IP address on a public network, a security group that allows for ICMP and ssh communication. The default to false ensures that resources are used only when required. - it accepts a resources dict with reusable items, which can be: a key_name, a security_group with rules for ssh and icmp in, a floating_ip. These are passed in as parameters in preparation for the migration to tempest-lib.
- it extends the valid value for
wait_until
with new types of wait abilities:PINGABLE
andSSHABLE
. For instance if anSSHABLE
server is requested the create method takes care of performing basic ssh validation as well. - it returns a tuple
(created_server, remote_client)
, where the remote client is already initialized with access resources such as public key, admin password, IP address, ssh account name.
def create_test_server(self, client, wait_until=None, sshable=False,
resources=None, **kwargs):
if sshable == True and run_ssh == True:
read config via helpers
process result, extend kwargs, but do not override
public_key: if key_name not defined use from resources or create
sg rules: use from resources, or create sg with rules and append
network name: append to network dict
floating ip: use from resources or allocate one
validation == True
(...)
server = servers_client.create_server(**kwargs)
wait for status
if ip_type == 'floating':
attach an IP
if validation:
build params based on helpers above
remote = RemoteClient(**params)
wait for status (extended: ping / connect / ssh)
return remote
def test_foo(self):
myvm = servers.create_test_server(
sshable=True, wait_until='SSHABLE')
myvm['remote_client'].write_to_console("I could do something more useful")
A server can still be made ssh-able "by-hand" for more complex scenarios, such as hot-plug tests, where the server may only be connected at a later stage to a public network.
In case a test class contains tests which make use of ssh-able servers, network resources must be prepared for the tenant (if not yet available), so that it is possible to have network access to the VM.
Alternatives
As run_ssh is currently disabled, an alternative could be to completely drop ssh verification from API tests. However a number of cases cannot really be verified unless ssh verification is on (e.g. reboot, rebuild, config drive).
Implementation
Assignee(s)
- Primary assignee:
-
Andrea Frittoli <andrea.frittoli@hp.com>
- Other assignees:
-
Nithya Ganesan <nithya.ganesan@hp.com>, Joseph Lanoux <joseph.lanoux@hp.com>
Milestones
- Target Milestone for completion:
-
Kilo-2
Work Items
- Introduce new configuration options, and helpers to read them
- Create a validation_resources function
- Create shared create_test_server function
- Create shared ssh verification function / extend RemoteClient
- Migrate tests to the new format (multiple patches)
- Deprecate un-used / removed configuration options
- Setup experimental / periodic jobs that run with validation enabled
- the aim is to promote both run_ssh and sshable to be
True
by default, as well maintain the code path healthy until that happens
Dependencies
None