
We've already done a lot of this, but let's get the vision down on paper, shall we? Change-Id: I30b6fb32e53786d2fe504b78182e1c18c0a17364
217 lines
8.6 KiB
ReStructuredText
217 lines
8.6 KiB
ReStructuredText
::
|
|
|
|
Copyright 2015 Hewlett-Packard Development Company, L.P.
|
|
|
|
This work is licensed under a Creative Commons Attribution 3.0
|
|
Unported License.
|
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
========================================
|
|
shade: A library that understands clouds
|
|
========================================
|
|
|
|
Infra uses multiple clouds and as a result has learned a lot about what needs
|
|
to be done to do that. In the interest of being good citizens, instead of that
|
|
knowledge being inside of nodepool, it should be in a reusable library.
|
|
|
|
Problem Description
|
|
===================
|
|
|
|
As much as OpenStack promises a utopian future where an application can be
|
|
written once and target multiple clouds that run OpenStack, the reality is
|
|
that deployer choice leaks through the abstractions to the point where the
|
|
end user must know about it. This causes logic to require a-priori knowledge
|
|
about clouds, as well as complex logic even on discoverable differences.
|
|
|
|
The current user interface libraries, `python-*client`, are particularly
|
|
user unfriendly as they were primarily written with server-to-server
|
|
communication in mind. They were also each designed completely differently
|
|
so that an application which uses more than one OpenStack feature becomes
|
|
quickly confusing to write.
|
|
|
|
In addition to Infra, `ansible` has a set of modules that focus on creating
|
|
and managing cloud resources. As part of using `ansible` to orchestrate
|
|
`puppet`, it only makes sense for Infra to use `ansible` to manage its
|
|
resources, which means that the logic Infra has learned about how that works
|
|
should be applicable. Specifics on using `ansible` for that purpose are
|
|
out of scope of this spec, but `ansible` upstream as a consumer is an
|
|
important design consideration.
|
|
|
|
Proposed Change
|
|
===============
|
|
|
|
The `shade` library will handle all of this. It will contain the logic learned
|
|
from `nodepool`, or moving forward, it will contain any new complex cloud
|
|
manipulation logic that `nodepool` needs. It should be considered that
|
|
`nodepool` is `shade's` primary user.
|
|
|
|
To that end, `shade` must support constructs like application based API rate
|
|
limiting and caching appropriate for long-lived connections.
|
|
|
|
A consumer of `shade` should never need to put in logic such as "if my cloud
|
|
supports X, then do Y, else Z". There are two situations in which such logic
|
|
might arise.
|
|
|
|
Firstly, there are two or more ways of doing the same logical action.
|
|
An example is getting a floating IP, which could be the purview of
|
|
`neutron` or of `nova`. `shade` should present a general `create_floating_ip`
|
|
to the user and hide all details about where it came from.
|
|
|
|
Secondly, there is functionality that simply does not exist on a cloud.
|
|
For example, some clouds are deployed without trove. In that case, the user
|
|
will receive an error message stating that the selected cloud does not support
|
|
managing trove resources.
|
|
|
|
The `python-*client` libraries are not written with end users in mind. They
|
|
have, as their primary use case, the enabling of server to server
|
|
communication. As such, they make a set of assumptions that is not in keeping
|
|
with a consumer point of view. Their use should be replaced by
|
|
`python-openstacksdk` once it is ready. However, it is not, so in the mean
|
|
time the `python-*client` libraries need to be used. As the future plan is to
|
|
replace them, all objects and exceptions they return should be expressly
|
|
hidden, even though masking exceptions is considered poor form.
|
|
|
|
A future state could be imagined where `shade` and `python-openstacksdk` merge,
|
|
but it does not seem to be the primary concern of either library at the moment.
|
|
If it did happen, it would likely be as a "simple" API or something on top of
|
|
or to the side of the rest of the SDK. The reasons for this largely is that
|
|
`python-openstacksdk` is more concerned with providing an SDK to program the
|
|
OpenStack APIs with - and `shade` is more concerned with hiding the ways in
|
|
which deployers have chosen to do things that leak through the API. It is
|
|
likely that a future state where `shade` is depreciated is one in which the
|
|
issues it deals with are bundled into the server APIs. In this instance, a
|
|
layer of business logic is not needed.
|
|
|
|
Passthrough access to the underlying `Client` objects is useful for phased
|
|
adoption of `shade`. Before 1.0 is released, removal should be considered, or
|
|
hidden behind a disableable warning. This is to ensure a user has to explicity
|
|
opt-in knowing that they are not part of the API.
|
|
|
|
`ansible` is the second user of `shade`. The main addition this brings is the
|
|
need for idempotent operations. The `ansible` modules must have enough in the
|
|
API to be able to provide that without large amounts of repeated logic in the
|
|
modules themselves. In fact, most of the `ansible` modules should actually
|
|
contain very little code that is not related to `ansible` argument processing
|
|
or interpretation of results into a suitable format.
|
|
|
|
Finally, it is not `shade's` purpose in life to express what is or is not
|
|
OpenStack, nor to be involved in such categorizations. Its job is to improve
|
|
the end user's experience. For that reason, `shade` should take a maximal
|
|
approach to including support for things. If someone wants to add support
|
|
for `designate` or `magnum` or `manila` or whatever, that's awesome.
|
|
|
|
It is a conscious and active decision to not use a plugin interface for this.
|
|
Because again `shade` exists to reduce the cognitive burden on the user, the
|
|
user should not have to know to install plugins to be able to use their cloud.
|
|
The two main reasons for pluggable clients in the past is:
|
|
|
|
* Strict policies on what is 'Integrated'
|
|
* To enable proprietary extensions
|
|
|
|
The first is no longer a problem for OpenStack broadly, and even if it was
|
|
it's still not a practical issue for an Infra project.
|
|
|
|
The second is the thing that will ultimately cause OpenStack to die if it is
|
|
allowed to continue. While the right of people to choose to destroy all the
|
|
goodness in the world is an important right for them to have, there is no need
|
|
for Infra to involve itself such a tragedy.
|
|
|
|
Anything that's in `shade` needs to be testable by running `shade` functional
|
|
tests against a devstack in the Infra gates.
|
|
|
|
There is currently one exception to the testable in Infra gates, which is that
|
|
the Rackspace Task API for Glance does not work in devstack, so we cannot test
|
|
it. We have an exception for this because at the moment, `nodepool` must use
|
|
that API, and it is an API that exists in glance, even if the backing code
|
|
is broken. However, the general rule stands, and any violations of that rule
|
|
need to be carefully considered exceptions - and probably accompanied by a
|
|
large amount of complaining.
|
|
|
|
Alternatives
|
|
------------
|
|
|
|
We could ignore writing a library and write all of our logic directly in
|
|
nodepool. This is problematic because it causes a lot of really useful code
|
|
and logic to not be easily reusable by the community at large.
|
|
|
|
We could write all of the logic directly in the ansible modules upstream and
|
|
then have nodepool turn into an engine which consumes the ansible modules. This
|
|
is more tempting, but ansible does not support long-lived objects, which means
|
|
that we'd be execing ansible on every operation which seems rather extreme. It
|
|
also means that people not using ansible would be unable to benefit from the
|
|
logic.
|
|
|
|
We could improve the client libraries or `python-openstacksdk`. We've tried to
|
|
include richer logic in the client libraries and have been told it's not what
|
|
they are for. The `python-openstacksdk` is still young and we've been told it's
|
|
not ready for production use yet. We need some of the logic for `shade` now,
|
|
so the timescale for getting it done in `python-openstacksdk` isn't very
|
|
workable.
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary assignee:
|
|
mordred
|
|
|
|
Additional assignee(s):
|
|
Shrews
|
|
greghaynes
|
|
dguerri
|
|
TheJulia
|
|
Spamaps
|
|
|
|
Gerrit Topic
|
|
------------
|
|
|
|
`shade` is a library itself, so there is no dedicated gerrit topic.
|
|
|
|
Work Items
|
|
----------
|
|
|
|
* Implement Image uploading for nodepool
|
|
* Get to feature parity with nodepool on floating-ips and server creation
|
|
* Implement ansible modules for every function in shade
|
|
|
|
Repositories
|
|
------------
|
|
|
|
openstack-infra/shade
|
|
|
|
Servers
|
|
-------
|
|
|
|
None
|
|
|
|
DNS Entries
|
|
-----------
|
|
|
|
None
|
|
|
|
Documentation
|
|
-------------
|
|
|
|
`shade` needs developer documentation of its API
|
|
|
|
Security
|
|
--------
|
|
|
|
None
|
|
|
|
Testing
|
|
-------
|
|
|
|
`shade` should have both unit tests and functional tests. The functional
|
|
tests should run against devstack VMs. If a developer chooses to, they should
|
|
be able to manually run functional tests against live clouds, since the purpose
|
|
of shade is to enable use of myriad clouds, not to support or expose
|
|
theoretical APIs.
|
|
|
|
Dependencies
|
|
============
|
|
|
|
None
|