From 547d3b49ab16c5aa7a41cd943591209c8a6f899d Mon Sep 17 00:00:00 2001 From: Monty Taylor Date: Mon, 1 Jun 2015 12:12:50 -0500 Subject: [PATCH] Add spec for shade We've already done a lot of this, but let's get the vision down on paper, shall we? Change-Id: I30b6fb32e53786d2fe504b78182e1c18c0a17364 --- doc/source/index.rst | 1 + specs/shade.rst | 216 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 217 insertions(+) create mode 100644 specs/shade.rst diff --git a/doc/source/index.rst b/doc/source/index.rst index eb71eb9..6fac971 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -38,6 +38,7 @@ permits. specs/nodepool-launch-workers specs/nodepool-workers specs/public_hiera + specs/shade specs/storyboard_integration_tests specs/storyboard_story_tags specs/storyboard_subscription_pub_sub diff --git a/specs/shade.rst b/specs/shade.rst new file mode 100644 index 0000000..ffddc55 --- /dev/null +++ b/specs/shade.rst @@ -0,0 +1,216 @@ +:: + + Copyright 2015 Hewlett-Packard Development Company, L.P. + + This work is licensed under a Creative Commons Attribution 3.0 + Unported License. + http://creativecommons.org/licenses/by/3.0/legalcode + +======================================== +shade: A library that understands clouds +======================================== + +Infra uses multiple clouds and as a result has learned a lot about what needs +to be done to do that. In the interest of being good citizens, instead of that +knowledge being inside of nodepool, it should be in a reusable library. + +Problem Description +=================== + +As much as OpenStack promises a utopian future where an application can be +written once and target multiple clouds that run OpenStack, the reality is +that deployer choice leaks through the abstractions to the point where the +end user must know about it. This causes logic to require a-priori knowledge +about clouds, as well as complex logic even on discoverable differences. + +The current user interface libraries, `python-*client`, are particularly +user unfriendly as they were primarily written with server-to-server +communication in mind. They were also each designed completely differently +so that an application which uses more than one OpenStack feature becomes +quickly confusing to write. + +In addition to Infra, `ansible` has a set of modules that focus on creating +and managing cloud resources. As part of using `ansible` to orchestrate +`puppet`, it only makes sense for Infra to use `ansible` to manage its +resources, which means that the logic Infra has learned about how that works +should be applicable. Specifics on using `ansible` for that purpose are +out of scope of this spec, but `ansible` upstream as a consumer is an +important design consideration. + +Proposed Change +=============== + +The `shade` library will handle all of this. It will contain the logic learned +from `nodepool`, or moving forward, it will contain any new complex cloud +manipulation logic that `nodepool` needs. It should be considered that +`nodepool` is `shade's` primary user. + +To that end, `shade` must support constructs like application based API rate +limiting and caching appropriate for long-lived connections. + +A consumer of `shade` should never need to put in logic such as "if my cloud +supports X, then do Y, else Z". There are two situations in which such logic +might arise. + +Firstly, there are two or more ways of doing the same logical action. +An example is getting a floating IP, which could be the purview of +`neutron` or of `nova`. `shade` should present a general `create_floating_ip` +to the user and hide all details about where it came from. + +Secondly, there is functionality that simply does not exist on a cloud. +For example, some clouds are deployed without trove. In that case, the user +will receive an error message stating that the selected cloud does not support +managing trove resources. + +The `python-*client` libraries are not written with end users in mind. They +have, as their primary use case, the enabling of server to server +communication. As such, they make a set of assumptions that is not in keeping +with a consumer point of view. Their use should be replaced by +`python-openstacksdk` once it is ready. However, it is not, so in the mean +time the `python-*client` libraries need to be used. As the future plan is to +replace them, all objects and exceptions they return should be expressly +hidden, even though masking exceptions is considered poor form. + +A future state could be imagined where `shade` and `python-openstacksdk` merge, +but it does not seem to be the primary concern of either library at the moment. +If it did happen, it would likely be as a "simple" API or something on top of +or to the side of the rest of the SDK. The reasons for this largely is that +`python-openstacksdk` is more concerned with providing an SDK to program the +OpenStack APIs with - and `shade` is more concerned with hiding the ways in +which deployers have chosen to do things that leak through the API. It is +likely that a future state where `shade` is depreciated is one in which the +issues it deals with are bundled into the server APIs. In this instance, a +layer of business logic is not needed. + +Passthrough access to the underlying `Client` objects is useful for phased +adoption of `shade`. Before 1.0 is released, removal should be considered, or +hidden behind a disableable warning. This is to ensure a user has to explicity +opt-in knowing that they are not part of the API. + +`ansible` is the second user of `shade`. The main addition this brings is the +need for idempotent operations. The `ansible` modules must have enough in the +API to be able to provide that without large amounts of repeated logic in the +modules themselves. In fact, most of the `ansible` modules should actually +contain very little code that is not related to `ansible` argument processing +or interpretation of results into a suitable format. + +Finally, it is not `shade's` purpose in life to express what is or is not +OpenStack, nor to be involved in such categorizations. Its job is to improve +the end user's experience. For that reason, `shade` should take a maximal +approach to including support for things. If someone wants to add support +for `designate` or `magnum` or `manila` or whatever, that's awesome. + +It is a conscious and active decision to not use a plugin interface for this. +Because again `shade` exists to reduce the cognitive burden on the user, the +user should not have to know to install plugins to be able to use their cloud. +The two main reasons for pluggable clients in the past is: + +* Strict policies on what is 'Integrated' +* To enable proprietary extensions + +The first is no longer a problem for OpenStack broadly, and even if it was +it's still not a practical issue for an Infra project. + +The second is the thing that will ultimately cause OpenStack to die if it is +allowed to continue. While the right of people to choose to destroy all the +goodness in the world is an important right for them to have, there is no need +for Infra to involve itself such a tragedy. + +Anything that's in `shade` needs to be testable by running `shade` functional +tests against a devstack in the Infra gates. + +There is currently one exception to the testable in Infra gates, which is that +the Rackspace Task API for Glance does not work in devstack, so we cannot test +it. We have an exception for this because at the moment, `nodepool` must use +that API, and it is an API that exists in glance, even if the backing code +is broken. However, the general rule stands, and any violations of that rule +need to be carefully considered exceptions - and probably accompanied by a +large amount of complaining. + +Alternatives +------------ + +We could ignore writing a library and write all of our logic directly in +nodepool. This is problematic because it causes a lot of really useful code +and logic to not be easily reusable by the community at large. + +We could write all of the logic directly in the ansible modules upstream and +then have nodepool turn into an engine which consumes the ansible modules. This +is more tempting, but ansible does not support long-lived objects, which means +that we'd be execing ansible on every operation which seems rather extreme. It +also means that people not using ansible would be unable to benefit from the +logic. + +We could improve the client libraries or `python-openstacksdk`. We've tried to +include richer logic in the client libraries and have been told it's not what +they are for. The `python-openstacksdk` is still young and we've been told it's +not ready for production use yet. We need some of the logic for `shade` now, +so the timescale for getting it done in `python-openstacksdk` isn't very +workable. + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + mordred + +Additional assignee(s): + Shrews + greghaynes + dguerri + TheJulia + Spamaps + +Gerrit Topic +------------ + +`shade` is a library itself, so there is no dedicated gerrit topic. + +Work Items +---------- + +* Implement Image uploading for nodepool +* Get to feature parity with nodepool on floating-ips and server creation +* Implement ansible modules for every function in shade + +Repositories +------------ + +openstack-infra/shade + +Servers +------- + +None + +DNS Entries +----------- + +None + +Documentation +------------- + +`shade` needs developer documentation of its API + +Security +-------- + +None + +Testing +------- + +`shade` should have both unit tests and functional tests. The functional +tests should run against devstack VMs. If a developer chooses to, they should +be able to manually run functional tests against live clouds, since the purpose +of shade is to enable use of myriad clouds, not to support or expose +theoretical APIs. + +Dependencies +============ + +None