Spec for new feature, secure-oslo-messaging.rpc messages

This blueprint defines the new feature to secure the oslo messaging
messages with symmetric keys.

Change-Id: If0146f08b3c5ad49a277963fcc685f5192d92edb
Partially-Implements: secure-oslo-messaging-messages
This commit is contained in:
Amrith Kumar 2016-10-30 04:56:53 -04:00 committed by Amrith Kumar
parent 806f8be82d
commit b2b2c59989

View File

@ -0,0 +1,328 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
Sections of this template were taken directly from the Nova spec
template at:
https://github.com/openstack/nova-specs/blob/master/specs/juno-template.rst
=================================
Secure oslo.messaging.rpc message
=================================
.. sectnum::
.. contents::
Trove utilizes oslo_messaging.rpc to perform RPC calls and the
transport underlying this is oslo_messaging. Messages sent on
oslo.messaging are currently treated as genuine. There is a benefit to
adding a layer of validation that will ensure that the RPC calls are
in fact genuine. We propose that the RPC calls be encrypted with
unique keys.
Launchpad Blueprint:
https://blueprints.launchpad.net/trove/+spec/secure-oslo-messaging-messages
Problem Description
===================
Messages sent on oslo.messaging are currently treated as genuine by
the recipient. Given that the names of the topics used are
predictable, it is possible for a person with sufficient knowledge of
Trove to, for example, compromise a guest instance or otherwise obtain
credentials to connect to RabbitMQ (or the underlying transport to
oslo-messaging) and then generate messages to, for example, the task
manager by impersonating the API service. While there are already
safeguards in place to contain the scope of this, such as by requiring
that the message contain a valid keystone token with the appropriate
access, this is still a point of vulnerability.
Currently, when a client wishes to make an asynchronous RPC (cast()),
the method name and parameters are marshalled and sent down to
oslo_messaging.rpc. It is the responsibility of oslo_messaging.rpc to
transmit the information to the remote side, and then find and invoke
the method specified. After the cast() is invoked on the client side,
the next thing that is seen by the consumer of oslo_messaging.rpc is
an invocation of the desired method on the server side.
The same thing happens for a synchronous RPC (call()) with the
additional step of the client blocking, the server completing the
operation and sending a response to the client, and the client
receiving that and unblocking.
Proposed Change
===============
After experimenting with several other alternative approaches, we
propose to implement custom serializers (and deserializers) which can
be provided to oslo_messaging.rpc.
All messages sent and return values in RPC call() will be serialized
through these custom methods which will encrypt the content. Due to a
bug ``Failure to use serializer in exception`` if an RPC function
throws an exception, the exception is not encrypted.
How does TroveRPCDispatcher verify legitimacy of a message
----------------------------------------------------------
The proposed implementation relies on cryptography and unique keys for
the control plane and the guests. We propose to use symmetric keys for
the purpose of encryption.
Trove has the following entities who are party to RPC invocations:
- Trove API Service (client)
- Trove Taskmanager Service (client and server)
- Trove Conductor Service (server)
- Trove Guestagent (client and server)
When an RPC call() or cast() is made, the client invokes the
serializer which will encrypt all arguments. When received on the
server side, oslo_messaging.rpc will invoke the deserializer which
will decrypt the arguments.
It is assumed that the control plane is secure and the control plane
symmetric key is secure. If it is compromised, then all bets are off.
In communication with the guest agent, each guest has a unique
symmetric key that is generated by the control plane and passed to the
guest at launch.
Securing the response
---------------------
As described earlier, a response to a call() method will be secured in
the same way as the request. As observed earlier, due to a bug, an
exception thrown by an RPC function is not (currently) being
serialized and will therefore be returned unencrypted. When (if) that
bug is fixed in oslo_messaging.rpc, this exposure is minimized.
Contol plane key
----------------
The control plane key is constructed at system initialization
time. The key is stored on the control plane (in the configuration
file).
If the control plane consists of multiple machines, then the control
plane services on all machines must have access to the control plane
key.
Getting keys to the guest instance
----------------------------------
On instance launch, the guest key is created and passed to the guest
as an injected file. We assume that the mechanism for file injection
is secure in that it cannot be intercepted and compromised by a bad
actor.
A unique key is created for each instance.
Why is this secure?
-------------------
We make two assumptions above; these are:
(a) The control plane is secure, the control plane key is not
compromised, and
(b) The transmission of the guest key to the guest is secure and is
not compromised.
These are, meaningful and reasonable assumptions to make given the
architecture of an OpenStack system.
Should a guest be compromised, the bad actor can connect to the
underlying transport (say Rabbit) but all they will be able to see are
encrypted messages that they cannot decrypt.
Configuration
-------------
The control plane key is stored on the control plane in a secure way
and there are configuration options to tell each service where to find
it.
Each guest instance will have a key and that will be stored securely
on the instance and a configuraiton setting will tell the guestagent
where to find it.
.. code-block:: python
cfg.StrOpt('tm_rpc_encr_key',
default='bzH6y0SGmjuoY0FNSTptrhgieGXNDX6PIhvz',
help='OpenSSL aes_cbc key for taskmanager RPC encryption.'),
cfg.StrOpt('inst_rpc_key_encr_key',
default='emYjgHFqfXNB1NGehAFIUeoyw4V4XwWHEaKP',
help='OpenSSL aes_cbc key to encrypt instance keys in DB.'),
cfg.StrOpt('instance_rpc_encr_key',
help='OpenSSL aes_cbc key for instance RPC encryption.'),
Database
--------
The guest key for each guest instance will be stored in the
database. A table instance_keys is proposed for this.
+---------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+-------+
| id | varchar(64) | NO | PRI | NULL | |
| instance_id | varchar(64) | NO | UNI | NULL | |
| encrypted_key | varchar(255) | NO | | NULL | |
| created | datetime | NO | | NULL | |
| updated | datetime | NO | | NULL | |
| deleted | tinyint(1) | NO | | NULL | |
| deleted_at | datetime | YES | | NULL | |
+---------------+--------------+------+-----+---------+-------+
The guest instance keys are encrypted and stored in the encrypted_key
column. A foreign key constraint links instance_id with
instances.id. A unique constraint on instance_id is placed on this
table.
Public API
----------
No changes to the public API.
Public API Security
-------------------
No changes.
Python API
----------
No changes.
CLI (python-troveclient)
------------------------
No changes.
Internal API
------------
The internal API (from the perspective of developers, and invocations)
will remain unaffected by this change as the implementation seeks to
work below the Trove code entirely. As a result, the internal API will
be radically different, and code must be in place to ensure that
encrypting and non-encrypting clients and servers know how to
interoperate.
Guest Agent
-----------
The guestagent will receive its key as a part of the configdrive/boot
process and can use it to decrypt all messages.
Alternatives
------------
Several alternatives were considered, prototyped, and abandoned. A
short summary of each is provided below.
(a) We proposed to the oslo_messaging.rpc team to implement a
lightweight message signing and encryption mechanism in their code
by providing a mechanism of callbacks which would allow the
consumer (trove) to perform the signing and encryption. The
oslo_messaging team did not want to go this route as they felt
that the message included other private data structures which we
(the consumer) could modify and cause unexpected behavior.
(b) We proposed that that oslo_messaging.rpc allow consumers to
provide a custom dispatcher for messages on the receiver
side. With this implementation, a signature or message encryption
could be performed on the client side and intercepted on the
server side and reversed allowing us to have minimal changes on
the server side. Again, the oslo_messaging.rpc team felt that the
dispatcher was a private data structure and they did not feel that
we should be encapsulating it.
(c) We prototyped and experimented with a change where each RPC
endpoint would be decorated and the decorator would provide a
mechanism to construct the proper parameters and the invocation to
the RPC method. The client side change would be identical to (b)
but the server side change would involve a change to every RPC
method to add the decorator. In addition, the call context would
not be encrypted in this approach and it was abandoned.
(d) We were advised that we should NOT be using oslo_messaging.rpc the
way we are using it as it was only intended for use on the control
plane. And that we should instead make the guest an RPC
server. Unfortunately that's not what we need? In Trove, the guest
agent is an extension of the control plane and not well suited to
a REST based communication strategy. What we need is an RPC
mechanism, and it is sad that oslo_messaging.rpc can't seem to
provide a secure one.
Dashboard Impact (UX)
=====================
None.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
amrith
Dashboard assignee:
none
Milestones
----------
Ocata-1
Work Items
----------
- Implementing code on control plane and guest
- Implement changes to devstack plugin to create control plane key
- Implement unit tests
- Implement upgrade handling
- Update documentation
Upgrade Implications
====================
Minimal upgrade implications are anticipated, code is proposed that
handles this transition.
1. The control plane key will be generated and persisted on all
control plane nodes.
2. When guests are upgraded a key will be sent to them as part of the
nova migrate process.
The API's will be rev'ed one major version to account for this.
Dependencies
============
There is an assumed dependency on the RPC API versioning which has now
merged.
Testing
=======
Oh yeah, we'll need some of this.
Documentation Impact
====================
And some of this; details to follow.
References
==========
``Failure to use serializer in exception``: https://bugs.launchpad.net/oslo.messaging/+bug/1648254
Appendix
========
Any additional technical information and data.