Spec for new feature, secure-oslo-messaging.rpc messages
This blueprint defines the new feature to secure the oslo messaging messages with symmetric keys. Change-Id: If0146f08b3c5ad49a277963fcc685f5192d92edb Partially-Implements: secure-oslo-messaging-messages
This commit is contained in:
parent
806f8be82d
commit
b2b2c59989
328
specs/ocata/secure-oslo-messaging-messages.rst
Normal file
328
specs/ocata/secure-oslo-messaging-messages.rst
Normal file
@ -0,0 +1,328 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
Sections of this template were taken directly from the Nova spec
|
||||
template at:
|
||||
https://github.com/openstack/nova-specs/blob/master/specs/juno-template.rst
|
||||
|
||||
=================================
|
||||
Secure oslo.messaging.rpc message
|
||||
=================================
|
||||
|
||||
.. sectnum::
|
||||
.. contents::
|
||||
|
||||
Trove utilizes oslo_messaging.rpc to perform RPC calls and the
|
||||
transport underlying this is oslo_messaging. Messages sent on
|
||||
oslo.messaging are currently treated as genuine. There is a benefit to
|
||||
adding a layer of validation that will ensure that the RPC calls are
|
||||
in fact genuine. We propose that the RPC calls be encrypted with
|
||||
unique keys.
|
||||
|
||||
Launchpad Blueprint:
|
||||
https://blueprints.launchpad.net/trove/+spec/secure-oslo-messaging-messages
|
||||
|
||||
|
||||
Problem Description
|
||||
===================
|
||||
|
||||
Messages sent on oslo.messaging are currently treated as genuine by
|
||||
the recipient. Given that the names of the topics used are
|
||||
predictable, it is possible for a person with sufficient knowledge of
|
||||
Trove to, for example, compromise a guest instance or otherwise obtain
|
||||
credentials to connect to RabbitMQ (or the underlying transport to
|
||||
oslo-messaging) and then generate messages to, for example, the task
|
||||
manager by impersonating the API service. While there are already
|
||||
safeguards in place to contain the scope of this, such as by requiring
|
||||
that the message contain a valid keystone token with the appropriate
|
||||
access, this is still a point of vulnerability.
|
||||
|
||||
Currently, when a client wishes to make an asynchronous RPC (cast()),
|
||||
the method name and parameters are marshalled and sent down to
|
||||
oslo_messaging.rpc. It is the responsibility of oslo_messaging.rpc to
|
||||
transmit the information to the remote side, and then find and invoke
|
||||
the method specified. After the cast() is invoked on the client side,
|
||||
the next thing that is seen by the consumer of oslo_messaging.rpc is
|
||||
an invocation of the desired method on the server side.
|
||||
|
||||
The same thing happens for a synchronous RPC (call()) with the
|
||||
additional step of the client blocking, the server completing the
|
||||
operation and sending a response to the client, and the client
|
||||
receiving that and unblocking.
|
||||
|
||||
Proposed Change
|
||||
===============
|
||||
|
||||
After experimenting with several other alternative approaches, we
|
||||
propose to implement custom serializers (and deserializers) which can
|
||||
be provided to oslo_messaging.rpc.
|
||||
|
||||
All messages sent and return values in RPC call() will be serialized
|
||||
through these custom methods which will encrypt the content. Due to a
|
||||
bug ``Failure to use serializer in exception`` if an RPC function
|
||||
throws an exception, the exception is not encrypted.
|
||||
|
||||
How does TroveRPCDispatcher verify legitimacy of a message
|
||||
----------------------------------------------------------
|
||||
|
||||
The proposed implementation relies on cryptography and unique keys for
|
||||
the control plane and the guests. We propose to use symmetric keys for
|
||||
the purpose of encryption.
|
||||
|
||||
Trove has the following entities who are party to RPC invocations:
|
||||
|
||||
- Trove API Service (client)
|
||||
- Trove Taskmanager Service (client and server)
|
||||
- Trove Conductor Service (server)
|
||||
- Trove Guestagent (client and server)
|
||||
|
||||
When an RPC call() or cast() is made, the client invokes the
|
||||
serializer which will encrypt all arguments. When received on the
|
||||
server side, oslo_messaging.rpc will invoke the deserializer which
|
||||
will decrypt the arguments.
|
||||
|
||||
It is assumed that the control plane is secure and the control plane
|
||||
symmetric key is secure. If it is compromised, then all bets are off.
|
||||
|
||||
In communication with the guest agent, each guest has a unique
|
||||
symmetric key that is generated by the control plane and passed to the
|
||||
guest at launch.
|
||||
|
||||
Securing the response
|
||||
---------------------
|
||||
|
||||
As described earlier, a response to a call() method will be secured in
|
||||
the same way as the request. As observed earlier, due to a bug, an
|
||||
exception thrown by an RPC function is not (currently) being
|
||||
serialized and will therefore be returned unencrypted. When (if) that
|
||||
bug is fixed in oslo_messaging.rpc, this exposure is minimized.
|
||||
|
||||
Contol plane key
|
||||
----------------
|
||||
|
||||
The control plane key is constructed at system initialization
|
||||
time. The key is stored on the control plane (in the configuration
|
||||
file).
|
||||
|
||||
If the control plane consists of multiple machines, then the control
|
||||
plane services on all machines must have access to the control plane
|
||||
key.
|
||||
|
||||
Getting keys to the guest instance
|
||||
----------------------------------
|
||||
|
||||
On instance launch, the guest key is created and passed to the guest
|
||||
as an injected file. We assume that the mechanism for file injection
|
||||
is secure in that it cannot be intercepted and compromised by a bad
|
||||
actor.
|
||||
|
||||
A unique key is created for each instance.
|
||||
|
||||
Why is this secure?
|
||||
-------------------
|
||||
|
||||
We make two assumptions above; these are:
|
||||
|
||||
(a) The control plane is secure, the control plane key is not
|
||||
compromised, and
|
||||
(b) The transmission of the guest key to the guest is secure and is
|
||||
not compromised.
|
||||
|
||||
These are, meaningful and reasonable assumptions to make given the
|
||||
architecture of an OpenStack system.
|
||||
|
||||
Should a guest be compromised, the bad actor can connect to the
|
||||
underlying transport (say Rabbit) but all they will be able to see are
|
||||
encrypted messages that they cannot decrypt.
|
||||
|
||||
Configuration
|
||||
-------------
|
||||
|
||||
The control plane key is stored on the control plane in a secure way
|
||||
and there are configuration options to tell each service where to find
|
||||
it.
|
||||
|
||||
Each guest instance will have a key and that will be stored securely
|
||||
on the instance and a configuraiton setting will tell the guestagent
|
||||
where to find it.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
cfg.StrOpt('tm_rpc_encr_key',
|
||||
default='bzH6y0SGmjuoY0FNSTptrhgieGXNDX6PIhvz',
|
||||
help='OpenSSL aes_cbc key for taskmanager RPC encryption.'),
|
||||
cfg.StrOpt('inst_rpc_key_encr_key',
|
||||
default='emYjgHFqfXNB1NGehAFIUeoyw4V4XwWHEaKP',
|
||||
help='OpenSSL aes_cbc key to encrypt instance keys in DB.'),
|
||||
cfg.StrOpt('instance_rpc_encr_key',
|
||||
help='OpenSSL aes_cbc key for instance RPC encryption.'),
|
||||
|
||||
Database
|
||||
--------
|
||||
|
||||
The guest key for each guest instance will be stored in the
|
||||
database. A table instance_keys is proposed for this.
|
||||
|
||||
+---------------+--------------+------+-----+---------+-------+
|
||||
| Field | Type | Null | Key | Default | Extra |
|
||||
+---------------+--------------+------+-----+---------+-------+
|
||||
| id | varchar(64) | NO | PRI | NULL | |
|
||||
| instance_id | varchar(64) | NO | UNI | NULL | |
|
||||
| encrypted_key | varchar(255) | NO | | NULL | |
|
||||
| created | datetime | NO | | NULL | |
|
||||
| updated | datetime | NO | | NULL | |
|
||||
| deleted | tinyint(1) | NO | | NULL | |
|
||||
| deleted_at | datetime | YES | | NULL | |
|
||||
+---------------+--------------+------+-----+---------+-------+
|
||||
|
||||
The guest instance keys are encrypted and stored in the encrypted_key
|
||||
column. A foreign key constraint links instance_id with
|
||||
instances.id. A unique constraint on instance_id is placed on this
|
||||
table.
|
||||
|
||||
Public API
|
||||
----------
|
||||
|
||||
No changes to the public API.
|
||||
|
||||
Public API Security
|
||||
-------------------
|
||||
|
||||
No changes.
|
||||
|
||||
Python API
|
||||
----------
|
||||
|
||||
No changes.
|
||||
|
||||
CLI (python-troveclient)
|
||||
------------------------
|
||||
|
||||
No changes.
|
||||
|
||||
Internal API
|
||||
------------
|
||||
|
||||
The internal API (from the perspective of developers, and invocations)
|
||||
will remain unaffected by this change as the implementation seeks to
|
||||
work below the Trove code entirely. As a result, the internal API will
|
||||
be radically different, and code must be in place to ensure that
|
||||
encrypting and non-encrypting clients and servers know how to
|
||||
interoperate.
|
||||
|
||||
Guest Agent
|
||||
-----------
|
||||
|
||||
The guestagent will receive its key as a part of the configdrive/boot
|
||||
process and can use it to decrypt all messages.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Several alternatives were considered, prototyped, and abandoned. A
|
||||
short summary of each is provided below.
|
||||
|
||||
(a) We proposed to the oslo_messaging.rpc team to implement a
|
||||
lightweight message signing and encryption mechanism in their code
|
||||
by providing a mechanism of callbacks which would allow the
|
||||
consumer (trove) to perform the signing and encryption. The
|
||||
oslo_messaging team did not want to go this route as they felt
|
||||
that the message included other private data structures which we
|
||||
(the consumer) could modify and cause unexpected behavior.
|
||||
(b) We proposed that that oslo_messaging.rpc allow consumers to
|
||||
provide a custom dispatcher for messages on the receiver
|
||||
side. With this implementation, a signature or message encryption
|
||||
could be performed on the client side and intercepted on the
|
||||
server side and reversed allowing us to have minimal changes on
|
||||
the server side. Again, the oslo_messaging.rpc team felt that the
|
||||
dispatcher was a private data structure and they did not feel that
|
||||
we should be encapsulating it.
|
||||
(c) We prototyped and experimented with a change where each RPC
|
||||
endpoint would be decorated and the decorator would provide a
|
||||
mechanism to construct the proper parameters and the invocation to
|
||||
the RPC method. The client side change would be identical to (b)
|
||||
but the server side change would involve a change to every RPC
|
||||
method to add the decorator. In addition, the call context would
|
||||
not be encrypted in this approach and it was abandoned.
|
||||
(d) We were advised that we should NOT be using oslo_messaging.rpc the
|
||||
way we are using it as it was only intended for use on the control
|
||||
plane. And that we should instead make the guest an RPC
|
||||
server. Unfortunately that's not what we need? In Trove, the guest
|
||||
agent is an extension of the control plane and not well suited to
|
||||
a REST based communication strategy. What we need is an RPC
|
||||
mechanism, and it is sad that oslo_messaging.rpc can't seem to
|
||||
provide a secure one.
|
||||
|
||||
Dashboard Impact (UX)
|
||||
=====================
|
||||
|
||||
None.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
amrith
|
||||
|
||||
Dashboard assignee:
|
||||
none
|
||||
|
||||
Milestones
|
||||
----------
|
||||
|
||||
Ocata-1
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
- Implementing code on control plane and guest
|
||||
- Implement changes to devstack plugin to create control plane key
|
||||
- Implement unit tests
|
||||
- Implement upgrade handling
|
||||
- Update documentation
|
||||
|
||||
Upgrade Implications
|
||||
====================
|
||||
|
||||
Minimal upgrade implications are anticipated, code is proposed that
|
||||
handles this transition.
|
||||
|
||||
1. The control plane key will be generated and persisted on all
|
||||
control plane nodes.
|
||||
2. When guests are upgraded a key will be sent to them as part of the
|
||||
nova migrate process.
|
||||
|
||||
The API's will be rev'ed one major version to account for this.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
There is an assumed dependency on the RPC API versioning which has now
|
||||
merged.
|
||||
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Oh yeah, we'll need some of this.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
And some of this; details to follow.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
``Failure to use serializer in exception``: https://bugs.launchpad.net/oslo.messaging/+bug/1648254
|
||||
|
||||
Appendix
|
||||
========
|
||||
|
||||
Any additional technical information and data.
|
Loading…
Reference in New Issue
Block a user