6c1442c385
Right now every controller rotates fernet keys. This is nice because should any controller die, we know the remaining ones will rotate the keys. However, we are currently over-rotating the keys. When we over rotate keys, we get logs like this: This is not a recognized Fernet token <token> TokenNotFound Most clients can recover and get a new token, but some clients (like Nova passing tokens to other services) can't do that because it doesn't have the password to regenerate a new token. With three controllers, in crontab in keystone-fernet we see the once a day correctly staggered across the three controllers: ssh ctrl1 sudo cat /etc/kolla/keystone-fernet/crontab 0 0 * * * /usr/bin/fernet-rotate.sh ssh ctrl2 sudo cat /etc/kolla/keystone-fernet/crontab 0 8 * * * /usr/bin/fernet-rotate.sh ssh ctrl3 sudo cat /etc/kolla/keystone-fernet/crontab 0 16 * * * /usr/bin/fernet-rotate.sh Currently with three controllers we have this keystone config: [token] expiration = 86400 (although, keystone default is one hour) allow_expired_window = 172800 (this is the keystone default) [fernet_tokens] max_active_keys = 4 Currently, kolla-ansible configures key rotation according to the following: rotation_interval = token_expiration / num_hosts This means we rotate keys more quickly the more hosts we have, which doesn't make much sense. Keystone docs state: max_active_keys = ((token_expiration + allow_expired_window) / rotation_interval) + 2 For details see: https://docs.openstack.org/keystone/stein/admin/fernet-token-faq.html Rotation is based on pushing out a staging key, so should any server start using that key, other servers will consider that valid. Then each server in turn starts using the staging key, each in term demoting the existing primary key to a secondary key. Eventually you prune the secondary keys when there is no token in the wild that would need to be decrypted using that key. So this all makes sense. This change adds new variables for fernet_token_allow_expired_window and fernet_key_rotation_interval, so that we can correctly calculate the correct number of active keys. We now set the default rotation interval so as to minimise the number of active keys to 3 - one primary, one secondary, one buffer. This change also fixes the fernet cron job generator, which was broken in the following cases: * requesting an interval of more than 1 day resulted in no jobs * requesting an interval of more than 60 minutes, unless an exact multiple of 60 minutes, resulted in no jobs It should now be possible to request any interval up to a week divided by the number of hosts. Change-Id: I10c82dc5f83653beb60ddb86d558c5602153341a Closes-Bug: #1809469
125 lines
4.1 KiB
Python
125 lines
4.1 KiB
Python
#!/usr/bin/env python
|
|
|
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
# you may not use this file except in compliance with the License.
|
|
# You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
|
|
# This module creates a list of cron intervals for a node in a group of nodes
|
|
# to ensure each node runs a cron in round robbin style.
|
|
|
|
from __future__ import print_function
|
|
import argparse
|
|
import json
|
|
import sys
|
|
|
|
MINUTE_SPAN = 1
|
|
HOUR_SPAN = 60
|
|
DAY_SPAN = 24 * HOUR_SPAN
|
|
WEEK_SPAN = 7 * DAY_SPAN
|
|
|
|
|
|
class RotationIntervalTooLong(Exception):
|
|
pass
|
|
|
|
|
|
def json_exit(msg=None, failed=False, changed=False):
|
|
if type(msg) is not dict:
|
|
msg = {'msg': str(msg)}
|
|
msg.update({'failed': failed, 'changed': changed})
|
|
print(json.dumps(msg))
|
|
sys.exit()
|
|
|
|
|
|
def generate(host_index, total_hosts, total_rotation_mins):
|
|
min = '*' # 0-59
|
|
hour = '*' # 0-23
|
|
day = '*' # 0-6 (day of week)
|
|
crons = []
|
|
|
|
if host_index >= total_hosts:
|
|
return crons
|
|
|
|
# We need to rotate the key every total_rotation_mins minutes.
|
|
# When there are N hosts, each host should rotate once every N *
|
|
# total_rotation_mins minutes, in a round-robin manner.
|
|
# We can generate a cycle for index 0, then add an offset specific to each
|
|
# host.
|
|
# NOTE: Minor under-rotation is better than over-rotation since tokens
|
|
# may become invalid if keys are over-rotated.
|
|
host_rotation_mins = total_rotation_mins * total_hosts
|
|
host_rotation_offset = total_rotation_mins * host_index
|
|
|
|
# Can't currently rotate less than once per week.
|
|
if total_rotation_mins > WEEK_SPAN:
|
|
msg = ("Unable to schedule fernet key rotation with an interval "
|
|
"greater than 1 week divided by the number of hosts")
|
|
raise RotationIntervalTooLong(msg)
|
|
|
|
# Build crons multiple of a day
|
|
elif host_rotation_mins > DAY_SPAN:
|
|
time = host_rotation_offset
|
|
while time + total_rotation_mins <= WEEK_SPAN:
|
|
day = time // DAY_SPAN
|
|
hour = time % HOUR_SPAN
|
|
min = time % HOUR_SPAN
|
|
crons.append({'min': min, 'hour': hour, 'day': day})
|
|
|
|
time += host_rotation_mins
|
|
|
|
# Build crons for multiple of an hour
|
|
elif host_rotation_mins > HOUR_SPAN:
|
|
time = host_rotation_offset
|
|
while time + total_rotation_mins <= DAY_SPAN:
|
|
hour = time // HOUR_SPAN
|
|
min = time % HOUR_SPAN
|
|
crons.append({'min': min, 'hour': hour, 'day': day})
|
|
|
|
time += host_rotation_mins
|
|
|
|
# Build crons for multiple of a minute
|
|
else:
|
|
time = host_rotation_offset
|
|
while time + total_rotation_mins <= HOUR_SPAN:
|
|
min = time // MINUTE_SPAN
|
|
crons.append({'min': min, 'hour': hour, 'day': day})
|
|
|
|
time += host_rotation_mins
|
|
|
|
return crons
|
|
|
|
|
|
def main():
|
|
parser = argparse.ArgumentParser(description='''Creates a list of cron
|
|
intervals for a node in a group of nodes to ensure each node runs
|
|
a cron in round robin style.''')
|
|
parser.add_argument('-t', '--time',
|
|
help='Time in minutes for a key rotation cycle',
|
|
required=True,
|
|
type=int)
|
|
parser.add_argument('-i', '--index',
|
|
help='Index of host starting from 0',
|
|
required=True,
|
|
type=int)
|
|
parser.add_argument('-n', '--number',
|
|
help='Number of hosts',
|
|
required=True,
|
|
type=int)
|
|
args = parser.parse_args()
|
|
try:
|
|
jobs = generate(args.index, args.number, args.time)
|
|
except Exception as e:
|
|
json_exit(str(e), failed=True)
|
|
json_exit({'cron_jobs': jobs})
|
|
|
|
|
|
if __name__ == "__main__":
|
|
main()
|