8e2ce29cdf
Pulling from the etherpad at the Denver 2019 PTG and inserting in to the template format. Much more detail can be fleshed out, but this gives a framework to start. Change-Id: If15093cb80f1230f3a626253676c88d162cbb2b2 Story: 2005751 Task: 33433
4.0 KiB
4.0 KiB
Auto-scale Compute to Balance Resource Usage
- As a deployer and operator of OpenStack I want to be able to configure highly available autoscaling services with Free Open Source Software.
- As an operator of OpenStack I want to be able to add additional compute nodes to my cluster from a pool of available bare metal inventory automatically in response to resource consumption within my cloud.
- As an operator of OpenStack I want to be able to remove compute nodes from my cluster and return them to the pool of available bare metal inventory nodes in response to an excess quantity of compute resource availability within my cloud.
- As an app deployer I want to automatically scale-in one app to free up physical infra to scale-out another app which needs the resources more. More generally, I want to scale various apps in/out/up/down based on load/priority/custom policy, subject to some global resource constraints.
Problem description
Global constraints: As an app deployer I want to automatically scale-in one app to free up physical infra to scale-out another app which needs the resources more.
More generally, I want to scale various apps in/out/up/down based on load/priority/custom policy, subject to some global resource constraints.
- Sort of like pre-emptible resources or something like that?
- Yes, but maybe more dynamic and more levels of priority. One workload may be high priority in one load condition, but become low priority under a different load condition.
- Ah, interesting! Like each autoscale group would have some concept of priority and timeframe (critical from 0900-1700, medium priority 1800->2000, low priority from 2100->0800)
- Could be something like that. Here's a more concrete example:
- I have two apps, A and B. Both apps are monitored for request
completion time.
- App A has the targets: good: 0-10ms ; ok: 10-30ms; bad: > 30 ms;
- App B has the targets: good: 0-100ms ; ok: 100-500ms; bad: > 500 ms;
- Based on the current load condition and request completion time, I want to allocate the physical compute resource between the two apps based on some optimization criteria.
- I have two apps, A and B. Both apps are monitored for request
completion time.
This use case was called out in the Denver 2019 PTG - https://etherpad.openstack.org/p/DEN-auto-scaling-SIG
OpenStack projects used
- ...
- ...