============================================ Auto-scale Compute to Balance Resource Usage ============================================ * As a deployer and operator of OpenStack I want to be able to configure highly available autoscaling services with Free Open Source Software. * As an operator of OpenStack I want to be able to add additional compute nodes to my cluster from a pool of available bare metal inventory automatically in response to resource consumption within my cloud. * As an operator of OpenStack I want to be able to remove compute nodes from my cluster and return them to the pool of available bare metal inventory nodes in response to an excess quantity of compute resource availability within my cloud. * As an app deployer I want to automatically scale-in one app to free up physical infra to scale-out another app which needs the resources more. More generally, I want to scale various apps in/out/up/down based on load/priority/custom policy, subject to some global resource constraints. Problem description =================== * Global constraints: As an app deployer I want to automatically scale-in one app to free up physical infra to scale-out another app which needs the resources more. More generally, I want to scale various apps in/out/up/down based on load/priority/custom policy, subject to some global resource constraints. * Sort of like pre-emptible resources or something like that? * Yes, but maybe more dynamic and more levels of priority. One workload may be high priority in one load condition, but become low priority under a different load condition. * Ah, interesting! Like each autoscale group would have some concept of priority and timeframe (critical from 0900-1700, medium priority 1800->2000, low priority from 2100->0800) * Could be something like that. Here's a more concrete example: * I have two apps, A and B. Both apps are monitored for request completion time. * App A has the targets: good: 0-10ms ; ok: 10-30ms; bad: > 30 ms; * App B has the targets: good: 0-100ms ; ok: 100-500ms; bad: > 500 ms; * Based on the current load condition and request completion time, I want to allocate the physical compute resource between the two apps based on some optimization criteria. This use case was called out in the Denver 2019 PTG - https://etherpad.openstack.org/p/DEN-auto-scaling-SIG OpenStack projects used ======================= .. Please provide a list of projects (OpenStack and otherwise) which may be used in order to implement this use case. If no implementation exists yet, suggestions are sufficient here. * ... * ... Inputs and decision-making ========================== .. Describe how decisions about when/how to auto-scale are taken. In particular list any other components or inputs which may provide additional context to help determine the correct action. Auto-scaling ============ .. Describe how the auto-scaling may occur. If there may be different approaches available, please list them all. Existing implementation(s) ========================== .. If there are one or more existing implementations of this use case, please give as many details as possible, in order that operators can re-implement the use case in their own clouds. However any information is better than no information! Linking to external documents is perfectly acceptable. Future work =========== .. Please link from here to any relevant specs. If a cross-project spec is required, it can be placed under ../specs/ in this repository. Please also make sure that any linked specs contain back-links to this use case for maximum discoverability. Dependencies ============ .. - Include specific references to specs and/or blueprints in auto-scaling-sig, or in other projects, that this one either depends on or is related to. - Does this feature require any new library dependencies or code otherwise not included in OpenStack? Or does it depend on a specific version of library?