Bump workers_pool_size to 300 and remove queueing of tasks
Especially in a single-conductor environment, the number of threads should be larger than max_concurrent_deploy, otherwise the latter cannot be reached in practice or will cause issues with heartbeats. On the other hand, this change fixes an issue with how we use futurist. Due to a misunderstanding, we ended up setting the workers pool size to 100 and then also allowing 100 more requests to be queued. To be it shortly, this change moves from 100 threads + 100 queued to 300 threads and no queue. Partial-Bug: #2038438 Change-Id: I1aeeda89a8925fbbc2dae752742f0be4bc23bee0
This commit is contained in:
parent
db549850e0
commit
224cdd726c
@ -125,9 +125,12 @@ class BaseConductorManager(object):
|
|||||||
self._keepalive_evt = threading.Event()
|
self._keepalive_evt = threading.Event()
|
||||||
"""Event for the keepalive thread."""
|
"""Event for the keepalive thread."""
|
||||||
|
|
||||||
# TODO(dtantsur): make the threshold configurable?
|
# NOTE(dtantsur): do not allow queuing work. Given our model, it's
|
||||||
rejection_func = rejection.reject_when_reached(
|
# better to reject an incoming request with HTTP 503 or reschedule
|
||||||
CONF.conductor.workers_pool_size)
|
# a periodic task that end up with hidden backlog that is hard
|
||||||
|
# to track and debug. Using 1 instead of 0 because of how things are
|
||||||
|
# ordered in futurist (it checks for rejection first).
|
||||||
|
rejection_func = rejection.reject_when_reached(1)
|
||||||
self._executor = futurist.GreenThreadPoolExecutor(
|
self._executor = futurist.GreenThreadPoolExecutor(
|
||||||
max_workers=CONF.conductor.workers_pool_size,
|
max_workers=CONF.conductor.workers_pool_size,
|
||||||
check_and_reject=rejection_func)
|
check_and_reject=rejection_func)
|
||||||
|
@ -22,7 +22,7 @@ from ironic.common.i18n import _
|
|||||||
|
|
||||||
opts = [
|
opts = [
|
||||||
cfg.IntOpt('workers_pool_size',
|
cfg.IntOpt('workers_pool_size',
|
||||||
default=100, min=3,
|
default=300, min=3,
|
||||||
help=_('The size of the workers greenthread pool. '
|
help=_('The size of the workers greenthread pool. '
|
||||||
'Note that 2 threads will be reserved by the conductor '
|
'Note that 2 threads will be reserved by the conductor '
|
||||||
'itself for handling heart beats and periodic tasks. '
|
'itself for handling heart beats and periodic tasks. '
|
||||||
|
25
releasenotes/notes/workers-20ca5c225c1474e0.yaml
Normal file
25
releasenotes/notes/workers-20ca5c225c1474e0.yaml
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
---
|
||||||
|
issues:
|
||||||
|
- |
|
||||||
|
When configuring a single-conductor environment, make sure the number
|
||||||
|
of worker pools (``[conductor]worker_pool_size``) is larger than the
|
||||||
|
maximum parallel deployments (``[conductor]max_concurrent_deploy``).
|
||||||
|
This was not the case by default previously (the options used to be set
|
||||||
|
to 100 and 250 accordingly).
|
||||||
|
upgrade:
|
||||||
|
- |
|
||||||
|
Because of a fix in the internal worker pool handling, you may now start
|
||||||
|
seeing requests rejected with HTTP 503 under a very high load earlier than
|
||||||
|
before. In this case, try increasing the ``[conductor]worker_pool_size``
|
||||||
|
option or consider adding more conductors.
|
||||||
|
- |
|
||||||
|
The default worker pool size (the ``[conductor]worker_pool_size`` option)
|
||||||
|
has been increased from 100 to 300. You may want to consider increasing
|
||||||
|
it even further if your environment allows that.
|
||||||
|
fixes:
|
||||||
|
- |
|
||||||
|
Fixes handling new requests when the maximum number of internal workers
|
||||||
|
is reached. Previously, after reaching the maximum number of workers
|
||||||
|
(100 by default), we would queue the same number of requests (100 again).
|
||||||
|
This was not intentional, and now Ironic no longer queues requests if
|
||||||
|
there are no free threads to run them.
|
Loading…
x
Reference in New Issue
Block a user