From 9169085db781e2a27bdfc36e45a4d0feeb9bca53 Mon Sep 17 00:00:00 2001 From: Julia Kreger Date: Sat, 18 Jul 2020 12:51:29 -0700 Subject: [PATCH] Extend base build timeouts Our ramdisks have swelled, and are taking anywhere from 500-700 seconds to even reach the point where IPA is starting up. This means, that a 900 second build timeout is cutting it close and intermittent performance degredation in CI means that a job may fail simply because it is colliding with the timeout. One example I deconstruted today where a 900 second timout was in effect: * 08:21:41 Tempest job startes * 08:21:46 Nova instance requested * Compute service requests ironic to do the thing. * Ironic downloads IPA and stages it - ~20-30 seconds * VM boots and loads ipxe ~30 seconds. * 08:23:22 - ipxe downloads kernel/ramdisk (time should be completion unless apache has changed logging behavior for requests.) * 08:26:28 - Kernel at 120 second marker and done decompressing the ramdisk. * ~08:34:30 - Kernel itself hit the six hundred second runtime marker and hasn't even started IPA. * 08:35:02 - Ironic declars the deploy failed due to wait timeout. ([conductor]deploy_callback_timeout hit at 700 seconds.) * 08:35:32 - Nova fails the build saying it can't be scheduled. (Note, I started adding times to figure out the window to myself, so they are incomplete above.) The time we can account for in the job is about 14 minutes or 840 seconds. As such, our existing defaults are just not enough to handle the ramdisk size AND variance in cloud performance. Change-Id: I4f9db300e792980059c401fce4c37a68c438d7c0 --- zuul.d/ironic-jobs.yaml | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/zuul.d/ironic-jobs.yaml b/zuul.d/ironic-jobs.yaml index afed7ae92c..d50a8fd464 100644 --- a/zuul.d/ironic-jobs.yaml +++ b/zuul.d/ironic-jobs.yaml @@ -35,14 +35,15 @@ FORCE_CONFIG_DRIVE: True INSTALL_TEMPEST: False # Don't install a tempest package globaly VIRT_DRIVER: ironic - BUILD_TIMEOUT: 900 + BUILD_TIMEOUT: 1800 IRONIC_BAREMETAL_BASIC_OPS: True IRONIC_BUILD_DEPLOY_RAMDISK: False - IRONIC_CALLBACK_TIMEOUT: 700 + IRONIC_CALLBACK_TIMEOUT: 1800 + IRONIC_PXE_BOOT_RETRY_TIMEOUT: 900 IRONIC_DEPLOY_DRIVER: ipmi IRONIC_INSPECTOR_BUILD_RAMDISK: False IRONIC_INSPECTOR_TEMPEST_INTROSPECTION_TIMEOUT: 1200 - IRONIC_TEMPEST_BUILD_TIMEOUT: 900 + IRONIC_TEMPEST_BUILD_TIMEOUT: 1800 IRONIC_TEMPEST_WHOLE_DISK_IMAGE: False IRONIC_VM_COUNT: 1 IRONIC_VM_EPHEMERAL_DISK: 1 @@ -741,8 +742,8 @@ PUBLIC_PHYSICAL_NETWORK: public PUBLIC_PROVIDERNET_TYPE: flat Q_USE_PROVIDERNET_FOR_PUBLIC: True - BUILD_TIMEOUT: 1440 - IRONIC_TEMPEST_BUILD_TIMEOUT: 1440 + BUILD_TIMEOUT: 2000 + IRONIC_TEMPEST_BUILD_TIMEOUT: 2000 IRONIC_PING_TIMEOUT: 1440 # NOTE(rpittau): OLD TINYIPA JOBS