Update dib stats
This updates dib stats after creating a dashboard to use them. Firstly, the individual return codes and runtime for each image type are unnecessary, because they call come from the same invocation of dib. While it is definitely useful to track the size of each output image, the overall status for a build is only a single value. This moves these duplciated values to ".status.<rc|duration>". Unfortunately, there's really no way to say "what was the time of the last non-null value" in grafana+graphite [1]. This means you can't do something useful like show a singlestat of the relative time of the last build "X hours ago" using the timer value. We can work around this by putting the timestamp of the last build in a gauge value; this monotonically increases and is easy to turn into a relative time. [1] https://github.com/grafana/grafana/issues/10550 Change-Id: Ia9518b6faecb30d45e0509bda4a9b2ab7fdc6261
This commit is contained in:
parent
c68dbb9636
commit
6fa73eac26
@ -286,7 +286,27 @@ Nodepool builder
|
||||
.. zuul:stat:: nodepool.dib_image_build.<diskimage_name>.<ext>.size
|
||||
:type: gauge
|
||||
|
||||
This stat reports the size of the built image in bytes.
|
||||
This stat reports the size of the built image in bytes. ``ext`` is
|
||||
based on the formats of the images created for the build, for
|
||||
example ``qcow2``, ``raw``, ``vhd``, etc.
|
||||
|
||||
.. zuul:stat:: nodepool.dib_image_build.<diskimage_name>.status.rc
|
||||
:type: gauge
|
||||
|
||||
Return code of the last DIB run. Zero is successful, non-zero is
|
||||
unsuccessful.
|
||||
|
||||
.. zuul:stat:: nodepool.dib_image_build.<diskimage_name>.status.duration
|
||||
:type: timer
|
||||
|
||||
Time the last DIB run for this image build took, in ms
|
||||
|
||||
.. zuul:stat:: nodepool.dib_image_build.<diskimage_name>.status.last_build
|
||||
:type: gauge
|
||||
|
||||
The UNIX timestamp of the last time a build for this image
|
||||
returned. This can be useful for presenting a relative time ("X
|
||||
hours ago") in a dashboard.
|
||||
|
||||
.. zuul:stat:: nodepool.image_update.<image name>.<provider name>
|
||||
:type: counter, timer
|
||||
@ -294,16 +314,6 @@ Nodepool builder
|
||||
Number of image uploads to a specific provider in the cloud plus the time in
|
||||
seconds spent to upload the image.
|
||||
|
||||
.. zuul:stat:: nodepool.dib_image_build.<diskimage_name>.<ext>.rc
|
||||
:type: gauge
|
||||
|
||||
Return code of the DIB.
|
||||
|
||||
.. zuul:stat:: nodepool.dib_image_build.<diskimage_name>.<ext>.duration
|
||||
:type: timer
|
||||
|
||||
Time the DIB run took in ms
|
||||
|
||||
Nodepool launcher
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
@ -888,12 +888,13 @@ class BuildWorker(BaseWorker):
|
||||
|
||||
if self._statsd:
|
||||
# report result to statsd
|
||||
for ext in img_types.split(','):
|
||||
key_base = 'nodepool.dib_image_build.%s.%s' % (
|
||||
diskimage.name, ext)
|
||||
pipeline.gauge(key_base + '.rc', rc)
|
||||
pipeline.timing(key_base + '.duration',
|
||||
int(build_time * 1000))
|
||||
key_base = 'nodepool.dib_image_build.%s.status' % (
|
||||
diskimage.name)
|
||||
pipeline.timing(key_base + '.duration',
|
||||
int(build_time * 1000))
|
||||
pipeline.gauge(key_base + '.rc', rc)
|
||||
pipeline.gauge(key_base + '.last_build',
|
||||
int(time.time()))
|
||||
pipeline.send()
|
||||
|
||||
return build_data
|
||||
|
@ -308,10 +308,11 @@ class TestNodePoolBuilder(tests.DBTestCase):
|
||||
self.waitForImage('fake-provider', 'fake-image')
|
||||
# Make sure our cleanup worker properly removes the first build.
|
||||
self.waitForBuildDeletion('fake-image', '0000000001')
|
||||
self.assertReportedStat('nodepool.dib_image_build.fake-image.qcow2.rc',
|
||||
self.assertReportedStat('nodepool.dib_image_build.'
|
||||
'fake-image.status.rc',
|
||||
'127', 'g')
|
||||
self.assertReportedStat('nodepool.dib_image_build.'
|
||||
'fake-image.qcow2.duration', None, 'ms')
|
||||
'fake-image.status.duration', None, 'ms')
|
||||
|
||||
def test_diskimage_build_only(self):
|
||||
configfile = self.setup_config('node_diskimage_only.yaml')
|
||||
@ -322,12 +323,15 @@ class TestNodePoolBuilder(tests.DBTestCase):
|
||||
|
||||
self.assertEqual(build_tar._formats, ['tar'])
|
||||
self.assertEqual(build_default._formats, ['qcow2'])
|
||||
self.assertReportedStat('nodepool.dib_image_build.fake-image.tar.rc',
|
||||
self.assertReportedStat('nodepool.dib_image_build.'
|
||||
'fake-image.status.rc',
|
||||
'0', 'g')
|
||||
self.assertReportedStat('nodepool.dib_image_build.'
|
||||
'fake-image.tar.duration', None, 'ms')
|
||||
'fake-image.status.duration', None, 'ms')
|
||||
self.assertReportedStat('nodepool.dib_image_build.'
|
||||
'fake-image.tar.size', '4096', 'g')
|
||||
self.assertReportedStat('nodepool.dib_image_build.'
|
||||
'fake-image.status.last_build', None, 'g')
|
||||
|
||||
def test_diskimage_build_formats(self):
|
||||
configfile = self.setup_config('node_diskimage_formats.yaml')
|
||||
|
12
releasenotes/notes/dib-stats-update-c4efeab13753e090.yaml
Normal file
12
releasenotes/notes/dib-stats-update-c4efeab13753e090.yaml
Normal file
@ -0,0 +1,12 @@
|
||||
---
|
||||
upgrade:
|
||||
- The diskimage-builder stats have been reworked to be more useful.
|
||||
The return code and duration is now stored in
|
||||
``nodepool.dib_image-build.<diskimage_name>.status.<rc|duration>``;
|
||||
previously this was split for each image format. This is
|
||||
unnecessary and confusing since the results will always be the
|
||||
same, since all formats are generated from the same
|
||||
diskimage-builder run. An additional gauge
|
||||
``nodepool.dib_image_build.<diskimage_name>.status.last_build`` is
|
||||
added to make it easy to show relative time of builds in
|
||||
dashboards.
|
Loading…
Reference in New Issue
Block a user