Create a tool for recreating missing metrid_id in Cassandra
In some rare cases it is possible for a row in Cassandra for metrics to have no value for metric_id or created_at, though they may still have updated_at and the other required columns. This tool is for recreating the metric_id from the other required columns. An additional 'persister-check-missing-metric-id.py' tool is provided which can be run to see if there are missing metric-id values that need to be recreated. Please see the README.rst for usage directions. Story: 2005305 Task: 30611 Change-Id: I0593558407c8c773d728bbd035dde91310b59be3 (cherry picked from commit 09af9bff91e5a109d5bcd204f3647aa4fe023fe9)
This commit is contained in:
parent
105c2e1b29
commit
0aaa7d357f
@ -0,0 +1,88 @@
|
||||
persister-recreate-metric-id
|
||||
============================
|
||||
|
||||
In some rare cases, it is possible to have metric rows in the Cassandra
|
||||
database which do not have a metric_id. Due to the nature of TSDBs,
|
||||
it is valid to have sparse data, but the version of Monasca API up
|
||||
through Rocky do not handle this well and produce an ugly ERROR.
|
||||
|
||||
For further reading - https://storyboard.openstack.org/#!/story/2005305
|
||||
|
||||
This tool runs through the metric table in Cassandra, identifies rows
|
||||
that are missing a metric_id, and uses an UPDATE operation to recreate
|
||||
the metric_id based on other values. The metric_id is calculated from
|
||||
a hash of the region, tenant_id, metric_name and dimensions, so it can
|
||||
be recreated.
|
||||
|
||||
All effort has been made to ensure this is a safe process. And it
|
||||
should be safe to run the tool multiple times. However, it is provided
|
||||
AS IS and you should use it at your own risk.
|
||||
|
||||
Usage
|
||||
=====
|
||||
|
||||
Steps to use this tool:
|
||||
|
||||
- Log in to one node where monasca-persister is deployed.
|
||||
- Identify installation path to monasca-persister. This may be a
|
||||
virtual environment such as
|
||||
`/opt/stack/venv/monasca-<version>/lib/python2.7/site-packages/monasca-persister`
|
||||
or as in devstack `/opt/stack/monasca-persister/monasca_persister/`.
|
||||
- Identify the existing configuration for monasca-persister. If using a
|
||||
java deployment, it may be in `/opt/stack/service/monasca/etc/persister-config.yml`
|
||||
or in devstack `/etc/monasca/persister.conf`
|
||||
- Copy and modify the config template file.
|
||||
|
||||
::
|
||||
|
||||
cp persister-recreate.ini /opt/stack/service/monasca/etc/persister-recreate.ini
|
||||
vi /opt/stack/service/monasca/etc/persister-recreate.ini
|
||||
|
||||
- Copy the values from the monasca-persister config in to the new .ini,
|
||||
particularly the password. In some cases, the single IP for the
|
||||
management network of one of the Cassandra nodes may need to be given,
|
||||
rather than the list of hostnames as specified in the .yml.
|
||||
- Copy the `persister-recreate-metric-id.py` and `persister-check-missing-metric-id.py`
|
||||
files in to place with the monasca-persister code.
|
||||
|
||||
::
|
||||
|
||||
cp persister-*-metric-id.py /opt/stack/venv/monasca-<version>/lib/python2.7/site-packages/monasca-persister
|
||||
|
||||
- Ensure the `mon-persister` user has permission to access both
|
||||
`persister-recreate.ini` and `persister-recreate-metric-id.py`.
|
||||
- Invoke the tool to generate a log of rows needing repair.
|
||||
|
||||
::
|
||||
|
||||
sudo -u mon-persister /opt/stack/venv/monasca-<version>/bin/python /opt/stack/venv/monasca-<version>/lib/python2.7/site-packages/monasca_persister/persister-check-missing-metric-id.py --config-file /opt/stack/service/monasca/etc/persister-recreate.ini
|
||||
|
||||
- Review the logged output. If output is as expected, then invoke
|
||||
the recreate-missing-metric-id tool to repair the rows.
|
||||
|
||||
::
|
||||
|
||||
sudo -u mon-persister /opt/stack/venv/monasca-<version>/bin/python /opt/stack/venv/monasca-<version>/lib/python2.7/site-packages/monasca_persister/persister-recreate-metric-id.py --config-file /opt/stack/service/monasca/etc/persister-recreate.ini
|
||||
|
||||
- Once repair has been verified successful, the configuration file
|
||||
may be deleted.
|
||||
|
||||
|
||||
License
|
||||
=======
|
||||
|
||||
Copyright (c) 2019 SUSE LLC
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the “License”); you may
|
||||
not use this file except in compliance with the License. You may obtain
|
||||
a copy of the License at
|
||||
|
||||
::
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an “AS IS” BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
@ -0,0 +1,156 @@
|
||||
# (C) Copyright 2019 SUSE LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
"""Persister check for missing metric_id tool
|
||||
|
||||
This tool is designed to 'fix' the rare instance when a metric_id
|
||||
has been removed from a row in Cassandra. That can cause issues
|
||||
when monasca-api retrieves the metric and tries to decode it.
|
||||
|
||||
Configure this tool by copying the Monasca Persister settings from
|
||||
/opt/stack/service/monasca/etc/persister-config.yml in to a config
|
||||
.ini file (see template).
|
||||
|
||||
Start the tool as stand-alone process by running
|
||||
'sudo -u mon_persister <venv python> \
|
||||
<venv path>/site-packages/monasca_persister/persister-check-missing-metric-id.py \
|
||||
--config-file <config file>'
|
||||
|
||||
When done, you may delete the config .ini file.
|
||||
|
||||
Template for .ini file (suggested /opt/stack/service/monasca/etc/persister-recreate.ini)
|
||||
[DEFAULT]
|
||||
debug = False
|
||||
|
||||
[repositories]
|
||||
metrics_driver = monasca_persister.repositories.cassandra.metrics_repository: \
|
||||
MetricCassandraRepository
|
||||
|
||||
[cassandra]
|
||||
|
||||
# Comma separated list of Cassandra node IP addresses (list value)
|
||||
contact_points = <single ip address for mgmt network on this node>
|
||||
|
||||
# Cassandra port number (integer value)
|
||||
port = 9042
|
||||
|
||||
# Keyspace name where metrics are stored (string value)
|
||||
#keyspace = monasca
|
||||
|
||||
# Cassandra user name (string value)
|
||||
user = mon_persister
|
||||
|
||||
# Cassandra password (string value)
|
||||
password = <password from persister-config.yml>
|
||||
|
||||
"""
|
||||
import sys
|
||||
|
||||
from oslo_log import log
|
||||
|
||||
from monasca_persister import config
|
||||
|
||||
from monasca_persister.repositories.cassandra import connection_util
|
||||
|
||||
LOG = log.getLogger(__name__)
|
||||
|
||||
METRIC_ALL_CQL = ('select region, tenant_id, metric_name, dimensions, '
|
||||
'dimension_names, created_at, metric_id, updated_at '
|
||||
'from metrics')
|
||||
|
||||
|
||||
def usage():
|
||||
usage = """Monasca Persister Check for Missing metric_id Tool
|
||||
|
||||
Used to find a metric_id, which in rare cases may be deleted
|
||||
from a row. The metric_id is a hash of other fields, and thus can be
|
||||
recreated. Note this tool is only for use with Cassandra storage installs.
|
||||
|
||||
Please see the included README.rst for more details about creating an
|
||||
appropriate configuration file.
|
||||
|
||||
To get a report of what rows are missing values (execute as mon-persister user):
|
||||
persister-recreate-metric-id.py --config-file <path>/persister-recreate.ini
|
||||
|
||||
"""
|
||||
print(usage)
|
||||
|
||||
|
||||
def main():
|
||||
"""persister check for missing metric_id tool."""
|
||||
|
||||
config.parse_args()
|
||||
|
||||
try:
|
||||
LOG.info('Starting check of metric_id consistency.')
|
||||
|
||||
# Connection setup
|
||||
# rocky style - note that we don't deliver pike style
|
||||
_cluster = connection_util.create_cluster()
|
||||
_session = connection_util.create_session(_cluster)
|
||||
|
||||
metric_all_stmt = _session.prepare(METRIC_ALL_CQL)
|
||||
|
||||
rows = _session.execute(metric_all_stmt)
|
||||
|
||||
# if rows:
|
||||
# LOG.info('First - {}'.format(rows[0]))
|
||||
# # LOG.info('First name {} and id {}'.format(
|
||||
# # rows[0].metric_name, rows[0].metric_id)) # metric_id can't be logged raw
|
||||
|
||||
# Bit of a misnomer - "null" is not in the cassandra db
|
||||
missing_value_rows = []
|
||||
for row in rows:
|
||||
if row.metric_id is None:
|
||||
LOG.info('Row with missing metric_id - {}'.format(row))
|
||||
missing_value_rows.append(row)
|
||||
|
||||
# check created_at
|
||||
if row.created_at is None and row.updated_at is not None:
|
||||
LOG.info("Metric created_at was also None.")
|
||||
|
||||
# TODO(joadavis) update the updated_at timestamp to now
|
||||
|
||||
# recreate metric id
|
||||
# copied from metrics_repository.py
|
||||
hash_string = '%s\0%s\0%s\0%s' % (row.region, row.tenant_id,
|
||||
row.metric_name,
|
||||
'\0'.join(row.dimensions))
|
||||
# metric_id = hashlib.sha1(hash_string.encode('utf8')).hexdigest()
|
||||
# id_bytes = bytearray.fromhex(metric_id)
|
||||
|
||||
LOG.info("Recreated hash for metric id: {}".format(hash_string))
|
||||
# LOG.info("new id_bytes {}".format(id_bytes)) # can't unicode decode for logging
|
||||
|
||||
# LOG.info("of {} rows there are {} missing metric_id".format(len(rows), len(null_rows)))
|
||||
if len(missing_value_rows) > 0:
|
||||
LOG.warning("--> There were {} rows missing metric_id.".format(
|
||||
len(missing_value_rows)))
|
||||
LOG.warning(" Those rows have NOT been updated.\n"
|
||||
" Please run the persister-recreate-metric-id "
|
||||
"tool to repair the rows.")
|
||||
else:
|
||||
LOG.info("No missing metric_ids were found, no changes made.")
|
||||
|
||||
LOG.info('Done with metric_id consistency check.')
|
||||
|
||||
return 0
|
||||
|
||||
except Exception:
|
||||
LOG.exception('Error! Exiting.')
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
@ -0,0 +1,182 @@
|
||||
# (C) Copyright 2019 SUSE LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
"""Persister Recreate metric_id
|
||||
|
||||
This tool is designed to 'fix' the rare instance when a metric_id
|
||||
has been removed from a row in Cassandra. That can cause issues
|
||||
when monasca-api retrieves the metric and tries to decode it.
|
||||
|
||||
Configure this tool by copying the Monasca Persister settings from
|
||||
/opt/stack/service/monasca/etc/persister-config.yml in to a config
|
||||
.ini file (see template).
|
||||
|
||||
Start the tool as stand-alone process by running
|
||||
'sudo -u mon_persister <venv python> \
|
||||
<venv path>/site-packages/monasca_persister/persister-recreate-metric-id.py \
|
||||
--config-file <config file>'
|
||||
|
||||
When done, you may delete the config .ini file.
|
||||
|
||||
Template for .ini file (suggested /opt/stack/service/monasca/etc/persister-recreate.ini)
|
||||
[DEFAULT]
|
||||
debug = False
|
||||
|
||||
[repositories]
|
||||
metrics_driver = monasca_persister.repositories.cassandra.metrics_repository: \
|
||||
MetricCassandraRepository
|
||||
|
||||
[cassandra]
|
||||
|
||||
# Comma separated list of Cassandra node IP addresses (list value)
|
||||
contact_points = <single ip address for mgmt network on this node>
|
||||
|
||||
# Cassandra port number (integer value)
|
||||
port = 9042
|
||||
|
||||
# Keyspace name where metrics are stored (string value)
|
||||
#keyspace = monasca
|
||||
|
||||
# Cassandra user name (string value)
|
||||
user = mon_persister
|
||||
|
||||
# Cassandra password (string value)
|
||||
password = <password from persister-config.yml>
|
||||
|
||||
"""
|
||||
import hashlib
|
||||
import sys
|
||||
|
||||
from oslo_config import cfg
|
||||
from oslo_log import log
|
||||
|
||||
from monasca_persister import config
|
||||
|
||||
from monasca_persister.repositories.cassandra import connection_util
|
||||
from monasca_persister.repositories.cassandra.metrics_repository import METRICS_INSERT_CQL
|
||||
|
||||
|
||||
LOG = log.getLogger(__name__)
|
||||
|
||||
METRIC_ALL_CQL = ('select region, tenant_id, metric_name, dimensions, '
|
||||
'dimension_names, created_at, metric_id, updated_at '
|
||||
'from metrics')
|
||||
|
||||
|
||||
def usage():
|
||||
usage = """Monasca Persister Recreate metric_id Tool
|
||||
|
||||
Used to recreate a metric_id, which in rare cases may be deleted
|
||||
from a row. The metric_id is a hash of other fields, and thus can be
|
||||
recreated. Note this tool is only for use with Cassandra storage installs.
|
||||
|
||||
Please see the included README.rst for more details about creating an
|
||||
appropriate configuration file.
|
||||
|
||||
persister-recreate-metric-id [-h] --config-file <ini>
|
||||
-h --help Prints this
|
||||
--config-file <ini> (Required) Configuration file as described in README.rst
|
||||
|
||||
Example
|
||||
|
||||
To repair rows (execute as mon-persister user):
|
||||
persister-recreate-metric-id.py --config-file <path>/persister-recreate.ini
|
||||
|
||||
"""
|
||||
print(usage)
|
||||
|
||||
|
||||
def main():
|
||||
"""persister recreate metric_id tool."""
|
||||
|
||||
config.parse_args()
|
||||
conf = cfg.CONF
|
||||
|
||||
try:
|
||||
LOG.info('Starting check and repair of metric_id consistency.')
|
||||
|
||||
# Connection setup
|
||||
# rocky style - note that we don't deliver pike style
|
||||
_cluster = connection_util.create_cluster()
|
||||
_session = connection_util.create_session(_cluster)
|
||||
_retention = conf.cassandra.retention_policy * 24 * 3600
|
||||
|
||||
metric_all_stmt = _session.prepare(METRIC_ALL_CQL)
|
||||
metric_repair_stmt = _session.prepare(METRICS_INSERT_CQL)
|
||||
|
||||
rows = _session.execute(metric_all_stmt)
|
||||
|
||||
# if rows:
|
||||
# LOG.info('First - {}'.format(rows[0]))
|
||||
# # LOG.info('First name {} and id {}'.format(
|
||||
# # rows[0].metric_name, rows[0].metric_id)) # metric_id can't be logged raw
|
||||
|
||||
# Bit of a misnomer - "null" is not in the cassandra db
|
||||
missing_value_rows = []
|
||||
for row in rows:
|
||||
if row.metric_id is None:
|
||||
LOG.info('Row with missing metric_id - {}'.format(row))
|
||||
missing_value_rows.append(row)
|
||||
|
||||
# check created_at
|
||||
fixed_created_at = row.created_at
|
||||
if row.created_at is None and row.updated_at is not None:
|
||||
LOG.info("Metric created_at was also None, repairing.")
|
||||
fixed_created_at = row.updated_at
|
||||
|
||||
# TODO(joadavis) update the updated_at timestamp to now
|
||||
|
||||
# recreate metric id
|
||||
# copied from metrics_repository.py
|
||||
hash_string = '%s\0%s\0%s\0%s' % (row.region, row.tenant_id,
|
||||
row.metric_name,
|
||||
'\0'.join(row.dimensions))
|
||||
metric_id = hashlib.sha1(hash_string.encode('utf8')).hexdigest()
|
||||
id_bytes = bytearray.fromhex(metric_id)
|
||||
|
||||
LOG.info("Recreated hash for metric id: {}".format(hash_string))
|
||||
# LOG.info("new id_bytes {}".format(id_bytes)) # can't unicode decode for logging
|
||||
|
||||
# execute cql
|
||||
metric_repair_bound_stmt = metric_repair_stmt.bind((_retention,
|
||||
id_bytes,
|
||||
fixed_created_at,
|
||||
row.updated_at,
|
||||
row.region,
|
||||
row.tenant_id,
|
||||
row.metric_name,
|
||||
row.dimensions,
|
||||
row.dimension_names))
|
||||
|
||||
_session.execute(metric_repair_bound_stmt)
|
||||
|
||||
# LOG.info("of {} rows there are {} missing metric_id".format(len(rows), len(null_rows)))
|
||||
if len(missing_value_rows) > 0:
|
||||
LOG.warning("--> There were {} rows missing metric_id.".format(
|
||||
len(missing_value_rows)))
|
||||
LOG.warning(" Those rows have been updated.")
|
||||
else:
|
||||
LOG.info("No missing metric_ids were found, no changes made.")
|
||||
|
||||
LOG.info('Done with metric_id consistency check and repair.')
|
||||
|
||||
return 0
|
||||
|
||||
except Exception:
|
||||
LOG.exception('Error! Exiting.')
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
@ -0,0 +1,22 @@
|
||||
[DEFAULT]
|
||||
debug = False
|
||||
|
||||
[repositories]
|
||||
metrics_driver = monasca_persister.repositories.cassandra.metrics_repository:MetricCassandraRepository
|
||||
|
||||
[cassandra]
|
||||
|
||||
# Comma separated list of Cassandra node IP addresses (list value)
|
||||
contact_points = <single ip address for mgmt network on this node>
|
||||
|
||||
# Cassandra port number (integer value)
|
||||
port = 9042
|
||||
|
||||
# Keyspace name where metrics are stored (string value)
|
||||
#keyspace = monasca
|
||||
|
||||
# Cassandra user name (string value)
|
||||
user = mon_persister
|
||||
|
||||
# Cassandra password (string value)
|
||||
password = <password from persister-config.yml>
|
Loading…
x
Reference in New Issue
Block a user