Keep only 7 days of records in ElasticSearch

We have been running out of disk recently with some indexes requiring
more than 400GB of space per index replica. Actual disk space
requirements are double that as we run with a replica. On top of that
the idea is that 5 of 6 elasticsearch nodes have enough space for all
our data so that we are resilient to losing a node.

Napkin math:

  400 * 10 * 2 = ~8TB of disk
  400 * 7 * 2 = ~5.6TB of disk

Each of the six ES nodes has 1TB of disk allocated to ES so 5.6TB should
get us just under the limit. Then for handling a node outage weekends
tend to not have as many records so our actual usage should be a little
lower.

Change-Id: Ie677bd47a9886870bc83876d2407742133299861
This commit is contained in:
Clark Boylan 2020-02-06 13:50:56 -08:00
parent 7227bcf879
commit bd752a0bfe

View File

@ -40,17 +40,7 @@ class openstack_project::elasticsearch_node (
version => '1.7.6',
}
cron { 'delete_old_es_indices':
ensure => 'absent',
user => 'root',
hour => '2',
minute => '0',
command => 'curl -sS -XDELETE "http://localhost:9200/logstash-`date -d \'10 days ago\' +\%Y.\%m.\%d`/" > /dev/null',
environment => 'PATH=/usr/bin:/bin:/usr/sbin:/sbin',
}
class { 'logstash::curator':
keep_for_days => '10',
keep_for_days => '7',
}
}