Ian Wienand 339cbf4c3d mirror-update: stats for vos release of mirrors
It's difficult to know if a release process is running too long when
we don't have a history of how long it should run for.

This is mostly the stats function from run_all.sh that has been
sending stats about runtimes there.  Wrap it in a vos_release function
with some minor refactoring, and update the scripts.

As noted inline, there's already release timer stats going to
afs.release.<volume> for the periodic release of docs/tarballs etc.

Change-Id: I3d79d1a0997af8977050b7f6e7cf3b7578cc8491
2020-04-09 14:34:35 +10:00

162 lines
5.1 KiB
Bash
Executable File

#!/bin/bash -e
# Copyright 2016 Red Hat, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
source /usr/share/mirror-update/functions.sh
MIRROR_VOLUME=$1
function echo_ts {
printf "%(%Y-%m-%d %H:%M:%S)T | %s\n" -1 "$@"
}
if [[ ${NO_TIMEOUT:-0} -eq 1 ]]; then
echo_ts "Running interactively"
set -x
TIMEOUT=""
else
TIMEOUT="timeout -k 2m 30m"
fi
BASE="/afs/.openstack.org/mirror/fedora"
# NOTE(pabelanger): #fedora-admin:
# tibbs | I run pubmirror[12].math.uh.edu.
# tibbs | It polls the masters every ten minutes.
# NOTE(ianw): 2018-11 we dropped "-p" from the rsync commands
# because upstream starting putting setgid bits on directories,
# which you have to have admin permissions in AFS to set.
# https://pagure.io/releng/issue/7921
MIRROR="rsync://pubmirror2.math.uh.edu/fedora-buffet/fedora/linux"
K5START="k5start -t -f /etc/fedora.keytab service/fedora-mirror -- ${TIMEOUT}"
echo_ts "----- START FEDORA MIRROR RSYNC RUN -----"
# Purge old releases
echo_ts "Purging old mirrors"
$K5START rm -rf $BASE/releases/29 $BASE/updates/29
for REPO in releases/30 releases/31; do
if ! [ -f $BASE/$REPO ]; then
$K5START mkdir -p $BASE/$REPO
fi
echo_ts "Running rsync for $REPO..."
$K5START rsync -rltDiz \
--delete \
--delete-excluded \
--exclude="Cloud/x86_64/images/*.box" \
--exclude="CloudImages/x86_64/images/*.box" \
--exclude="Container" \
--exclude="Docker" \
--exclude="aarch64/" \
--exclude="armhfp/" \
--exclude="source/" \
--exclude="Server" \
--exclude="Spins" \
--exclude="Workstation" \
--exclude="x86_64/debug/" \
--exclude="x86_64/drpms/" \
$MIRROR/$REPO/ $BASE/$REPO/
echo_ts "... done"
done
for REPO in updates/30 updates/31; do
if ! [ -f $BASE/$REPO ]; then
$K5START mkdir -p $BASE/$REPO
fi
echo_ts "Running rsync for $REPO..."
$K5START rsync -rltDiz \
--delete \
--delete-excluded \
--exclude="aarch64/" \
--exclude="armhfp/" \
--exclude="i386/" \
--exclude="source/" \
--exclude="SRPMS/" \
--exclude="x86_64/debug" \
--exclude="x86_64/drpms" \
$MIRROR/$REPO/ $BASE/$REPO/
echo_ts "... done"
done
MIRROR="rsync://pubmirror2.math.uh.edu/fedora-buffet/alt/atomic"
if ! [ -f $BASE/atomic ]; then
$K5START mkdir -p $BASE/atomic
fi
echo_ts "Running rsync atomic..."
$K5START rsync -rltDiz \
--delete \
--delete-excluded \
--exclude="testing/" \
--exclude="Atomic/" \
--exclude="Fedora-Atomic-25-*/" \
--exclude="Fedora-Atomic-26-*/" \
--exclude="Fedora-Atomic-27-*/" \
--exclude="Fedora-Atomic-28-*/" \
--exclude="ppc64le/" \
--exclude="images/*.raw.xz" \
--exclude="images/*.box" \
--exclude="images/*.iso" \
--exclude="images/efiboot.img" \
--exclude="images/install.img" \
--exclude="iso/*.iso" \
--exclude="os/EFI/BOOT/" \
--exclude="pxeboot/" \
--exclude="isolinux/" \
$MIRROR/ $BASE/atomic/
echo_ts "... done"
# TODO(pabelanger): Validate rsync process
date --iso-8601=ns | $K5START tee $BASE/timestamp.txt
# Now sleep for 20 minutes. openafs "pads" its incremental
# replication on "vos release" by -15 minutes to account for clock
# skew between hosts.
#
# We can get into a negative feedback loop with this, particularly if
# we have a series of big updates, or run things by hand to avoid
# timeouts.
#
# Consider the case of a large mirror pulse (perhaps a new distro
# release is included, etc.). The "Last Update" time on the volume
# will indicate when this run finished.
#
# The last 15 minutes of that run could have brought in a significant
# amount of data. Now we move onto the next mirror pulse, and the
# "vos release" below will try to sync the remote R/O volume from
# "Last Update - 15 minutes" to now(). If you include the data from
# this pulse, we are now dragging across potentially *a lot* of data;
# enough to make the whole thing timeout. Then the volume is locked,
# and we keep putting more data ontop with each cron run making it
# even worse.
#
# By sleeping here for 15+ minutes and doing a trivial write, we can
# ensure that when the *next* release says "sync from Last Update - 15
# minutes" it will *only* include this trivial write, and not
# potentially this entire mirror pulse data too.
sleep $(( 20 * 60 ))
date --iso-8601=ns | $K5START tee $BASE/timestamp.txt
echo_ts "Running vos release."
vos_release $MIRROR_VOLUME | \
while IFS= read -r line; do echo_ts "$line"; done
echo_ts "... done"
echo_ts "----- END FEDORA MIRROR RSYNC RUN -----"
printf "\n\n"