root/build-tools/git-utils.sh
Scott Little b20ac0164d Build Avoidance
Purpose:
   Reduce build times after a repo sync by pulling in pre-generated
srpms and rpms and other build products created by a local reference build.

Usage:
  repo sync
  generate-cgcs-centos-repo.sh ...
  populate_downloads.sh ...
  build-pkgs --build-avoidance [--build-avoidance-user <user> \
     --build-avoidance-host <addr> --build-avoidance-dir <dir>]

Reference builds:
- A server performs a regular (daily?), automated builds using
  existing methods. Call these the reference builds.

- The builds are timestamped, and preserved for some time. (weeks?)
  The MY_WORKSPACE directory for the build shall have a common root
  directory, and a leaf directory that is a UTC time stamp of format
  YYYYMMDDThhmmssZ.
  e.g.
  MY_WORKSPACE=/localdisk/loadbuild/jenkins/StarlingX/20180719T113021Z

  Alternative formats are possible by setting values in ...
  "$MY_REPO/local-build-data/build_avoidance_source"
  e.g.
  BUILD_AVOIDANCE_DATE_FORMAT="%Y-%m-%d"
  BUILD_AVOIDANCE_TIME_FORMAT="%H-%M-%S"
  BUILD_AVOIDANCE_DATE_TIME_DELIM="_"
  BUILD_AVOIDANCE_DATE_TIME_POSTFIX=""
  BUILD_AVOIDANCE_DATE_UTC=0

  Which results in YYYY-MM-DD_hh-mm-ss format using local time.
  The one property that the timestamp must have is that they
  are sortable, and that the reference build and the consumer of
  the reference builds agree on the format.

- A build CONTEXT is captured, consisting of the SHA of each and every
  git that contributed to the build.

- For each package built, a file shall capture he md5sums of all the
  source code inputs to the build of that package.

- All these build products are accessible locally (e.g. a regional
  office) via rsync (other protocols can be added later).  ssh
  is also required to run remote query commands on the reference build.

  Initial ground work to support a selection variable ....
  BUILD_AVOIDANCE_FILE_TRANSFER="my-transfer-protocol"
  in $MY_REPO/local-build-data/build_avoidance_source"
  has been created, but "rsync" is the only valid value at this time.

- Location of the reference build can be specified via command line, or
  defaults can be put in $MY_REPO/local-build-data/build_avoidance_source.
  The local-build-data directory is gitignored by stx-root and so can be
  customized for local needs.
  e.g.
  cat $MY_REPO/local-build-data/build_avoidance_source
  BUILD_AVOIDANCE_USR="jenkins"
  BUILD_AVOIDANCE_HOST="stx-build-server.myco.com"
  BUILD_AVOIDANCE_DIR="/localdisk/loadbuild/jenkins/StarlingX"

Notes:
- Build avoidance is only used if requested.
- Build avoidance does not necessarily use the latest reference build.
  It compares the git context of all available reference builds vs your
  own git context, and chooses the most recent for which you gits have
  all the conent.  i.e. all your gits will be same or newer than that
  used by the reference build.  This also meens that some packages might
  still need to be rebuilt after the download step.
- Normally build avoidance remembers the last download context and will only
  consider reference builds newer than the last download.   You can reset
  using 'build-pkgs --build-avoidance --clear' to erase the download history.
  When might this matter to me?  If you change to an old branch that
  hasn't been synced recently and want to build in that context.
- The primary assumtion of Build Avoidance is that it is faster to
  download packages than to build them.  This is typically true of a
  good LAN, but likely not true of a WAN. This is why we emphasize the
  local nature of your reference build server.

Also in this update:
- reworked context generation to be relative to 'dirname $MY_REPO'
- Moved md5sum calculation to a common file, and fixed case where
  symlinks where canonacalized to paths outside of $MY_REPO.
  We'll make an exception to canonacalization to keep paths
  relative to $MY_REPO.
- In future other functions could be moved to the common file.

Story: 2002835
Task: 22754
Change-Id: I757780190cc6063d0a2d3ad9d0a6020ab5169e99
Signed-off-by: Scott Little <scott.little@windriver.com>
2018-09-17 16:41:31 -04:00

149 lines
3.1 KiB
Bash
Executable File

#
# Copyright (c) 2018 Wind River Systems, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
#
# A place for any functions relating to git, or the git hierarchy created
# by repo manifests.
#
git_ctx_root_dir () {
dirname "${MY_REPO}"
}
#
# git_list <dir>:
# Return a list of git root directories found under <dir>
#
git_list () {
local DIR=${1}
find "${DIR}" -type d -name '.git' -exec dirname {} \; | sort -V
}
# GIT_LIST: A list of root directories for all the gits under $MY_REPO/..
# as absolute paths.
export GIT_LIST=$(git_list "$(git_ctx_root_dir)")
# GIT_LIST_REL: A list of root directories for all the gits under $MY_REPO/..
# as relative paths.
export GIT_LIST_REL=$(for p in $GIT_LIST; do echo .${p#$(git_ctx_root_dir)}; done)
#
# git_list_containing_branch <dir> <branch>:
# Return a list of git root directories found under <dir> and
# having branch <branch>. The branch need not be current branch.
#
git_list_containing_branch () {
local DIR="${1}"
local BRANCH="${2}"
local d
for d in $(git_list "${DIR}"); do
(
cd "$d"
git branch --all | grep -q "$BRANCH"
if [ $? -eq 0 ]; then
echo "$d"
fi
)
done
}
#
# git_list_containing_tag <dir> <tag>:
# Return a list of git root directories found under <dir> and
# having tag <tag>.
#
git_list_containing_tag () {
local DIR="${1}"
local TAG="${2}"
local d
for d in $(git_list "${DIR}"); do
(
cd "$d"
git tag | grep -q "$TAG"
if [ $? -eq 0 ]; then
echo "$d"
fi
)
done
}
#
# git_context:
# Returns a bash script that can be used to recreate the current git context,
#
# Note: all paths are relative to $MY_REPO/..
#
git_context () {
(
cd $(git_ctx_root_dir)
local d
for d in $GIT_LIST_REL; do
(
cd ${d}
echo -n "(cd ${d} && git checkout -f "
echo "$(git rev-list HEAD -1))"
)
done
)
}
#
# git_test_context <context>:
#
# Test if all commits referenced in the context are present
# in the history of the gits in their current checkout state.
#
# Returns: 0 = context is present in git history
# 1 = At least one element of context is not present
# 2 = error
#
git_test_context () {
local context="$1"
local query=""
local target_hits=0
local actual_hits=0
if [ ! -f "$context" ]; then
return 2
fi
query=$(mktemp "/tmp/git_test_context_XXXXXX")
if [ "$query" == "" ]; then
return 2
fi
# Transform a checkout context into a query that prints
# all the commits that are found in the git history.
#
# Limit search to last 500 commits in the interest of speed.
# I don't expect to be using contexts more than a few weeks old.
cat "$context" | \
sed "s#checkout -f \([a-e0-9]*\)#rev-list --max-count=500 HEAD | \
grep \1#" > $query
target_hits=$(cat "$context" | wc -l)
actual_hits=$(cd $(git_ctx_root_dir); source $query | wc -l)
\rm $query
if [ $actual_hits -eq $target_hits ]; then
return 0
fi
return 1
}