Skip to content

Commit

Permalink
Initial testing framework (#2095)
Browse files Browse the repository at this point in the history
TYPE: enhancement

KEYWORDS: testing, regression, test framework

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
The current regression suite code is complex, requires maintenance of
multiple alternate repositories, and takes involved effort to add a new
test making community contribution limited at best. Likewise, the
complexity of the system reduces the likelihood of independent local
testing of changes, leading to a development cycle of one-off commits
done to reinvoke testing to see if meaningful commits fix the issues.

Solution:
This new proposed regression suite addresses these shortcomings in a
number of discrete ways:
1. Modularize the testing framework to an generalized independent repo
usable by any repo seeking to set up tests that can run locally, on HPC
systems, and within any CI/CD framework
2. Write WRF-specific test scripts _inside_ the WRF repo and in a manner
that does not rely on specific layouts/hardware/etc. so long as WRF can
compile and run on intended system (i.e. able to be run locally)
3. Write CI/CD tests in a simple and generally CI/CD framework-agnostic
method where definitions of these also reside _within the WRF repo_
4. Utilize HPC resources in a safe manner to increase breadth of testing
to allow testing of many more compilers and on similar hardware to the
general use case of WRF

As a first pass at demonstrating this solution, this PR implements a
simple set of compilation tests using GNU x86 configurations testing
serial, sm, dm, and sm+dm selections. The CI/CD portion is done via
GitHub workflow actions on a specific trigger event. The values and
trigger methods are configurable, but this initial implementation will
use the `labeled` trigger, which will initiate tests when
`compile-tests` or `all-tests` is added as a label to a pull request.

TESTS CONDUCTED:
1. Testing of this github workflow was done in a separate fork also
testing on Derecho. Both positive and negative tests were used to
demonstrate respective output usefulness.

RELEASE NOTE:
Introduce a modularized testing framework that allows testing locally
and natively on HPC systems that lives within the WRF repository
  • Loading branch information
islas authored Sep 19, 2024
1 parent 1d86bcb commit 958ce12
Show file tree
Hide file tree
Showing 10 changed files with 513 additions and 0 deletions.
22 changes: 22 additions & 0 deletions .ci/env/derecho.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/bin/sh

echo "Setting up derecho environment"
workingDirectory=$PWD
. /etc/profile.d/z00_modules.sh
echo "Loading modules : $*"
cmd="module purge"
echo $cmd && eval "${cmd}"

# We should be handed in the modules to load
while [ $# -gt 0 ]; do
cmd="module load $1"
echo $cmd && eval "${cmd}"
shift
done

# Go back to working directory if for unknown reason HPC config changing your directory on you
if [ "$workingDirectory" != "$PWD" ]; then
echo "derecho module loading changed working directory"
echo " Moving back to $workingDirectory"
cd $workingDirectory
fi
46 changes: 46 additions & 0 deletions .ci/env/helpers.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/bin/sh

# Useful string manipulation functions, leaving in for posterity
# https://stackoverflow.com/a/8811800
# contains(string, substring)
#
# Returns 0 if the specified string contains the specified substring,
# otherwise returns 1.
contains()
{
string="$1"
substring="$2"

if [ "${string#*"$substring"}" != "$string" ]; then
echo 0 # $substring is in $string
else
echo 1 # $substring is not in $string
fi
}

setenvStr()
{
# Changing IFS produces the most consistent results
tmpIFS=$IFS
IFS=","
string="$1"
for s in $string; do
if [ ! -z $s ]; then
eval "echo export \"$s\""
eval "export \"$s\""
fi
done
IFS=$tmpIFS
}

banner()
{
lengthBanner=$1
shift
# https://www.shellscript.sh/examples/banner/
printf "#%${lengthBanner}s#\n" | tr " " "="
printf "# %-$(( ${lengthBanner} - 2 ))s #\n" "`date`"
printf "# %-$(( ${lengthBanner} - 2 ))s #\n" " "
printf "# %-$(( ${lengthBanner} - 2 ))s #\n" "$*"
printf "#%${lengthBanner}s#\n" | tr " " "="
}
16 changes: 16 additions & 0 deletions .ci/env/hostenv.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/sh

# Allow selection of hostname, and if none is provided use the current machine
# While this may seem unintuitive at first, it provides the flexibility of using
# "named" configurations without being explicitly tied to fqdn
hostname=$AS_HOST
if [ -z "$hostname" ]; then
hostname=$( python3 -c "import socket; print( socket.getfqdn() )" )
fi

if [ $( contains ${hostname} hsn.de.hpc ) -eq 0 ]; then
# Derecho HPC SuSE PBS
. .ci/env/derecho.sh
else
echo "No known environment for '${hostname}', using current"
fi
1 change: 1 addition & 0 deletions .ci/hpc-workflows
Submodule hpc-workflows added at ba8393
108 changes: 108 additions & 0 deletions .ci/tests/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
#!/bin/sh
help()
{
echo "./build.sh as_host workingdir [options] [-- <hostenv.sh options>]"
echo " as_host First argument must be the host configuration to use for environment loading"
echo " workingdir First argument must be the working dir to immediate cd to"
echo " -c Configuration build type, piped directly into configure"
echo " -n Configuration nesting type, piped directly into configure"
echo " -o Configuration optstring passed into configure"
echo " -b Build command passed into compile"
echo " -e environment variables in comma-delimited list, e.g. var=1,foo,bar=0"
echo " -- <hostenv.sh options> Directly pass options to hostenv.sh, equivalent to hostenv.sh <options>"
echo " -h Print this message"
echo ""
echo "If you wish to use an env var in your arg such as '-c \$SERIAL -e SERIAL=32', you must"
echo "you will need to do '-c \\\$SERIAL -e SERIAL=32' to delay shell expansion"
}

echo "Input arguments:"
echo "$*"

AS_HOST=$1
shift
if [ $AS_HOST = "-h" ]; then
help
exit 0
fi

workingDirectory=$1
shift

cd $workingDirectory

# Get some helper functions
. .ci/env/helpers.sh

while getopts c:n:o:b:e:h opt; do
case $opt in
c)
configuration="$OPTARG"
;;
n)
nesting="$OPTARG"
;;
o)
configOpt="$OPTARG"
;;
b)
buildCommand="$OPTARG"
;;
e)
envVars="$envVars,$OPTARG"
;;
h) help; exit 0 ;;
*) help; exit 1 ;;
:) help; exit 1 ;;
\?) help; exit 1 ;;
esac
done

shift "$((OPTIND - 1))"

# Everything else goes to our env setup
. .ci/env/hostenv.sh $*

# Now evaluate env vars in case it pulls from hostenv.sh
if [ ! -z "$envVars" ]; then
setenvStr "$envVars"
fi

# Re-evaluate input values for delayed expansion
eval "configuration=\"$configuration\""
eval "nesting=\"$nesting\""
eval "configOpt=\"$configOpt\""
eval "buildCommand=\"$buildCommand\""

./clean -a

echo "Compiling with option $configuration nesting=$nesting and additional flags '$configOpt'"
./configure $configOpt << EOF
$configuration
$nesting
EOF

if [ ! -f configure.wrf ]; then
echo "Failed to configure"
exit 1
fi

echo "./compile $buildCommand"
./compile $buildCommand

result=$?

if [ $result -ne 0 ]; then
echo "Failed to compile"
exit 1
fi

# And a *very* special check because WRF compiles the WRF way and force-ignores all make errors
# putting the onus on US to check for things
if [ ! -x ./main/wrf.exe ]; then # There's a bunch of other execs but this is the most important and
# doing more checks to accomodate just reinforces this bad design
echo "Failed to compile"
exit 1
fi

echo "TEST $(basename $0) PASS"
69 changes: 69 additions & 0 deletions .ci/wrf_compilation_tests-make.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
{
"submit_options" :
{
"timelimit" : "00:20:00",
"working_directory" : "..",
"arguments" :
{
"base_env_numprocs" : [ "-e", "NUM_PROCS=4" ],

".*make.*::args_nesting" : [ "-n", "1" ],
".*make.*::args_configopt" : [ "-o", "-d" ],
".*make.*::args_build_tgt" : [ "-b", "em_real -j $NUM_PROCS" ]
},
"hsn.de.hpc" :
{
"submission" : "PBS",
"queue" : "main",
"hpc_arguments" :
{
"node_select" : { "-l " : { "select" : 1, "ncpus" : 16 } },
"priority" : { "-l " : { "job_priority" : "economy" } }
},
"arguments" :
{
"base_env_numprocs" : [ "-e", "NUM_PROCS=16" ],
"very_last_modules" : [ "netcdf" ],
".*gnu.*::test_modules" : [ "gcc" ],
".*intel(?!-llvm).*::test_modules" : [ "intel-classic" ],
".*intel-llvm.*::test_modules" : [ "intel-oneapi" ],
".*pgi.*::test_modules" : [ "nvhpc" ],
".*dm.*::test_mpi_module" : [ "cray-mpich" ]
}
}
},
"make-gnu" :
{
"steps" :
{
"serial" :
{
"command" : ".ci/tests/build.sh",
"arguments" : [ "-c", "32" ]
},
"sm" :
{
"command" : ".ci/tests/build.sh",
"arguments" : [ "-c", "33" ],
"dependencies" : { "serial" : "afterany" }
}
}
},
"make-gnu-mpi" :
{
"steps" :
{
"dm" :
{
"command" : ".ci/tests/build.sh",
"arguments" : [ "-c", "34" ]
},
"dm+sm" :
{
"command" : ".ci/tests/build.sh",
"arguments" : [ "-c", "35" ],
"dependencies" : { "dm" : "afterany" }
}
}
}
}
97 changes: 97 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
name: Regression Suite
run-name : ${{ github.event_name == 'push' && 'CI' || github.event.label.name }} (${{ github.event_name }})

on:
push:
branches: [ master, develop ]
# See https://stackoverflow.com/a/78444521 and
# https://github.com/orgs/community/discussions/26874#discussioncomment-3253755
# as well as official (but buried) documentation :
# https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#pull-request-events-for-forked-repositories-2
pull_request:
types: [ labeled ]

# https://docs.github.com/en/actions/sharing-automations/reusing-workflows#supported-keywords-for-jobs-that-call-a-reusable-workflow
# Also https://stackoverflow.com/a/74959635
# TL;DR - For public repositories the safest approach will be to use the default read permissions, but at the cost
# of not being able to modify the labels. That will need to be a separate [trusted] workflow that runs from the base repo
# permissions :
# contents : read
# pull-requests : write

# Write our tests out this way for easier legibility
# testsSet :
# - key : value
# key : value
# tests :
# - value
# - value
# - < next test >
# https://stackoverflow.com/a/68940067
jobs:
buildtests:
if : ${{ github.event.label.name == 'compile-tests' || github.event.label.name == 'all-tests' || github.event_name == 'push' }}
strategy:
max-parallel: 4
fail-fast: false
matrix:

testSet :
- host : derecho
hpc-workflows_path : .ci/hpc-workflows
archive : /glade/work/aislas/github/runners/wrf/derecho/logs/
account : NMMM0012
name : "Make Compilation Tests"
id : make-tests
fileroot : wrf_compilation_tests-make
args : -j='{"node_select":{"-l ":{"select":1}}}'
pool : 8
tpool : 1
mkdirs : true
tests :
- make-gnu
- make-gnu-mpi
# add new compilation tests here

uses : ./.github/workflows/test_workflow.yml
with :
# This should be the only hard-coded value, we don't use ${{ github.event.label.name }}
# to avoid 'all-tests' to be used in this workflow
label : compile-tests

# Everything below this should remain the same and comes from the testSet matrix
hpc-workflows_path : ${{ matrix.testSet.hpc-workflows_path }}
archive : ${{ matrix.testSet.archive }}
name : ${{ matrix.testSet.name }}
id : ${{ matrix.testSet.id }}
host : ${{ matrix.testSet.host }}
fileroot : ${{ matrix.testSet.fileroot }}
account : ${{ matrix.testSet.account }}
tests : ${{ toJson( matrix.testSet.tests ) }}
mkdirs : ${{ matrix.testSet.mkdirs }}
args : ${{ matrix.testSet.args }}
pool : ${{ matrix.testSet.pool }}
tpool : ${{ matrix.testSet.tpool }}
# I am leaving this here for posterity if this is to be replicated in private repositories for testing
permissions:
contents: read
pull-requests: write
name : Test ${{ matrix.testSet.name }} on ${{ matrix.testSet.host }}

# In the event that 'all-tests' is used, this final job will be the one to remove
# the label from the PR
removeAllLabel :
if : ${{ !cancelled() && github.event.label.name == 'all-tests' }}
name : Remove 'all-tests' label
runs-on: ubuntu-latest
needs : [ buildtests ] # Put tests here to make this wait for the tests to complete
steps:
- name : Remove '${{ github.event.label.name }}' label
env:
PR_NUMBER: ${{ github.event.number }}
run: |
curl \
-X DELETE \
-H "Accept: application/vnd.github.v3+json" \
-H 'Authorization: token ${{ github.token }}' \
https://api.github.com/repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/labels/${{ github.event.label.name }}
Loading

0 comments on commit 958ce12

Please sign in to comment.