-
Notifications
You must be signed in to change notification settings - Fork 700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial testing framework #2095
Merged
islas
merged 12 commits into
wrf-model:release-v4.6.1
from
islas:initial-testing-framework
Sep 19, 2024
Merged
Initial testing framework #2095
islas
merged 12 commits into
wrf-model:release-v4.6.1
from
islas:initial-testing-framework
Sep 19, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In order to run test scripts outside of a testing framework, the handling of environment setup should not be solely dependent on running within a dedicated test framework. This has the added benefit of compartmentalizing the duties of environment and dependency solving from running the tests. These environment scripts allow for the selection of a particular environment with the default being the fqdn of the current host. From there, arguments are routed using standard POSIX-sh to a respective script. In the case of Derecho (applicable to any system using lmod) all subsequent argument are treated as modules to load into the current session. The hostenv.sh script relies on one "argument" $AS_HOST being passed in via variable setting to facilitate selection. The helpers.sh script provides convenience features for substing checking in sh, delayed environment variable expansion via eval, and quick banner creation. The derecho.sh script is included as the first supported environment.
This script will facilitate the first tests. There are only three requirements of any given test script with the planned testing framework. If a different testing framework is used in the future, these requirements of the test scripts can and should be re-evaluated. The test script should : 1. Take the intended host / configuration environment as the first argument 2. Take the working directory to immediately change to as the second argument 3. Output some key phrase at the end of the test to denote success, anything else (non-zero exit code, no phrase but return zero) is a failure This particular compilation test script satisfies the above while also providing enough flexibility to select compile target, stanza configuration, parallel jobs, and other command-line options into the make build. Additionally, for convenience environment variables can be passed in as command-line options to the test script to modularize certain inputs.
Following the documentation of the hpc-workflows testing framework and the testing structure found in .ci/, a JSON file for a GNU compilation test was added. This test will compile the em_real core using the GNU Linux x86 stanza configuration. All other options are left as default. If this test is run using the derecho configuration the appropriate modules will attempt to be loaded. For non-derecho environments, per the testing structure under .ci/, if no configuration exists in .ci/hostenv.sh then the current environment wil be used verbatim.
This reusable workflow balances quick setup with github actions-specific features. It assumes that the tests can be controlled via a label being set in a PR. To coordinate PR vs primary branch testing, a suffix is generated using either the PR number or the branch name. This suffix is then used to relocate log files to an archival location in an organized fashion. Github artifacts are still used for failed test capture, but logs will also be moved to the archive location for quicker access if one has access to where these tests execute. To allow for parallelized testing available from hpc-workflows, the workflow can make duplicate directories of the repository that can each run their own test instance without clobbering files. Once tests are run, results are gathered, relocated to archival location, reported and printed to the screen, summarized into the actions summary page, and then packaged into an artifact if failure occured. Finally, the test label is removed if the named tests and label match.
This pipeline is triggered if any pushes occur on master or develop OR if a PR is labeled with an appropriate tag as specified by the tests within this workflow. Additionally, a specific label to trigger all tests can be used that will be removed from the PR when all tests finish, regardless of exit status. The pipeline makes extensive use of the reusable test_workflow.yml to instantiate tests on runners. This pipeline currently only includes the definition for one test to be run on a github runner with tags that satisfy "derecho". Likewise, other hard-coded values appearing in here assume a particular runner setup and environment.
I'm using the approach we're using in MPAS to setup testing with a very limited minimal setup (simple compilation tests) at first to get something started. The idea would be to then gradually translate the current tests to a usable format by this framework. |
The regression test results:
|
mgduda
reviewed
Sep 16, 2024
mgduda
reviewed
Sep 16, 2024
mgduda
approved these changes
Sep 17, 2024
kkeene44
approved these changes
Sep 19, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TYPE: enhancement
KEYWORDS: testing, regression, test framework
SOURCE: internal
DESCRIPTION OF CHANGES:
Problem:
The current regression suite code is complex, requires maintenance of multiple alternate repositories, and takes involved effort to add a new test making community contribution limited at best. Likewise, the complexity of the system reduces the likelihood of independent local testing of changes, leading to a development cycle of one-off commits done to reinvoke testing to see if meaningful commits fix the issues.
Solution:
This new proposed regression suite addresses these shortcomings in a number of discrete ways:
As a first pass at demonstrating this solution, this PR implements a simple set of compilation tests using GNU x86 configurations testing serial, sm, dm, and sm+dm selections. The CI/CD portion is done via GitHub workflow actions on a specific trigger event. The values and trigger methods are configurable, but this initial implementation will use the
labeled
trigger, which will initiate tests whencompile-tests
orall-tests
is added as a label to a pull request.TESTS CONDUCTED:
RELEASE NOTE:
Introduce a modularized testing framework that allows testing locally and natively on HPC systems that lives within the WRF repository