-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: INFENG-382: Release redesign #10002
Conversation
…#9442) INFENG-704: Remove bumpversion * Update determined Python package to fetch its own version dynamically from package metadata using importlib.metadata, which seems to be the canonical way to fetch the version string at runtime. In lieu of something more complicated, I assigned a module-level __version__ variable in __init__.py to avoid touching the rest of the package code that relies on this behavior. * Migrate all of the setup.py options into the existing pyproject.toml, except options that are no longer necessary, leaving the long_description in because pyproject.toml seems to be unable to reference README files above the pyproject.toml root. * Remove harness/determined/__version__.py and model_hub/model_hub/__version__.py, as they are no longer necessary. The __version__ dunder is now provided by __init__.py. See PEP-396 <https://peps.python.org/pep-0396/> for more information about why this is no longer desirable. * Add explicit dependencies on setuptools>=64 and setuptools_scm>=8 in harness/pyproject.toml and model_hub/pyproject.toml to allow tagging versions dynamically from git. * Add setuptools==70.0.0 and setuptools-scm==8.1.0 as dependencies to support setuptools dynamically using git tag information. * Set build-backend to setuptools.build_meta to use setuptools_scm. * Move all setuptools.setup() args into pyproject.toml, because newer version of setuptools, if I understand correctly, have support for them. * Remove bumpversion config for setup.py and __version__.py.
INFENG-709: Remove CloudFormation versions * Remove hardcoded Determined versions from CloudFormation templates. Because we're already passing through an optional command line parameter anyway, we can instead set that default value to the existing determined.__version__ variable pulled from package metadata, instead of setting a default value in the `Version` parameter of CloudFormation templates.
…9515) INFENG-726: Generate versions.json from git tags This changeset introduces a new Python script, docs/gen-versions.py, to generate the existing versions.json file from git tags, while maintaining the previous structure of the file and incorporating new tags as they get created. My initial thought was to use the existing versions.json file as a template, and then append subsequent entries to it from new git tags as they get created and pushed. This felt brittle, though, and very special case-y. Instead, I use the GitPython version of git-rev-list to walk the DAG from all tags back to right before 0.21.0. I then filter out all tags that aren't a major, minor, or patch release, and exclude versions that have tags, but without corresponding entries in the original versions.json file. This gives us all relevant doc tags between now and 0.21.0 (inclusive), which we can use to generate versions.json. It also works for any new tags we use going forward. The resultant file still winds up where it normally would, and should get packaged and released like it would if it were stored in git on disk. The summary of changes is as follows: * Add gen-versions.py, which generates versions.json from git tags. * Add a new docs/Makefile target, gen-versions, to run gen-versions.py as part of the docs build process. * Remove versions.json from .bumpversion.cfg. * Remove versions.json.
INFENG-739: remove version string from Vite config * Replace the process.env.VERSION variable explicitly defined in vite.config.mts with either the actual VERSION environment variable, if it's set, or the most recent git tag, followed by an optional commit counter and partial unique SHA hash. This shells out to git, and accounts for the rare but possible case that the config is being from outside of a git repository. In that situation, the version string is useless anyway, so we return "unknown" from the shell. * During the release, we will set VERSION ourselves based on the tag, so we don't have to go over the top trying to parse it out here. The local version doesn't matter too much, either, as it's only referenced during development, and, as far as I can tell, is really only used to display the current version to the user. During the release, we can set VERSION from CircleCI directly, as it already provides the tag if the run was triggered by one.
INFENG-744: Remove VERSION file * Remove dependency on static VERSION file and replace it with a bash script, version.sh, that either fetches the version from git tags, usually used for local builds, or returns an existing environment variable, mainly used in CI. * The script uses a few different git tools to find the most recent previous tag for a given commit, a combination of git-describe, git-tag, and git-merge-base. Additional implementation details are available in the comments in that file. The resulting version, though, is <prev-tag>+<HEAD-short-SHA>. This ensures that certain Python dependency constraints are able to be satisfied when building Determined locally. Otherwise, if VERSION is already set, as will be the case in CI, the script simply returns that instead. * Replace existing `cat`s of the VERSION file with invocations of this shell script; specifically, in the following files: - .devcontainer/server.sh - .github/workflows/lint-python.yml - agent/Makefile - docs/Makefile - docs/deploy/Makefile - helm/Makefile - master/Makefile - model_hub/Makefile * Invoke version.sh in the Python Determined module Makefile: harness/Makefile. This was necessary to replace my previous attempt at fetching version string information from git using a Pythom setuptools plugin called setuptools_scm. Although that library works and is quite handy, and I sincerely thank Ronny Pfannschmidt and all other contributors for their work, configuring it was non-trivial, at times, and it was becoming difficult to cleanly get it to do what I wanted. I also decided that it makes more sense to have a single, canonical version string (not unlike the VERSION file itself) that we use for as many things as possible, notwithstanding certain potential incongruences between how different systems parse version strings. Now, instead of using setuptools_scm, we read the environment variable in harness/setup.py, and pass that to setuptools as a dynamic version. This also helps ensure that there aren't multiple ways to fetch a canonical version, which could become out of sync if we aren't careful. It does, however, mean that you need to use the Makefile to build the determined module; if you want to invoke `python -m build` directly, you'll have to set VERSION yourself. This is unlikely. If VERSION is unset, setup.py will raise an exception. * Remove the --version-file flag from docs/deploy/upload.py. I don't think it's even used, but it's certainly unnecessary now. * Replace reading the VERSION file in docs/deploy/upload.py with fetching the VERSION environment variable, which should be set by the docs makefile. I've done the same thing in docs/conf.py. * Although I've described what version.sh does, the diff will show some minor tweaks, like editing comments and moving some tag-munging commands around. I've also removed `set -x`, because I forgot that executing the script locally, even in a subshell, will set shell options globally, and apply after the script exits. This is undesirable. There are ways to mitigate this, but it's not a pressing issue. * Remove VERSION file from .bumpversion.cfg.
INFENG-743: Remove Helm chart bumpversion * Update Chart.yaml version strings to dummy values: 0.0.0, and instead pass VERSION in through --version and --app-version flags during helm package in helm/Makefile.
* Remove bump2version ("bumpversion", colloquially) dependency, both literally from requirements.txt and also the last remaining reference in its config file, .bumpversion.cfg, along with the config file itself. The remaining reference was in .circleci/real_config.yml. Removing it, though, was somewhat tricky, and had a few knock-on effects that had to be solved. The version string itself was a default for the pipeline parameter det-version, which is threaded through the entirety of the build process as a source of truth for the Determined version information. Instead of updating the CircleCI config itself, we can dynamically generate it in .circleci/config.yml and set the parameter there. There's still some work left to do to get tag builds working correctly, but for now, I call version.sh to set det-version, which should work correctly for all branch builds, by returning the local version string, e.g. 0.34.0+2e616a3d7 * Update version.sh to fix a small bug that caused git tags with subsequent 'v's in them, e.g. v1.0.3-dev0, to be munged incorrectly into 1.0.3-de0. Now, we use ${parameter#word} shell expansion to remove the first singular 'v' from tags. * Update two direct invocations of setup.py in CircleCI to instead use the build module. Invoking setup.py directly is deprecated. * Update the agent creation code to URL-escape the query parameter that's used to set the agent version. Under the new versioning scheme, local build versions use a short SHA after a plus + for uniqueness, and the plus sign was being interpreted as being part of a URL, where it was then turned into a space. This broke tests that attempt to fetch the agent version (it's also just plain incorrect). Now, the version string is properly escaped, and the proper agent version is returned. * Update how setup.py and the determined (harness/) Python package manage versions. I kept running into test errors that indicated that the correct version wasn't being set somewhere in setuptools. At first, I mistakenly assumed that pytest was, somehow, using the bare source for testing. This is incorrect, though (and makes no sense anyway). pytest still needs to import packages, and it does so after we pip install an editable install of determined. Editable installs still end up calling setuptools, which means the built package has package metadata, and so we can still use importlib.metadata locally to fetch version information. The only bit that really had to be changed was harness/setup.py and model_hub/setup.py. I did tweak harness/determined/__init__.py, but that's just to catch the possible importlib.metadata.PackageNotFoundError exception. I believe this will never happen, but I've been wrong before. And there really is no excuse for __init__.py to ever do anything that isn't successful. So now, and it feels mildly gross, but is not too different from how other large projects like numpy manage versions, setup.py simply shells out to version.sh to get a version number during package build, if VERSION isn't already set. I realize that I also check VERSION in version.sh, but a belt and suspenders won't hurt. If VERSION is set, we use that; otherwise, we get the local (+SHA) version. This enables us to do editable installations, which a bunch of things assume anyway, without having to ensure VERSION is set. Before, setup.py assumed VERSION was set, or else it would error out. But when you pip install -e harness directly, like in CI or even locally (which can happen with pip install -r requirements.txt, as -e harness is the first line), that variable won't necessarily be set. So instead, setup.py checks VERSION, and then, if it's empty, attempts to fetch the version itself. If setup.py and __init__.py can't retrieve versions correctly, they'll return default values. setup.py will return 1!0.0.0+unknown, whereas __init__.py will return 2!0.0.0+unknown. My hope is that using two different versions can help indicate where the error is, and using epochs should make package requirement resolution still pass, as both 1 and 2 are greater than the implicit 0 epoch that our version numbers use now. * Update/remove comments that reference bumpversion. * Add additional VERSION environment variable to CircleCI config. * Update Python file to comport with black. * Add return types to functions in setup.py to satisfy mypy.
* Update package-and-push-system-dev and package-and-push-system-dev-ee CircleCI jobs to build dev Docker images correctly. Specifically, fix a bash if statement to check for if we're running the job from the "main" branch or a branch prefixed with "release-". The previous behavior would match on any git branch regardless.
Some of my changes from months ago included modifications to how we build and release model_hub. Subsequently, model_hub was removed entirely. This changeset removes the last vestiges of model_hub that crept in during my rebase. * Modify .circleci/real_config.yml to remove remaining references to model_hub. * Remove model_hub directory and contents.
Docsite preview being generated for this PR. |
✅ Deploy Preview for determined-ui canceled.
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #10002 +/- ##
==========================================
- Coverage 54.59% 54.58% -0.02%
==========================================
Files 1258 1257 -1
Lines 157136 157140 +4
Branches 3617 3615 -2
==========================================
- Hits 85795 85770 -25
- Misses 71208 71237 +29
Partials 133 133
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Docsite preview being generated for this PR. |
docs/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1,118 +0,0 @@ | |||
[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So let's talk about what happens for patch releases, as said in your description: I think anytime we do a release, the version switcher should be updated to include it. And we probably shouldn't retire releases from docs as quickly as we will stop maintaining them. IMO, we should consider populating the list from the list of available versions in S3 plus the one currently being built. This requires checking right before publishing, so that if we do a couple of releases in short-ish succession, we don't overwrite one of them because it's docs weren't actually published yet, and the build process of the other release didn't know about it. And then we would just need to decide when (if ever) we retire a release from docs.
What are your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'm familiar enough with the docs publishing part of the automation to answer reliably as part of this review. What exactly would we be reading from S3? With this current changeset, though, every new minor or patch release will get a corresponding docs version published, which includes appending to the version picker versions.json
file. This all happens without relying on external services, because the git tags should be the source of truth for which versions we've published. But there's extra docs automation happening that I didn't look at as part of this work (mostly because it never came up), and that might require alteration, ideally as a separate pull request. If we want to pick and choose versions for docs, I would argue that we should then decouple the docs entirely from the Determined release process.
I'm also not sure if the existing documentation automation will handle jumping around between versions well (e.g. if we do a patch release on an older, minor release branch), because of the assumptions that code still makes. I'd need to spend more time looking closely; the docs automation is doing a bunch of stuff that I haven't needed to look at yet, and I don't think any of those necessary changes would make sense to go in as part of this PR.
Docsite preview being generated for this PR. |
same archive artifact naming scheme as the production artifacts use.
Docsite preview being generated for this PR. |
set process.env.VERSION.
Docsite preview being generated for this PR. |
Docsite preview being generated for this PR. |
harness/setup.py. This might fix a Windows test.
Docsite preview being generated for this PR. |
linter stops complaining.
Docsite preview being generated for this PR. |
workflow/job filters.
Docsite preview being generated for this PR. |
Docsite preview being generated for this PR. |
is there a reason why we're not deleting the VERSION file? |
Docsite preview being generated for this PR. |
Ha! Four-ish months of work and of course I forgot to delete the |
Ticket
INFENG-382 (epic)
INFENG-807 (epic)
See also: individual tickets from PRs linked in commits
Introduction
This changeset represents a significant overhaul to the Determined release process.
History
Early on in this work, I decided that it would be easier to maintain a long-running feature branch and merge relevant functionality into it over time, instead of attempting to fiddle with build/release functionality on
main
while somehow also gating it so it doesn't trigger before it was ready. I still believe this was prudent. As a result, though, there is now a feature branch with a significant amount of changes (and I apologize for the large diff).I'm hoping that any big issues and questions were resolved in the individual pull requests I made along the way. Other than some last-minute
model_hub
removal due to a messy rebase, this diff should be a union of everything in those individual pull requests. Please consult the closed PRs before commenting to see if your issue was already addressed there.I will still attempt to summarize the big changes here, and provide another opportunity for folks to look things over before merging what is a significant amount of work.
Description
I had the following goals when redesigning the release process:
bump2version
dependency so releases are self-contained and not reliant on a static, on-disk version number. The tag represents the version, and everything downstream uses that instead.I didn't quite achieve that last bullet point, mostly because of how release notes work now, but other than that, I think I've addressed all of these. I'll describe in detail the changes, and how Determined developers should expect to interact with the build and release process going forward.
No more
bump2version
andVERSION
filebump2version
(colloquially: bumpversion, even though bothbumpversion
andbump2version
are deprecated, replaced bybump-my-version
) looks for version strings based on specified files or glob patterns and then modifies those files with new version strings, in a prescribed order. Part of the release process was a complicated state machine ofbumpversion
invocations to nudge different parts of the version string up depending on what we were doing.Although tracking version numbers on disk is an option (the Git project, for example, does this1), it also has downsides. It means that every release needs at least one additional commit each time a version number is incremented, along with an increasing number of files across the repository modified each time. It's also possible for it to become out of sync with the version that's being built, depending on how tagging is performed.
I would rather have git tags be the source of truth for version strings, especially since that's how we trigger release jobs anyway, so that's what I've attempted here.
The
VERSION
file andbumpversion
have been replaced with a shell script,version.sh
, that, by default, returns the next local version number. Setting theVERSION
environment variable will override this behavior. The script itself has extensive comments, but it tries to be sensitive to where you are within the git DAG. If, for example, your current branch was made off of main after0.35.0
and before0.36.0
, then it would return something like0.35.0+5df5142c38
, where5df5142c38
is a short hash ofHEAD
.Instead of reading from
VERSION
, any builds that relied on that file instead runversion.sh
. If CircleCI is building from a feature branch, it will do the right thing with the local version string. If it's a proper release, then the version itself will be set from a tag further back in the process.Going forward, if you need to work with a version string, attempt to get it from
version.sh
or via tag. Please talk to me if you need help designing around this constraint, I'm happy to help.1 The Git project also tags main for their releases, which simplifies a lot of things. They also only have a single version string in one file.
version-switcher.json
has been removedInstead of storing
version-switcher.json
on disk and updating it every time, I wrote a script calledgen-versions.py
during a build to generate the existing static one up until this point, and then it will consult git for all subsequent versions. (We might need to talk about how this works with patch releases.)CircleCI
.circleci/real_config.yml
has several changes.I consolidated some of the tag regular expressions. I also added a
v
prefix to the pattern that will trigger release builds. This is important.I added a
make-component
job to somewhat modularize all of the various ways we were runningmake
across the project. It also made it easier to add dry-run release job functionality, by having a single place to toggle a booleandryrun
variable. (Under the hood, ifdryrun: true
, it just appends-dryrun
to the givenmake
target. It's not perfect, but it works.)I added a
release-dryrun
workflow, which works similarly to therelease
workflow, except it (mostly) pushes artifacts to non-production locations (GitHub, PyPI, Docker Hub, etc.). There's a small clerical issue with thedetermined
package being pushed to test.pypi.org, which means one particular job will technically fail every time, but once we resolve the issue, that should be fixed. It doesn't particularly affect the efficacy of the workflow itself.Dry-run build steps
Some of the more verbose changes are to support the previously-mentioned
release-dryrun
workflow. This necessitated entirely separate files for dry-run GoReleaser configurations, as well as separate-dryrun
suffixedmake
targets for all build steps that were necessary for a full production release.Going forward, assuming we want the
release-dryrun
workflow to continue to work, we'll need to keep in mind that we have to update the dry-run functionality that exists in various makefiles and GoReleaser configs along with actual production updates to the build. Otherwise, the two workflows will begin to drift.determined
Python package build changesI moved the
determined
(harness/
) package to PyProject. This was more relevant before I made some changes, but I believe PyProject is becoming the blessed way to configure Python projects going forward anyway, so I've left it.The
determined
package also returns its own version differently now. Instead of openingVERSION
and outputting it, it usesimportlib.metadata
to grab the version string from its own package metadata, which should always exist as long as the package has been built. (So it's effectively the same thing, it's just using a canonical file thatsetuptools
generates during a build, instead of something we maintain ourselves.)I've also removed all
__version__.py
files. Although the__version__
dunder is discouraged (and PEP-396 to standardize it was rejected), I've kept that in because that would cause some serious downstream breakage for anyone who relies on it, and migrating to something else would be unnecessary work.Things to pay attention to
Some of these pieces are more critical than others. In general, I'd recommend giving a closer look at the following:
version.sh
docs/gen-versions.py
anddocs/gen-versions.sh
harness/setup.py
version.sh
and does something with that string.harness/determined/__init__.py
and, in particular, theimportlib.metadata
version-reading functionality.Test Plan
I hope that the test plans I provided for the individual constituent pull requests were adequate. I've also done extensive testing over the last few months, including fixing a lot of broken tests and build steps along the way. Given the global and production-impacting nature of these changes, I'm not sure I can provide an overarching test plan for everything here. I might recommend consulting the individual linked PRs and following their test plans. Otherwise, I plan to help out with or captain the next release, which will be a good debut for everything here.
Checklist
docs/release-notes/
See Release Note for details.