We use Waf as our build tool. However, before adding the Waf file to the individual projects we first add some additional tools to Waf.
These help us to handle/resolve library dependencies. The goal is to add functionality to Waf such that it can clone and download needed dependencies automatically.
Table of Contents:
- License
- Building our custom Waf binary
- Tests
- Source code
- Resolver features
- Context helpers
- Future features
This project is under the same BSD license as the Waf project. The license text can be found here: https://gitlab.com/ita1024/waf/blob/master/waf-light#L6-30
Clone the repository:
git clone https://github.com/steinwurf/waf.git
Build waf and include our custom tools:
python waf configure python waf build
This will produce a waf binary in the build
folder which we may copy into
our projects.
To ensure that the tools work as intended way we provide a set of tests. To run the tests invoke:
python waf --run_tests
Passing --skip_network_tests
will skip any unit tests which rely on network
connectivity.
To test the freshly built Waf binary, some unit test use network connectivity to resolve dependencies. This makes the tests slow.
An example of such a test is the self_build
test. The self_build
test
will invoke a freshly built waf
binary with the wscript used to build it -
yes very meta :)
When working with a failing test or similar it may be beneficial to be able
to run only a selected set of tests. This can be achieved by passing a test
filter to pytest. The pytest test filter can be passed using the
--test_filter
:
python waf --test_filter="test_git"
The --test_filter
string will bypassed to the pytest -k
option. See more information in the pytest documentation:
https://docs.pytest.org/en/latest/usage.html#specifying-tests-selecting-tests
Running pytest --help
will produce the following description of the
-k
option:
-k EXPRESSION only run tests that match the given substring expression. An expression is a python evaluable expression where all names are substring-matched against test names and their parent classes. Example: -k 'test_method or test_other' matches all test functions and classes whose name contains 'test_method' or 'test_other'. Additionally keywords are matched to classes and functions containing extra names in their 'extra_keyword_matches' set, as well as functions that have names assigned directly to them.
We use pytest
to run the unit tests and integration tests. If some unit
tests fail, it may be helpful to go to the test folder and invoke the failing
waf commands manually.
Using our default configuration, pytest will create a local temporary folder
called pytest
when running the tests. This can be overridden with the
--pytest_basetemp
option.
If a test uses the testdirectory
fixture, then pytest will create a
subfolder matching the test function name. For example, if you have a test
function called test_empty_wscript(testdirectory)
, then the first invocation
of that test will happen inside pytest/test_empty_wscript0
.
We use the logging system provided by waf. If you have an issue with the
resolve functionality, you can add the -v
verbose flag (or -vvv
to see all debug information). Alternatively, you can use the
--zones
filter to see the resolver debug messages only:
python waf configure -v --zones=resolve
The default zone printed by waf
when adding the verbose flag -v
is
runner
, so if you want to see that also pass:
python waf configure -v --zones=resolve,runner
The modifications and additions to Waf are in the src/wurf
folder. The
main file included by Waf is the src/wurf/waf_entry_point.py
. This is a great
place to start to understand our additions to Waf
.
Waf will load this file automatically when starting up, which is achieved using
the --prelude
option of Waf. As described in the Waf book:
https://waf.io/book/#_customization_and_redistribution.
The location of the source files is a bit tricky, as Waf will move these files
in the src/wurf
folder to waflib.extras.wurf
. In the core files, we use
the relative include (from . import xyz
). When running the unit tests,
we add the src
to PYTHONPATH
, so the tested classes are imported like
this:
from wurf.xyz import Xyz
Code that uses/imports code from core Waf is prefixed with waf_
. This
makes it easy to see which files are pure Python and which provide the
integration points with Waf.
The main modification added to the standard Waf flow of control is the addition of the ResolveContext. At a high level this looks as follows:
./waf .... + | 1. | | +--------v--------+ 2. +----------------+ | +------------> | | | OptionsContext | | ResolveContext | | | <----------+ | | +-----------------+ 3. +----------------+ | 4. | | +--------v--------+ | ConfigureContext| | BuildContext | | .... | +-----------------+
Let's outline the different steps:
- The user invokes the waf binary in the project folder, internally Waf will
create the
OptionsContext
to recurse out in user'swscript
files and collect the project options. - However, before that happens we will create the
ResolveContext
which is responsible for making sure declared dependencies are available. The resolve step has two main modes of operation "resolve" and "load". In the "resolve" mode we will try to fetch the needed dependencies e.g. via git clone or other ways. In the "load" mode we expect dependencies to have already been resolved and made available on our local file system (and we just load information about where they are located). Roughly speaking we will be in "resolve" mode when the users use the "configure" command i.e.python waf configure ...
and otherwise in the "load" mode. - In both cases the
ResolveContext
makes a dependency available by producing a path to that dependency. That can later be used in other contexts etc. E.g. If the dependency declares that it is recursable, we will automatically recurse it for options, configure and build. - After having executed the
OptionsContext
and collected all options etc. control is passed to the next Waf / user-defined context. At this point path to dependencies are still available in the global dependency_cache dictionary inwaf_resolve_context.py
.
Sometimes it is useful to skip the resolve step e.g. if you doing something different than building the source code.
We've added an option to skip the resolve step:
python waf --no_resolve ...
There are two overall ways of specifying a dependency.
- Using a
resolve.json
file. - Defining a
resolve(...)
function in the project'swscript
A dependency is described using a number of key-value attributes. The following defines the general dependency attributes:
The name
attribute is a string that assigns a human-readable name to the
dependency:
{ "name": "my-pet-library", ... }
The name must be unique among all dependencies.
The resolver
attribute is a string that specifies the resolver type used to
download the dependency:
{ "name": "my-pet-library", "resolver": "git", ... }
Valid resolver types are: {"git" | "http"}
.
The optional
attribute is a boolean which specifies that a dependency
needs to be enabled in the resolve step inside the wscript:
{ "name": "my-pet-library", "resolver": "git", "optional": true, ... }
In the wscript we can then conditionally enable the dependency by adding
the following to the resolve(...)
function:
def resolve(ctx): if some_condition: ctx.enable_dependency("my-pet-library")
If optional
is not specified, it will default to false
.
Note
The resolve
step is performed before the options
step. This means
that if a dependency needs to be enabled based on a user option, one must
check for that option using sys.argv
or similar rather than using the
ctx.options
object.
This attribute specifies whether Waf should recurse into the dependency folder.
This is useful if the dependency is itself a Waf project. When recursing into a folder Waf will look for a wscript in the folder and execute its commands.
Currently, we will automatically (if recurse is true
), recurse into and execute
following Waf commands: (resolve
, options
, configure
, build
)
As we also recurse into resolve
it also enables us to recursively to resolve
the dependencies of our dependencies.
If you have a wscript where you would like to recurse dependencies for a custom
waf command, say upload
, then add the following to your wscript's
upload
function:
def upload(ctx): ... your code # Now let's recurse and execute the upload functions in dependencies # wscripts. import waflib.extras.wurf.waf_resolve_context # Call upload in all dependencies (if it exists) waf_resolve_context.recurse_dependencies(self)
Example of attributes:
{ "name": "my-pet-library", "resolver": "git", "optional": true, "recurse": true, ... }
If recurse
is not specified, it will default to true
.
The internal
attribute is a boolean whether the dependency is internal to
the specific project. Let's make a small example, say we have two libraries
libfoo
which depends on libbar
. libbar
has a dependency on gtest
for running unit-tests etc. However, when resolving dependencies of libfoo
we only get libbar
because gtest
is marked as internal
to libbar
.
As illustrated by the small figure:
+-------+ |libfoo | +---+---+ | | v +---+---+ internal +--------+ |libbar | +---------> | gtest | +-------+ +--------+
Example of attributes:
{ "name": "my-pet-library", "resolver": "git", "optional": true, "recurse": true, "internal": true, ... }
If internal
is not specified, it will default to false
.
Internal dependencies can be skipped from the top-level resolve step by
providing the --skip_internal
option.
The source
attribute contains the URL for the dependency. The URL
format depends on the resolver.
Example of attributes:
{ "name": "my-pet-library", "resolver": "git", "optional": true, "recurse": true, "internal": true, "source": "github.com/myorg/mylib.git" }
Note
The previous sources
attribute has been deprecated and will be
removed in a future version.
Please use the source
attribute instead.
The post_resolve
attribute is a list of steps to be performed after
successfully resolving a dependency.
The steps will be performed in the order they are specified.
Example of attributes:
{ "name": "my-pet-library", "resolver": "git", "optional": true, "recurse": true, "internal": true, "source": "github.com/myorg/mylib.git", "post_resolve": [ { "type": "run", "command": "tar -xvj file.tar" } ] }
The idea is to support different types of post_resolve
steps,
currently we support the following:
run
: This type of post resolve step runs acommand
in the folder where the dependency has been resolved. Thecommand
can be either a string or list of strings i.e. the following is also valid:{ "type": "run", "command": ["tar", "-xvj", "file.tar"] }
The method
attribute on a resolver of type git
allows us to select
how the git
resolver determines the correct version of the dependency to
use.
The simplest to use is the checkout
method, which combined with the
checkout
attribute will use git to clone a specific tag, branch or SHA1
commit.:
{ "name": "somelib" "resolver": "git", "method": "checkout", "checkout": "my-branch" "source": "github.com/myorg/somelib.git" ... }
The semver
method will use Semantic Versioning (www.semver.org) to select
the correct version (based on the available git tags). Using the major
attribute we specify which major version of a dependency to use. Example:
On first resolve Second resolve +-----------------------+-----------------------+ | 4.0.0 | 4.0.0 4.0.1 | 4.0.1 Selected +---> 4.1.1 | 4.1.1 | Selected +---> 4.2.0 | 5.0.0 | +
On the initial resolve, the newest available tag with major version 4 is
4.1.1
. At a later point in time, we re-run resolve, this time new
versions of our dependency have been released and the newest is now 4.2.0
.
Attributes:
{ "name": "someotherlib" "resolver": "git", "method": "semver", "major": 4, "source": "github.com/myorg/someotherlib.git" }
Using this attribute you can control whether submodules in a git dependency
should be cloned/pulled. Default is true
which will clone/pull submodules if
found. To avoid cloning/pulling a submodule set pull_submodules: false
:
{ "name": "somelib" "resolver": "git", "method": "checkout", "checkout": "my-branch" "source": "github.com/myorg/somelib.git", "pull_submodules": false ... }
Using the http
resolver we can specify download dependencies via HTTP.
Specify a filename of the downloaded dependency:
{ "name": "myfile" "resolver": "http", "filename": "somefile.zip", "source": "http://mydomain.com/myfile.zip" }
The attribute is optional. If not specified the resolver will try to derive the filename from the dependency URL, or the returned HTTP headers.
If the dependency is an archive (e.g. zip
, tar.gz
, etc.) the extract
boolean specifies whether the archive should be extracted:
{ "name": "myfile" "resolver": "http", "extract": true, "source": "http://mydomain.com/myfile.zip" }
If the extract
attribute is not specified it defaults to false
.
Dependencies are specified using the resolve.json
file.
A simple example for a resolve.json
file specifying a single git semver
dependency:
[ { "name": "waf-tools", "resolver": "git", "method": "semver", "major": 4, "source": "github.com/steinwurf/waf-tools.git" } ]
All dependencies need to be specified in this way. In some situations where
the need for a dependency relies on runtime information, it can be specified to
be "optional" and then enabled or disabled in a user-defined resolve(...)
function in the wscript
.
To support both these configuration methods, we define the following "rules":
The purpose of this feature is to provide stable locations in the file system for the downloaded dependencies.
By default, several folders will be created during the process of resolving
dependencies. Several projects can share the same folder for resolved
dependencies (this is controlled using the --resolve_path
option). To avoid
confusing/error-prone situations the folders are considered immutable. This
results in some overhead, as the dependency paths will change as new
versions of them become available. E.g if the gtest
dependency is currently
located under /path/to/gtest-1.6.7-someh4sh
, as soon as version 1.6.8
is
released and the user re-runs python waf configure
the path may be
updated to /path/to/gtest-1.6.8-someh4sh
as the resolver noticed the new
version.
This is problematic e.g. for IDE configurations where the user needs to manually go and update the path in the IDE to the new location.
Moreover, Waf fails to recognize changes in dependency including files if they are located outside the project root. This is very annoying if you are developing header-only projects side-by-side because you need to rebuild the entire project if some header files change. But if the dependencies are accessed through a symlink within the project, then Waf will be able to track the changes in all the included files.
To avoid these problems, we created the resolve_symlinks
local folder in
the project root that contains symlinks to the resolved dependencies. The
path can be changed with the --symlinks_path
option.
For the previous example, we would see the following in the resolve_symlinks
folder:
$ ls -la resolve_symlinks/ total 0 lrwxrwxrwx 1 usr usr 29 Feb 20 20:55 gtest -> /path/to/gtest-1.6.7-someh4sh
After re-running python waf configure ...
:
$ ls -la resolve_symlinks/ total 0 lrwxrwxrwx 1 usr usr 29 Feb 20 20:57 gtest -> /path/to/gtest-1.6.8-someh4sh
The --lock_versions
option will write lock_resolve_versions.json
to the project folder. This file will describe the exact version information
about the project's dependencies.
The version information can be different for different resolvers:
git
resolvers will store the SHA1 commit id of the dependency.http
resolvers will store the SHA1 sum of the downloaded dependency.
If the lock_resolve_versions.json
is present, it will take precedence over all
resolvers besides the user options such as manually specifying checkout or
path.
You can commit the lock_resolve_versions.json
file to git, e.g. when creating
a LTS (Long Term Support) release or similar where you want to pin the exact
versions for each dependency
As an example:
# Writes / overwrites an existing lock_resolve_versions.json python waf configure --lock_versions
The --lock_paths
will write a lock_resolve_paths.json
file in the project
folder. It behaves differently from the --lock_versions
option in that it
will store the relative paths to the resolved dependencies. The typical
use case for this is to download all dependencies into a folder stored within
the project (default behavior) to make a standalone archive.
If the lock_resolve_paths.json
is present, it will take precedence over
both the lock_resolve_versions.json
and all other resolvers besides the user
resolvers besides the user options, such as manually specifying checkout or
path.
This makes it possible to easily create a standalone archive:
python waf configure --lock_paths python waf standalone
Using the --resolve_path
option whenever doing a resolve or configure can be
cumbersome.
To combat this a config file can be used to override the default value for
this option.
The config file must be called .wurf_config
, and must be located in either
the project's directory or the user's directory. Note, that the former takes
priority over the latter.
The following is an example of the content of a config file:
[DEFAULT] resolve_path = ~/projects/dependencies
This config file will override the default value for the resolve_path with
~/projects/dependencies
.
We add various helpers to the Waf context objects. The following list is an incomplete list of the helpers that are added.
Compiles a requirements_in
file to a requirements_txt
file. The
requirements_in
file is hashed and the hash is stored in the
requirements_txt
.
The requirements_txt will be re-generated in two cases:
- The hash of the requirements_in file has changed.
- The requirements_txt file does not exist.
Creates a virtualenv in a specified folder.
Ensure that we've run the build step before running the current command.
Rewrites content of a file - useful for updating e.g. version numbers when doing a release.
The following list contains the work items that we have identified as "cool" features for the Waf dependency resolve extension.
Certain resolvers utilize "shortcuts" such as using cached information about dependencies to speed up the resolve step. Providing this option should by-pass such optimizations and do a full resolve - not relying on any form of cached data.
To make error messages user-friendly the default is to redirect full tracebacks (showing where an error originated), to the log files. However, if running on a build system it is convenient to have the full traceback printed to the terminal, this avoids us having to log into the machine and manually retrieve the log file.
To support third-party tooling working with information about an already
resolved dependency we implement the --dump-resolved-dependencies
option.
This will write out information about resolved dependencies such as semver tag chosen etc.