Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create built-in matrix transforms to facilitate splitting tasks by arbitrary values #588

Merged
merged 2 commits into from
Oct 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/reference/transforms/chunking.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _chunking:
.. _chunking transforms:

Chunking Tasks
==============
Chunking Transforms
===================

The :mod:`taskgraph.transforms.chunking` module contains transforms that aid
in splitting a single entry in a ``kind`` into multiple tasks. This is often
Expand Down
6 changes: 3 additions & 3 deletions docs/reference/transforms/from_deps.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _from_deps:
.. _from_deps transforms:

From Dependencies
=================
From Dependencies Transforms
============================

The :mod:`taskgraph.transforms.from_deps` transforms can be used to create
tasks based on the kind dependencies, filtering on common attributes like the
Expand Down
7 changes: 4 additions & 3 deletions docs/reference/transforms/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ provides, read below to learn how to use them.

.. toctree::

from_deps
task_context
chunking
From Dependencies <from_deps>
Task Context <task_context>
Matrix <matrix>
Chunking <chunking>
200 changes: 200 additions & 0 deletions docs/reference/transforms/matrix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
.. _matrix transforms:

Matrix Transforms
=================

The :mod:`taskgraph.transforms.matrix` transforms can be used to split a base
task into many subtasks based on a defined matrix.

These transforms are useful if you need to have many tasks that are very
similar except for some small configuration differences.

Usage
-----

Add the transform to the ``transforms`` key in your ``kind.yml`` file:

.. code-block:: yaml

transforms:
- taskgraph.transforms.matrix
# ...

Then create a ``matrix`` section in your task definition, e.g:

.. code-block:: yaml

tasks:
test:
matrix:
os: ["win", "mac", "linux"]

# rest of task definition

This will split the ``test`` task into three; ``test-win``, ``test-mac`` and
``test-linux``.

Matrix with Multiple Rows
~~~~~~~~~~~~~~~~~~~~~~~~~

You can add as many rows as you like to the matrix, and every combination of
tasks will be generated. For example, the following matrix:

.. code-block:: yaml

tasks:
test:
matrix:
os: ["win", "mac", "linux"]
python: ["py312", "py311"]

# rest of task definition

Will generate these tasks:

- ``test-win-py312``
- ``test-win-py311``
- ``test-mac-py312``
- ``test-mac-py311``
- ``test-linux-py312``
- ``test-linux-py311``

Note that the name of the tasks will be built based on the order of rows in the
ahal marked this conversation as resolved.
Show resolved Hide resolved
matrix.

Substituting Matrix Context into the Task Definition
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Of course these tasks will be identical, so you'll want to change other parts
of the task definition based on the matrix values.

Substituting Values in Yaml
```````````````````````````

The simplest way to change a matrix task's definition, is to use the built-in
yaml substitution:

.. code-block:: yaml

tasks:
test:
matrix:
os: ["win", "mac", "linux"]
description: Run {matrix[os]} tests
worker-type: {matrix[os]}-worker
ahal marked this conversation as resolved.
Show resolved Hide resolved

Limiting Substitution
'''''''''''''''''''''

By default, all keys and values in the task definition will be checked for
substitution parameters. But in some cases, it might be desirable to limit which
keys get substituted, such as when using the ``matrix`` transforms alongside
other transforms that perform substitution, such as the
:mod:`~taskgraph.transforms.task_context` or
:mod:`~taskgraph.transforms.chunking` transforms.

ahal marked this conversation as resolved.
Show resolved Hide resolved
To limit the fields that will be evaluated for substitution, you can pass in the
``substitution-fields`` config:

.. code-block:: yaml

tasks:
test:
matrix:
substitution-fields: ["worker-type"]
os: ["win"]
description: Run {matrix[os]} tests
worker-type: {matrix[os]}-worker

In the example above, ``worker-type`` will evaluate to ``win-worker``, whereas
the description will be the literal string ``Run {matrix[os]} tests``. Dot
notation can be used in ``substitution-fields`` to limit substitution to some
sub configuration of the task definition.

Substituting Values in a Later Transform
````````````````````````````````````````

For more advanced cases, you may wish to use a later transform to act on the
result of the matrix evaluation. To accomplish this, the ``matrix`` transforms
will set a ``matrix`` attribute that contains all matrix values applicable to
the task.

For example, let's say you have a ``kind.yml`` like:

.. code-block:: yaml

transforms:
- taskgraph.transforms.matrix
- custom_taskgraph.transforms.custom
# ...

tasks:
test:
matrix:
os: ["win", "mac", "linux"]

Then in your ``custom.py`` transform file, you could add:

.. code-block:: python

@transforms.add
def set_worker_type_and_description(config, tasks):
for task in tasks:
matrix = task["attributes"]["matrix"]
task["description"] = f"Run {matrix['os']} tests"
task["worker-type"] = f"{matrix['os']}-worker"
yield task

This example will yield the exact same result as the yaml example above, but it
allows for more complex logic.

Excluding Matrix Combinations
-----------------------------

Sometimes you might not want to generate *every* possible combination of tasks,
and there may be some you wish to exclude. This can be accomplished using the
``exclude`` config:

.. code-block:: yaml

tasks:
test:
matrix:
os: ["win", "mac"]
arch: ["x86", "arm64"]
python: ["py312", "py311"]
exclude:
- os: mac
arch: x86
- os: win
arch: arm64
python: py311

This will cause all combinations where ``os == mac and arch == x86`` to be
skipped, as well as the specific combination where ``os == win and arch ==
arm64 and python == py311``. This means the following tasks will be generated:

* test-win-x86-py311
* test-win-x86-py312
* test-win-arm64-py312
* test-mac-arm64-py311
* test-mac-arm64-py312

Customizing Task Names
----------------------

By default, the ``matrix`` transforms will append each matrix value to the
task's name, separated by a dash. If some other format is desired, you can specify
the ``set-name`` config:

.. code-block:: yaml

tasks:
test:
matrix:
set-name: "test-{matrix[os]}/{matrix[python]}"
os: ["win"]
python: ["py312"]

Instead of creating a task with the name ``test-win-py312``, the name will be
``test-win/py312``.
6 changes: 3 additions & 3 deletions docs/reference/transforms/task_context.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _task_context:
.. _task_context transforms:

Task Context
============
Task Context Transforms
=======================

The :mod:`taskgraph.transforms.task_context` transform can be used to
substitute values into any field in a task with data that is not known
Expand Down
112 changes: 112 additions & 0 deletions src/taskgraph/transforms/matrix.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.

"""
Transforms used to split one task definition into many tasks, governed by a
matrix defined in the definition.
"""

from copy import deepcopy
from textwrap import dedent

from voluptuous import Extra, Optional, Required

from taskgraph.transforms.base import TransformSequence
from taskgraph.util.schema import Schema
from taskgraph.util.templates import substitute_task_fields

MATRIX_SCHEMA = Schema(
{
Required("name"): str,
Optional("matrix"): {
Optional(
"exclude",
description=dedent(
"""
Exclude the specified combination(s) of matrix values from the
final list of tasks.

If only a subset of the possible rows are present in the
exclusion rule, then *all* combinations including that subset
subset will be excluded.
""".lstrip()
),
): [{str: str}],
Optional(
"set-name",
description=dedent(
"""
Sets the task name to the specified format string.

Useful for cases where the default of joining matrix values by
a dash is not desired.
""".lstrip()
),
): str,
Optional(
"substitution-fields",
description=dedent(
"""
List of fields in the task definition to substitute matrix values into.

If not specified, all fields in the task definition will be
substituted.
"""
),
): [str],
Extra: [str],
},
Extra: object,
},
)
"""Schema for matrix transforms."""

transforms = TransformSequence()
transforms.add_validate(MATRIX_SCHEMA)


def _resolve_matrix(tasks, key, values, exclude):
for task in tasks:
for value in values:
new_task = deepcopy(task)
new_task["name"] = f"{new_task['name']}-{value}"

matrix = new_task.setdefault("attributes", {}).setdefault("matrix", {})
matrix[key] = value

for rule in exclude:
if all(matrix.get(k) == v for k, v in rule.items()):
break
else:
yield new_task


@transforms.add
def split_matrix(config, tasks):
for task in tasks:
if "matrix" not in task:
yield task
continue

matrix = task.pop("matrix")
set_name = matrix.pop("set-name", None)
fields = matrix.pop("substitution-fields", task.keys())
exclude = matrix.pop("exclude", {})

new_tasks = [task]
for key, values in matrix.items():
new_tasks = _resolve_matrix(new_tasks, key, values, exclude)

for new_task in new_tasks:
if set_name:
if "name" not in fields:
fields.append("name")
new_task["name"] = set_name
ahal marked this conversation as resolved.
Show resolved Hide resolved

substitute_task_fields(
new_task,
fields,
matrix=new_task["attributes"]["matrix"],
)
yield new_task
11 changes: 2 additions & 9 deletions src/taskgraph/transforms/task_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

from taskgraph.transforms.base import TransformSequence
from taskgraph.util.schema import Schema
from taskgraph.util.templates import deep_get, substitute
from taskgraph.util.templates import deep_get, substitute_task_fields
from taskgraph.util.yaml import load_yaml

SCHEMA = Schema(
Expand Down Expand Up @@ -113,12 +113,5 @@ def render_task(config, tasks):
subs.setdefault("name", task["name"])

# Now that we have our combined context, we can substitute.
for field in fields:
container, subfield = task, field
while "." in subfield:
f, subfield = subfield.split(".", 1)
container = container[f]

container[subfield] = substitute(container[subfield], **subs)

substitute_task_fields(task, fields, **subs)
yield task
Loading
Loading