pytest-adf

This is an ALPHA RELEASE

pytest-adf is a pytest plugin for writing Azure Data Factory integration tests. It is light-wrapper around the Azure Data Factory Python SDK.

Requirements

You will need the following:

Python 3+

Installation

To install pytest-adf:

pip install pytest-adf

Usage

Here is a simple usage of the adf_pipeline_run fixture.

def test_pipeline_succeeded(adf_pipeline_run):
    this_run = adf_pipeline_run("my_pipeline", run_inputs={})
    assert this_run.status == "Succeeded"

The adf_pipeline_run fixture provides a factory function that triggers a pipeline run when called. It will then block and poll the pipeline run till completion* before returning. Pipeline run completion is defined by the following status: "Succeeded", "TimedOut", "Failed", "Cancelled".

For an example of how to use this in an overall Modern Data Warehouse solution as part of an automated Azure DevOps Release Pipeline, see here and here. This is part of a larger demo solution showcasing DataOps as applied to the Modern Data Warehouse architecture.

For additional usage information, see caching pipeline runs.

Configuration

You need to provide pytest-adf with the necessary configuration to connect to your Azure Data Factory. You can provide it via Environment Variables or as pytest command line variables. Command line variables take precedence over Environment Variables.

Environment Variables

AZ_SERVICE_PRINCIPAL_ID - Azure AD Service Principal with rights to trigger a run in Data Factory (ei. Data Factory Contributor), if not provided the test will use AZ-Cli authentication
AZ_SERVICE_PRINCIPAL_SECRET - Password of Service Principal
AZ_SERVICE_PRINCIPAL_TENANT_ID - Azure AD Tenant ID of Service Principal
AZ_SUBSCRIPTION_ID - Azure Subscription ID where Azure Data Factory is hosted.
AZ_RESOURCE_GROUP_NAME - Azure Resource Group name where Azure Data Factory is hosted.
AZ_DATAFACTORY_NAME - Name of the Azure Data Factory.
AZ_DATAFACTORY_POLL_INTERVAL_SEC - Optional. Seconds between poll intervals to check for status of the triggered run.

For more information on how to create an Azure AD service principal, see here.

pytest command-line

Alternatively, you can pass these like so:

pytest
    --sp_id=my_sp_id \
    --sp_password=my_sp_pass \
    --sp_tenant_id=my_tenant_id \
    --sub_id=my_s_id \
    --rg_name=my_rg \
    --adf_name=my_adf \
    --poll_interval=20

Caching pipeline runs

Because ADF pipelines can be expensive to run, the adf_pipeline_run fixture allows you to cache pipeline runs by specifying the cached_run_name variable. Pipeline runs are identified by a combination of pipeline_name and cached_run_name. This is helpful is you want to create multiple test cases against the same pipeline_run without the needing to (1) rerun the entire pipeline or (2) mixing all assert statements in the same test_ case function.

To force a rerun with the same pipeline_name and cached_run_name, use rerun=True.

For example:

# Call adf_pipeline_run specifying cached_run_name variable.
this_first_run = adf_pipeline_run(pipeline_name="pipeline_foo", run_inputs={}, cached_run_name="run_bar")

# Call adf_pipeline_run again, with same pipeline_name and cached_run_name
# This will NOT trigger an actual ADF pipeline run, and will instead return this_first_run object.
# Note: run_inputs are not checked to determine if cached run was called with the same run_inputs.
this_second_run = adf_pipeline_run(pipeline_name="pipeline_foo", run_inputs={}, cached_run_name="run_bar")
this_first_run == this_second_run  # True

# To force a rerun, set rerun=True.
this_third_run = adf_pipeline_run(pipeline_name="pipeline_foo", run_inputs={}, cached_run_name="run_bar", rerun=True)
this_first_run != this_third_run  # False

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.devcontainer		.devcontainer
devops		devops
docs		docs
src/pytest_adf		src/pytest_adf
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
README.rst		README.rst
mkdocs.yml		mkdocs.yml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pytest-adf

Requirements

Installation

Usage

Configuration

Environment Variables

pytest command-line

Caching pipeline runs

Contributing

About

Releases 2

Packages

Contributors 3

Languages

License

devlace/pytest-adf

Folders and files

Latest commit

History

Repository files navigation

pytest-adf

Requirements

Installation

Usage

Configuration

Environment Variables

pytest command-line

Caching pipeline runs

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Languages

Packages