Create venv :
python -m venv venv
Activate venv :
source venv/bin/activate
Install requirements :
pip install -r requirements
Choose a DAG and select Trigger DAG w/ config. In the JSON config, you can specify the following parameters to overwrite these default values:
{
"branch": "master",
"version": "latest"
}
To release a version of an enriched dataset and publish it to researchers, run the enriched DAG with a config specifying the version number.
{
"version": "1.0.0"
}
The version number must follow this format : "x.x.x"
where x is a number.
Create .env
file :
cp .env.sample .env
Deploy stack :
docker-compose up
Login to Airflow UI :
- URL :
http://localhost:50080
- Username :
airflow
- Password :
airflow
Create Airflow variables (Airflow UI => Admin => Variables) :
- dags_path :
/opt/airflow/dags
- base_url (optional) :
http://localhost:50080
For faster variable creation, upload the variables.json
file in the Variables page.
docker-compose exec airflow-scheduler airflow tasks test <dag> <task> 2022-01-01
Login to MinIO console :
- URL :
http://localhost:59001
- Username :
minioadmin
- Password :
minioadmin
Create Airflow variable (Airflow UI => Admin => Variables) :
- s3_conn_id :
minio
Create Airflow connection (Airflow UI => Admin => Connections) :
- Connection Id :
minio
- Connection Type :
Amazon S3
- Extra :
{
"host": "http://minio:9000",
"aws_access_key_id": "minioadmin",
"aws_secret_access_key": "minioadmin"
}
Create Airflow variable (Airflow UI => Admin => Variables) :
- slack_hook_url :
https://hooks.slack.com/services/...