This is a the boiler-plate workspace that you can clone and add your requirements. This should help you to understand how to set up a project within HelloDATA.
Goal: that you can test and create your DAGs, dbt, and whatever you need within this git-repo and deploy anywhere with regular HelloDATA deployment.
We use a Airflow DAGs that will be copied with an initContainer into central Airflow orchestrator within Business Domain.
Find the detailed documentation on Workspaces Documentation.
- the dags you are running on HelloDATA are in /src/dags/ folder. Check boiler-example.py for an example
- add your deployments needs and requirements into hd_env.py. This step will be integrated in your working HelloDATA environments.
- make sure to set the ENV variable (e.g. postgres local password) on
deployment/local/env
- make sure to set the ENV variable (e.g. postgres local password) on
- you can specify all your requirements in Dockerfile
- build your image with
docker build -t hellodata-ws-boilerplate:local-build .
or create your own one. Be sure that your HelloDATA instance has access to your uploaded Docker registry.
For testing your Airflow locally. You can use above menitoned Astro CLI, which makes it very easy to run Airflow with one command astro dev start -e ../env
. See more details on Airflow Readme.
Requirements:
- Local Docker installed (either native or Docker-Desktop)
- Linux/MacOS: add
127.0.0.1 host.docker.internal
to the/etc/hosts
file. - Windows: Enable in Docker Desktop under
Settings -> General -> Use WSL 2 based engine
the settings:Add the *.docker.internal names to the host's etc/hosts file (Requires password)
- Make sure Docker-Desktop entered it correctly in
C:\Windows\System32\drivers\etc\hosts
. There were some cases where it added wrongly. It should look something like:
- Make sure Docker-Desktop entered it correctly in
- Linux/MacOS: add
# Added by Docker Desktop
<YOUR-IP-HERE> host.docker.internal
<YOUR-IP-HERE> gateway.docker.internal
# To allow the same kube context to work on the host and the container:
127.0.0.1 kubernetes.docker.internal
# End of section
- make sure Kubernetes is enabled
- copy you local kube-file to astro:
cp ~/.kube/config src/dags/airflow/include/.kube/
- Windows: The file is under
%USERPROFILE%\.kube\config
(C:\Users[YourIdHere].kube)
- Windows: The file is under
- install
KubernetesPodOperator
locally with pip (not with astro) installpip install apache-airflow-providers-cncf-kubernetes
. - if you store data from one task to the next, you need a volume-mount (see
pvc.yaml
) and configurevolumes
andvolume_mounts
in Airflow (see example)