This repository contains a cloud-based pipeline for predicting marine debris from Sentinel-2 L2A images using machine learning. The predictions are made using the PlasticDetectionModel. The output is stored in a PostGIS database hosted on AWS RDS. The predictions are displayed on our mapping application, deployed here: https://map.oceanecowatch.org
This repository is triggered by the OceanEcoMapServer via a Github Actions workflow endpoint.
- AWS account with configured access to RDS and S3 services.
- Rundpod account with a serverless endpoint.
- Access to Sentinel-Hub Processing API
The service can be run manually by triggering the workflow_dispatch event here
- Clone the repository
- Install the dependencies
pip install -r requirements.txt
- Create a
.env
file in the root directory with the following variables:
DB_USER=
DB_PW=
DB_NAME=
DB_HOST=
DB_PORT=
SH_INSTANCE_ID=
SH_CLIENT_ID=
SH_CLIENT_SECRET=
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=
RUNPOD_API_KEY=
RUNPOD_ENDPOINT_ID=
- Greate the database and necessary tables
python -m scripts.reset_db
python -m scripts.add_tables.py
- Run the service locally The job and model id will be 1 if you have run the reset_db and add_job_to_db scripts
python -m src.main --job-id <job-id> --probability-threshold <prob-theshold>
pip install -r requirements-dev.txt
- Run quick unit tests
pytest -m 'not integration and not slow'
- Run integration tests
pytest -m 'integration'
- Run slow (real inference) tests
pytest -m 'slow and not integration'