Skip to content

A sample repository illustrating how to do MLOps on GCP using Vertex AI

License

Notifications You must be signed in to change notification settings

meken/gcp-mlops-demo

Repository files navigation

gcp-mlops-demo

This is a sample project that illustrates how to use Vertex AI on GCP for building and running MLOps workflows.

Steps to run the code on Vertex AI

There's a number of different ways of running this code base on Vertex AI. This repository contains Cloud Build pipelines for automating the package generation and continous training of models, as well as a Vertex AI pipeline (based on Kubeflow v2) for training. However, it's also possible to run these steps individually. The following commands will illustrate how to run just the training on Vertex AI.

First you need to create a source distribution for this repository. Before we build the source distribution we need to make sure that the build package is available. You could create a virtual environment and install it there using the following commands

python3 -m venv .venv
source .venv/bin/activate
pip install build

Assuming that you've cloned this repository and you're at the root of it, run the following command

python3 -m build --sdist .

Once the distribution is created, upload it to GCS

BUCKET=...  # set to a bucket that's accessible by you and Vertex AI
gsutil cp dist/*.tar.gz gs://$BUCKET/code/

Now configure your target environment and submit the job. We assume that there's training data on GCS.

LOCATION=...  # which GCP region to use
JOB_NAME=gcp-mlops-demo-job
PKG_NAME=`ls -1t dist | head -n1`  # use the latest generated package
PYTHON_PACKAGE_URIS=gs://$BUCKET/code/$PKG_NAME
ARGUMENTS="--training-data-dir=gs://$BUCKET/data/train,--output-dir=gs://$BUCKET/outputs"
EXECUTOR_IMAGE_URI=europe-docker.pkg.dev/vertex-ai/training/scikit-learn-cpu.1-0:latest
gcloud ai custom-jobs create \
  --region=$LOCATION \
  --display-name=$JOB_NAME \
  --python-package-uris=$PYTHON_PACKAGE_URIS \
  --worker-pool-spec=machine-type=n1-standard-4,replica-count=1,executor-image-uri=$EXECUTOR_IMAGE_URI,python-module=trainer.task \
  --args="$ARGUMENTS"

Running things locally

In order to run this code on your local environment, you need to install this repository as an editable package.

Let's start with setting up a virtual environment (if you haven't got one yet).

python3 -m venv .venv
source .venv/bin/activate

Now you can install the package which will also install all the dependencies

pip install -e .[gcp,dev]

The extra package gcp provides the capability to build and run Kubeflow v2 pipelines on GCP. The dev package is for development related tools (flake8, pytest, coverage).

Note that the training code has no dependencies on GCP or other Google components, so it can be run on any platform. The GCP libraries are configured as extra dependencies and only needed when you want to facilitate running the training code on GCP.

Once everything is installed, you can run the training task on your local environment by calling the right module.

DATA_DIR=...  # a local folder which contains CSV files
python3 -m trainer.task --training-data-dir=$DATA_DIR

About

A sample repository illustrating how to do MLOps on GCP using Vertex AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published