- The central OpenShift cluster containing the tools responsible for creating any artifacts required for the successful deployment of an Inference Application to Near Edge environments.
- There are no resources or network constraints expected in the core cluster as it is expected that it fully supports all workflows required for creating and verifying Inference Application container images.
- This is a non-core distributed environment to run and serve AI/ML inference workloads in moderate yet constrained compute resources and network.
- For the purpose of this repository, the near edge environment is represented by separate OpenShift clusters which may be disconnected from the core, the internet or both but may be managed from a core OpenShift cluster.
- A Model Server is responsible for hosting models as a service "to return predictions based on data inputs that you provide through API calls."1
- For any workflows under opendatahub-io/ai-edge, we will be focusing on using the Model Servers and serving runtimes supported by Open Data Hub.
- OCI compliant container image2 with the models included during the build process.
- Support for container images where the model and model serving runtimes are stored together.
- A centralized repository for the models and their metadata and managing the model lifecycle and versions. Currently our pipelines do not support any Model Registry; only S3 and Git can be used as directly referenced sources where models can be stored.
- Open Container Initiative (OCI) compliant container registry where the Inference Application Container images and other artifacts are stored and versioned ready to be deployed on production or staging environments.
- GitOps is an established configuration management pattern to store the configuration of your infrastructure and workflow automation in a Git repository for reproducibility and version control.
- "GitOps uses Git repositories as a single source of truth to deliver infrastructure as code."3