Skip to content
/ moai Public

moai is a PyTorch-based AI Model Development Kit (MDK) created to improve data-driven model workflows, design and reproducibility.

License

Notifications You must be signed in to change notification settings

moverseai/moai

Repository files navigation

moai - Accelerating modern data-driven workflows

Documentation Status Build Status PyTorch Lightning TorchServe

Code style: black

Overview

moai is a PyTorch-based AI Model Development Kit (MDK) that aims to improve data-driven model workflows, design and understanding. Since it is based on established open-source packages, it can be readily used to improve most AI workflows. To explore moai, simply install the package and follow the examples, having in mind that it is in early development alpha version, thus new features will be available soon.

Overview Overview

Features & Design Goals

  • Modularity via Monads: Use moai's existing pool of modular model building blocks.
  • Reproducibility via Configuration: moai manages the hyper-parameter sensitive AI R&D workflows via its built-in configuration-based design.
  • Productivity via Minimizing Coding: moai offers a data-driven domain modelling language (DML) that can facilitate quick & easy model design.
  • Extensibility via Plugins: Easily integrate external code using moai's built-in metaprogramming and external code integration.
  • Understanding via Analysis: moai supports inter-model performance and design aggregation actions to consolidate knowledge between models and query differences.

Actions

moai offers a set of data-driven workflow functionalities through specific integrated actions. These consume moai configuration files that describe each action's executed context. As moai is built around these configuration files that define its context and describe each model's details, it offers actions that support heavy data-driven workflows with inter-model analytics, knowledge extraction and meticulous reproduction.

Details for each action follow:

  • moai play CONFIG_PATH

Play Action Play Action

Using the play action, moai starts the playback of a dataset's train\val\test splits. moai's exporters can be used to the extract dataset specific statistics. moai's visualization engine can be used to showcase the dataset. Optionally, monad processing graphs can be defined to transform the data.

  • moai train CONFIG_PATH

Train Action Train Action

The train action consumes a configuration file that defines the model that will be trained, the data that will be used to train and validate it, as well as configurating the engine around the training process. The results include model states across training and logs including validation metrics and losses.

  • moai evaluate CONFIG_PATH

Evaluate Action Evaluate Action

The evaluate action consumes a configuration file that defines the trained model that will be tested, the test data, as well as configurating the engine around the testing process. The results include model aggregated and/or detailed metrics, and inference samples.

  • moai plot PATH_TO_EXPERIMENTS

Plot Action Plot Action

The plot action consumes various configuration files - usually from different versions of the same model - and generates a visualization consolidating and aggregating inter-model performance, providing the necessary means to analyze the behaviour of different hyper-parameters or model configurations.

  • moai diff lhs=PATH_TO_CONFIG_A rhs=PATH_TO_CONFIG_B

Diff Action Diff Action

The diff action consumes two different configuration file - usually from different versions of the same model - and reports their differences related to hyper-parameterization, processing graph variations, etc..

  • moai reprod PATH_TO_RESOLVED_CONFIG

Reprod Action Reprod Action

The reprod action consumes a previously logged and resolved configuration file, and facilitates its reproduction by re-executing it while adjusting to development environment differences.

Dependencies

moai stands on the shoulders of giants as it relies on various large scale open-source projects:

  1. PyTorch > 1.7.0 needs to be customly installed on your system/environment.

  2. Lightning > 1.0.0 is the currently supported training backend.

  3. Hydra > 1.0 drives moai's DML that sets up model configurations, and additionally manages the hyper-parameter complexity of modern AI models.

  4. TorchServe > 0.5.3 is needed to deploy models as services.

  5. ONNX > 1.11.0 is needed to export models in an exchangeable format.

  6. Visdom is the currently supported visualization engine.

  7. HiPlot drives moai's inter-model analytics.

  8. Various PyTorch Open Source Projects:

    • Kornia for a set of computer vision operations integrated as moai monads.
    • Albumentations as the currently supported data augmentation framework.
  9. The Wider Open Source Community that conducts accessible R&D and drives most of moai's capabilities.

  10. A set of awesome Python libraries.

Installation

Package

To install the latest released moai package run:

pip install moai-mdk

Source

Download the master branch source and install it by opening a command line on the source directory and running:

pip install . or pip install -e . (in editable form)

Getting Started

Visit the documentation site to learn about moai's DML and the overall MDK design and usage.

Examples can be found at conf/examples.

Licence

moai is Apache 2.0 licenced, as found in the corresponding LICENCE file.

However, some code integrated from external projects may carry their own licences.

PyTorch Developer's Day 2021

PTDD21 PTDD21

Citation

If you use moai in your R&D workflows or find its code useful please consider citing:

@misc{moai,
    key = {moai: PyTorch Model Development Kit},
    title = {{\textit{moai}: Accelerating modern data-driven workflows}},
    year = {2021},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/moverseai/moai}},
}

Contact

Use a GitHub issue tracker