Skip to content

Commit

Permalink
Preset Containers 2024.0 (#162)
Browse files Browse the repository at this point in the history
* classical-ml build working

* updated for 2024.0 release

* tests and compose updated for 2024.0.0

* image names updated in README

* Classical ML samples linked

* classical ml samples descrition added

* Classical ML bullet point fixed

* dl inf opt py3.9 working

* complete DL & Inf optimazation build. Updated tests

* fixed test automation failures

* DL samples landing page added

* ccl sample removed

* INC samples added to landing page

* updated readme

* update multi node docs with correct conda env

* ccl test removed

* moving HVD env to right location

* classical ml jupyter landing page fixed

* Updated recommded SHM size

---------

Co-authored-by: sharvil.shah <sharvils@mlp-prod-clx-5669.ra.intel.com>
Co-authored-by: Tyler Titsworth <tyler.titsworth@intel.com>
  • Loading branch information
3 people authored Nov 16, 2023
1 parent 701fe50 commit 6649c08
Show file tree
Hide file tree
Showing 21 changed files with 349 additions and 472 deletions.
24 changes: 12 additions & 12 deletions preset/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ You can get the Preset Containers from [Intel® AI Tools Selector](https://www.i

| Preset Container Name | Purpose | Tools | Image Name |
| -----------------------------| ------------- | ------------- | ----------------- |
| Data Analytics | Perform large scale data analysis | [Intel® Distribution For Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [Modin*](https://github.com/modin-project/modin), [Intel® Dataset Librarian](https://github.com/IntelAI/models/tree/master/datasets/dataset_api), [Intel® Data Connector](https://github.com/IntelAI/models/tree/master/datasets/cloud_data_connector) | `intel/data-analytics:2023.2-py3.9`<br />`intel/data-analytics:2023.2-py3.10` |
| Classical ML | Train classical-ml models using scikit, modin and xgboost | [Intel® Distribution For Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [Intel® extension for SciKit Learn](https://github.com/intel/scikit-learn-intelex), [XGBoost*](https://github.com/dmlc/xgboost), [Modin*](https://github.com/modin-project/modin), <br /> [Intel® Dataset Librarian](https://github.com/IntelAI/models/tree/master/datasets/dataset_api), [Intel® Data Connector](https://github.com/IntelAI/models/tree/master/datasets/cloud_data_connector) | `intel/classical-ml:2023.2-py3.9`<br />`intel/classical-ml:2023.2-py3.10` |
| Deep Learning | Train large scale Deep Learning models with Tensorflow or PyTorch | [Intel® Distribution For Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [PyTorch*](https://pytorch.org/), [Tensorflow*](https://www.tensorflow.org/),<br /> [Intel® Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch), [Intel® Extension for Tensorflow](https://github.com/intel/intel-extension-for-tensorflow),<br /> [Intel® Optimization for Horovod](https://github.com/intel/intel-optimization-for-horovod), [Intel® Dataset Librarian](https://github.com/IntelAI/models/tree/master/datasets/dataset_api), [Intel® Data Connector](https://github.com/IntelAI/models/tree/master/datasets/cloud_data_connector), [Intel® Extension for DeepSpeed](https://github.com/intel/intel-extension-for-deepspeed) | `intel/deep-learning:2023.2-py3.9`<br />`intel/deep-learning:2023.2-py3.10` |
| Inference Optimization | Optimize Deep Learning models for inference<br /> using Intel® Neural Compressor | [Intel® Distribution For Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [PyTorch*](https://pytorch.org/), [Tensorflow*](https://www.tensorflow.org/), <br /> [Intel® Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch), [Intel® Extension for Tensorflow](https://github.com/intel/intel-extension-for-tensorflow),<br /> [Intel® Neural Compressor](https://github.com/intel/neural-compressor), [Intel® Dataset Librarian](https://github.com/IntelAI/models/tree/master/datasets/dataset_api), [Intel® Data Connector](https://github.com/IntelAI/models/tree/master/datasets/cloud_data_connector) | `intel/inference-optimization:2023.2-py3.9`<br />`intel/inference-optimization:2023.2-py3.10` |
| Data Analytics | Perform large scale data analysis | [Intel® Distribution For Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [Modin*](https://github.com/modin-project/modin), [Intel® Dataset Librarian](https://github.com/IntelAI/models/tree/master/datasets/dataset_api), [Intel® Data Connector](https://github.com/IntelAI/models/tree/master/datasets/cloud_data_connector) | `intel/data-analytics:2024.0.0-py3.9`<br />`intel/data-analytics:2024.0.0-py3.10` |
| Classical ML | Train classical-ml models using scikit, modin and xgboost | [Intel® Distribution For Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [Intel® extension for SciKit Learn](https://github.com/intel/scikit-learn-intelex), [XGBoost*](https://github.com/dmlc/xgboost), [Modin*](https://github.com/modin-project/modin), <br /> [Intel® Dataset Librarian](https://github.com/IntelAI/models/tree/master/datasets/dataset_api), [Intel® Data Connector](https://github.com/IntelAI/models/tree/master/datasets/cloud_data_connector) | `intel/classical-ml:2024.0.0-py3.9`<br />`intel/classical-ml:2024.0.0-py3.10` |
| Deep Learning | Train large scale Deep Learning models with Tensorflow or PyTorch | [Intel® Distribution For Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [PyTorch*](https://pytorch.org/), [Tensorflow*](https://www.tensorflow.org/),<br /> [Intel® Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch), [Intel® Extension for Tensorflow](https://github.com/intel/intel-extension-for-tensorflow),<br /> [Intel® Optimization for Horovod](https://github.com/intel/intel-optimization-for-horovod), [Intel® Dataset Librarian](https://github.com/IntelAI/models/tree/master/datasets/dataset_api), [Intel® Data Connector](https://github.com/IntelAI/models/tree/master/datasets/cloud_data_connector), [Intel® Extension for DeepSpeed](https://github.com/intel/intel-extension-for-deepspeed) | `intel/deep-learning:2024.0.0-py3.9`<br />`intel/deep-learning:2024.0.0-py3.10` |
| Inference Optimization | Optimize Deep Learning models for inference<br /> using Intel® Neural Compressor | [Intel® Distribution For Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [PyTorch*](https://pytorch.org/), [Tensorflow*](https://www.tensorflow.org/), <br /> [Intel® Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch), [Intel® Extension for Tensorflow](https://github.com/intel/intel-extension-for-tensorflow),<br /> [Intel® Neural Compressor](https://github.com/intel/neural-compressor), [Intel® Dataset Librarian](https://github.com/IntelAI/models/tree/master/datasets/dataset_api), [Intel® Data Connector](https://github.com/IntelAI/models/tree/master/datasets/cloud_data_connector) | `intel/inference-optimization:2024.0.0-py3.9`<br />`intel/inference-optimization:2024.0.0-py3.10` |

The Deep Learning and Inference Optimization containers have separate conda environments for each framework: `torch` and `tensorflow`.
The Deep Learning and Inference Optimization containers have separate conda environments for each framework: `pytorch-cpu`, `pytorch-gpu` and `tensorflow-xpu`.

## Prerequisites
Make sure [docker](https://docs.docker.com/engine/) is installed on the machine. Follow the [instruction here](https://docs.docker.com/engine/install/) to install docker engine on a host machine.
Expand All @@ -31,7 +31,7 @@ There are 2 modes to run thess containers:
* Interactive
* Jupyter

Before starting, pick the name of the container image from the [table](#preset-containers) based on the task to perform. The commands below use `intel/deep-learning:2023.2-py3.9` as an example.
Before starting, pick the name of the container image from the [table](#preset-containers) based on the task to perform. The commands below use `intel/deep-learning:2024.0.0-py3.9` as an example.

### Run in Interactive Mode
This mode allows running the container in an interactive shell. This enables the ability to interact with the container's bash shell. Below is the command to start the container in interactive mode:
Expand All @@ -40,10 +40,10 @@ This mode allows running the container in an interactive shell. This enables the
docker run -it --rm \
${RENDER_GROUP} \
${VIDEO_GROUP} \
--shm-size=4G \
--shm-size=12G \
-v ${PWD}:/home/dev/workdir \
-w /home/dev/workdir \
intel/deep-learning:2023.2-py3.9 bash
intel/deep-learning:2024.0.0-py3.9 bash
```

>**Note:** `${RENDER_GROUP}` and `${VIDEO_GROUP}` are required for utilizing Intel® Extension for PyTorch from the `torch` conda environment on both CPU and GPU.
Expand All @@ -56,11 +56,11 @@ docker run -it --rm \
${RENDER_GROUP} \
${VIDEO_GROUP} \
--device=/dev/dri \
--shm-size=4G \
--shm-size=12G \
-v ${PWD}:/home/dev/workdir \
-v /dev/dri/by-path:/dev/dri/by-path \
-w /home/dev/workdir \
intel/deep-learning:2023.2-py3.9 bash
intel/deep-learning:2024.0.0-py3.9 bash
```

### Run using Jupyter Notebook
Expand All @@ -76,7 +76,7 @@ docker run -it --rm \
${RENDER_GROUP} \
-e PORT=$PORT \
-p $PORT:$PORT \
intel/deep-learning:2023.2-py3.9
intel/deep-learning:2024.0.0-py3.9
```

If you want to enable [Intel® Flex/Max GPU](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu.html) optimization, please use the following command. Please note that test for the `render` group sets the Group ID for the GPU device to enable GPU optimization inside the container.
Expand All @@ -89,7 +89,7 @@ docker run -it --rm -e PORT=$PORT \
-v ${PWD}:/home/dev/jupyter \
-v /dev/dri/by-path:/dev/dri/by-path \
-p $PORT:$PORT \
intel/deep-learning:2023.2-py3.9
intel/deep-learning:2024.0.0-py3.9
```

After running this command the terminal should display an output similar to displayed below in the image ![image](https://github.com/intel/ai-containers/assets/18349036/0a8a2d05-f7b0-4a9f-994e-bcc4e4b703a0) The server address together with the port set can be used to connect to the jupyter server in a web browser. For example `http://127.0.0.1:8888`. The token displayed after the `token=` can be used as a password to login into the server. For example in the image displayed above the token is `b66e74a85bc2570bf15782e5124c933c3a4ddabd2cf2d7d3`.
Expand Down
45 changes: 28 additions & 17 deletions preset/classical-ml/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ ENV CONDA_ROOT=/home/${USERNAME}/conda
ARG MINICONDA_VERSION
ARG PYTHON_VERSION
ARG IDP_VERSION
ARG INTEL_CHANNEL

RUN wget --no-check-certificate https://repo.anaconda.com/miniconda/Miniconda3-${MINICONDA_VERSION}.sh -O miniconda.sh && \
chmod +x miniconda.sh && \
Expand All @@ -78,34 +79,44 @@ RUN wget --no-check-certificate https://repo.anaconda.com/miniconda/Miniconda3-$
ln -s ${CONDA_ROOT} ${CONDA_ROOT}/../miniconda3 && \
export PATH=${CONDA_ROOT}/bin/:${PATH} && \
conda update -y conda && \
conda config --set solver libmamba && \
conda config --show solver && \
conda config --add channels intel && \
conda init --all && \
conda clean -y --all
conda clean -y --all

ENV PATH ${CONDA_ROOT}/condabin:${CONDA_ROOT}/bin/:${PATH}

ARG SCIKIT_VERSION
ARG XGBOOST_VERSION
ARG MODIN_VERSION

RUN conda create -yn classical-ml intelpython3_core==${IDP_VERSION} python=${PYTHON_VERSION} && \
ARG DAAL4PY_VERSION

RUN conda create -yn classical-ml && \
conda install -yn classical-ml -c ${INTEL_CHANNEL} \
matplotlib-base \
numba-dpex=0.21.4 \
numpy=1.24.3 \
ipp=2021.10 \
intelpython=${IDP_VERSION} \
python=${PYTHON_VERSION} \
scikit-learn-intelex==${SCIKIT_VERSION} \
matplotlib \
daal4py=${DAAL4PY_VERSION} \
scipy \
xgboost${XGBOOST_VERSION:+==${XGBOOST_VERSION}} \
threadpoolctl && \
conda install -yn classical-ml ipython ipykernel kernda -c conda-forge && \
conda install -n classical-ml -y -c intel \
matplotlib \
daal4py \
scikit-learn-intelex${SCIKIT_VERSION:+==${SCIKIT_VERSION}} \
threadpoolctl && \
conda clean -y --all

RUN conda install -n classical-ml -y -c intel xgboost${XGBOOST_VERSION:+==${XGBOOST_VERSION}} &&\
conda clean -y --all
RUN conda run -n classical-ml python -m pip install --ignore-installed --no-cache-dir \
modin[ray]==${MODIN_VERSION} \
dataset-librarian \
cloud-data-connector && \
conda run -n classical-ml python -m pip install --no-cache-dir \
scipy==1.11.1 \
certifi=='2023.07.22'
RUN conda install -n classical-ml -y -c ${INTEL_CHANNEL} -c conda-forge \
modin-ray=${MODIN_VERSION} \
python-dotenv \
tqdm

RUN conda run -n classical-ml python -m pip install --no-deps --no-cache-dir \
dataset-librarian \
cloud-data-connector

ENV PYTHONSTARTUP=~/.patch_sklearn.py
COPY base/.patch_sklearn.py ~/.patch_sklearn.py
Expand Down
12 changes: 10 additions & 2 deletions preset/classical-ml/notebooks/Classical_ML_Samples_Overview.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,23 @@
"# Samples to Get Started with Intel® AI Tools Selector Classical ML Preset Container\n",
"This Container features several samples to get started with understanding how Intel® AI Tools delivers optimized and scalable solutions for Classical ML workflows using Jupyterlab.\n",
"\n",
"Below are a list of samples included that take advantage of Intel Optimizations for a given AI Tool:"
"Below are a list of samples included that take advantage of Intel Optimizations for a given AI Tool:\n",
"1. [Intel® Modin](./modin/IntelModin_Vs_Pandas.ipynb): This notebook illustrates how to use Modin* to replace the Pandas API. The sample compares the performance of Intel® Distribution of Modin* and the performance of Pandas for specific dataframe operations. Intel® Distribution of Modin* accelerates Pandas operations using Ray or Dask execution engine. The distribution provides compatibility and integration with the existing Pandas code. The sample code demonstrates how to perform some basic dataframe operations using Pandas and Intel® Distribution of Modin*. You will be able to compare the performance difference between the two methods.\n",
"2. [Getting Started With Intel SKLearn](./sklearn/Intel_Extension_For_SKLearn_GettingStarted.ipynb): Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application. The acceleration is achieved through the use of the Intel(R) oneAPI Data Analytics Library (oneDAL). In this example we will be recognizing handwritten digits using a machine learning classification algorithm. Handwritten digits Dataset is from sklearn toy datasets Digits dataset contains 1797 input images and for each image there are 64 pixels(8x8 matrix) as features Output has 10 classes corresponding to all the digits(0-9) Support Vector Machine(SVM) classifier is being used as machine learning classification algorith.\n",
"3. [Intel XGBoost Performance](./xgboost/IntelPython_XGBoost_Performance.ipb) : In this example we will train a XGBoost model and predict the results to show off Intel's optimizations for XGBoost used for increased performance. Intel optimized XGBoost is shipped as a part of the Intel® oneAPI AI Analytics Toolkit. This example is a Jupyter Notebook version of a XGBoost example seen in this Medium blog using the popular Higgs dataset: \n",
"https://medium.com/intel-analytics-software/new-optimizations-for-cpu-in-xgboost-1-1-81144ea2111515b)\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "",
"name": ""
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
28 changes: 14 additions & 14 deletions preset/classical-ml/tests.yaml
Original file line number Diff line number Diff line change
@@ -1,68 +1,68 @@
dataset-librarian-3.10:
cmd: conda run -n classical-ml bash -c 'yes | python -m dataset_librarian.dataset -n msmarco --download -d ~/msmarco'
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.10
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.10
dataset-librarian-3.9:
cmd: conda run -n classical-ml bash -c 'yes | python -m dataset_librarian.dataset -n msmarco --download -d ~/msmarco'
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.9
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.9
modin-3.10:
cmd: /tests/modin/test_modin.sh
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.10
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.10
shm_size: 10.24G
volumes:
- dst: /tests
src: $PWD/preset/classical-ml/tests
modin-3.9:
cmd: /tests/modin/test_modin.sh
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.9
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.9
shm_size: 10.24G
volumes:
- dst: /tests
src: $PWD/preset/classical-ml/tests
modin-notebook-3.10:
cmd: papermill --log-output modin/IntelModin_Vs_Pandas.ipynb -k classical-ml
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.10
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.10
notebook: 'true'
modin-notebook-3.9:
cmd: papermill --log-output modin/IntelModin_Vs_Pandas.ipynb -k classical-ml
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.9
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.9
notebook: 'true'
scikit-3.10:
cmd: /tests/scikit/test_scikit.sh
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.10
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.10
volumes:
- dst: /tests
src: $PWD/preset/classical-ml/tests
scikit-3.9:
cmd: /tests/scikit/test_scikit.sh
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.9
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.9
volumes:
- dst: /tests
src: $PWD/preset/classical-ml/tests
scikit-notebook-3.10:
cmd: papermill --log-output sklearn/Intel_Extension_For_SKLearn_GettingStarted.ipynb -k classical-ml
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.10
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.10
notebook: 'true'
scikit-notebook-3.9:
cmd: papermill --log-output sklearn/Intel_Extension_For_SKLearn_GettingStarted.ipynb -k classical-ml
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.9
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.9
notebook: 'true'
xgboost-3.10:
cmd: /tests/xgboost/test_xgboost.sh
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.10
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.10
volumes:
- dst: /tests
src: $PWD/preset/classical-ml/tests
xgboost-3.9:
cmd: /tests/xgboost/test_xgboost.sh
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.9
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.9
volumes:
- dst: /tests
src: $PWD/preset/classical-ml/tests
xgboost-notebook-3.10:
cmd: papermill --log-output xgboost/IntelPython_XGBoost_Performance.ipynb -k classical-ml
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.10
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.10
notebook: 'true'
xgboost-notebook-3.9:
cmd: papermill --log-output xgboost/IntelPython_XGBoost_Performance.ipynb -k classical-ml
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-2023.2-py3.9
img: ${REGISTRY}/aiops/mlops-ci:b-${GITHUB_RUN_NUMBER:-0}-classical-ml-${PRESET_RELEASE:-2024.0.0}-py3.9
notebook: 'true'
Loading

0 comments on commit 6649c08

Please sign in to comment.