Skip to content

Commit

Permalink
Added isolet logistic conversion
Browse files Browse the repository at this point in the history
  • Loading branch information
johnhansarickWallaroo committed Jun 21, 2022
1 parent cf0cabf commit 9ec6414
Show file tree
Hide file tree
Showing 9 changed files with 560 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to Convert sk-learn Logistic Model to ONNX\n",
"\n",
"The following tutorial is a brief example of how to convert a [scikit-learn](https://scikit-learn.org/stable/) (aka sk-learn) **logistic ML model** to the [ONNX](https://onnx.ai/ ) format for use with Wallaroo.\n",
"\n",
"This tutorial assumes that you have a Wallaroo instance and are running this Notebook from the Wallaroo Jupyter Hub service."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This tutorial provides the following:\n",
"\n",
"* `isolet_logistic_model_numclass.pickle`: a logistic trained sk-learn model.\n",
" This model contains 617 columns.\n",
"* `isolet_test_data.tsv`: A test file that can be used to verify the output of the converted logistic model.\n",
"* `try_isolet.ipynb`: This Jupyter Notebook demonstrates the use of the converted sk-learn logistic ML model in ONNX with Wallaroo."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conversion Process\n",
"\n",
"### Libraries\n",
"\n",
"The first step is to import our libraries we will be using."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# Used to load the sk-learn model\n",
"import pickle\n",
"\n",
"# Used for the conversion process\n",
"import onnx, skl2onnx, onnxmltools\n",
"from skl2onnx.common.data_types import FloatTensorType\n",
"from skl2onnx.common.data_types import DoubleTensorType\n",
"from skl2onnx import convert_sklearn"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can determine the correct ONNX Target Opset for our libraries.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"13"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# figure out the correct opset\n",
"\n",
"from onnx.defs import onnx_opset_version\n",
"from onnxconverter_common.onnx_ex import DEFAULT_OPSET_NUMBER\n",
"TARGET_OPSET = min(DEFAULT_OPSET_NUMBER, onnx_opset_version())\n",
"TARGET_OPSET"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With the `TARGET_OPSET` determined, we can convert our sklearn logistic model to onnx.\n",
"\n",
"* **IMPORTANT NOTE**: Note that for the conversion process, `zipmap` is **disabled**."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Load the model that we will be converting:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/conda/lib/python3.8/site-packages/sklearn/base.py:310: UserWarning: Trying to unpickle estimator LogisticRegression from version 0.24.2 when using version 0.24.1. This might lead to breaking code or invalid results. Use at your own risk.\n",
" warnings.warn(\n"
]
}
],
"source": [
"# convert model to ONNX\n",
"\n",
"# load the model\n",
"\n",
"# with open(\"isolet_logistic_model.pickle\", \"rb\") as f:\n",
"with open(\"./isolet_logistic_model_numclass.pickle\", \"rb\") as f:\n",
" logistic_model = pickle.load(f)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We already know the number of columns, so we'll set that variable in the next step."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Set the number of columns\n",
"\n",
"ncols = 617"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next up is to set the options. As a reminder **zipmap must be disabled**."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"## Set the options\n",
"\n",
"initial_type = [('float_input', FloatTensorType([None, ncols]))]\n",
"options = {id(logistic_model): {'zipmap': False}} # here we turn off the zipmap"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With everything ready, we can now convert the sk-learn Logistics model to ONNX, and store it in the variable `onnx_model_converted`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"## Run the conversion\n",
"\n",
"onnx_model_converted = convert_sklearn(logistic_model, initial_types=initial_type, options=options,\n",
" target_opset=TARGET_OPSET)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can save our model to a `onnx` file. Once complete, we can run it through the `try_isolet.ipynb` notebook to verify it."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# Export the model to a file\n",
"onnx.save_model(onnx_model_converted, \"isolet_logistic_model_numclass.onnx\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
10 changes: 10 additions & 0 deletions model_conversion/sklearn-logistic-to-onnx/isolet_test_data.tsv

Large diffs are not rendered by default.

Loading

0 comments on commit 9ec6414

Please sign in to comment.