Update branch (#161)

Signed-off-by: Gaurav Shukla<gashukla@amd.com> Co-authored-by: zjgarvey <zjgarvey@gmail.com> Co-authored-by: kumardeepakamd <123522031+kumardeepakamd@users.noreply.github.com> Co-authored-by: Scott Todd <scott.todd0@gmail.com> Co-authored-by: zjgarvey <47986913+zjgarvey@users.noreply.github.com> Co-authored-by: Andreas Falkenberg <149819731+afalkenberg1@users.noreply.github.com> Co-authored-by: Xida Ren <xida.ren.dev@gmail.com> Co-authored-by: Xida Ren (Cedar) <cedar.ren@gmail.com> Co-authored-by: Gaurav Shukla <gaurav@nod-labs.com> Co-authored-by: Kumar Deepak <kumar@xilinx.com> Co-authored-by: afalkenberg1 <afalkenb@amd.com> Co-authored-by: Phaneesh Barwaria <b.phaneesh@gmail.com> Co-authored-by: Chi_Liu <22491986+AmosLewis@users.noreply.github.com>
nod-ai · Apr 12, 2024 · 5a73d7e · 5a73d7e
1 parent 70652b1
commit 5a73d7e
Show file tree

Hide file tree

Showing 157 changed files with 3,532 additions and 1,065 deletions.
diff --git a/.github/workflows/test_iree.yml b/.github/workflows/test_iree.yml
@@ -1,4 +1,4 @@
-# Copyright 2024 Advanced Micro Devices
+# Copyright 2024 Advanced Micro Devices, Inc.
 #
 # Licensed under the Apache License v2.0 with LLVM Exceptions.
 # See https://llvm.org/LICENSE.txt for license information.
@@ -21,8 +21,8 @@ concurrency:
   cancel-in-progress: true
 
 jobs:
-  linux_x86_64:
-    name: Linux (x86_64)
+  linux_x86_64_onnx:
+    name: Linux (x86_64) Onnx
     runs-on: ubuntu-latest
     env:
       VENV_DIR: ${{ github.workspace }}/.venv
@@ -67,12 +67,46 @@ jobs:
           source ${VENV_DIR}/bin/activate
           pytest iree_tests/onnx/node/generated -n auto -rpfE --timeout=30 --retries 2 --retry-delay 5 --durations=10
 
+  linux_x86_64_w7900_gpu_models:
+    name: Linux (x86_64 w7900) Models GPU
+    runs-on: nodai-amdgpu-w7900-x86-64
+    env:
+      VENV_DIR: ${{ github.workspace }}/.venv
+    steps:
+      - name: "Checking out repository"
+        uses: actions/checkout@v4
+        with:
+          submodules: false
+          lfs: true
+      - name: "Setting up Python"
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+
+      - name: "Setup Python venv"
+        run: python3 -m venv ${VENV_DIR}
+
+      - name: "Installing IREE nightly release Python packages"
+        run: |
+          source ${VENV_DIR}/bin/activate
+          python3 -m pip install \
+            --find-links https://iree.dev/pip-release-links.html \
+            --upgrade \
+            iree-compiler \
+            iree-runtime
+
+      - name: "Installing other Python requirements"
+        run: |
+          source ${VENV_DIR}/bin/activate
+          python3 -m pip install -r iree_tests/requirements.txt
       # TODO(scotttodd): add a local cache for these large files to a persistent runner
       - name: "Downloading remote files for real weight model tests"
         run: |
           source ${VENV_DIR}/bin/activate
-          python3 iree_tests/download_remote_files.py
+          python3 iree_tests/download_remote_files.py --root-dir pytorch/models
       - name: "Running real weight model tests"
+        env:
+            IREE_TEST_CONFIG_FILES: iree_tests/configs/config_pytorch_models_cpu_llvm_task.json
         run: |
           source ${VENV_DIR}/bin/activate
-          pytest iree_tests -n auto -k real_weights -rpfE --timeout=60 --retries 2 --retry-delay 5 --durations=0
+          pytest iree_tests/pytorch/models -s -n 4 -k real_weights -rpfE --timeout=1200 --retries 2 --retry-delay 5 --durations=0
diff --git a/e2eshark/README.md b/e2eshark/README.md
@@ -163,9 +163,14 @@ Note that the --cachedir command line argument is necessary for any run command.
 
 Run the tests in operators, combinations folders of the default framework (i.e. pytorch),
 Use framework to onnx to torch MLIR path (--mode onnx) and run upto inference (default) using llvm-cpu backend (default),
-use four processor cores (default --jobs 4) on your machine, generate report file after finishing test run
+use four processor cores (default --jobs 4) on your machine, generate report file after finishing test run.
+
+If the model you are running requires a huggingface token (llama, gemma), set the HF_TOKEN env variable as well.
+Either set environment in shell (`export HF_TOKEN=your_token`) or add it on in command line
+as shown below.
+
 ```
-python ./run.py -c 'path_to_your_torch_mlir_build_dir' -i 'path_to_your_iree_build_dir'
+HF_TOKEN=your_token python ./run.py -c 'path_to_your_torch_mlir_build_dir' -i 'path_to_your_iree_build_dir'
 --report --cachedir 'path_to_your_cache_dir'
 ```
 You can see logs of test run inside test-run/'test sub-directory'. Start with commands.log file. 
@@ -296,6 +301,25 @@ if more than two runs are diffed, then when values differ the comma separated di
 ```
 The -1 under inference indicates, one test regressed in inference
 
+### Running tests with upload
+
+If you are interested in running tests, but want to upload the mlir files generated to Azure 
+to share with others or for yourself, first you will have to set the AZURE_CONNECTION_STRING environment
+variable. You can find this connection string here: https://portal.azure.com/#@amdcloud.onmicrosoft.com/resource/subscriptions/8c190d1b-eb91-48d5-bec5-3e7cb7412e6c/resourceGroups/pdue-nod-ai-rg/providers/Microsoft.Storage/storageAccounts/e2esharkuserartifacts/keys. If you don't have access to link above, you can ask Sai Enduri for the connection string.
+
+Then, setup an upload_list.txt file with the names of the models you want to upload on. There is already one
+setup at e2eshark/gold/upload_list.txt. You can just modify that one.
+
+Optional: If you want to change what type of files are being uploaded, simply tweak `upload_list = ["mlir"]` in e2eshark/run.py to change or add more file types you want to upload (`upload_list = ["mlir", "log"]` for example).
+
+
+With this connection string and upload list file, you can now run command like this:
+```
+python ./run.py -c ../../torch-mlir/build -i ../../iree-build --report --cachedir ~/.cache/huggingface --mode direct --tests pytorch/models/beit-base-patch16-224-pt22k-ft22k pytorch/models/bge-base-en-v1.5 pytorch/models/mit-b0 pytorch/models/bert-large-uncased pytorch/models/deit-small-distilled-patch16-224 --uploadtestsfile /home/sai/SHARK-TestSuite-fork/e2eshark/gold/upload_list.txt --cleanup
+```
+
+You can then find a json file (`upload_urls.json` in e2eshark directory) with the model names and links to the files uploaded for each model. You can just wget these links to download as it is public, so should be easy to share with others.
+
 ### Adding new tests
 
 #### Adding test in framework pytorch

diff --git a/e2eshark/gold/onnx-models-passed.txt b/e2eshark/gold/onnx-models-passed.txt
@@ -0,0 +1,17 @@
+onnx/models/resnet50_vaiq_int8
+pytorch/models/opt-125M
+pytorch/models/resnet50
+pytorch/models/opt-1.3b
+pytorch/models/beit-base-patch16-224-pt22k-ft22k
+pytorch/models/bert-large-uncased
+pytorch/models/bge-base-en-v1.5
+pytorch/models/gpt2-xl
+pytorch/models/gpt2
+pytorch/models/miniLM-L12-H384-uncased
+pytorch/models/opt-350m
+pytorch/models/t5-base
+pytorch/models/t5-large
+pytorch/models/vicuna-13b-v1.3
+pytorch/models/whisper-base
+pytorch/models/whisper-medium
+pytorch/models/whisper-small
diff --git a/e2eshark/gold/passed.txt b/e2eshark/gold/passed.txt
@@ -1,11 +1,42 @@
 onnx/combinations/constant_constantofshape
-onnx/operators/add 
-onnx/operators/CumSum 
-onnx/operators/Shape 
-onnx/operators/Pow 
-onnx/operators/Sigmoid 
+onnx/operators/add
+onnx/operators/gemm
+onnx/operators/CumSum
+onnx/operators/Pow
+onnx/operators/Shape
+onnx/operators/Sigmoid
+onnx/operators/identity
+onnx/operators/relu
+onnx/operators/reshape
+onnx/operators/reduceprod
+onnx/operators/expand
+onnx/operators/MatMul
+onnx/operators/Mul
+onnx/operators/Softmax
+onnx/operators/concat
+onnx/operators/equal
+onnx/operators/flatten
+onnx/operators/layernorm
+onnx/operators/maxpool
+onnx/operators/neg
+onnx/operators/gemm2
 pytorch/combinations/mlp
-pytorch/models/resnet50
 pytorch/operators/conv2d
 pytorch/operators/linear
-
+pytorch/operators/gridsampler
+pytorch/models/opt-125M
+pytorch/models/resnet50
+pytorch/models/opt-1.3b
+pytorch/models/beit-base-patch16-224-pt22k-ft22k
+pytorch/models/bert-large-uncased
+pytorch/models/bge-base-en-v1.5
+pytorch/models/gpt2-xl
+pytorch/models/gpt2
+pytorch/models/miniLM-L12-H384-uncased
+pytorch/models/opt-350m
+pytorch/models/t5-base
+pytorch/models/t5-large
+pytorch/models/vicuna-13b-v1.3
+pytorch/models/whisper-base
+pytorch/models/whisper-medium
+pytorch/models/whisper-small
diff --git a/e2eshark/gold/turbine-models-passed.txt b/e2eshark/gold/turbine-models-passed.txt
@@ -1,10 +1,21 @@
+pytorch/models/t5-large
+pytorch/models/bert-large-uncased
+pytorch/models/llama2-7b-GPTQ
+pytorch/models/opt-350m
+pytorch/models/mobilebert-uncased
+pytorch/models/opt-1.3b
+pytorch/models/whisper-base
+pytorch/models/bge-base-en-v1.5
+pytorch/models/miniLM-L12-H384-uncased
+pytorch/models/llama2-7b-hf
+pytorch/models/t5-base
+pytorch/models/bart-large
 pytorch/models/vit-base-patch16-224
 pytorch/models/resnet50
-pytorch/models/opt-125M
 pytorch/models/beit-base-patch16-224-pt22k-ft22k
-pytorch/models/bge-base-en-v1.5
+pytorch/models/whisper-medium
+pytorch/models/opt-125M
+pytorch/models/opt-125m-gptq
 pytorch/models/deit-small-distilled-patch16-224
-pytorch/models/bert-large-uncased
-pytorch/models/miniLM-L12-H384-uncased
-pytorch/models/opt-1.3b
-pytorch/models/opt-350m
+pytorch/models/whisper-small
+pytorch/models/vicuna-13b-v1.3
diff --git a/e2eshark/onnx/.gitignore b/e2eshark/onnx/.gitignore
@@ -1,2 +1,4 @@
 # Model artifacts
 operators/*/model.onnx
+models/*/model.onnx
+models/*/model.onnxsignature.json
diff --git a/e2eshark/onnx/combinations/QuantizeToMatMulInteger/model.py b/e2eshark/onnx/combinations/QuantizeToMatMulInteger/model.py
@@ -0,0 +1,82 @@
+# Copyright 2024 Advanced Micro Devices, Inc.
+#
+# Licensed under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+# run.py creates runmodel.py by concatenating this file model.py
+# and tools/stubs/onnxmodel.py
+
+# See https://onnx.ai/onnx/intro/python.html for intro on creating
+# onnx model using python onnx API
+import numpy, torch, sys
+import onnxruntime
+from onnx import numpy_helper, TensorProto, save_model
+from onnx.helper import make_model, make_node, make_graph, make_tensor_value_info
+from onnx.checker import check_model
+
+# import from e2eshark/tools to allow running in current dir, for run through
+# run.pl, commutils is symbolically linked to allow any rundir to work
+sys.path.insert(0, "../../../tools/stubs")
+from commonutils import E2ESHARK_CHECK_DEF
+
+# Create an instance of it for this test
+E2ESHARK_CHECK = dict(E2ESHARK_CHECK_DEF)
+
+# Create an input (ValueInfoProto)
+X = make_tensor_value_info("X", TensorProto.FLOAT, [2, 4, 5])
+Y = make_tensor_value_info("Y", TensorProto.FLOAT, [5, 3])
+
+# Create an output
+Z = make_tensor_value_info("Z", TensorProto.INT32, [2, 4, 3])
+
+# Create a node (NodeProto)
+qlxnode = make_node(
+    "DynamicQuantizeLinear", ["X"], ["QX", "SX", "ZPX"], "qlxnode"
+)
+qlynode = make_node(
+    "DynamicQuantizeLinear", ["Y"], ["QY", "SY", "ZPY"], "qlynode"
+)
+mminode = make_node(
+    "MatMulInteger", ["QX", "QY", "ZPX", "ZPY"], ["Z"], "mminode"  # node name  # inputs  # outputs
+)
+
+# Create the graph (GraphProto)
+graph = make_graph(
+    [qlxnode,qlynode, mminode],
+    "mmigraph",
+    [X, Y],
+    [Z],
+)
+
+# Create the model (ModelProto)
+onnx_model = make_model(graph)
+onnx_model.opset_import[0].version = 19
+
+# Save the model
+# save_model(onnx_model, "model.onnx")
+with open("model.onnx", "wb") as f:
+    f.write(onnx_model.SerializeToString())
+
+session = onnxruntime.InferenceSession("model.onnx", None)
+model_input_X = numpy.random.randn(2, 4, 5).astype(numpy.float32)
+model_input_Y = numpy.random.randn(5, 3).astype(numpy.float32)
+# gets X in inputs[0] and Y in inputs[1]
+inputs = session.get_inputs()
+# gets Z in outputs[0]
+outputs = session.get_outputs()
+
+model_output = session.run(
+    [outputs[0].name],
+    {inputs[0].name: model_input_X, inputs[1].name: model_input_Y},
+)
+
+# Moving to torch to handle bfloat16 as numpy does not support bfloat16
+E2ESHARK_CHECK["input"] = [
+    torch.from_numpy(model_input_X),
+    torch.from_numpy(model_input_Y),
+]
+E2ESHARK_CHECK["output"] = [torch.from_numpy(arr) for arr in model_output]
+
+print("Input:", E2ESHARK_CHECK["input"])
+print("Output:", E2ESHARK_CHECK["output"])
diff --git a/e2eshark/onnx/combinations/constant_constantofshape/model.py b/e2eshark/onnx/combinations/constant_constantofshape/model.py
@@ -1,4 +1,4 @@
-# Copyright 2024 Advanced Micro Devices
+# Copyright 2024 Advanced Micro Devices, Inc.
 #
 # Licensed under the Apache License v2.0 with LLVM Exceptions.
 # See https://llvm.org/LICENSE.txt for license information.

diff --git a/e2eshark/onnx/combinations/mlp/model.py b/e2eshark/onnx/combinations/mlp/model.py
@@ -1,4 +1,4 @@
-# Copyright 2024 Advanced Micro Devices
+# Copyright 2024 Advanced Micro Devices, Inc.
 #
 # Licensed under the Apache License v2.0 with LLVM Exceptions.
 # See https://llvm.org/LICENSE.txt for license information.

diff --git a/e2eshark/onnx/models/AlexNet_vaiq_int8/model.py b/e2eshark/onnx/models/AlexNet_vaiq_int8/model.py
@@ -0,0 +1,49 @@
+import numpy, torch, sys
+import onnxruntime
+
+# import from e2eshark/tools to allow running in current dir, for run through
+# run.pl, commutils is symbolically linked to allow any rundir to work
+sys.path.insert(0, "../../../tools/stubs")
+from commonutils import E2ESHARK_CHECK_DEF
+
+# Create an instance of it for this test
+E2ESHARK_CHECK = dict(E2ESHARK_CHECK_DEF)
+
+
+# The generated or checked in onnx file must always be called model.onnx
+# the tools/stubs/onnxmodel.py is appended to model.py
+# to form runmodel.py in the rundirectory which is then taken
+# through flow
+
+
+# start an onnxrt session
+session = onnxruntime.InferenceSession("model.onnx", None)
+
+# Even if model is quantized, the inputs and outputs are
+# not, so apply float32
+model_input_X = numpy.random.rand(1, 3, 224, 224).astype(numpy.float32)
+
+# gets X in inputs[0] and Y in inputs[1]
+inputs = session.get_inputs()
+# gets Z in outputs[0]
+outputs = session.get_outputs()
+
+
+model_output = session.run(
+    [outputs[0].name],
+    {inputs[0].name: model_input_X},
+)[0]
+E2ESHARK_CHECK["input"] = [torch.from_numpy(model_input_X)]
+E2ESHARK_CHECK["output"] = [torch.from_numpy(arr) for arr in model_output]
+
+print("Input:", E2ESHARK_CHECK["input"])
+print("Output:", E2ESHARK_CHECK["output"])
+
+# Post process output to do:
+# sort(topk(torch.nn.functional.softmax(output, 0), 2)[1])[0]
+# Top most probability
+# E2ESHARK_CHECK["postprocess"] = [
+#     (torch.nn.functional.softmax, [0], False, 0),
+#     (torch.topk, [2], True, 1),
+#     (torch.sort, [], True, 0),
+# ]