add llama + sd models to iree test suite #126

saienduri · 2024-03-26T04:01:21Z

This commit plugs in turbine tank exported mlir to the iree testing framework for llama + sd models.
Couldn't add real weight test for llama. Real weight file is 20GB and crashing the runner when trying.
On that note, looks like splat test cases aren't running at the moment?
Maybe we can enable the real weight test after we get the cluster.
Also, just in case it's useful for anyone adding models in future, for the exported model mlir, I had to update real weight flag to --parameters=model=real_weights.irpa to get it working.

ScottTodd

Nice! Can you also review iree-org/iree#16801 please? Then once this lands we can start testing these programs on IREE presubmit too.

ScottTodd · 2024-03-26T15:45:12Z

iree_tests/pytorch/models/llama-tank/test_cases.json

+      "name": "splats",
+      "runtime_flagfile": "splat_data_flags.txt",


On that note, looks like splat test cases aren't running at the moment?

The test code is filtering to just real weight tests (using -k real_weights) right now:

SHARK-TestSuite/.github/workflows/test_iree.yml

Lines 75 to 78 in 73dcb42

- name: "Running real weight model tests"

run: |

source ${VENV_DIR}/bin/activate

pytest iree_tests -n auto -k real_weights -rpfE --timeout=60 --retries 2 --retry-delay 5 --durations=0

I consider the test_iree.yml in this repo to be a sanity check and demonstration of the test suite itself, not a source of truth for test coverage. iree-org/iree#16801 adds similar tests in the IREE repo, and that is what "really matters" (will run on presubmit and should be kept passing). To that end, we can add whatever new filtering and test cases we want here.

Got it! Thanks for the info.

Speaking of how brittle the tests are in this repo (by using the latest IREE nightlies instead of a pinned version), you may want to sync past 28091b0 to get a clean CI run on this PR

ScottTodd · 2024-03-26T15:46:40Z

iree_tests/pytorch/models/sd-unet-tank/test_cases.json

+      "name": "real_weights",
+      "runtime_flagfile": "real_weights_data_flags.txt",


These tests timed out (and then were retried... which then ignored the timeout... that's an odd interaction, pytest plugins :P)

https://github.com/nod-ai/SHARK-TestSuite/actions/runs/8433834658/job/23095725951?pr=126#step:10:26

How about increasing the timeout here to 600 seconds?

SHARK-TestSuite/.github/workflows/test_iree.yml

Lines 75 to 78 in 73dcb42

- name: "Running real weight model tests"

run: |

source ${VENV_DIR}/bin/activate

pytest iree_tests -n auto -k real_weights -rpfE --timeout=60 --retries 2 --retry-delay 5 --durations=0

(that's what I have it set to in iree-org/iree#16801 over in the IREE repo)

ScottTodd · 2024-03-26T15:50:16Z

iree_tests/pytorch/models/sd-vae-decode-tank/test_cases.json

+          "azure_account_url": "https://sharkpublic.blob.core.windows.net",
+          "azure_container_name": "sharkpublic",
+          "azure_base_blob_name": "sai/sd-vae-decode-tank/",
+          "files": [
+            "sd-vae-decode-tank.mlirbc"
+          ]


FYI, I'm considering collapsing these back down to just URLs like
https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd-vae-decode-tank/sd-vae-decode-tank.mlirbc
I was trying to be fancy with Azure downloading and saving a few characters in .json files but making the URLs easy to copy/paste and wget is going to be nicer for developers

Ok cool. This is also pretty nice though. Just being able to see the file names instead of the cluttered urls can be nice for anyone taking a look. Maybe we can add a comment to each test like: https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd-vae-decode-tank is the base url. If interested in downloading the file simply wget base url + /{file_name}

Sure. I at least consider what's in the json files today to be very rough V0 code. So I wouldn't treat it as sacred if you find it tricky to work with.

Another ideas was to group files for each test case using tar/zip, so we wouldn't need to list them all explicitly.

Oooh that would be much easier for adding the tests

ScottTodd · 2024-03-26T15:51:18Z

iree_tests/pytorch/models/llama-tank/real_weights_data_flags.txt

+--expected_output=1x93x32000xf32=@inference_output.0.bin
+--expected_output=1x32x93x128xf32=@inference_output.1.bin
+--expected_output=1x32x93x128xf32=@inference_output.2.bin
+--expected_output=1x32x93x128xf32=@inference_output.3.bin
+--expected_output=1x32x93x128xf32=@inference_output.4.bin


Woah that's a lot of large outputs. Do we need to be checking all of them?

Is this coming from Turbine, or torchscript / "direct"?

This is coming from Turbine. These are just the tensor outputs of a llama forward pass (https://huggingface.co/docs/transformers/v4.39.1/en/main_classes/output#transformers.modeling_outputs.CausalLMOutputWithPast). Specifically, it is the logits and past_key_values tensors. We can ignore the past_key_value tensors and just look at logits. Let me know if I used the ignored outputs syntax correctly.

Let me know if I used the ignored outputs syntax correctly.

You may need to repeat --expected_output="(ignored)" for each output (could try with a smaller test case to confirm)

Tried a couple different variations (complains about no shape being provided and when I did, it just complained about being unable to parse arg). Maybe come back to it when we actually enable llama real weight test?

¯\_(ツ)_/¯

Maybe come back to it when we actually enable llama real weight test?

SGTM

Cool, are we good to land this?

…128) IREE PR: iree-org/iree#16900

ScottTodd

Regarding naming, is "turbine tank" different from https://github.com/nod-ai/SHARK-TestSuite/tree/main/e2eshark/pytorch/models ? Is that https://github.com/nod-ai/SHARK-Turbine/tree/main/models/turbine_models/turbine_tank ?

Eventually we should have scripts like https://github.com/nod-ai/SHARK-TestSuite/blob/main/iree_tests/pytorch/models/import_from_e2eshark.py or guides like https://github.com/nod-ai/SHARK-TestSuite/tree/main/iree_tests#pytorch-models for each source of model files.

Just looking ahead a bit to whenever the MLIR and input/output files need to be regenerated. We want enough either automated or at least documented such that anyone can update the files*

* if they have write access to the cloud projects, but some github actions secrets + automation can also help there too

saienduri · 2024-03-27T00:53:22Z

Yeah, it is different (https://github.com/nod-ai/SHARK-TestSuite/blob/main/turbine_tank/flows_util.py), but these models are custom and turbine specifc (https://github.com/nod-ai/SHARK-Turbine/blob/main/models/turbine_models/custom_models/sd_inference/unet_runner.py). I want to get that process fully automated (Pull in the latest passing mlir nightly from the turbinetank azure storage). Not sure if we can have a guide for adding turbine tank models though because I had to mess around with the model runners itself, so that it generates the input/output .bin files. The rest of the process (converting mlir to mlirbc and safetensors to splats, real_weights irpas) seems to already be covered.

ScottTodd · 2024-04-02T18:47:53Z

iree_tests/pytorch/models/sd-clip-tank/sd-clip-tank.mlirbc

Oh, this should have been checked in using git lfs, based on

SHARK-TestSuite/iree_tests/.gitattributes

Lines 1 to 2 in 9464d50

*.irpa filter=lfs diff=lfs merge=lfs -text

*.mlirbc filter=lfs diff=lfs merge=lfs -text

I see a warning when I pull this:

Encountered 6 file(s) that should have been pointers, but weren't: iree_tests/pytorch/models/llama-tank/splats.irpa iree_tests/pytorch/models/sd-clip-tank/sd-clip-tank.mlirbc iree_tests/pytorch/models/sd-clip-tank/splats.irpa iree_tests/pytorch/models/sd-unet-tank/sd-unet-tank.mlirbc iree_tests/pytorch/models/sd-unet-tank/splats.irpa iree_tests/pytorch/models/sd-vae-decode-tank/splats.irpa

I'm also learning how to use git lfs, so I guess I'll get back to you when I know how to resolve that and stop it from happening again...

Sent #147 to fix

Follow-up to #126 (comment) All I needed to do for generating this PR was pull changes (having git lfs installed). I think to avoid this in the future, others working with these files should also install git lfs.

This commit plugs in turbine tank exported mlir to the iree testing framework for llama + sd models. Couldn't add real weight test for llama. Real weight file is 20GB and crashing the runner when trying. On that note, looks like splat test cases aren't running at the moment? Maybe we can enable the real weight test after we get the cluster. Also, just in case it's useful for anyone adding models in future, for the exported model mlir, I had to update real weight flag to `--parameters=model=real_weights.irpa` to get it working.

Follow-up to #126 (comment) All I needed to do for generating this PR was pull changes (having git lfs installed). I think to avoid this in the future, others working with these files should also install git lfs.

saienduri added 9 commits March 25, 2024 20:59

add llama

74692cc

remove real weights

a8f426b

add unet

e11117e

add right blob base

bfa9928

update param flag

92f79ad

add clip

0aaae7e

add vae

3d50ffa

try with updated test_case json

f4aec0e

add back onnx testing

e7e5250

saienduri changed the title ~~add llama model to iree test suite~~ add llama + sd models to iree test suite Mar 26, 2024

ScottTodd reviewed Mar 26, 2024

View reviewed changes

saienduri and others added 7 commits March 26, 2024 12:11

timeout and ignore outputs

3106ae7

Sync iree_tests XFAIL sets with IREE after torch-mlir update. (#127) (#…

9eafc6e

…128) IREE PR: iree-org/iree#16900

skip onnx, try ignore

b5b0ba1

try with all ignores

32f9400

try other variation for ignored

91879b4

revert to no ignores

d6c3939

add back onnx

d62d79d

saienduri requested a review from ScottTodd March 26, 2024 23:46

ScottTodd approved these changes Mar 26, 2024

View reviewed changes

saienduri merged commit d097807 into main Mar 27, 2024
1 check passed

ScottTodd reviewed Apr 2, 2024

View reviewed changes

ScottTodd mentioned this pull request Apr 2, 2024

Migrate .mlirbc and .irpa files into Git LFS. #147

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add llama + sd models to iree test suite #126

add llama + sd models to iree test suite #126

saienduri commented Mar 26, 2024 •

edited

Loading

ScottTodd left a comment

ScottTodd Mar 26, 2024

saienduri Mar 26, 2024

ScottTodd Mar 26, 2024

ScottTodd Mar 26, 2024

ScottTodd Mar 26, 2024

saienduri Mar 26, 2024 •

edited

Loading

ScottTodd Mar 26, 2024

saienduri Mar 26, 2024

ScottTodd Mar 26, 2024

saienduri Mar 26, 2024

ScottTodd Mar 26, 2024

saienduri Mar 26, 2024

ScottTodd Mar 26, 2024

saienduri Mar 26, 2024

ScottTodd left a comment

saienduri commented Mar 27, 2024 •

edited

Loading

ScottTodd Apr 2, 2024

ScottTodd Apr 2, 2024

		"name": "splats",
		"runtime_flagfile": "splat_data_flags.txt",

	- name: "Running real weight model tests"
	run: \|
	source ${VENV_DIR}/bin/activate
	pytest iree_tests -n auto -k real_weights -rpfE --timeout=60 --retries 2 --retry-delay 5 --durations=0

		"name": "real_weights",
		"runtime_flagfile": "real_weights_data_flags.txt",

	*.irpa filter=lfs diff=lfs merge=lfs -text
	*.mlirbc filter=lfs diff=lfs merge=lfs -text

add llama + sd models to iree test suite #126

add llama + sd models to iree test suite #126

Conversation

saienduri commented Mar 26, 2024 • edited Loading

ScottTodd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saienduri Mar 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ScottTodd left a comment

Choose a reason for hiding this comment

saienduri commented Mar 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saienduri commented Mar 26, 2024 •

edited

Loading

saienduri Mar 26, 2024 •

edited

Loading

saienduri commented Mar 27, 2024 •

edited

Loading