Enable composable benchmark configs for flexible model+device+optimization scheduling #7349

guangy10 · 2024-12-18T01:10:06Z

It's not true that any combination of model + delegate can work, which make adding a new model to continuous run not easy as the workflow will run it across all delegates. Besides, each delegate may run with different configurations. For example, llama3.2 spinquant is using a prequantized checkpoitn hence it's not using the recipe for the regular fp32 checkpint. To support various combinations and optimizations, we are migrating to use benchmark_configs which is a set of predefined configs with combination of all optimizations that are possibly applied to the model, e.g. kv cache, embedding/activation quant, dtype, delegation, sdpa, etc.

In this PR, given a model (either a Hugging Face model ID or a in-tree model name) and a target platform ("android" vs "ios"), it's retrieving a list of supported benchmark configurations from the script gather_benchmark_configs.py and schedule the benchmark jobs accordingly. From the workflow dispatcher, users will just need to enter the model names, it will discover all supported benchmark configs for each model. Further more (not included in this PR), we can potentially expose config_args (key-value paris) from the script, if there is a way to store them in the DB and display in the dashboard. It will help understand how exactly a model is exported/lowered when discussing/debugging perf metrics.

Apple: https://github.com/pytorch/executorch/actions/runs/12404655922
Android: https://github.com/pytorch/executorch/actions/runs/12404651330

pytorch-bot · 2024-12-18T01:10:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7349

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit c9f0156 with merge base 6ab4399 ():

NEW FAILURE - The following job has failed:

apple-perf / benchmark-on-device (llama, mps, apple_iphone_15, arn:aws:devicefarm:us-west-2:308535385114:devic... / mobile-job (ios) (gh)
Final attempt failed. Child_process exited with error code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

huydhn · 2024-12-18T21:29:15Z

We will probably not include this path in the continuous run until the RAM limits is lifted.

I'm still following up with AWS on this. The confusing part is I'm not entirely sure if this is something from AWS (but let's see). From what I read in https://developer.apple.com/documentation/bundleresources/entitlements/com.apple.developer.kernel.increased-memory-limit, this feels like a maybe and not a guarantee

If you use this entitlement, make sure your app still behaves correctly if additional memory isn’t available.

and

An increased memory limit is only available on some device models, call the os_proc_available_memory function to determine the amount of memory available

facebook-github-bot · 2024-12-18T22:13:22Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

.github/workflows/android-perf.yml

guangy10 · 2024-12-18T23:15:44Z

Core ML ANE job (ios 17): https://github.com/pytorch/executorch/actions/runs/12403482481

guangy10 · 2024-12-18T23:16:25Z

Qualcomm HTP job: https://github.com/pytorch/executorch/actions/runs/12403453453/job/34626940914

.ci/scripts/gather_benchmark_configs.py

facebook-github-bot · 2024-12-19T00:08:42Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

guangy10 · 2024-12-19T00:10:22Z

@huydhn FYI, Core ML ANE and Qualcomm HTP jobs are not going to block this PR. I will merge this PR tonight to unblock you, and leave adding those paths separately if they are not working out of the box

huydhn · 2024-12-19T00:43:49Z

@huydhn FYI, Core ML ANE and Qualcomm HTP jobs are not going to block this PR. I will merge this PR tonight to unblock you, and leave adding those paths separately if they are not working out of the box

Sounds good! Once this lands, I could start working on bringing the benchmark config to the dashboard

…ation scheduling

facebook-github-bot · 2024-12-19T01:42:14Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

guangy10 · 2024-12-19T01:43:23Z

Core ML ANE and Qualcomm HTP are not ready. Will enable in separate PRs. For now, we just hide these paths from benchmark_configs.

guangy10 · 2024-12-19T03:22:03Z

@kirklandsign FYI, the results of newly added llama3 benchmarkings (spintquant, qlora, bf16) are also missing from the benchmark_results.json. See the upload job here: https://github.com/pytorch/executorch/actions/runs/12404651330/job/34631545501

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 18, 2024

guangy10 marked this pull request as draft December 18, 2024 01:10

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:11 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:15 — with GitHub Actions Failure

guangy10 force-pushed the benchmark_configs branch from a70add1 to 2e2ab00 Compare December 18, 2024 01:19

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:20 — with GitHub Actions Failure

guangy10 force-pushed the benchmark_configs branch from 2e2ab00 to 19197a1 Compare December 18, 2024 01:22

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:22 — with GitHub Actions Failure

guangy10 force-pushed the benchmark_configs branch 2 times, most recently from edba8e4 to 0289f0a Compare December 18, 2024 01:28

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:31 — with GitHub Actions Failure

guangy10 force-pushed the benchmark_configs branch 2 times, most recently from 8a9df92 to a7dc617 Compare December 18, 2024 01:34

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:40 — with GitHub Actions Failure

guangy10 force-pushed the benchmark_configs branch 2 times, most recently from 84c943e to 6e7a7b1 Compare December 18, 2024 01:54

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:59 — with GitHub Actions Failure

guangy10 force-pushed the benchmark_configs branch 2 times, most recently from c72fb73 to d030e94 Compare December 18, 2024 02:03

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:07 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:14 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:15 — with GitHub Actions Failure

guangy10 force-pushed the benchmark_configs branch from d030e94 to 95ff5de Compare December 18, 2024 02:17

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:51 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:54 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:59 — with GitHub Actions Failure

guangy10 added module: benchmark Features or issues related to benchmark infra, including the workflow, CI and benchmark apps topic: not user facing labels Dec 18, 2024

guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 03:39 — with GitHub Actions Failure

guangy10 force-pushed the benchmark_configs branch from c49b41d to 68c1719 Compare December 18, 2024 22:10

guangy10 commented Dec 18, 2024

View reviewed changes

.github/workflows/android-perf.yml Show resolved Hide resolved

guangy10 temporarily deployed to upload-benchmark-results December 18, 2024 22:55 — with GitHub Actions Inactive

guangy10 force-pushed the benchmark_configs branch from 68c1719 to 4940f44 Compare December 18, 2024 23:10

huydhn reviewed Dec 18, 2024

View reviewed changes

.ci/scripts/gather_benchmark_configs.py Outdated Show resolved Hide resolved

guangy10 force-pushed the benchmark_configs branch from 4940f44 to 38713c2 Compare December 18, 2024 23:35

huydhn approved these changes Dec 18, 2024

View reviewed changes

guangy10 force-pushed the benchmark_configs branch 2 times, most recently from 3120d9c to 39db916 Compare December 19, 2024 00:03

guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 00:53 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 00:54 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 01:20 — with GitHub Actions Inactive

Enable composable benchmark configs for flexible model+device+optimiz…

c9f0156

…ation scheduling

guangy10 force-pushed the benchmark_configs branch from 39db916 to c9f0156 Compare December 19, 2024 01:40

guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 02:33 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 02:36 — with GitHub Actions Inactive

guangy10 merged commit 62016d6 into main Dec 19, 2024
149 of 151 checks passed

guangy10 deleted the benchmark_configs branch December 19, 2024 02:58

guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 03:01 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 03:06 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable composable benchmark configs for flexible model+device+optimization scheduling #7349

Enable composable benchmark configs for flexible model+device+optimization scheduling #7349

guangy10 commented Dec 18, 2024 •

edited

Loading

pytorch-bot bot commented Dec 18, 2024 •

edited

Loading

huydhn commented Dec 18, 2024

facebook-github-bot commented Dec 18, 2024

guangy10 commented Dec 18, 2024 •

edited

Loading

guangy10 commented Dec 18, 2024 •

edited

Loading

facebook-github-bot commented Dec 19, 2024

guangy10 commented Dec 19, 2024

huydhn commented Dec 19, 2024 •

edited

Loading

facebook-github-bot commented Dec 19, 2024

guangy10 commented Dec 19, 2024

guangy10 commented Dec 19, 2024

Enable composable benchmark configs for flexible model+device+optimization scheduling #7349

Enable composable benchmark configs for flexible model+device+optimization scheduling #7349

Conversation

guangy10 commented Dec 18, 2024 • edited Loading

pytorch-bot bot commented Dec 18, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7349

❌ 1 New Failure

huydhn commented Dec 18, 2024

facebook-github-bot commented Dec 18, 2024

guangy10 commented Dec 18, 2024 • edited Loading

guangy10 commented Dec 18, 2024 • edited Loading

facebook-github-bot commented Dec 19, 2024

guangy10 commented Dec 19, 2024

huydhn commented Dec 19, 2024 • edited Loading

facebook-github-bot commented Dec 19, 2024

guangy10 commented Dec 19, 2024

guangy10 commented Dec 19, 2024

guangy10 commented Dec 18, 2024 •

edited

Loading

pytorch-bot bot commented Dec 18, 2024 •

edited

Loading

guangy10 commented Dec 18, 2024 •

edited

Loading

guangy10 commented Dec 18, 2024 •

edited

Loading

huydhn commented Dec 19, 2024 •

edited

Loading