Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable composable benchmark configs for flexible model+device+optimization scheduling #7349

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

guangy10
Copy link
Contributor

@guangy10 guangy10 commented Dec 18, 2024

It's not true that any combination of model + delegate can work, which make adding a new model to continuous run not easy as the workflow will run it across all delegates. Besides, each delegate may run with different configurations. For example, llama3.2 spinquant is using a prequantized checkpoitn hence it's not using the recipe for the regular fp32 checkpint. To support various combinations and optimizations, we are migrating to use benchmark_configs which is a set of predefined configs with combination of all optimizations that are possibly applied to the model, e.g. kv cache, embedding/activation quant, dtype, delegation, sdpa, etc.

In this PR, given a model (either a Hugging Face model ID or a in-tree model name) and a target platform ("android" vs "ios"), it's retrieving a list of supported benchmark configurations from the script gather_benchmark_configs.py and schedule the benchmark jobs accordingly. From the workflow dispatcher, users will just need to enter the model names, it will discover all supported benchmark configs for each model. Further more (not included in this PR), we can potentially expose config_args (key-value paris) from the script, if there is a way to store them in the DB and display in the dashboard. It will help understand how exactly a model is exported/lowered when discussing/debugging perf metrics.

Apple: https://github.com/pytorch/executorch/actions/runs/12404655922
Android: https://github.com/pytorch/executorch/actions/runs/12404651330

Copy link

pytorch-bot bot commented Dec 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7349

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit c9f0156 with merge base 6ab4399 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 18, 2024
@guangy10 guangy10 marked this pull request as draft December 18, 2024 01:10
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:11 — with GitHub Actions Failure
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:15 — with GitHub Actions Failure
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:20 — with GitHub Actions Failure
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:22 — with GitHub Actions Failure
@guangy10 guangy10 force-pushed the benchmark_configs branch 2 times, most recently from edba8e4 to 0289f0a Compare December 18, 2024 01:28
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:31 — with GitHub Actions Failure
@guangy10 guangy10 force-pushed the benchmark_configs branch 2 times, most recently from 8a9df92 to a7dc617 Compare December 18, 2024 01:34
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:40 — with GitHub Actions Failure
@guangy10 guangy10 force-pushed the benchmark_configs branch 2 times, most recently from 84c943e to 6e7a7b1 Compare December 18, 2024 01:54
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 01:59 — with GitHub Actions Failure
@guangy10 guangy10 force-pushed the benchmark_configs branch 2 times, most recently from c72fb73 to d030e94 Compare December 18, 2024 02:03
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:07 — with GitHub Actions Failure
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:14 — with GitHub Actions Failure
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:15 — with GitHub Actions Failure
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:51 — with GitHub Actions Failure
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:54 — with GitHub Actions Failure
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 02:59 — with GitHub Actions Failure
@guangy10 guangy10 added module: benchmark Features or issues related to benchmark infra, including the workflow, CI and benchmark apps topic: not user facing labels Dec 18, 2024
@guangy10 guangy10 had a problem deploying to upload-benchmark-results December 18, 2024 03:39 — with GitHub Actions Failure
@huydhn
Copy link
Contributor

huydhn commented Dec 18, 2024

We will probably not include this path in the continuous run until the RAM limits is lifted.

I'm still following up with AWS on this. The confusing part is I'm not entirely sure if this is something from AWS (but let's see). From what I read in https://developer.apple.com/documentation/bundleresources/entitlements/com.apple.developer.kernel.increased-memory-limit, this feels like a maybe and not a guarantee

If you use this entitlement, make sure your app still behaves correctly if additional memory isn’t available.

and

An increased memory limit is only available on some device models, call the os_proc_available_memory function to determine the amount of memory available

@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@guangy10 guangy10 temporarily deployed to upload-benchmark-results December 18, 2024 22:55 — with GitHub Actions Inactive
@guangy10
Copy link
Contributor Author

guangy10 commented Dec 18, 2024

Core ML ANE job (ios 17): https://github.com/pytorch/executorch/actions/runs/12403482481

@guangy10
Copy link
Contributor Author

guangy10 commented Dec 18, 2024

@guangy10 guangy10 force-pushed the benchmark_configs branch 2 times, most recently from 3120d9c to 39db916 Compare December 19, 2024 00:03
@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@guangy10
Copy link
Contributor Author

@huydhn FYI, Core ML ANE and Qualcomm HTP jobs are not going to block this PR. I will merge this PR tonight to unblock you, and leave adding those paths separately if they are not working out of the box

@huydhn
Copy link
Contributor

huydhn commented Dec 19, 2024

@huydhn FYI, Core ML ANE and Qualcomm HTP jobs are not going to block this PR. I will merge this PR tonight to unblock you, and leave adding those paths separately if they are not working out of the box

Sounds good! Once this lands, I could start working on bringing the benchmark config to the dashboard

@guangy10 guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 00:53 — with GitHub Actions Inactive
@guangy10 guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 00:54 — with GitHub Actions Inactive
@guangy10 guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 01:20 — with GitHub Actions Inactive
@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@guangy10
Copy link
Contributor Author

Core ML ANE and Qualcomm HTP are not ready. Will enable in separate PRs. For now, we just hide these paths from benchmark_configs.

@guangy10 guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 02:33 — with GitHub Actions Inactive
@guangy10 guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 02:36 — with GitHub Actions Inactive
@guangy10 guangy10 merged commit 62016d6 into main Dec 19, 2024
149 of 151 checks passed
@guangy10 guangy10 deleted the benchmark_configs branch December 19, 2024 02:58
@guangy10 guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 03:01 — with GitHub Actions Inactive
@guangy10 guangy10 temporarily deployed to upload-benchmark-results December 19, 2024 03:06 — with GitHub Actions Inactive
@guangy10
Copy link
Contributor Author

@kirklandsign FYI, the results of newly added llama3 benchmarkings (spintquant, qlora, bf16) are also missing from the benchmark_results.json. See the upload job here: https://github.com/pytorch/executorch/actions/runs/12404651330/job/34631545501

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: benchmark Features or issues related to benchmark infra, including the workflow, CI and benchmark apps topic: not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants