-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Llama3.1 1B HTP to benchmark #7398
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7398
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 2c5f9dc with merge base b2a680b (): NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
32a44a8
to
e15fd50
Compare
e15fd50
to
fc3ccb4
Compare
The HTP path is expected to take much longer due to calibration. Need to bump up the timeout threshold. I guess there is no harm to bump it up for ALL as most will finish within 30mins anyway. |
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
3287b18
to
0565c7e
Compare
0565c7e
to
361ca85
Compare
docker-image: executorch-ubuntu-22.04-qnn-sdk | ||
submodules: 'true' | ||
timeout: 60 | ||
timeout: 240 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set to 120 still timed out.
@@ -132,10 +132,10 @@ jobs: | |||
matrix: ${{ fromJson(needs.set-parameters.outputs.benchmark_configs) }} | |||
fail-fast: false | |||
with: | |||
runner: linux.2xlarge.memory | |||
runner: linux.4xlarge.memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locally on devserver, quantization is taking "INFO:root:Time for quantizing: 1203.9422521591187", but on CI it's 4x slower, bump up to use the 4x runner.
361ca85
to
88a9267
Compare
88a9267
to
2c5f9dc
Compare
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Llama3.2 QNN HTP: https://github.com/pytorch/executorch/actions/runs/12426136559/job/34693953714