-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add opcheck testing for nms #7961
Conversation
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7961
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 00673be with merge base e3fb8c0 (): FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This adds testing for the meta registrations as promised. PyTorch core doesn't have all of the things needed landed yet, so we can't merge it yet pytorch/pytorch#108936 |
…enerating optest" Richard, I'm curious to see what you think of this. I'm trying to use optest on the torchvision test suite, and after hacking up pytest support in #108929 I noticed that this was 5x'ing the test time... for no good reason. * torchvision nms tests before optests: 60 passed, 4 skipped, 1206 deselected in 11.47s * after optests: 300 passed, 20 skipped, 1206 deselected in 49.85s It's no good reason because torchvision has parametrized the tests to get a spread of various random generation, but for checking schema or fake tensor, we don't actually need to test for different values. This PR hacks up the codegen to replace pytest parametrize markers so that, instead of sampling many values, we sample only one value if you mark it with `opcheck_only_one`. There's a carveout for device parametrization, where we always run all those variants. With this PR: * reduced optests: 88 passed, 4 skipped, 1206 deselected in 13.89s Companion torchvision PR which uses this at pytorch/vision#7961 Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]
Richard, I'm curious to see what you think of this. I'm trying to use optest on the torchvision test suite, and after hacking up pytest support in #108929 I noticed that this was 5x'ing the test time... for no good reason. * torchvision nms tests before optests: 60 passed, 4 skipped, 1206 deselected in 11.47s * after optests: 300 passed, 20 skipped, 1206 deselected in 49.85s It's no good reason because torchvision has parametrized the tests to get a spread of various random generation, but for checking schema or fake tensor, we don't actually need to test for different values. This PR hacks up the codegen to replace pytest parametrize markers so that, instead of sampling many values, we sample only one value if you mark it with `opcheck_only_one`. There's a carveout for device parametrization, where we always run all those variants. With this PR: * reduced optests: 88 passed, 4 skipped, 1206 deselected in 13.89s Companion torchvision PR which uses this at pytorch/vision#7961 Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]
Richard, I'm curious to see what you think of this. I'm trying to use optest on the torchvision test suite, and after hacking up pytest support in #108929 I noticed that this was 5x'ing the test time... for no good reason. * torchvision nms tests before optests: 60 passed, 4 skipped, 1206 deselected in 11.47s * after optests: 300 passed, 20 skipped, 1206 deselected in 49.85s It's no good reason because torchvision has parametrized the tests to get a spread of various random generation, but for checking schema or fake tensor, we don't actually need to test for different values. This PR hacks up the codegen to replace pytest parametrize markers so that, instead of sampling many values, we sample only one value if you mark it with `opcheck_only_one`. There's a carveout for device parametrization, where we always run all those variants. With this PR: * reduced optests: 88 passed, 4 skipped, 1206 deselected in 13.89s Companion torchvision PR which uses this at pytorch/vision#7961 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: #108936 Approved by: https://github.com/zou3519
test/test_ops.py
Outdated
optests.generate_opcheck_tests( | ||
TestNMS, | ||
["torchvision"], | ||
{}, | ||
"test/test_ops.py", | ||
[], | ||
data_dependent_torchvision_test_checks, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, after pytorch/pytorch#109110 the failures dict is no longer a python dict, generate_opcheck_tests now assumes the existence of a json file.
The NMS tests work on everything, so there are no expected failures. It seems unfortunate that we would need to have a .json file in the repo to use generate_opcheck_tests. But I don't really have a better idea right now, the reason why we require a json file is so that we can automatically update it by writing to it; a string is a bit more difficult because we haven't hooked into the expecttest mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind having to have an empty JSON, but I am going to have to do this for each test class (of which there are a lot)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a similar problem with fbgemm, where there are a lot of test classes. Maybe everything should roll into the same json file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @ezyang
Our CI is missing the expecttest
dep, I pushed a commit for that. Let's see.
Can't land this yet, because @zou3519 is going to make some more changes to the opcheck API that I want to land before I do this, but the review is appreciated. |
@@ -462,9 +471,10 @@ def test_boxes_shape(self): | |||
|
|||
@pytest.mark.parametrize("aligned", (True, False)) | |||
@pytest.mark.parametrize("device", cpu_and_cuda_and_mps()) | |||
@pytest.mark.parametrize("x_dtype", (torch.float16, torch.float32, torch.float64), ids=str) | |||
@pytest.mark.parametrize("x_dtype", (torch.float16, torch.float32, torch.float64)) # , ids=str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to comment-out the ids
part because we're hitting this assert 🤔 https://github.com/pytorch/pytorch/blob/2a40b7efcb273d1689b55763fade49263adcc788/torch/testing/_internal/optests/generate_tests.py#L225
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang I updated the PR according to the new opcheck, LGTM. LMK if there's anything more you wanted to add on your side, otherwise I'll merge.
Just FYI, we're hitting this "NYI" failure #7961 (comment)
No, thank you so much for finishing it up! Please merge whenever you're ready. |
Hey @NicolasHug! You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py |
Summary: Signed-off-by: Edward Z. Yang <ezyang@meta.com> Reviewed By: vmoens Differential Revision: D50789092 fbshipit-source-id: 614de3d6949a84ca576b9e7344de2e8e18152bf3 Co-authored-by: Nicolas Hug <nh.nicolas.hug@gmail.com> Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Signed-off-by: Edward Z. Yang ezyang@meta.com
cc @pmeier