Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make counting more robust to input datatype #722
make counting more robust to input datatype #722
Changes from 1 commit
3905f6e
8eb5855
84ef887
d82295d
a4505b2
c334d4d
94b3e6c
87074b2
aa21bc9
309586f
3b5e01d
3384f45
6b81b51
18b70b5
7aa2d79
93caf72
62c5967
1238a2d
c1c09d1
bc20432
cbfd92e
60c0aad
c72b1aa
e4ca264
00fe408
a470531
aca38a3
168b3ab
204f89a
3f3e657
657c1ec
2ebf1f3
f469f0c
6cf64d9
dc1f7ae
4fd45d9
d31dcce
0fd1c27
fbdf564
e1fab99
717c795
ba59cc3
2dfba9f
8f446ee
e20e28c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are worth keeping (here and elsewhere) as they mention the names of the arguments whose dimensions don't match, which
eachindex
cannot do. Also in this particular case you've kept@inbounds
so no check is done AFAICT.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still thinking about this. I think the current error will look like this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone is calling
countmap
directly in their code withaxes(x) ≠ axes(wv)
, then the message "x and wv must have the same axes" is more helpful. If, on the other hand, the code causing the error is more removed (e.g.) then "all inputs to eachindex must have the same indices, got Base.OneTo(2) and Base.OneTo(3)" might be easier to understand because it lists the indices. The question is then if we want to manually lift the error message a level to update the parameter names, to which I lean toward no.
It depends on how
countmap
is used.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general it's always better to throw the error as soon as possible rather than letting it be thrown by another function deeper in the call stack. Otherwise it's hard to understand what happened (and you may even think StatsBase is buggy), e.g. here
eachindex
isn't the problem so it shouldn't appear in the error or the stack trace. (This is also why a common pattern when implementing custom array types is to use@boundschecks checkbounds(...); @inbounds ...
so that the error is thrown by the custom array method rather than by the methods it calls internally.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, it's also possible and nice for users to print the indices in the error message you throw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because dimension checking is not in an
@boundscheck
This would result in redundant dimension checking when the dimensions do match (the common case).If we think about dimension checking (and bounds checking) as algorithmic operations that have a performance impact and should be performed only once for both style and performance reasons, then this is inelegant. This is how I currently think about bounds and dimension checking.
If we think about bounds and dimension checking like type assertions, a fun, effective, free, and ultimately optional way of formally commenting on code to help readers, error messages, and occasionally performance, then throwing in an extra dimension check is a good idea. This is how I would like to think about bounds and dimension checking, but I don't know if the compiler can elide redundant checks and the syntax for additional bounds and dimension checking is a lot less clean than
::Integer
or a hypothetical::InBounds
.I've only heard of this hoisting in @vchuravy's comment JuliaLang/julia#42521 (comment), and don't know where to look for details, feasibility & potential timelines, but it sounds enticing. It would be nice to obviate this whole discussion with a type error at the top of the call hierarchy.
This being a real package that people actually use, and dimension checking being cheap in practice (I think), makes me think that your suggestion of keeping the original errors (probably extended to follow
eachindex
's style of showing input values) is the best way to go pragmatically, even if there is theoretically a better way on the horizon or this way is technically suboptimal for now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put back eager dimension checking and added error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler is generally good at removing redundant checks, but it needs to be able to prove that they are redundant.
JuliaLang/julia#42573 adds the ability to hoist
@inbounds
in LLVM, and JuliaLang/julia#42692 is about making that happen more often. All this will only come down the pipe in 1.8, so manual hoisting is still the name of the game for now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Fun and exciting! I can't wait till the day I can stop writing @inbounds in my code altogether (or at least as much).