move to modern indexing style #790

Moelf · 2022-05-16T19:17:21Z

No description provided.

Moelf · 2022-05-16T19:18:17Z

check #722 too

devmotion

Generally, I think there's no reason to drop @inbounds if we use eachindex and hence ensure that indexing is correct. However, it seems here we could also just use zip without explicit eachindex?

As in #722, it would be good to add tests with OffsetArrays.

As an alternative, that would also allow to fix the promotion to Float64 and allow us to use pairwise summation we could use e.g.

function sqL2dist(a::AbstractArray{<:Number}, b::AbstractArray{<:Number})
    length(a) == length(b) || throw(DimensionMismatch("length of inputs incompatible"))
    r = sum(Broadcast.instantiate(Broadcast.broadcasted(vec(a), vec(b)) do ai, bi
        return abs2(ai - bi)
    end))
    return r
end

src/deviation.jl

Moelf · 2022-05-16T21:46:47Z

Compiler should infer inbounds when using each index

Moelf · 2022-05-16T22:22:50Z

after some testing, zip() is as slow as eachindex(a, b) without @inbounds

I suggest keep using eachindex(a, b) but also add @inbounds

Moelf · 2022-05-16T22:33:03Z

on the master branch of Julia, the @inbounds is inferred from for i in eachindex(a, b) as expected.

and I don't understand the CI error on nightly

src/deviation.jl

Co-authored-by: Kristoffer Carlsson <kcarlsson89@gmail.com>

devmotion · 2022-05-17T11:21:10Z

Seems like the PR is mainly missing tests, e.g., with OffsetArrays? (Of course, there are other possible improvements discussed in some comments above but in my opinion they could go into separate PRs since this PR is already a clear improvement for arrays with non-standard indices.)

bkamins · 2022-05-18T06:48:52Z

src/deviation.jl

-        @inbounds if a[i] == b[i]
-            c += 1
-        end
+    for (ai, bi) = zip(a, b)


Using zip here is incorrect given the contract of counteq (same comments below).

The point is that the contract states:

Count the number of indices

and zip will ignore indices but instead lead to comparing values using their iteration order.

IMO that's consistent with the current behaviour though, regardless of the docstring. The current implementation already only cares about whether the first, second, third, etc elements match but doesn't care about shapes or cartesian indices of the compared arrays.

Another nice consequence of using zip is that it is consistent with how Distances handles input arrays.

I agree with @bkamins that the docstring implies that values at the same indices are compared. So we need to check that indices are the same (or that starting indices are equal in addition to the current check that length are equal). We can keep a tolerance when mixing vectors and matrices by only check that linear indices are equal to avoid breakage.

What do you mean about Distances? Doesn't it check that axes are the same or that at least linear indices are the same? EDIT: just saw the link you posted above to https://github.com/JuliaStats/Distances.jl/blob/91f51b543ea6c54936d3e6183acdf7da50bf1f9e/src/metrics.jl#L251. I guess we could use a similar approach, but I would suggest being stricter for arrays with non-1-based indices, and requiring that they start at the same index. Note that this logic in Distances predates OffsetArray support since it was already present at JuliaStats/Distances.jl#164, so there's no reason to think this behaviors makes sense for OffsetArrays.

Since the package does neither support nor test arrays with non-standard or non-linear indices, my interpretation is that the docstring just refers to iteration order but it was not intended to come up with a more general design decision. Maybe there's some information in the original PR.

Yes we're free to reinterpret and adapt the docstring as we deem appropriate. But when passing arrays with different axes, it would find it dangerous to silently discard indices. If people use OffsetArrays there must be a reason. That's also a safer approach as if people find it too inconvenient, we can relax this requirement and allow mismatched indices. OTC if we allow them now we won't be able to change this later.

I think it is clear from the source codes that the original design did not take into account non-1-based indices. Then - naturally - docstrings do not reflect this case.

I think we need to make a general decision for the whole package:

if we allow non-1-based indexing (I feel from the discussions that we want to allow them)

if yes - how they should be treated (and this should be made consistent across all methods in the package)

what is our position about mixing arrays of different dimensionalities (this point is related as higher-dimensional arrays usually support linear indexing, which is 1-based) and when we want to accept only vectors.

After these decisions are made and documented the respective PRs should be done to reflect them. Otherwise we risk a situation when different methods in the package will take different assumptions.

A simple solution could be to adopt the semantics of broadcast, as discussed above at #790 (comment). This would avoid forcing users to understand yet another rule and it would be easier to implement for us (in practice we could use a different implementation under the hood if needed for performance).

ParadaCarleton · 2023-08-22T18:49:54Z

@Moelf could we wrap up this PR? It looks almost ready to merge and I want to end all the negative press from StatsBase's frequent @inbounds errors.

(By the way, does eachindex still not automatically elide bounds checking?)

Moelf · 2023-08-22T18:53:02Z

what's the desired action? do we want to use zip or @inbounds?

ParadaCarleton · 2023-08-22T19:08:20Z

I'm happy with @inbounds, as long as eachindex is being used. If we want to fix that later that's fine, but for now we just need to fix the bugs.

Moelf · 2023-08-22T19:57:27Z

oh, but then it's already fixed on master, closing now

move to modern indexing style

e6a902e

devmotion reviewed May 16, 2022

View reviewed changes

mcabbott reviewed May 16, 2022

View reviewed changes

src/deviation.jl Show resolved Hide resolved

add back inbounts

e105e92

Moelf requested a review from devmotion May 16, 2022 22:29

devmotion reviewed May 16, 2022

View reviewed changes

src/deviation.jl Outdated Show resolved Hide resolved

use zip

f7201b8

KristofferC reviewed May 17, 2022

View reviewed changes

src/deviation.jl Outdated Show resolved Hide resolved

Update src/deviation.jl

f741872

Co-authored-by: Kristoffer Carlsson <kcarlsson89@gmail.com>

LilithHafner mentioned this pull request May 17, 2022

Allow only 1-based indexed vectors in AbstractWeights #791

Closed

bkamins reviewed May 18, 2022

View reviewed changes

bkamins mentioned this pull request May 18, 2022

make counting more robust to input datatype #722

Merged

Moelf closed this Aug 22, 2023

Moelf reopened this Aug 22, 2023

Merge branch 'master' into remove_inbounds

c1b3ab1

Moelf closed this Aug 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move to modern indexing style #790

move to modern indexing style #790

Moelf commented May 16, 2022

Moelf commented May 16, 2022

devmotion left a comment

Moelf commented May 16, 2022 •

edited

Loading

Moelf commented May 16, 2022

Moelf commented May 16, 2022 •

edited

Loading

devmotion commented May 17, 2022

bkamins May 18, 2022

devmotion May 18, 2022

devmotion May 18, 2022

nalimilan May 18, 2022 •

edited

Loading

devmotion May 18, 2022

nalimilan May 18, 2022

bkamins May 18, 2022

nalimilan May 18, 2022

ParadaCarleton commented Aug 22, 2023 •

edited

Loading

Moelf commented Aug 22, 2023

ParadaCarleton commented Aug 22, 2023 •

edited

Loading

Moelf commented Aug 22, 2023

move to modern indexing style #790

move to modern indexing style #790

Conversation

Moelf commented May 16, 2022

Moelf commented May 16, 2022

devmotion left a comment

Choose a reason for hiding this comment

Moelf commented May 16, 2022 • edited Loading

Moelf commented May 16, 2022

Moelf commented May 16, 2022 • edited Loading

devmotion commented May 17, 2022

bkamins May 18, 2022

Choose a reason for hiding this comment

devmotion May 18, 2022

Choose a reason for hiding this comment

devmotion May 18, 2022

Choose a reason for hiding this comment

nalimilan May 18, 2022 • edited Loading

Choose a reason for hiding this comment

devmotion May 18, 2022

Choose a reason for hiding this comment

nalimilan May 18, 2022

Choose a reason for hiding this comment

bkamins May 18, 2022

Choose a reason for hiding this comment

nalimilan May 18, 2022

Choose a reason for hiding this comment

ParadaCarleton commented Aug 22, 2023 • edited Loading

Moelf commented Aug 22, 2023

ParadaCarleton commented Aug 22, 2023 • edited Loading

Moelf commented Aug 22, 2023

Moelf commented May 16, 2022 •

edited

Loading

Moelf commented May 16, 2022 •

edited

Loading

nalimilan May 18, 2022 •

edited

Loading

ParadaCarleton commented Aug 22, 2023 •

edited

Loading

ParadaCarleton commented Aug 22, 2023 •

edited

Loading