Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow only 1-based indexed vectors in AbstractWeights #791

Closed
wants to merge 2 commits into from

Conversation

bkamins
Copy link
Contributor

@bkamins bkamins commented May 17, 2022

Currently we have:

julia> Weights(OffsetArray([1,2,3], -2))
3-element Weights{Int64, Int64, OffsetVector{Int64, Vector{Int64}}}:
                 3
        1320009728
 20547999722963018

I propose to require that AbstractWeights take a vector using 1-based indexing. This is a safer option to the alternative, where we would recalculate indices everywhere (this is possible, but I think error prone, so it is better to be defensive)

@bkamins bkamins requested a review from nalimilan May 17, 2022 21:43
src/weights.jl Outdated Show resolved Hide resolved
"""
macro weights(name)
return quote
mutable struct $name{S<:Real, T<:Real, V<:AbstractVector{T}} <: AbstractWeights{S, T, V}
values::V
sum::S
end
$(esc(name))(values::AbstractVector{<:Real}) = $(esc(name))(values, sum(values))
function $(esc(name))(values::AbstractVector{<:Real})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative that would be safer would be to define inner constructor:

macro weights(name)
    return quote
        mutable struct $(esc(name)){S<:Real, T<:Real, V<:AbstractVector{T}} <: AbstractWeights{S, T, V}
            values::V
            sum::S

            function $(esc(name)){S, T, V}(values::AbstractVector{T}, s::S) where {S<:Real, T<:Real, V<:AbstractVector{T}}
                Base.require_one_based_indexing(values)
                return new(values, s)
            end
        end

        $(esc(name))(values::AbstractVector{<:Real}, s=sum(values)) =
            $(esc(name)){typeof(s), eltype(values), typeof(values)}(values, s)
    end
end

but this looses all promotion rules that default inner constructor comes with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we want to require 1-based indexing in all cases, the inner constructor approach sounds safer. How many methods do we need to define to replicate the default inner constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have closed the PR since it seems we want to support non-1-based indexing. I think this issue should be settled first.

"""
macro weights(name)
return quote
mutable struct $name{S<:Real, T<:Real, V<:AbstractVector{T}} <: AbstractWeights{S, T, V}
values::V
sum::S
end
$(esc(name))(values::AbstractVector{<:Real}) = $(esc(name))(values, sum(values))
function $(esc(name))(values::AbstractVector{<:Real})
Base.require_one_based_indexing(values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LilithHafner
Copy link
Contributor

The issue here is also fixed by the single line of code Base.axes(wv::AbstractWeights) = Base.axes(wv.values) which makes Weights{T, S, OffsetVector{U, Vector{V}}} compliment with the the AbstractVector interface which it claims via subtyping.

That fix is included in #722.

Making weights throw an error on OffsetVectors does not eliminate the related segfaults because a user can still call Weights, as you noted above. To fix the exposure to segfaults due to the faulty assumption that AbstractVectors all use one-based indexing in methods that handle weights, we would need to add this check to the inner constructor and also narrow all type signatures from AbstractWeights to Weights, because a user is free to define their own subtype of AbstractWeights that fulfills the AbstractVector interface but does not have one-based indexing. Further, many of these segfaults come from making that assumption about AbstractVectors, not just AbstractWeights, and this pr would not address those errors.

We have to pick one of

  1. Be okay with the possibility of segfaults
  2. Check for and error on non-one-based indexing everywhere
  3. Use modern indexing everywhere (see move to modern indexing style #790, make counting more robust to input datatype #722)

I prefer option 3.

@bkamins
Copy link
Contributor Author

bkamins commented May 18, 2022

I am OK to go for option 3 so I am closing this PR.
Let me just note that going this way requires much more careful design and tests.

@bkamins bkamins closed this May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants