Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Product of distributions #473

Open
zsunberg opened this issue Apr 8, 2023 · 10 comments
Open

Product of distributions #473

zsunberg opened this issue Apr 8, 2023 · 10 comments
Labels
Contribution Opportunity This would be something that would be very useful to the community and a good modular addition.

Comments

@zsunberg
Copy link
Member

zsunberg commented Apr 8, 2023

We should add a way to make a product of two independent distributions to POMDPTools, i.e.

d1 = SparseCat([1,2], [0.5, 0.5])
d2 = SparseCat([:a, :b], [0.9, 0.1])
d3 = IndependentProduct(d1, d2, combine=tuple)
rand(d3) # might return (1, :a)

First we should consult Distributions.jl to see if there is already something like it.

@zsunberg zsunberg added the Contribution Opportunity This would be something that would be very useful to the community and a good modular addition. label Apr 8, 2023
@mykelk
Copy link
Member

mykelk commented Apr 8, 2023

Kind of like the Product distribution in Distributions.jl?
https://juliastats.org/Distributions.jl/stable/multivariate/#Distributions.Product

@zsunberg
Copy link
Member Author

zsunberg commented Apr 8, 2023

Yes... the only problem is:

julia> using POMDPTools, Distributions

julia> d = SparseCat([1,2], [0.5, 0.5]);

julia> Product([d, d])
ERROR: MethodError: no method matching Product(::Vector{SparseCat{Vector{Int64}, Vector{Float64}}})
Closest candidates are:
  Product(::V) where {S<:ValueSupport, T<:UnivariateDistribution{S}, V<:AbstractVector{T}} at ~/.julia/packages/Distributions/Spcmv/src/multivariate/product.jl:25

So, either we need to fit our distributions into the Distributions.jl type hierarchy or make our own product.

This is why I don't like elaborate type hierarchies like the one in Distributions.jl. It scares me every time I see one. (Side note: that is another reason why I am not excited about DomainSets - it has a very elaborate type hierarchy)

@mykelk
Copy link
Member

mykelk commented Apr 9, 2023

Yep, we can make our own. Or, it seems like we could provide a wrapper around general distributions from Distributions.jl?

@johannes-fischer
Copy link
Contributor

This is why I don't like elaborate type hierarchies like the one in Distributions.jl.

Out of curiosity: what is the problem with fitting it into the Distributions.jl type hierarchy?

@zsunberg
Copy link
Member Author

zsunberg commented Apr 14, 2023

@johannes-fischer thanks for asking! It is very useful for me to have to work out how to explain things like this.

Let's say we wanted to put put POMDPTools.SparseCat in the Distributions.jl type hierarchy. We would need something like

struct SparseCat{V, P} <: Distribution{F<:VariateForm, S<:ValueSupport}
      vals::V
      probs::P
end

So then we need to choose a VariateForm and a ValueSupport. For ValueSupport, we choose Discrete. For VariateForm, we have the choices of Univariate, Multivariate, and Matrixvariate, or we can make our own. Depending on the type of objects in SparseCat, it could be Univariate, Multivariate, or something else (like a string). I have no idea what the implications for making a new VariateForm would be.

I suppose I could do something like

struct SparseCat{V, P, F<:VariateForm} <: Distribution{F, Discrete}
      vals::V
      probs::P
end

but I would still have to make a new VariateForm for things like strings.

If VariateForm and ValueSupport were traits instead of parts of the type hierarchy, this would be a much more tractable problem because I could let VariateForm depend on the type of vals.

Note: POMDPs.jl is also guilty of this: we should just have statetype instead of POMDP{S, A, O}.

@johannes-fischer
Copy link
Contributor

@zsunberg I see, thanks for the detailed explanation! What would be the issue with defining

struct SparseCat{V, P} <: Distribution{VariateForm, Discrete}
      vals::V
      probs::P
end

The fact that VariateForm is an abstract type should not be a problem, since Univariate, Multivariate and Matrixvariate are also abstract types with no concrete subtypes. It wouldn't dispatch on methods for UnivariateDistribution etc. which seems reasonable if it is unknown whether SparseCat is univariate. Or something like

struct SparseCat{V, P, F<:VariateForm} <: Distribution{F, Discrete}
      vals::V
      probs::P
end
SparseCat(vals::V, probs::P) where {V,P} = SparseCat{V, P, VariateForm}(vals, probs)
SparseCat(::Type{F}, vals::V, probs::P) where {V, P, F<:VariateForm} = SparseCat{V, P, F}(vals, probs)

to allow for SparseCat distributions that explicitly set VariateForm but also SparseCat for arbitrary objects like strings etc. without defining a new VariateForm.

@zsunberg
Copy link
Member Author

Yeah, actually I guess some form of what you are suggesting wouldn't be too bad. I think the niceness of integrating with Distributions.jl outweighs the bit of ugliness, and it looks like Distributions.jl would move in the direction of traits if someone has time to work on it in the future.

I like the second approach in general better. But I think we should try a little harder to infer the VariateForm in the constructor.

@zsunberg
Copy link
Member Author

If anyone wants to take a stab at making the distributions in POMDPTools compatible with Distributions.jl, it would be a nice contribution.

@zsunberg
Copy link
Member Author

I tried implementing this in #495 but it did not go very well. Distributions.jl is very focused on numerical distributions (see #490), so Product([SparseCat([:a,:b,:c], [0.5, 0.2, 0.3]), BoolDistribution(1.0)]) has no chance of working anytime soon.

@johannes-fischer
Copy link
Contributor

I saw that Distributions.Product is deprecated and will be replaced by Distributions.ProductDistribution (see https://github.com/JuliaStats/Distributions.jl/blob/master/src/multivariate/product.jl and https://github.com/JuliaStats/Distributions.jl/blob/master/src/product.jl). But I think it doesn't change much regarding the issues.

Constructing the product distribution works at least when giving the SparseCat a Univariate form explicitly: Distributions.ProductDistribution([SparseCat{Vector{Symbol}, Vector{Float64}, Univariate}([:a,:b,:c], [0.5, 0.2, 0.3]), BoolDistribution(1.0)]) but pdf or rand still fail since ultimately they expect numeric values, as you mentioned (x::AbstractArray{<:Real,N}).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Contribution Opportunity This would be something that would be very useful to the community and a good modular addition.
Projects
None yet
Development

No branches or pull requests

3 participants