Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling discontinuous density functions #925

Open
richardreeve opened this issue Jul 8, 2019 · 2 comments
Open

Handling discontinuous density functions #925

richardreeve opened this issue Jul 8, 2019 · 2 comments

Comments

@richardreeve
Copy link
Contributor

richardreeve commented Jul 8, 2019

I'm thinking about problems with non-continuous distributions, and whether it really makes sense to only have Union{Discrete, Continuous} <: ValueSupport. There was a brief discussion about mixed discrete and continuous distributions in #332, then more recently a couple of attempts to deal with other problems with discrete distributions in #887 and #916 allowing non-integer support. However, it seems like none of them is going anywhere at the moment...

The problem I see is that as well as being the probability density function for continuous distributions, pdf is also the probability mass function for discrete distributions. I'd like to be able to define a slab-and-spike distribution (as mentioned in that first issue), but I can't see how to do it, not just because distributions in MixtureModels need to have all the same ValueSupport subtype, but because it's not clear what the ValueSupport should be, nor what the pdf function should return - if we treat it as a density function, it's infinite at point masses, whereas if we treat it as a mass function, then it's zero everywhere else.

My feeling is that there's a problem with reusing pdf in both cases... pmf seems like a better bet. This can obviously be done in a non-breaking way by aliasing pmf to pdf, but then the mixed distributions can define both. What do people think? What do these distributions actually have to provide?

The end goal of this from my perspective is to be able to construct zero-inflated (also mentioned in #390), hurdle and slab-and-spike distributions easily from their constituent parts (a point mass and another discrete or continuous distribution) as they all get used a reasonable amount in the real world... [in fact there isn’t even a point mass distribution at the moment I don’t think?] But it's also to clean up the handling of discontinuous-but-not-discrete and discrete distributions more generally.

On the latter note, shouldn't we allow distributions with Discrete ValueSupport to have elements of any type, so long as there are a countably infinite(?) number of elements in its support (rather than requiring Ints - or things that round or floor to Ints, which is even more bizarre - as now)? Perhaps we should have CountableValue{T} <: DiscontinuousValue{T} <: ValueSupport where T is currently an Int - aliasing Discrete to CountableValue{Int} - but #916 suggests extending T in CountableValue{T} to Any? T in Discontinuous{T} could then be Float64 for a slab-spike distribution, and you could even have ContinuousValue{T} <: ValueSupport (with Continuous aliased to ContinuousValue{Float64}) for distributions with complex support (or Float32, etc.). And then finally what about more complicated support (say for a tree?) being explicitly handled by either a subtype of DiscontinuousValue{T} or a separate CompoundValue{T}?

@richardreeve
Copy link
Contributor Author

@matbesancon pointed out on Slack that #861 provides the Dirac delta function, though this doesn't solve the underlying problem with discrete / discontinuous distributions. There's also more discussion on related stuff on #771.

@richardreeve
Copy link
Contributor Author

Preliminary implementation(?) already done months ago here: MixedDistributions.jl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant