Create an API to extend an existing model without affecting old one #5336

ricardoV94 · 2021-03-11T13:56:02Z

ricardoV94
Mar 11, 2021
Maintainer

This idea emerged in a discussion by @canyon289 on how to do posterior predictive sampling on new random vars.

The current solution is to add a new variable after sampling as in:

with pm.Model() as m:
  x = pm.Normal('x')
  y = pm.Normal('y', x, 1, observed=data)
  trace = pm.sample()

with m:
  new_y = pm.Gamma('new_y', mu=x, sigma=2)
  ppc = pm.sample_posterior_predictive(trace, var_names=['new_y']

The problem with that approach as mentioned by @michaelosthege is that you only have one shot of getting it right as there is no API to delete a variable from a model. So if you decide to try something else you have to extend with yet another variable and a new name.

A nested model does not work because the new variables are added to the original model (and it is untested territory)

with pm.Model() as m:
  x = pm.Normal('x')
  y = pm.Normal('y', x, 1, observed=data)
  trace = pm.sample()

with m:
  with pm.Model() as extended_m:
    new_y = pm.Gamma('new_y', mu=x, sigma=2)
    ppc = pm.sample_posterior_predictive(trace, var_names=['new_y']

It would be great to have something that extends an existing model without affecting it:

with pm.Model() as m:
  x = pm.Normal('x')
  y = pm.Normal('y', x, 1, observed=data)
  trace = pm.sample()

with pm.Extend(m) as extended_m:
    new_y = pm.Gamma('new_y', mu=x, sigma=2)
    ppc = pm.sample_posterior_predictive(trace, var_names=['new_y']

Could also be the nested syntax as long as we ensured the original model is not affected.

brandonwillard · 2021-03-11T19:07:28Z

brandonwillard
Mar 11, 2021

This conversation is about v4, no?

0 replies

ricardoV94 · 2021-03-11T19:13:31Z

ricardoV94
Mar 11, 2021
Maintainer Author

Any new functionalities will come only to V4 at this point. So yes.

0 replies

brandonwillard · 2021-03-11T19:42:03Z

brandonwillard
Mar 11, 2021

OK, so this touches upon a few bigger questions/ideas, and underlying most of them is the question "Do we need the Model object and/or its constraints in order to do X, Y, and/or Z?", so I'll start with that.

In this example, "...X, Y, and/or Z" is posterior predictive sampling, which really only requires a RandomVariable-containing graph, since it's basically just aesara.function-compiling said graphs. In this case, if you focus on just the Aesara graphs, things like the name restrictions imposed by Model are irrelevant: i.e. one could completely duplicate/recreate the underlying RandomVariable-containing graph and sample away.

Also, we need to start asking ourselves if we really need constraints like unique names. From the Aesara perspective, we don't, and we definitely shouldn't use variable names as a means of identifying Aesara variables (the same is/was true for Theano).

0 replies

OriolAbril · 2021-03-11T19:47:08Z

OriolAbril
Mar 11, 2021
Maintainer

Also, we need to start asking ourselves if we really need constraints like unique names. From the Aesara perspective, we don't, and we definitely shouldn't use variable names as a means of identifying Aesara variables (the same is/was true for Theano).

As things are structured now, non unique variable names would break arviz conversion (and so would variables and dimensions having the same name). I am not sure what would be the gain of non-unique variable names but if we wanted to go down that path we'd have to update the converter to take that into account and edit the names so they stop being unique.

0 replies

brandonwillard · 2021-03-11T20:22:18Z

brandonwillard
Mar 11, 2021

non unique variable names would break arviz conversion

I imagine we could "unique-ify" the names before passing anything off to Arviz.

N.B. Aesara already has an auto-unique-naming feature that we could use/extend:

import aesara.tensor as at

a_1 = at.vector("a")
a_2 = at.vector("a")

>>> a_1.name == "a"
True
>>> a_2.name == "a"
True
>>> a_1.auto_name
'auto_3'
>>> a_2.auto_name
'auto_4'

0 replies

brandonwillard · 2021-03-11T20:32:47Z

brandonwillard
Mar 11, 2021

Another thing worth noting: pymc3.model.Model is already very close to aesara.graph.fg.FunctionGraph, in both concept and functionality. Both are containers that "encapsulate" graphs and carry/provide information about said graphs.

More specifically, FunctionGraph is all about specifying inputs and outputs for graph-based calculations. It provides methods for manipulating the encapsulated graphs in a consistent way (i.e. in a way that updates other dependent quantities as well), and is used by another process to perform computations.

Model is exactly the same: it's inputs are the un-observed random variables, and the outputs are the observed ones. Models are likewise used by another process to perform computations: i.e. posterior sampling.

One of the most notable differences is that Model lacks the manipulation features that FunctionGraph provides (e.g. ability to add and remove inputs, replace arbitrary parts of the graph, etc.)

Depending on how well this analogy/correspondence works (in a functional sense), we might want to consider basing Model off of FunctionGraph.

0 replies

ricardoV94 · 2021-03-11T20:53:33Z

ricardoV94
Mar 11, 2021
Maintainer Author

That's quite interesting.

I don't fully grasp the extent of aesara graph abilities but it would really be nice to be able to operate more organically on PyMC3 models / graphs as an "object" in itself and not as a "context"

On the other hand there is always going to be a divide before and after sampling that must stay clear. For instance posterior predictive samples cannot affect posterior sampled values.

This issue was trying to get at this, although probably in a very v3 mindset. How can we facilitate extending models after posterior sampling in a way that is both intuitive but also safe (ie not creating the illusion that you can do incremental inferences)?

0 replies

brandonwillard · 2021-03-11T21:12:15Z

brandonwillard
Mar 11, 2021

it would really be nice to be able to operate more organically on PyMC3 models / graphs as an "object" in itself and not as a "context"

That's exactly what I'm getting at. A distinguishing characteristic of FeatureGraphs (vs. Models) is that they take graphs as inputs. This—I think—is an important part of what's missing from the Model context approach.

It makes more sense for a Model object to work in exactly the same way as FunctionGraph, because the Aesara graph of a model exists independently from a Model object. Also, one can create an Aesara graph from which multiple Models could be derived.

For example, consider the following graphs:

import aesara.tensor as at
import aesara.tensor.random.basic as ar


A_rv = ar.normal(0, 1)
B_rv = ar.gamma(0.5, 0.5)

Y_rv = ar.normal(A_rv, 1.0 / B_rv)
Z_rv = ar.poisson(at.exp(Y_rv))

Using our hypothetical FunctionGraph-like Model type, we could define the following "sample-able" models:

# Model(unobserved: List[Variable], observed: List[Variable], givens: Dict[Variable, np.ndarray])
m1 = Model([A_rv, B_rv], [Y_rv])
m2 = Model([A_rv, B_rv, Y_rv], [Z_rv])
m3 = Model([A_rv, B_rv], [Y_rv, Z_rv])
m4 = Model([A_rv], [Y_rv, Z_rv], givens={B_rv: 1.2})
# etc.

0 replies

ricardoV94 · 2021-03-11T21:34:15Z

ricardoV94
Mar 11, 2021
Maintainer Author

Just a quick clarification, when you say observed in that example do you mean the output of random ancestral sampling or of mcmc following bayesian conditioning (ie the observed argument in PyMC3)? It looks like the first but just want to be sure.

More generally how well can these models broadcast automatically, so that the givens could the entire posterior trace and everything not in the givens (added before or after) would be obtained by ancestral sampling based on these values?

Edit: I think these 2 discourse threads may be useful for this discussion:

https://discourse.pymc.io/t/bug-in-fast-sample-posterior-predictive/6904

https://discourse.pymc.io/t/sample-posterior-predictive-with-a-vengeance/5926

0 replies

brandonwillard · 2021-03-11T23:05:34Z

brandonwillard
Mar 11, 2021

Just a quick clarification, when you say observed in that example do you mean the output of random ancestral sampling or of mcmc following bayesian conditioning (ie the observed argument in PyMC3)? It looks like the first but just want to be sure.

The observed arguments are the variables that would belong in Model.observed_RVs, and the unobserved arguments are, naturally, the ones that would belong in Model.unobserved_RVs.

More generally how well can these models broadcast automatically, so that the givens could the entire posterior trace and everything not in the givens (added before or after) would be obtained by ancestral sampling based on these values?

The answer to that question always depends on how well the Ops involved are implemented, but for all the RandomVariables I've implemented so far, they should be completely broadcast-able. In other words, yes, one should be able to provide this hypothetical givens argument an entire trace of data and everything should work out correctly.

0 replies

ricardoV94 · 2021-03-12T17:25:10Z

ricardoV94
Mar 12, 2021
Maintainer Author

I think that while I understand what you mean for the case of ancestral sampling, I am not sure how it would look like for MCMC sampling.

To help ground the discussion, here is the original example that @canyon289 posted on the slack discussion.

@michaelosthege you mentioned you do a lot of this with GPs, do you have a minimal example?

What would be a good API to achieve this goal? Trying to reason with what @brandonwillard said above, would something like this make sense?

idxs = np.array([0, 0, 1, 1, .., 9, 9])
mu_group = pm.Normal(0, 1)
sigma_group = pm.HalfNormal(1)
mu_individual = pm.Normal(mu_group, sigma_group, size=10)
observation = pm.Normal(mu_individual[idxs], 1)

model = Model([mu_group, sigma_group, mu_individual], [observation])
prior = model.prior_predictive()  # returns ancestral samples for [observation, mu_group, sigma_group, mu_individual]
trace = model.sample(givens={observation: data})  # returns mcmc samples for [mu_group, sigma_group, mu_individual]
ppc = model.posterior_predictive(givens=trace)   # returns ancestral samples for [observation]

new_mu_individual = pm.Normal(mu_group, sigma_group)
extended_model = Model([new_mu_individual], [mu_group_sigma, sigma_group])
extended_model.sample_posterior_predictive(givens=trace)

0 replies

brandonwillard · 2021-03-12T18:16:26Z

brandonwillard
Mar 12, 2021

We should never need to recreate (partially or otherwise) a model in order to perform out-of-sample (OOS) sampling. In every case I've seen, the real problem with performing OOS sampling using PyMC3 models is the lack of flexible shape support, but this is exactly what we're addressing in v4 (and a big part of the reason why we're addressing it).

0 replies

michaelosthege · 2021-03-12T19:37:39Z

michaelosthege
Mar 12, 2021
Maintainer

@michaelosthege you mentioned you do a lot of this with GPs, do you have a minimal example?

@ricardoV94 my code reads pretty much like https://docs.pymc.io/notebooks/GP-Latent.html#Using-.conditional
Note that the with model reuses the model from above - after sampling it.

0 replies

brandonwillard · 2021-03-12T20:06:04Z

brandonwillard
Mar 12, 2021

@michaelosthege, would it have been possible to make the variable(s) underlying X_new into shared variable(s) so that they could simply be updated and the original model object could be reused?

0 replies

michaelosthege · 2021-03-12T20:10:27Z

michaelosthege
Mar 12, 2021
Maintainer

@michaelosthege, would it have been possible to make the variable(s) underlying X_new into shared variable(s) so that they could simply be updated and the original model object could be reused?

Only if changing the shape works, because you might want to evaluate the GP at some 1_000 coordinates, but don't want them in the trace already (memory, performance, matrix inversion errors).

0 replies

ricardoV94 · 2024-11-12T07:49:09Z

ricardoV94
Nov 12, 2024
Maintainer Author

This now exists under model.copy()

0 replies

Create an API to extend an existing model without affecting old one #5336

ricardoV94 Mar 11, 2021 Maintainer

Replies: 16 comments

ricardoV94 Mar 11, 2021 Maintainer Author

OriolAbril Mar 11, 2021 Maintainer

ricardoV94 Mar 11, 2021 Maintainer Author

ricardoV94 Mar 11, 2021 Maintainer Author

ricardoV94 Mar 12, 2021 Maintainer Author

michaelosthege Mar 12, 2021 Maintainer

michaelosthege Mar 12, 2021 Maintainer

ricardoV94 Nov 12, 2024 Maintainer Author

ricardoV94
Mar 11, 2021
Maintainer

ricardoV94
Mar 11, 2021
Maintainer Author

OriolAbril
Mar 11, 2021
Maintainer

ricardoV94
Mar 11, 2021
Maintainer Author

ricardoV94
Mar 11, 2021
Maintainer Author

ricardoV94
Mar 12, 2021
Maintainer Author

michaelosthege
Mar 12, 2021
Maintainer

michaelosthege
Mar 12, 2021
Maintainer

ricardoV94
Nov 12, 2024
Maintainer Author