Create an API to extend an existing model without affecting old one #5336
Replies: 16 comments
-
This conversation is about |
Beta Was this translation helpful? Give feedback.
-
Any new functionalities will come only to V4 at this point. So yes. |
Beta Was this translation helpful? Give feedback.
-
OK, so this touches upon a few bigger questions/ideas, and underlying most of them is the question "Do we need the In this example, "...X, Y, and/or Z" is posterior predictive sampling, which really only requires a Also, we need to start asking ourselves if we really need constraints like unique names. From the Aesara perspective, we don't, and we definitely shouldn't use variable names as a means of identifying Aesara variables (the same is/was true for Theano). |
Beta Was this translation helpful? Give feedback.
-
As things are structured now, non unique variable names would break arviz conversion (and so would variables and dimensions having the same name). I am not sure what would be the gain of non-unique variable names but if we wanted to go down that path we'd have to update the converter to take that into account and edit the names so they stop being unique. |
Beta Was this translation helpful? Give feedback.
-
I imagine we could "unique-ify" the names before passing anything off to Arviz. N.B. Aesara already has an auto-unique-naming feature that we could use/extend: import aesara.tensor as at
a_1 = at.vector("a")
a_2 = at.vector("a")
>>> a_1.name == "a"
True
>>> a_2.name == "a"
True
>>> a_1.auto_name
'auto_3'
>>> a_2.auto_name
'auto_4' |
Beta Was this translation helpful? Give feedback.
-
Another thing worth noting: More specifically,
One of the most notable differences is that Depending on how well this analogy/correspondence works (in a functional sense), we might want to consider basing |
Beta Was this translation helpful? Give feedback.
-
That's quite interesting. I don't fully grasp the extent of aesara graph abilities but it would really be nice to be able to operate more organically on PyMC3 models / graphs as an "object" in itself and not as a "context" On the other hand there is always going to be a divide before and after sampling that must stay clear. For instance posterior predictive samples cannot affect posterior sampled values. This issue was trying to get at this, although probably in a very v3 mindset. How can we facilitate extending models after posterior sampling in a way that is both intuitive but also safe (ie not creating the illusion that you can do incremental inferences)? |
Beta Was this translation helpful? Give feedback.
-
That's exactly what I'm getting at. A distinguishing characteristic of It makes more sense for a For example, consider the following graphs: import aesara.tensor as at
import aesara.tensor.random.basic as ar
A_rv = ar.normal(0, 1)
B_rv = ar.gamma(0.5, 0.5)
Y_rv = ar.normal(A_rv, 1.0 / B_rv)
Z_rv = ar.poisson(at.exp(Y_rv)) Using our hypothetical # Model(unobserved: List[Variable], observed: List[Variable], givens: Dict[Variable, np.ndarray])
m1 = Model([A_rv, B_rv], [Y_rv])
m2 = Model([A_rv, B_rv, Y_rv], [Z_rv])
m3 = Model([A_rv, B_rv], [Y_rv, Z_rv])
m4 = Model([A_rv], [Y_rv, Z_rv], givens={B_rv: 1.2})
# etc. |
Beta Was this translation helpful? Give feedback.
-
Just a quick clarification, when you say observed in that example do you mean the output of random ancestral sampling or of mcmc following bayesian conditioning (ie the observed argument in PyMC3)? It looks like the first but just want to be sure. More generally how well can these models broadcast automatically, so that the givens could the entire posterior trace and everything not in the givens (added before or after) would be obtained by ancestral sampling based on these values? Edit: I think these 2 discourse threads may be useful for this discussion: https://discourse.pymc.io/t/bug-in-fast-sample-posterior-predictive/6904 https://discourse.pymc.io/t/sample-posterior-predictive-with-a-vengeance/5926 |
Beta Was this translation helpful? Give feedback.
-
The
The answer to that question always depends on how well the |
Beta Was this translation helpful? Give feedback.
-
I think that while I understand what you mean for the case of ancestral sampling, I am not sure how it would look like for MCMC sampling. To help ground the discussion, here is the original example that @canyon289 posted on the slack discussion. @michaelosthege you mentioned you do a lot of this with GPs, do you have a minimal example? What would be a good API to achieve this goal? Trying to reason with what @brandonwillard said above, would something like this make sense? idxs = np.array([0, 0, 1, 1, .., 9, 9])
mu_group = pm.Normal(0, 1)
sigma_group = pm.HalfNormal(1)
mu_individual = pm.Normal(mu_group, sigma_group, size=10)
observation = pm.Normal(mu_individual[idxs], 1)
model = Model([mu_group, sigma_group, mu_individual], [observation])
prior = model.prior_predictive() # returns ancestral samples for [observation, mu_group, sigma_group, mu_individual]
trace = model.sample(givens={observation: data}) # returns mcmc samples for [mu_group, sigma_group, mu_individual]
ppc = model.posterior_predictive(givens=trace) # returns ancestral samples for [observation]
new_mu_individual = pm.Normal(mu_group, sigma_group)
extended_model = Model([new_mu_individual], [mu_group_sigma, sigma_group])
extended_model.sample_posterior_predictive(givens=trace) |
Beta Was this translation helpful? Give feedback.
-
We should never need to recreate (partially or otherwise) a model in order to perform out-of-sample (OOS) sampling. In every case I've seen, the real problem with performing OOS sampling using PyMC3 models is the lack of flexible shape support, but this is exactly what we're addressing in |
Beta Was this translation helpful? Give feedback.
-
@ricardoV94 my code reads pretty much like https://docs.pymc.io/notebooks/GP-Latent.html#Using-.conditional |
Beta Was this translation helpful? Give feedback.
-
@michaelosthege, would it have been possible to make the variable(s) underlying |
Beta Was this translation helpful? Give feedback.
-
Only if changing the shape works, because you might want to evaluate the GP at some 1_000 coordinates, but don't want them in the trace already (memory, performance, matrix inversion errors). |
Beta Was this translation helpful? Give feedback.
-
This now exists under model.copy() |
Beta Was this translation helpful? Give feedback.
-
This idea emerged in a discussion by @canyon289 on how to do posterior predictive sampling on new random vars.
The current solution is to add a new variable after sampling as in:
The problem with that approach as mentioned by @michaelosthege is that you only have one shot of getting it right as there is no API to delete a variable from a model. So if you decide to try something else you have to extend with yet another variable and a new name.
A nested model does not work because the new variables are added to the original model (and it is untested territory)
It would be great to have something that extends an existing model without affecting it:
Could also be the nested syntax as long as we ensured the original model is not affected.
Beta Was this translation helpful? Give feedback.
All reactions