Description
Prior predictive sampling draws values from the untransformed variables and then attempts to recreate the transformed values via transformation.forward_val(untransformed_values)
However, this logic is not sufficient to recover the correct values when there are stochastic bounds (e.g., pm.Uniform('x', lower=0, upper=y)
, because we also need to know what the values of the stochastic bounds were in each sample (i.e., the y
).
People don't usually care about transformed variables, but this is critical for example in sample_smc()
as it creates it's particles from an initial prior_predictive call.
Hopefully this will be gone in V4.0, but I thought best to document it just in case (and in the meanwhile).
Minimal reproducible example:
np.random.seed(1)
with pm.Model() as m:
y = pm.Uniform('y', 0, 1)
x = pm.Uniform('x', 0, y)
prior = pm.sample_prior_predictive()
print(np.mean(np.isnan(prior['x_interval__'])))
/home/ricardo/Documents/Projects/pymc3/pymc3/distributions/transforms.py:294: RuntimeWarning: invalid value encountered in log
return floatX(np.log(x - a) - np.log(b - x))
0.314
The 31.4% NANs
correspond to cases where a
and b
were either above or below x due to random sampling in the block below. The NANs
are the most obvious symptom, but all values are in fact incorrect:
Which is called at the end of prior_predictive_sampling
, without providing a point
or any other contextual info: