Description
Description of your problem
when using pm.set_data()
with data that has a different number of rows than the original data, pm.sample_posterior_predictive()
will fail.
minimal example.
#!/usr/bin/env python3
import os as os
import numpy as np
import pymc as pm
import aesara as ae
import aesara.tensor as at
import matplotlib.pyplot as plt
print (pm.__version__, ae.__version__)
# data generation
n_observations = 99
slope = 3.
predictor = np.random.uniform(0., 1., n_observations)
residual = np.random.normal(0., 0.05, n_observations)
observable = predictor * slope + residual
# plt.scatter(predictor, observable, s=3, marker='o', edgecolor='k', facecolor='w', alpha=0.6)
# plt.show()
# model design
with pm.Model() as model:
data = pm.Data('predictor', predictor, mutable = True)
ones = pm.Data('epsilon', np.ones((n_observations, )), mutable = True)
slope = pm.Normal( f'slope', mu = np.pi, sigma = 1.)
estimator = at.dot(data, slope)
# residual = pm.HalfCauchy('residual', 1.)
residual = at.dot(ones, pm.HalfCauchy('residual', 1.))
posterior = pm.Normal('posterior', mu = estimator, sigma = residual, observed = observable)
# inference
with model:
trace = pm.sample(2**10)
# out-of-sample prediction
with model:
pm.set_data({'predictor': 1.1*np.ones(7,)})
prediction = pm.sample_posterior_predictive(trace)
full traceback.
Complete error traceback
Traceback (most recent call last):--------------------------------------------------------------------| 0.00% [0/4096 00:00<?]
File "/usr/lib/python3.10/site-packages/aesara/compile/function/types.py", line 971, in __call__
self.vm()
File "/usr/lib/python3.10/site-packages/aesara/graph/op.py", line 543, in rval
r = p(n, [x[0] for x in i], o)
File "/usr/lib/python3.10/site-packages/aesara/tensor/random/op.py", line 368, in perform
smpl_val = self.rng_fn(rng, *(args + [size]))
File "/usr/lib/python3.10/site-packages/aesara/tensor/random/op.py", line 166, in rng_fn
return getattr(rng, self.name)(*args, **kwargs)
File "_generator.pyx", line 1136, in numpy.random._generator.Generator.normal
File "_common.pyx", line 594, in numpy.random._common.cont
File "_common.pyx", line 511, in numpy.random._common.cont_broadcast_2
File "__init__.pxd", line 741, in numpy.PyArray_MultiIterNew3
ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (99,) and arg 1 with shape (7,).
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "PredictionShapeError.py", line 48, in <module>
prediction = pm.sample_posterior_predictive(trace)
File "/usr/lib/python3.10/site-packages/pymc/sampling.py", line 2022, in sample_posterior_predictive
values = sampler_fn(**param)
File "/usr/lib/python3.10/site-packages/pymc/util.py", line 366, in wrapped
return core_function(**input_point)
File "/usr/lib/python3.10/site-packages/aesara/compile/function/types.py", line 984, in __call__
raise_with_op(
File "/usr/lib/python3.10/site-packages/aesara/link/utils.py", line 534, in raise_with_op
raise exc_value.with_traceback(exc_trace)
File "/usr/lib/python3.10/site-packages/aesara/compile/function/types.py", line 971, in __call__
self.vm()
File "/usr/lib/python3.10/site-packages/aesara/graph/op.py", line 543, in rval
r = p(n, [x[0] for x in i], o)
File "/usr/lib/python3.10/site-packages/aesara/tensor/random/op.py", line 368, in perform
smpl_val = self.rng_fn(rng, *(args + [size]))
File "/usr/lib/python3.10/site-packages/aesara/tensor/random/op.py", line 166, in rng_fn
return getattr(rng, self.name)(*args, **kwargs)
File "_generator.pyx", line 1136, in numpy.random._generator.Generator.normal
File "_common.pyx", line 594, in numpy.random._common.cont
File "_common.pyx", line 511, in numpy.random._common.cont_broadcast_2
File "__init__.pxd", line 741, in numpy.PyArray_MultiIterNew3
ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (99,) and arg 1 with shape (7,).
Apply node that caused the error: normal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F2ABD7743C0>), TensorConstant{(1,) of 99}, TensorConstant{11}, Elemwise{mul,no_inplace}.0, Elemwise{mul,no_inplace}.0)
Toposort index: 4
Inputs types: [RandomGeneratorType, TensorType(int64, (1,)), TensorType(int64, ()), TensorType(float64, (None,)), TensorType(float64, (None,))]
Inputs shapes: ['No shapes', (1,), (), (7,), (99,)]
Inputs strides: ['No strides', (8,), (), (8,), (8,)]
Inputs values: [Generator(PCG64) at 0x7F2ABD7743C0, array([99]), array(11), 'not shown', 'not shown']
Outputs clients: [['output'], ['output']]
HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.
HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
additional information.
I am on my yearly recursion to this problem, trying to do out-of-sample prediction on a complex linear LKJ-priored multivariate model. I had reported issues before here and everything temporarily worked in a 4.0beta due to this and then this PR.
However, turning back to my project with the current versions, I encountered the seemingly familiar shape conflicts.
What's worse is that I could condense this down, and it even happens on the minimal example provided above. You will find that I have actually tried to control the shape of all model components using tensor multiplication at.dot
; however the aesara traceback still containes a wrong input shape (the last one).
This issue might be related, but that is not clear.
Versions and main components
- PyMC/PyMC3 Version: 4.2.1
- Aesara/Theano Version: 2.8.7
- Python Version: 3.10.8
- Operating system: linux 6.0.2-arch1-1
- How did you install PyMC/PyMC3: pip