Skip to content

Can't sample a model after resizing an RV #4918

Closed
@michaelosthege

Description

@michaelosthege

Description

First of all this is not a regression w.r.t. PyMC3 v3, since model resizing did not even work to begin with.

But with the new RandomVariable based model graphs, the size of an RV is no longer cut in stone and we're promoting that as a new feature, so users might try it and run into this issue.

I'm opening this so it's documented.

Here's a minimal example:

import pymc3 as pm

def run():
    with pm.Model() as pmodel:
        data = pm.Data("data", [1, 2, 3])
        x = pm.Normal("x", size=data.shape[0])
        pm.Normal("likelihood", mu=x, observed=data)

        sample_kwargs = dict(cores=1, chains=1, tune=2, draws=2)
        pm.sample(**sample_kwargs, step=pm.Metropolis())

        # resize the model
        data.set_value([1, 2, 3, 4])
        pm.sample(**sample_kwargs, step=pm.Metropolis())

if __name__ == "__main__":
    run()
Traceback
ValueError: Input dimension mismatch. (input[1].shape[0] = 4, input[2].shape[0] = 3)
Apply node that caused the error: Elemwise{Composite{(i0 + (-sqr((Cast{float64}(i1) - i2))))}}(TensorConstant{(1,) of -1..0664093453}, data, x)
Toposort index: 0
Inputs types: [TensorType(float64, (True,)), TensorType(int32, vector), TensorType(float64, vector)]
Inputs shapes: [(1,), (4,), (3,)]
Inputs strides: [(8,), (4,), (8,)]
Inputs values: [array([-1.83787707]), array([1, 2, 3, 4]), array([-1.2947119 ,  0.10578338, -0.59358436])]
Outputs clients: [[Sum{acc_dtype=float64}(Elemwise{Composite{(i0 + (-sqr((Cast{float64}(i1) - i2))))}}.0)]]

HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.
HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

Why this happens

The initial values are eagerly set/drawn/evaluated to numeric ndarrays and stored in the model.initial_values dictionary.
If the corresponding RV changes its size later on, the initial value is not updated.

Potential solutions

We could prefer to keep track of symbolic initial values (or None) and only evaluate/draw them at the start of sampling.

User-provided numeric initial values would still restrict the RV to its initial size, of course, but it should fix most use cases.

Versions and main components

  • PyMC3 Version: main (also on 725d798, so no it is not because of latest initval changes)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions