Skip to content

Imputations should also work with ndarray data #4437

Closed
@michaelosthege

Description

@michaelosthege

There are many cases where observations are not just a pandas.Series but instead an ndim>=1 ndarray.
In such cases the automatic imputation of float("nan") values does not work, because pymc3.model.pandas_to_array only looks for NaN if data is a pandas object.

The point of pandas_to_array is to convert data to a numpy ndarray, using a numpy.ma.MaskedArray for the imputation case.
It would make sense to support ndarray input too.

Description of your problem

Please provide a minimal, self-contained, and reproducible example.

data = numpy.array([
    [1,2,3],
    [4,5,float("nan")],
    [7,8,9],
])
print(data)
with pymc3.Model():
    pymc3.Normal(
        "L",
        mu=pymc3.Normal("x", shape=data.shape),
        sd=10,
        observed=data,
        shape=data.shape
    )
    pymc3.sample()

Please provide the full traceback.

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'x': array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])}

Initial evaluation results:
x   -8.27
L     NaN
Name: Log-probability of test_point, dtype: float64

Versions and main components

  • PyMC3 Version: 3.11.0
  • Theano Version: 1.1.0
  • Python Version: 3.7

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions