inferencedata.log_likelihood is summing observations

When talking with @lucianopaz I realized we completely broke log_likelihood computation in V4.

```python
import pymc as pm
with pm.Model() as m:
    y = pm.Normal("y")
    x = pm.Normal("x", y, 1, observed=[5, 2])    
    idata = pm.sample(tune=5, draws=5, chains=2)
print(idata.log_likelihood['x'].values.shape)
# (2, 5, 1)
```

Whereas in V3:
```python
import pymc3 as pm
with pm.Model() as m:
    y = pm.Normal("y")
    x = pm.Normal("x", y, 1, observed=[5, 2])    
    idata = pm.sample(tune=5, draws=5, chains=2, return_inferencedata=True)
print(idata.log_likelihood['x'].values.shape)
# (2, 5, 2)
```

This happened because the default `model.logpt` now returns the `summed` logp by default whereas before it returned the vectorized logp by default. The change was done in https://github.com/pymc-devs/pymc/commit/0a172c87e39ee64bf5101a5887281ad6548e6ea4

Although that is a more sane default, we have to reintroduce an easy helper `logp_elemwiset` (I think this is pretty much broken right now as well) which calls `logpt` with `sum=False`. 

Also in this case we might want to just return the logprob terms as the dictionary items that are returned by `aeppl.factorized_joint_lopgrob` and let the end-user decide how he wants to combine them. These keys contain `{value variable: logp term}`. The default of calling `at.add` on all variables when `sum=False` is seldom useful (that's why we switched the default), due to potential unwanted broadcasting across variables with different dimensions.

One extra advantage of returning the dictionary items is that we don't need to create nearly duplicated graphs for each observed variable when computing the log-likelihood here:

https://github.com/pymc-devs/pymc/blob/fe2d101bb27e05b889eafda7e54b07e05250faee/pymc/backends/arviz.py#L268

We can request it for any number of observed variables at the same time, and then simply compile a function that has each variable logp term as an output, but otherwise shares the common nodes, saving on compilation, computation and memory footprint, when a model has more than one observed variable. 

For instance, this nested loop would no longer be needed:

https://github.com/pymc-devs/pymc/blob/fe2d101bb27e05b889eafda7e54b07e05250faee/pymc/backends/arviz.py#L276-L282

CC @OriolAbril 

	for var, log_like_fun in cached:
	for k, chain in enumerate(trace.chains):
	log_like_chain = [
	self.log_likelihood_vals_point(point, var, log_like_fun)
	for point in trace.points([chain])
	]
	log_likelihood_dict.insert(var.name, np.stack(log_like_chain), k)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

inferencedata.log_likelihood is summing observations #5236

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

inferencedata.log_likelihood is summing observations #5236

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions