Description
If you have questions about a specific use case, or you are not sure whether this is a bug or not, please post it to our discourse channel: https://discourse.pymc.io
Description of your problem
Hi all, with some frequency I have multiple groups of data that I want to model at the same time.
Here is a small example:
import numpy as np
import pymc3 as pm
from scipy.stats import norm
num_groups = 5
group_size = 200
sigma = 1
data = np.concatenate([norm.rvs(loc=mu, scale=sigma, size=group_size) for mu in range(num_groups)])
data_labels = np.concatenate([np.ones(group_size) * group_id for group_id in range(num_groups)]).astype(int)
with pm.Model() as model:
mu = pm.HalfNormal("mu", sigma=10, shape=num_groups)
sigma = pm.HalfNormal("sigma", sigma=20, shape=num_groups)
likelihood = pm.Normal("data", mu=mu[data_labels], sigma=sigma[data_labels], observed=data)
Every time I do this (assuming it is the right thing to do), I forget how to setup the observed data and slicing the RVs. I'll eventually find an old example of mine, but can't readily find PyMC3 documentation to help me out. Shaped/vectored RVs are mentioned in the Getting Started Tutorial, but not to the extent I've shown here. In the past, I think I've learned some from the Rugby Example or Discourse etc.
If it makes sense to others, I'd like to see some tutorial/documentation that explicitly details this usage. It may already exist, but I haven't been able to search for it effectively, so some change could be made to make it easier to find. This may also be due to the fact that I'm not sure if these would be called vector variables, shaped variables etc.