Bayesian Data Production Operations

Hello,

Has anyone put a PyMC model into production and actually saved the samples for later analysis into a database? If so, what is the best database design to save posterior samples?

1 Like

CC @michaelosthege

Hi @jordan.howell2,

at the moment most people fall back to InferenceData.to_netcdf and managig traces on a filesystem level (e.g. S3), but you can also use a real database with the mcbackend.ClickHouseBackend as shown here.
This enables live access to the draws while the sampler is still running.

Long term I’m working towards switching the PyMC internals to use mcbackend, so any contributions are welcome!