Bayesian Data Production Operations

jordan.howell2 · October 27, 2022, 5:08pm

Hello,

Has anyone put a PyMC model into production and actually saved the samples for later analysis into a database? If so, what is the best database design to save posterior samples?

ricardoV94 · October 29, 2022, 12:00pm

CC @michaelosthege

michaelosthege · October 30, 2022, 12:48pm

Hi @jordan.howell2,

at the moment most people fall back to InferenceData.to_netcdf and managig traces on a filesystem level (e.g. S3), but you can also use a real database with the mcbackend.ClickHouseBackend as shown here.
This enables live access to the draws while the sampler is still running.

Long term I’m working towards switching the PyMC internals to use mcbackend, so any contributions are welcome!

Topic		Replies	Views
Complaint Monday - What has been bothering you about PyMC? Development development	7	601	June 19, 2023
Save posterior samples to backend rather than holding in RAM? v5	3	302	October 9, 2023
Using mcbackend to store samples v5 modeling	4	384	July 22, 2024
Deploying Bayesian Models with PyMC3 Questions	2	1306	May 23, 2020
Different posterior predictive results after loading saved model version agnostic prediction	3	148	April 29, 2024

Bayesian Data Production Operations

Related topics