-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
a request for a use case; where pyMC3 makes a system call to an external code for its generative model #1925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You have to write a theano wrapping operator around the program you want to call. You can find one example of how to do that in my beat repository I am sorry for the complexity of the example, but unfortunatley I think it is not possible to make it easier. You dont need to know what goes on in the functions. Just for principle. I also recommend you to read in the theano tutorial section "how to expand theano?" I hope this helps up to some point ... |
I have exactly the same issue, proprietary software code which I can only instantiate through a command call. from pymc3 import *
import numpy as np
import matplotlib.pyplot as plt
import theano.tensor as t
import theano
import json
import os
# X1, X2, and Y were generated from the code on Pymc3 tutorial website. first example
X1 = [-1.0856306, 0.99734545, 0.2829785, -1.50629471, -0.57860025, 1.65143654, -2.42667924, -0.42891263, 1.26593626, -0.8667404, -0.67888615, -0.09470897, 1.49138963, -0.638902, -0.44398196, -0.43435128, 2.20593008, 2.18678609, 1.0040539, 0.3861864, 0.73736858, 1.49073203, -0.93583387, 1.17582904, -1.25388067, -0.6377515, 0.9071052, -1.4286807, -0.14006872, -0.8617549, -0.25561937, -2.79858911, -1.7715331, -0.69987723, 0.92746243, -0.17363568, 0.00284592, 0.68822271, -0.87953634, 0.28362732, -0.80536652, -1.72766949, -0.39089979, 0.57380586, 0.33858905, -0.01183049, 2.39236527, 0.41291216, 0.97873601, 2.23814334, -1.29408532, -1.03878821, 1.74371223, -0.79806274, 0.02968323, 1.06931597, 0.89070639, 1.75488618, 1.49564414, 1.06939267, -0.77270871, 0.79486267, 0.31427199, -1.32626546, 1.41729905, 0.80723653, 0.04549008, -0.23309206, -1.19830114, 0.19952407, 0.46843912, -0.83115498, 1.16220405, -1.09720305, -2.12310035, 1.03972709, -0.40336604, -0.12602959, -0.83751672, -1.60596276, 1.25523737, -0.68886898, 1.66095249, 0.80730819, -0.31475815, -1.0859024, -0.73246199, -1.21252313, 2.08711336, 0.16444123, 1.15020554, -1.26735205, 0.18103513, 1.17786194, -0.33501076, 1.03111446, -1.08456791, -1.36347154, 0.37940061, -0.37917643];
X2 = [0.12841094, -0.39557759, 0.14245293, 0.51966079, -0.0049252, 0.00682843, 0.0359099, -0.37239514, 0.08522933, -0.32108195, -0.08553592, 0.24857391, -0.14704339, 0.1002498, 0.20254781, 0.05574817, -0.27418969, -0.06649506, 0.39188227, -0.40500915, -0.0551572, -0.11042161, 0.02414947, 0.14964312, 0.32173819, -0.05404648, 0.16246827, 0.09994803, 0.09486946, -0.11278479, -0.19946429, -0.22000862, -0.15128744, 0.06433732, 0.15218988, 0.06469377, -0.10979102, 0.36119402, 0.30377312, -0.07080002, -0.16468628, 0.02604299, 0.25345973, 0.066553, 0.11130974, -0.04241602, 0.09125418, 0.30890889, -0.04793376, 0.02866155, 0.0507633, 0.05674507, -0.28237778, -0.37537373, -0.20393101, 0.03358846, 0.11077123, -0.10613491, 0.2754515, -0.02863519, 0.0040632, -0.03879277, 0.02680536, 0.14089481, 0.13313069, -0.17968459, 0.30473276, -0.21900529, 0.0158454, -0.05487931, -0.20979834, -0.01502412, -0.14816275, 0.01458145, 0.08061719, 0.29438587, 0.06147684, -0.12224507, -0.07832396, 0.02799562, 0.01869217, 0.29191785, 0.27907059, -0.07178719, -0.10972843, -0.51141092, -0.10978408, -0.19561154, -0.07096489, 0.07831685, 0.03543847, -0.0059936, 0.03991642, -0.02522355, 0.03940379, -0.646211, -0.0538587, -0.02217014, -0.06825234, -0.04358925];
Y = np.array([9.38706859e-01, 4.10296149e-01,3.83981292e+00, 1.48115418e+00, 4.02779506e-01, 2.46184530e+00, -1.42342679e+00, -1.27520755e+00, 2.38380704e+00, -3.90761758e-01, 6.86815665e-01, 2.10641559e+00, 1.84890360e+00, -8.04359754e-01, 3.93284941e-01, 2.31721220e+00, 3.41651416e+00, 3.39016804e+00, 2.22246532e+00, 3.77308673e-01, 3.43806883e-01, 1.66274112e+00, -1.20663529e-01, 2.18829692e+00, 1.50706675e+00, -1.19159361e+00, 1.44784359e+00, -1.55349860e+00, -1.40248284e-01, -1.96609652e-02, -1.35472064e+00, -1.59474188e+00, -1.39656749e+00, 5.29754386e-01, 2.63051387e+00, 5.53932221e-01, 1.76084808e+00, 2.39686504e+00, 1.47396672e+00, 9.07514885e-01, 7.37921664e-02, -3.82899347e-01, 1.49271947e+00, 7.65880501e-01, 2.05273917e+00, 5.63172455e-01, 4.25098874e+00, 3.26909416e-02, 3.93785393e-01, 3.67324277e+00, 1.69575050e+00, 9.38133214e-01, 1.35531685e+00, -2.42854948e+00, 1.26254192e+00, 2.07270390e+00, 2.75833869e+00, 2.60484762e+00, 3.21391580e+00, 4.95643013e+00, 2.31319324e-01, 1.53863552e+00, 1.25983672e+00, -5.57565140e-01, 3.74025866e+00, 1.00427073e+00, 2.44326467e+00, 5.03997740e-01, 1.06029822e+00, 1.48250538e+00, -2.69441500e-01, -1.19520306e+00, 3.20016631e+00, -6.69460228e-01, -2.24215995e+00, 2.10607318e+00, 2.01495136e+00, -8.51855254e-01, -8.99821832e-01, -1.20278122e+00, 1.05077792e+00, -1.43401688e-01, 1.84052098e+00, 1.16665281e+00, 5.60119574e-02, -2.04696786e+00, -1.66062003e+00, 5.51783963e-01, 1.58062230e+00, 1.63826706e+00, 1.16403511e+00, 3.85980814e-01, 2.23665854e+00, 1.23718947e+00, -1.16021703e+00, 1.11137427e+00, 1.65658589e+00, -3.20236525e-03, 1.36931418e+00, 1.33161104e+00]);
i = 0
@theano.compile.ops.as_op(itypes=[t.dscalar,t.dscalar,t.dscalar],otypes=[t.dvector])
def proc_test(alpha, beta1, beta2):
#### write an input file
global X1, X2, i
params = {};
params['X1'] = X1;
params['X2'] = X2;
params['alpha'] = float(alpha);
params['beta1'] = float(beta1);
params['beta2'] = float(beta2);
###
with open("input.txt", 'w') as outfile:
json.dump(params, outfile)
#### make a system call
os.system("python dummy.py input.txt");
# Get the number of function runs each 100 NOT necessary, just checking
i +=1
if i%100 ==0:
print " number of evaluations {}".format(i)
#### read the output file and return the value
mu = np.loadtxt("output.txt");
return(np.array(mu))
# the model
with Model() as model: # model specifications in PyMC3 are wrapped in a with-statement
# Priors for unknown model parameters
alpha = Normal('alpha', mu=0, sd=10);
beta1 = Normal('beta1', mu=0, sd=10);
beta2 = Normal('beta2', mu=0, sd=10);
sigma = HalfNormal('sigma', sd=1);
# Expected value of outcome
mu = proc_test(alpha, beta1, beta2);
# Likelihood (sampling distribution) of observations
Y_obs = Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
# Inference!
#start = find_MAP(); # Find starting value by optimization
step = Metropolis()
trace = sample(100,step)
#plot results
plt.figure(figsize=(12, 12))
summary(trace)
traceplot(trace) This implementation has the drawback of not having a gradient calculator. Thus you can not use MAP estimation or samplers which require gradient calculation (eg. NUTS). I use here Metropolis. Perhaps someone with deeper knowledge on pymc3 could help. If you have a function called as an external cmd command + reading outputs from file, is there a way to implement a gradient calculation for the Theano op? I wonder why is this not a recursive issue in pymc3 (I haven't seen practically questions or answers on this). I imagine some people would want to do inference in external/not own models which wasn't an issue in pymc2. But migrating to pymc3 is not trivial. |
If the generation procedure is a complete black box, I guess the only way maybe is a stochastic gradient - but that would still be very inefficient.
I think any large model (with lots of parameters) is unlikely to work in PyMC2 as well, as Metropolis will not be able to explore the full parameter space. I think one way worth to try is the newly implemented SMC (thanks to @hvasbath), otherwise you can try some Approaximate Bayesian computation approach like elfi. |
I can just repeat what @junpenglao said. I had the same problem like you @Argantonio65 , which is why I started implementing that code. However, if your external program is fast and the forward model not expensive you can still implement the grad method of your OP and still get the gradients. You would just need to repeatedly call your external program and for each parameter keep the others fixed and change the current one just a little. The typical numerical gradient calculations. |
Thanks @hvasbath @junpenglao |
Hey @Argantonio65 any updates on these? We are curious on how SMC performs on many different models as this is still a little experimental. Although we have now few feedbacks that it seems to work well we would like to know how its working for your case? |
Hello @hvasbath, my likelihood was quite slow to evaluate (simple version of the model ~ 2-3 min) so I decided to emulate the dynamic model (just data based) and then infer on the surrogate model. My dynamics are quite smooth to the parameters so for dimension <10 the emulated structure gave very good results. And being the surrogate evaluated in 1-2 seconds the problem with the sampling efficiency was practically gone, so I stopped trying with more efficient samplers. It is nice if SMC performs well. If I manage to get some free time with the model (I can not access to it at least for 2 weeks) I might try with SMC. I see that there is an example of usage at SMC2_gaussians. Do you have an example in which the likelihood is based on observations? I understood that the likelihood should be defined as a random variable (like in llk = pm.Potential('llk', two_gaussians(X))). |
I have an example here fitting a mixed effect model with different sampler: |
what's the 'SMC' ? |
Sequential Monte Carlo, a sampler in the step-methods you may try to use for your model |
Uh oh!
There was an error while loading. Please reload this page.
In my research, my generative model is encapsulated into a software code (i do not have access to the source code). This code reads an input file and writes an output file. I have two choices, i can either run a design of experiments with this code, fit a surrogate model, and then encode this surrogate model inside of pymc3's model definition. Conversely, i would like to be able to make a system call to this software code from within pymc3.
In an effort to test if this mechanism is possible, i have attempted to re-work the simplest example from Pymc3 tutorial where the calculation of mu (as a liner regression fit) is conducted in an external function call.
The following code block is a modified version of the first example from pymc3 website. Instead of having mu = alpha + beta[0]*X1 + beta[1]*X2 inside of the model block, mu is the return value from a sub-routine. This sub-routine takes in the current sampled values for alpha and beta[i], builds an input file, makes a system call to dummy.py, and lastly it reads the output file created by dummy.py and returns the value of mu.
The next code block is the file dummy.py. This code reads its input file, calculates mu = alpha + beta[0]*X1 + beta[1]*X2 and writes mu to an output file.
My issue is that I don't know enough about the core data structures that pymc3 uses. So i don't know how to extract the current iteration's sampled values of alpha, and beta[i], so that these values can be written to an input file.
thanks much,
-ian
The text was updated successfully, but these errors were encountered: