Skip to content

a request for a use case; where pyMC3 makes a system call to an external code for its generative model #1925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iantmiller opened this issue Mar 20, 2017 · 11 comments

Comments

@iantmiller
Copy link

iantmiller commented Mar 20, 2017

In my research, my generative model is encapsulated into a software code (i do not have access to the source code). This code reads an input file and writes an output file. I have two choices, i can either run a design of experiments with this code, fit a surrogate model, and then encode this surrogate model inside of pymc3's model definition. Conversely, i would like to be able to make a system call to this software code from within pymc3.

In an effort to test if this mechanism is possible, i have attempted to re-work the simplest example from Pymc3 tutorial where the calculation of mu (as a liner regression fit) is conducted in an external function call.

The following code block is a modified version of the first example from pymc3 website. Instead of having mu = alpha + beta[0]*X1 + beta[1]*X2 inside of the model block, mu is the return value from a sub-routine. This sub-routine takes in the current sampled values for alpha and beta[i], builds an input file, makes a system call to dummy.py, and lastly it reads the output file created by dummy.py and returns the value of mu.

from pymc3 import  *
import numpy as np
import matplotlib.pyplot as plt
import theano
import json
from subprocess import call

# X1, X2, and Y were generated from the code on Pymc3 tutorial website. first example
X1 = [-1.0856306,   0.99734545,  0.2829785,  -1.50629471, -0.57860025,  1.65143654, -2.42667924, -0.42891263,  1.26593626, -0.8667404,  -0.67888615, -0.09470897,   1.49138963, -0.638902,   -0.44398196, -0.43435128,  2.20593008,  2.18678609,   1.0040539,   0.3861864,   0.73736858,  1.49073203, -0.93583387,  1.17582904,  -1.25388067, -0.6377515,   0.9071052,  -1.4286807,  -0.14006872, -0.8617549, -0.25561937, -2.79858911, -1.7715331,  -0.69987723,  0.92746243, -0.17363568,  0.00284592,  0.68822271, -0.87953634,  0.28362732, -0.80536652, -1.72766949,  -0.39089979,  0.57380586,  0.33858905, -0.01183049,  2.39236527,  0.41291216,  0.97873601,  2.23814334, -1.29408532, -1.03878821,  1.74371223, -0.79806274,  0.02968323,  1.06931597,  0.89070639,  1.75488618,  1.49564414,  1.06939267, -0.77270871,  0.79486267,  0.31427199, -1.32626546,  1.41729905,  0.80723653,  0.04549008, -0.23309206, -1.19830114,  0.19952407,  0.46843912, -0.83115498,  1.16220405, -1.09720305, -2.12310035,  1.03972709, -0.40336604, -0.12602959, -0.83751672, -1.60596276,  1.25523737, -0.68886898,  1.66095249,  0.80730819, -0.31475815, -1.0859024,  -0.73246199, -1.21252313,  2.08711336,  0.16444123,  1.15020554, -1.26735205,  0.18103513,  1.17786194, -0.33501076,  1.03111446, -1.08456791, -1.36347154,  0.37940061, -0.37917643];
X2 = [0.12841094, -0.39557759,  0.14245293,  0.51966079, -0.0049252, 0.00682843,  0.0359099,  -0.37239514,  0.08522933, -0.32108195, -0.08553592,  0.24857391, -0.14704339,  0.1002498,   0.20254781,  0.05574817, -0.27418969, -0.06649506,  0.39188227, -0.40500915, -0.0551572,  -0.11042161,  0.02414947,  0.14964312,  0.32173819, -0.05404648,  0.16246827,  0.09994803,  0.09486946, -0.11278479, -0.19946429, -0.22000862, -0.15128744,  0.06433732,  0.15218988,  0.06469377, -0.10979102,  0.36119402,  0.30377312, -0.07080002, -0.16468628,  0.02604299,  0.25345973,  0.066553,    0.11130974, -0.04241602,  0.09125418,  0.30890889, -0.04793376,  0.02866155,  0.0507633,   0.05674507, -0.28237778, -0.37537373, -0.20393101,  0.03358846,  0.11077123, -0.10613491,  0.2754515,  -0.02863519,  0.0040632, -0.03879277,  0.02680536,  0.14089481,  0.13313069, -0.17968459,  0.30473276, -0.21900529,  0.0158454,  -0.05487931, -0.20979834, -0.01502412, -0.14816275,  0.01458145,  0.08061719,  0.29438587,  0.06147684, -0.12224507, -0.07832396,  0.02799562,  0.01869217,  0.29191785,  0.27907059, -0.07178719, -0.10972843, -0.51141092, -0.10978408, -0.19561154, -0.07096489,  0.07831685,  0.03543847, -0.0059936,   0.03991642, -0.02522355,  0.03940379, -0.646211, -0.0538587,  -0.02217014, -0.06825234, -0.04358925];
Y  = np.array([9.38706859e-01,   4.10296149e-01,3.83981292e+00,   1.48115418e+00, 4.02779506e-01,   2.46184530e+00,  -1.42342679e+00,  -1.27520755e+00,   2.38380704e+00,  -3.90761758e-01,   6.86815665e-01,   2.10641559e+00,   1.84890360e+00,  -8.04359754e-01,   3.93284941e-01,   2.31721220e+00,   3.41651416e+00,   3.39016804e+00,   2.22246532e+00,   3.77308673e-01,  3.43806883e-01,   1.66274112e+00,  -1.20663529e-01,   2.18829692e+00,   1.50706675e+00,  -1.19159361e+00,   1.44784359e+00,  -1.55349860e+00,  -1.40248284e-01,  -1.96609652e-02,  -1.35472064e+00,  -1.59474188e+00, -1.39656749e+00,   5.29754386e-01,   2.63051387e+00,   5.53932221e-01,   1.76084808e+00,   2.39686504e+00,   1.47396672e+00,   9.07514885e-01,   7.37921664e-02,  -3.82899347e-01,   1.49271947e+00,   7.65880501e-01,   2.05273917e+00,   5.63172455e-01,   4.25098874e+00,   3.26909416e-02,   3.93785393e-01,   3.67324277e+00,   1.69575050e+00,   9.38133214e-01,   1.35531685e+00,  -2.42854948e+00,   1.26254192e+00,   2.07270390e+00,   2.75833869e+00,   2.60484762e+00,   3.21391580e+00,   4.95643013e+00,   2.31319324e-01,   1.53863552e+00,   1.25983672e+00,  -5.57565140e-01,   3.74025866e+00,   1.00427073e+00,   2.44326467e+00,   5.03997740e-01,   1.06029822e+00,   1.48250538e+00,  -2.69441500e-01,  -1.19520306e+00, 3.20016631e+00, -6.69460228e-01, -2.24215995e+00,   2.10607318e+00,   2.01495136e+00,  -8.51855254e-01,  -8.99821832e-01,  -1.20278122e+00,   1.05077792e+00,  -1.43401688e-01,   1.84052098e+00,   1.16665281e+00,   5.60119574e-02,  -2.04696786e+00,  -1.66062003e+00,   5.51783963e-01,   1.58062230e+00,   1.63826706e+00,   1.16403511e+00,   3.85980814e-01,   2.23665854e+00,   1.23718947e+00,  -1.16021703e+00,   1.11137427e+00,   1.65658589e+00,  -3.20236525e-03,   1.36931418e+00,   1.33161104e+00]);

def proc_test(alpha, beta1, beta2, X1, X2):
  
    #### write an input file
    params = {};
    params['X1']    = X1;
    params['X2']    = X2;
   
    # These are the lines of code that I need help with.
    # As written below, objects of the Normal class are being stored into the dictonary.
    # I need to store the current sampled value for each distribution.

    # Do I need to pull these values from a Backend or Trace object?

    params['alpha'] = alpha;
    params['beta1'] = beta1;
    params['beta2'] = beta2;

    ###
          
    with open("input.txt", 'w') as outfile:
        json.dump(params, outfile)

    #### make a system call
    call ("python dummy.py input.txt");

    #### read the output file and return the value
    mu = np.loadtxt("output.txt");
    
    return(mu)

# the model
with Model() as model: # model specifications in PyMC3 are wrapped in a with-statement

    # Priors for unknown model parameters
    alpha = Normal('alpha', mu=0, sd=10);
    beta1 = Normal('beta1', mu=0, sd=10);
    beta2 = Normal('beta2', mu=0, sd=10);
    sigma = HalfNormal('sigma', sd=1);

    # Expected value of outcome
    mu = proc_test(alpha, beta1, beta2, X1, X2);

    # Likelihood (sampling distribution) of observations
    Y_obs = Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
    
    
    # Inference!
    start   = find_MAP(); # Find starting value by optimization
    step    = NUTS(scaling=start); # Instantiate MCMC sampling algorithm
    trace   = sample(2000, step, start=start); # draw 2000 posterior samples using NUTS sampling

#plot results
plt.figure(figsize=(12, 12))
summary(trace)    
traceplot(trace)

The next code block is the file dummy.py. This code reads its input file, calculates mu = alpha + beta[0]*X1 + beta[1]*X2 and writes mu to an output file.

# file: dummy.py
import sys
import numpy as np
import json

def main(argv):

    # provide an input file from the command line 
    input_file   = sys.argv[1];
    
    # read in the input file    
    params = {};
    with open(input_file) as json_data:
        params = json.load(json_data)
    
    X1 = np.asarray(params["X1"], dtype=float);
    X2 = np.asarray(params["X2"], dtype=float);
    
    # the generative model
    mu = float(params["alpha"]) + float(params["beta1"])*X1 + float(params["beta2"])*X2

    # write the results to a output file    
    np.savetxt("output.txt", mu);

    return

if __name__ == "__main__":
    main(sys.argv[1:])

My issue is that I don't know enough about the core data structures that pymc3 uses. So i don't know how to extract the current iteration's sampled values of alpha, and beta[i], so that these values can be written to an input file.

thanks much,
-ian

@twiecki
Copy link
Member

twiecki commented Mar 21, 2017

@hvasbath
Copy link
Contributor

hvasbath commented Apr 8, 2017

You have to write a theano wrapping operator around the program you want to call.
Then you can simply use the random variables as is.

You can find one example of how to do that in my beat repository
https://github.com/hvasbath/beat/tree/master/src
in theanof.py l.79-166 this operator is then called in models.py l.343 and yields a function that takes the random variables from pymc3 as inputs models.py l.414)

I am sorry for the complexity of the example, but unfortunatley I think it is not possible to make it easier. You dont need to know what goes on in the functions. Just for principle. I also recommend you to read in the theano tutorial section "how to expand theano?"

I hope this helps up to some point ...

@Argantonio65
Copy link

I have exactly the same issue, proprietary software code which I can only instantiate through a command call.
I have been trying to get around and here is a working version of your code with a theano op:

from pymc3 import  *
import numpy as np
import matplotlib.pyplot as plt
import theano.tensor as t
import theano
import json
import os
# X1, X2, and Y were generated from the code on Pymc3 tutorial website. first example
X1 = [-1.0856306,   0.99734545,  0.2829785,  -1.50629471, -0.57860025,  1.65143654, -2.42667924, -0.42891263,  1.26593626, -0.8667404,  -0.67888615, -0.09470897,   1.49138963, -0.638902,   -0.44398196, -0.43435128,  2.20593008,  2.18678609,   1.0040539,   0.3861864,   0.73736858,  1.49073203, -0.93583387,  1.17582904,  -1.25388067, -0.6377515,   0.9071052,  -1.4286807,  -0.14006872, -0.8617549, -0.25561937, -2.79858911, -1.7715331,  -0.69987723,  0.92746243, -0.17363568,  0.00284592,  0.68822271, -0.87953634,  0.28362732, -0.80536652, -1.72766949,  -0.39089979,  0.57380586,  0.33858905, -0.01183049,  2.39236527,  0.41291216,  0.97873601,  2.23814334, -1.29408532, -1.03878821,  1.74371223, -0.79806274,  0.02968323,  1.06931597,  0.89070639,  1.75488618,  1.49564414,  1.06939267, -0.77270871,  0.79486267,  0.31427199, -1.32626546,  1.41729905,  0.80723653,  0.04549008, -0.23309206, -1.19830114,  0.19952407,  0.46843912, -0.83115498,  1.16220405, -1.09720305, -2.12310035,  1.03972709, -0.40336604, -0.12602959, -0.83751672, -1.60596276,  1.25523737, -0.68886898,  1.66095249,  0.80730819, -0.31475815, -1.0859024,  -0.73246199, -1.21252313,  2.08711336,  0.16444123,  1.15020554, -1.26735205,  0.18103513,  1.17786194, -0.33501076,  1.03111446, -1.08456791, -1.36347154,  0.37940061, -0.37917643];
X2 = [0.12841094, -0.39557759,  0.14245293,  0.51966079, -0.0049252, 0.00682843,  0.0359099,  -0.37239514,  0.08522933, -0.32108195, -0.08553592,  0.24857391, -0.14704339,  0.1002498,   0.20254781,  0.05574817, -0.27418969, -0.06649506,  0.39188227, -0.40500915, -0.0551572,  -0.11042161,  0.02414947,  0.14964312,  0.32173819, -0.05404648,  0.16246827,  0.09994803,  0.09486946, -0.11278479, -0.19946429, -0.22000862, -0.15128744,  0.06433732,  0.15218988,  0.06469377, -0.10979102,  0.36119402,  0.30377312, -0.07080002, -0.16468628,  0.02604299,  0.25345973,  0.066553,    0.11130974, -0.04241602,  0.09125418,  0.30890889, -0.04793376,  0.02866155,  0.0507633,   0.05674507, -0.28237778, -0.37537373, -0.20393101,  0.03358846,  0.11077123, -0.10613491,  0.2754515,  -0.02863519,  0.0040632, -0.03879277,  0.02680536,  0.14089481,  0.13313069, -0.17968459,  0.30473276, -0.21900529,  0.0158454,  -0.05487931, -0.20979834, -0.01502412, -0.14816275,  0.01458145,  0.08061719,  0.29438587,  0.06147684, -0.12224507, -0.07832396,  0.02799562,  0.01869217,  0.29191785,  0.27907059, -0.07178719, -0.10972843, -0.51141092, -0.10978408, -0.19561154, -0.07096489,  0.07831685,  0.03543847, -0.0059936,   0.03991642, -0.02522355,  0.03940379, -0.646211, -0.0538587,  -0.02217014, -0.06825234, -0.04358925];
Y  = np.array([9.38706859e-01,   4.10296149e-01,3.83981292e+00,   1.48115418e+00, 4.02779506e-01,   2.46184530e+00,  -1.42342679e+00,  -1.27520755e+00,   2.38380704e+00,  -3.90761758e-01,   6.86815665e-01,   2.10641559e+00,   1.84890360e+00,  -8.04359754e-01,   3.93284941e-01,   2.31721220e+00,   3.41651416e+00,   3.39016804e+00,   2.22246532e+00,   3.77308673e-01,  3.43806883e-01,   1.66274112e+00,  -1.20663529e-01,   2.18829692e+00,   1.50706675e+00,  -1.19159361e+00,   1.44784359e+00,  -1.55349860e+00,  -1.40248284e-01,  -1.96609652e-02,  -1.35472064e+00,  -1.59474188e+00, -1.39656749e+00,   5.29754386e-01,   2.63051387e+00,   5.53932221e-01,   1.76084808e+00,   2.39686504e+00,   1.47396672e+00,   9.07514885e-01,   7.37921664e-02,  -3.82899347e-01,   1.49271947e+00,   7.65880501e-01,   2.05273917e+00,   5.63172455e-01,   4.25098874e+00,   3.26909416e-02,   3.93785393e-01,   3.67324277e+00,   1.69575050e+00,   9.38133214e-01,   1.35531685e+00,  -2.42854948e+00,   1.26254192e+00,   2.07270390e+00,   2.75833869e+00,   2.60484762e+00,   3.21391580e+00,   4.95643013e+00,   2.31319324e-01,   1.53863552e+00,   1.25983672e+00,  -5.57565140e-01,   3.74025866e+00,   1.00427073e+00,   2.44326467e+00,   5.03997740e-01,   1.06029822e+00,   1.48250538e+00,  -2.69441500e-01,  -1.19520306e+00, 3.20016631e+00, -6.69460228e-01, -2.24215995e+00,   2.10607318e+00,   2.01495136e+00,  -8.51855254e-01,  -8.99821832e-01,  -1.20278122e+00,   1.05077792e+00,  -1.43401688e-01,   1.84052098e+00,   1.16665281e+00,   5.60119574e-02,  -2.04696786e+00,  -1.66062003e+00,   5.51783963e-01,   1.58062230e+00,   1.63826706e+00,   1.16403511e+00,   3.85980814e-01,   2.23665854e+00,   1.23718947e+00,  -1.16021703e+00,   1.11137427e+00,   1.65658589e+00,  -3.20236525e-03,   1.36931418e+00,   1.33161104e+00]);


i = 0
@theano.compile.ops.as_op(itypes=[t.dscalar,t.dscalar,t.dscalar],otypes=[t.dvector])
def proc_test(alpha, beta1, beta2):
    #### write an input file
    global X1, X2, i
    
    params = {};
    params['X1']    = X1;
    params['X2']    = X2;

    params['alpha'] = float(alpha);
    params['beta1'] = float(beta1);
    params['beta2'] = float(beta2);

    ###
    with open("input.txt", 'w') as outfile:
        json.dump(params, outfile)

    #### make a system call
    os.system("python dummy.py input.txt");
    
    # Get the number of function runs each 100 NOT necessary, just checking
    i +=1 
    if i%100 ==0:
        print " number of evaluations {}".format(i)

    #### read the output file and return the value
    mu = np.loadtxt("output.txt");
    
    return(np.array(mu))

# the model
with Model() as model: # model specifications in PyMC3 are wrapped in a with-statement

    # Priors for unknown model parameters
    alpha = Normal('alpha', mu=0, sd=10);
    beta1 = Normal('beta1', mu=0, sd=10);
    beta2 = Normal('beta2', mu=0, sd=10);
    sigma = HalfNormal('sigma', sd=1);

    # Expected value of outcome
    mu = proc_test(alpha, beta1, beta2);

    # Likelihood (sampling distribution) of observations
    Y_obs = Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
    
    # Inference!
    #start   = find_MAP(); # Find starting value by optimization
    step = Metropolis()
    trace = sample(100,step)

#plot results
plt.figure(figsize=(12, 12))
summary(trace)    
traceplot(trace)

This implementation has the drawback of not having a gradient calculator. Thus you can not use MAP estimation or samplers which require gradient calculation (eg. NUTS). I use here Metropolis.

Perhaps someone with deeper knowledge on pymc3 could help. If you have a function called as an external cmd command + reading outputs from file, is there a way to implement a gradient calculation for the Theano op?

I wonder why is this not a recursive issue in pymc3 (I haven't seen practically questions or answers on this). I imagine some people would want to do inference in external/not own models which wasn't an issue in pymc2. But migrating to pymc3 is not trivial.

@junpenglao
Copy link
Member

junpenglao commented Jun 16, 2017

Perhaps someone with deeper knowledge on pymc3 could help. If you have a function called as an external cmd command + reading outputs from file, is there a way to implement a gradient calculation for the Theano op?

If the generation procedure is a complete black box, I guess the only way maybe is a stochastic gradient - but that would still be very inefficient.

I wonder why is this not a recursive issue in pymc3 (I haven't seen practically questions or answers on this). I imagine some people would want to do inference in external/not own models which wasn't an issue in pymc2. But migrating to pymc3 is not trivial.

I think any large model (with lots of parameters) is unlikely to work in PyMC2 as well, as Metropolis will not be able to explore the full parameter space. I think one way worth to try is the newly implemented SMC (thanks to @hvasbath), otherwise you can try some Approaximate Bayesian computation approach like elfi.

@hvasbath
Copy link
Contributor

hvasbath commented Jun 16, 2017

I can just repeat what @junpenglao said. I had the same problem like you @Argantonio65 , which is why I started implementing that code. However, if your external program is fast and the forward model not expensive you can still implement the grad method of your OP and still get the gradients. You would just need to repeatedly call your external program and for each parameter keep the others fixed and change the current one just a little. The typical numerical gradient calculations.
But you would need to use the proper Op class instead of the as_op above, as this doesnt support the gradient...
However, if your external program takes a long time it is waaaaaay more efficient to simply use SMC, especially if you have many model parameters.

@Argantonio65
Copy link

Thanks @hvasbath @junpenglao
I will try to use your implementation of SMC (as my likelihood evaluates quite slowly, thus no point of trying the gradient version).

@hvasbath
Copy link
Contributor

Hey @Argantonio65 any updates on these? We are curious on how SMC performs on many different models as this is still a little experimental. Although we have now few feedbacks that it seems to work well we would like to know how its working for your case?

@Argantonio65
Copy link

Hello @hvasbath, my likelihood was quite slow to evaluate (simple version of the model ~ 2-3 min) so I decided to emulate the dynamic model (just data based) and then infer on the surrogate model. My dynamics are quite smooth to the parameters so for dimension <10 the emulated structure gave very good results. And being the surrogate evaluated in 1-2 seconds the problem with the sampling efficiency was practically gone, so I stopped trying with more efficient samplers.

It is nice if SMC performs well. If I manage to get some free time with the model (I can not access to it at least for 2 weeks) I might try with SMC.

I see that there is an example of usage at SMC2_gaussians. Do you have an example in which the likelihood is based on observations? I understood that the likelihood should be defined as a random variable (like in llk = pm.Potential('llk', two_gaussians(X))).

@junpenglao
Copy link
Member

@dreamqin68
Copy link

what's the 'SMC' ?

@hvasbath
Copy link
Contributor

Sequential Monte Carlo, a sampler in the step-methods you may try to use for your model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants