Skip to content

NUTS Sampler Fails with pygpu.gpuarray.GpuArrayException Error #3087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MisterRedactus opened this issue Jul 10, 2018 · 22 comments
Closed

NUTS Sampler Fails with pygpu.gpuarray.GpuArrayException Error #3087

MisterRedactus opened this issue Jul 10, 2018 · 22 comments

Comments

@MisterRedactus
Copy link

I am unable to run the NUTS sampler in PyMC3. The case I am running is a straightforward transcription of the basis Normal distribution case in the regression example at http://docs.pymc.io/notebooks/getting_started#Installation:

import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm
#print('Running on PyMC3 v{}'.format(pm.version))
plt.style.use('seaborn-darkgrid')

Initialize random number generator

np.random.seed(123)

True parameter values

alpha, sigma = 1, 1
beta = [1, 2.5]

Size of dataset

size = 100

Predictor variable

X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2

Simulate outcome variable

Y = alpha + beta[0]*X1 + beta[1]*X2 + np.random.randn(size)*sigma

fig, axes = plt.subplots(1, 2, sharex=True, figsize=(10,4))
axes[0].scatter(X1, Y)
axes[1].scatter(X2, Y)
axes[0].set_ylabel('Y'); axes[0].set_xlabel('X1'); axes[1].set_xlabel('X2');
plt.show()

basic_model = pm.Model()
with basic_model:

# Priors for unknown model parameters
alpha = pm.Normal('alpha', mu=0, sd=10)
beta = pm.Normal('beta', mu=0, sd=10, shape=2)
sigma = pm.HalfNormal('sigma', sd=1)

# Expected value of outcome
mu = alpha + beta[0]*X1 + beta[1]*X2

# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)

map_estimate = pm.find_MAP(model=basic_model, method='powell')

print(map_estimate)

with basic_model:
# draw 500 posterior samples
trace = pm.sample(500)

This code appears to work properly up to the point where the pm.sample line is encountered. At that point I receive a pygpu.gpuarray.GpuArrayException invalid value error. The following is the complete response from the run including traceback:

WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Using cuDNN version 5105 on context None
Mapped name None to device cuda: GeForce GTX 950 (0000:03:00.0)
0%| | 0/5000 [00:00<?, ?it/s]D
:\Programs\Anaconda3\Lib\site-packages\scipy\optimize_minimize.py:502: RuntimeWarning: Method powell does not use gradi
ent information (jac).
RuntimeWarning)
logp = -148.98, ||grad|| = 0.73744: 100%|███████████████████████████████████████████| 183/183 [00:00<00:00, 185.38it/s]
{'alpha': array(0.9090931, dtype=float32), 'beta': array([0.9514547, 2.6145666], dtype=float32), 'sigma_log__': array(-0
.03494539, dtype=float32), 'sigma': array(0.9656581, dtype=float32)}
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma_log__, beta, alpha]
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "D:\Programs\Anaconda3\Lib\site-packages\joblib\externals\loky\backend\queues.py", line 151, in _feed
obj, reducers=reducers)
File "D:\Programs\Anaconda3\Lib\site-packages\joblib\externals\loky\backend\reduction.py", line 145, in dumps
p.dump(obj)
File "D:\Programs\Anaconda3\Lib\site-packages\theano\gpuarray\type.py", line 909, in GpuArray_pickler
return (GpuArray_unpickler, (np.asarray(cnda), ctx_name))
File "D:\Programs\Anaconda3\Lib\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
File "pygpu\gpuarray.pyx", line 1735, in pygpu.gpuarray.GpuArray.array
File "pygpu\gpuarray.pyx", line 1405, in pygpu.gpuarray._pygpu_as_ndarray
File "pygpu\gpuarray.pyx", line 394, in pygpu.gpuarray.array_read
pygpu.gpuarray.GpuArrayException: b'cuMemcpyDtoHAsync(dst, src->ptr + srcoff, sz, ctx->mem_s): CUDA_ERROR_INVALID_VALUE:
invalid argument'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File ".\Regression_Case.py", line 50, in
trace = pm.sample(500)
File "D:\Programs\Anaconda3\Lib\site-packages\pymc3\sampling.py", line 442, in sample
trace = _mp_sample(**sample_args)
File "D:\Programs\Anaconda3\Lib\site-packages\pymc3\sampling.py", line 982, in _mp_sample
traces = Parallel(n_jobs=cores, mmap_mode=None)(jobs)
File "D:\Programs\Anaconda3\Lib\site-packages\joblib\parallel.py", line 962, in call
self.retrieve()
File "D:\Programs\Anaconda3\Lib\site-packages\joblib\parallel.py", line 865, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "D:\Programs\Anaconda3\Lib\site-packages\joblib_parallel_backends.py", line 515, in wrap_future_result
return future.result(timeout=timeout)
File "D:\Programs\Anaconda3\Lib\site-packages\joblib\externals\loky_base.py", line 431, in result
return self.__get_result()
File "D:\Programs\Anaconda3\Lib\site-packages\joblib\externals\loky_base.py", line 382, in __get_result
raise self._exception
File "D:\Programs\Anaconda3\Lib\site-packages\joblib\externals\loky\backend\queues.py", line 151, in _feed
obj, reducers=reducers)
File "D:\Programs\Anaconda3\Lib\site-packages\joblib\externals\loky\backend\reduction.py", line 145, in dumps
p.dump(obj)
File "D:\Programs\Anaconda3\Lib\site-packages\theano\gpuarray\type.py", line 909, in GpuArray_pickler
return (GpuArray_unpickler, (np.asarray(cnda), ctx_name))
File "D:\Programs\Anaconda3\Lib\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
File "pygpu\gpuarray.pyx", line 1735, in pygpu.gpuarray.GpuArray.array
File "pygpu\gpuarray.pyx", line 1405, in pygpu.gpuarray._pygpu_as_ndarray
File "pygpu\gpuarray.pyx", line 394, in pygpu.gpuarray.array_read
pygpu.gpuarray.GpuArrayException: b'cuMemcpyDtoHAsync(dst, src->ptr + srcoff, sz, ctx->mem_s): CUDA_ERROR_INVALID_VALUE:
invalid argument'

The following are my versions and main components

  • PyMC3 Version: 3.4.1
  • Theano Version: 1.0.2
  • Python Version: 3.6.5
  • Operating system: Windows 10
  • How did you install PyMC3: (conda/pip): conda

I was hoping to use PyMC3 in an upcoming project, so any assistance you might provide would be much appreciated.

@junpenglao
Copy link
Member

We usually dont see a big advantage in using GPU in our use cases, so my suggestion is to set theano to CPU only and try again.

@MisterRedactus
Copy link
Author

I tried that by changing the setting in my .theanorc.txt file from 'device = cuda' to 'device = cpu'. This resulted in a new error:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma_log__, beta, alpha]
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FFB9A6994C4 Unknown Unknown Unknown
KERNELBASE.dll 00007FFBD06F717D Unknown Unknown Unknown
KERNEL32.DLL 00007FFBD1432784 Unknown Unknown Unknown
ntdll.dll 00007FFBD3450C31 Unknown Unknown Unknown
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FFB9A6994C4 Unknown Unknown Unknown
KERNELBASE.dll 00007FFBD06F717D Unknown Unknown Unknown
KERNEL32.DLL 00007FFBD1432784 Unknown Unknown Unknown
ntdll.dll 00007FFBD3450C31 Unknown Unknown Unknown
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FFB9A6994C4 Unknown Unknown Unknown
KERNELBASE.dll 00007FFBD06F717D Unknown Unknown Unknown
KERNEL32.DLL 00007FFBD1432784 Unknown Unknown Unknown
ntdll.dll 00007FFBD3450C31 Unknown Unknown Unknown
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FFB9A6994C4 Unknown Unknown Unknown
KERNELBASE.dll 00007FFBD06F717D Unknown Unknown Unknown
KERNEL32.DLL 00007FFBD1432784 Unknown Unknown Unknown
ntdll.dll 00007FFBD3450C31 Unknown Unknown Unknown
ERROR: The process "18224" not found.
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FFB9A6994C4 Unknown Unknown Unknown
KERNELBASE.dll 00007FFBD06F717D Unknown Unknown Unknown
KERNEL32.DLL 00007FFBD1432784 Unknown Unknown Unknown
ntdll.dll 00007FFBD3450C31 Unknown Unknown Unknown
QObject::~QObject: Timers cannot be stopped from another thread
ERROR: The process "6244" not found.

Needless to say, I did not initiate any control-C event, so I am just as puzzled with this new error as with the last one, unless there is another way to set theano to use the CPU only.

@twiecki
Copy link
Member

twiecki commented Jul 12, 2018

Did you install mkl-services in anaconda?

@MisterRedactus
Copy link
Author

I don't believe so, unless it was part of the baseline anaconda installation. Unfortunately, I am traveling over the next week and don't have access to my desktop to test this out. Frankly, I have been having enough problems getting a stable PyMC3 installation that I am considering starting over by reinstalling anaconda, then reinstalling PyMC3 from conda while taking careful notes as I go along to document my steps.

However, please see my related Issue 3093 for problems encountered for PyMC3 installation on a different Windows 10 laptop.

@MisterRedactus
Copy link
Author

Now that I am back at my home office, I have returned to the problem of installing PyMC3 in Windows 10 in conjunction with Python 3.7/Anaconda 5.2 on my dual-boot GPU-enabled desktop. I have tried a couple of ways of doing this, one consistent with the procedure at http://datahans.blogspot.com/2016/04/installing-pymc3.html (but using Python 3.7, and not 2.7 as recommended in the link), and the other by creating a special conda package in line with one of the suggestions at #2988. In the second case, the yml file was:

name: pymc3_env_3_7
dependencies:
  - python 
  - cloudpickle
  - ipykernel
  - mingw
  - libpython
  - m2w64-toolchain
  - mkl=2017
  - pygpu
  - theano
  - pymc3
  - parameterized
  - seaborn

In both cases, I got through the original sampling and MAP estimate portions of the regression case code. And in both cases, I then received this traceback:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma, beta, alpha]
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\theano\gpuarray\__init__.py", line 227, in <module>
    use(config.device)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\theano\gpuarray\__init__.py", line 214, in use
    init_dev(device, preallocate=preallocate)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\theano\gpuarray\__init__.py", line 65, in init_dev
    raise RuntimeError("You can't initialize the GPU in a subprocess if the parent process already did it")
RuntimeError: You can't initialize the GPU in a subprocess if the parent process already did it
  0%|                                                                                         | 0/5000 [00:00<?, ?it/s]D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\scipy\optimize\_minimize.py:381: RuntimeWarning: Method powell does not use gradient information (jac).
  RuntimeWarning)
logp = -149.47, ||grad|| = 13.25: 100%|████████████████████████████████████████████| 177/177 [00:00<00:00, 3326.66it/s]
Map estimate =  {'alpha': array(0.9090678691864014, dtype=float32), 'beta': array([ 0.9514268 ,  2.61449409], dtype=float32), 'sigma_log__': array(-0.03490985184907913, dtype=float32), 'sigma': array(0.9656924605369568, dtype=float32)}
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma, beta, alpha]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 114, in _main
    Traceback (most recent call last):
prepare(preparation_data)
  File "Regression_Case.py", line 51, in <module>
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 225, in prepare
    trace = pm.sample(1000)
_fixup_main_from_path(data['init_main_from_path'])
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\sampling.py", line 449, in sample
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    trace = _mp_sample(**sample_args)
run_name="__mp_main__")
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\sampling.py", line 996, in _mp_sample
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\runpy.py", line 263, in run_path
    chain, progressbar)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 275, in __init__
    pkg_name=pkg_name, script_name=fname)
for chain, seed, start in zip(range(chains), seeds, start_points)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\runpy.py", line 96, in _run_module_code
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 275, in <listcomp>
    mod_name, mod_spec, pkg_name, script_name)
for chain, seed, start in zip(range(chains), seeds, start_points)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\runpy.py", line 85, in _run_code
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 182, in __init__
    exec(code, run_globals)
self._process.start()
  File "D:\Projects\Leak Spill Analysis\Regression_Case.py", line 51, in <module>
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\process.py", line 105, in start
    trace = pm.sample(1000)
self._popen = self._Popen(self)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\sampling.py", line 449, in sample
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\context.py", line 223, in _Popen
    trace = _mp_sample(**sample_args)
return _default_context.get_context().Process._Popen(process_obj)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\sampling.py", line 996, in _mp_sample
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\context.py", line 322, in _Popen
    chain, progressbar)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 275, in __init__
        return Popen(process_obj)for chain, seed, start in zip(range(chains), seeds, start_points)

  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 275, in <listcomp>
    reduction.dump(process_obj, to_child)
for chain, seed, start in zip(range(chains), seeds, start_points)  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\reduction.py", line 60, in dump

    ForkingPickler(file, protocol).dump(obj)  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 182, in __init__

    self._process.start()BrokenPipeError
: [Errno 32] Broken pipe  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\process.py", line 105, in start

    self._popen = self._Popen(self)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

For what it is worth, I have run the sample code at http://deeplearning.net/software/theano/tutorial/using_gpu.html confirming that the GPU can be run successfully on my PC. So I am trying other ideas, but as before, any ideas would be appreciated.

@MisterRedactus
Copy link
Author

I have continued to try installing PyMC3 on my desktop, and have successfully achieved it by modifying my yml file as follows:

name: pymc3_env_2_7
dependencies:
  - python=2.7
  - cloudpickle
  - ipykernel
  - mingw
  - libpython
  - m2w64-toolchain
  - mkl=2017
  - pygpu
  - theano
  - pymc3
  - parameterized
  - seaborn

This installation using Python 2.7 appears to run successfully using both my CPU and GPU, although I do get an odd "Could not pickle model, sampling singlethreaded." message when I run on the GPU. So in a sense, this installation issue appears to be resolved, since I can now use the PyMC3 app for my work. Now that I've said that, it's worth pointing out that I have never, on any machine, been able to get PyMC3 to successfully install to a Windows 10 platform using the current version of Python 3.

@fonnesbeck
Copy link
Member

Glad you got something working. It is worrying that you can't get Py3 to work, however. To clarify, are you talking about working with a GPU, or working at all? Does it work in a CPU environment?

@MisterRedactus
Copy link
Author

The traceback I posted a couple of days ago was with device = cuda. Changing this to device = cpu in my .theanorc.txt file results in:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma, beta, alpha]
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
  0%|                                                                                         | 0/5000 [00:00<?, ?it/s]D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\scipy\optimize\_minimize.py:381: RuntimeWarning: Method powell does not use gradient information (jac).
  RuntimeWarning)
logp = -149.47, ||grad|| = 13.25: 100%|████████████████████████████████████████████| 177/177 [00:00<00:00, 4748.09it/s]
Map estimate =  {'alpha': array(0.9090678691864014, dtype=float32), 'beta': array([ 0.9514268 ,  2.61449409], dtype=float32), 'sigma_log__': array(-0.03490985184907913, dtype=float32), 'sigma': array(0.9656924605369568, dtype=float32)}
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma, beta, alpha]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
Traceback (most recent call last):
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\runpy.py", line 96, in _run_module_code
  File "Regression_Case.py", line 57, in <module>
        mod_name, mod_spec, pkg_name, script_name)trace = pm.sample(1000)

  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\runpy.py", line 85, in _run_code
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\sampling.py", line 449, in sample
        exec(code, run_globals)trace = _mp_sample(**sample_args)

  File "D:\Projects\Leak Spill Analysis\Regression_Case.py", line 57, in <module>
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\sampling.py", line 996, in _mp_sample
        trace = pm.sample(1000)chain, progressbar)

  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\sampling.py", line 449, in sample
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 275, in __init__
    trace = _mp_sample(**sample_args)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\sampling.py", line 996, in _mp_sample
    chain, progressbar)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 275, in __init__
        for chain, seed, start in zip(range(chains), seeds, start_points)for chain, seed, start in zip(range(chains), seeds, start_points)

  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 275, in <listcomp>
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 275, in <listcomp>
        for chain, seed, start in zip(range(chains), seeds, start_points)for chain, seed, start in zip(range(chains), seeds, start_points)

  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 182, in __init__
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\site-packages\pymc3\parallel_sampling.py", line 182, in __init__
        self._process.start()self._process.start()

  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\process.py", line 105, in start
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\process.py", line 105, in start
        self._popen = self._Popen(self)self._popen = self._Popen(self)

  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\context.py", line 223, in _Popen
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
return _default_context.get_context().Process._Popen(process_obj)  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\context.py", line 322, in _Popen

      File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
      File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
return Popen(process_obj)
  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
      File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
reduction.dump(process_obj, to_child)
_check_not_importing_main()  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\reduction.py", line 60, in dump

  File "D:\Anaconda3\envs\pymc3_env_3_7\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
    ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

@twiecki
Copy link
Member

twiecki commented Jul 23, 2018

I think I know why. Py3 properly supports parallel sampling, which would only work if you had 4 GPUs. So try setting jobs=1.

@MisterRedactus
Copy link
Author

Awesome! That did it. The sample case runs in the Python 3 PyMC3 environment with both device = cuda and device = cpu in .theanorc.txt. I note that the CPU run takes about a quarter the time that the GPU run takes, so it would be hard to justify doing this using my GPU.

My only other question: Why did the Python 2 installation work properly? Beyond that, it would be useful to include more bulletproof instructions for installing PyMC3 under Windows.

Other than that, I think my PyMC3 installation looks good. Thanks so much for the assist.

@twiecki
Copy link
Member

twiecki commented Jul 24, 2018

Glad it's working. Yes, GPU only speeds up very few models. There's probably more optimization that could be done, however, but it's not a priority currently.

The support for parallel sampling is a bit broken in python 2 so I assume that you just didn't get parallelization there and thus theano didn't try to run 4 GPU jobs in parallel.

Happy sampling!

@twiecki twiecki closed this as completed Jul 24, 2018
@JIXING123
Copy link

hi, I also met the same problem. I still do not understand how to solve this problem as this is the first time I run PyMC3? Can you write it down more detail? thanks

@twiecki
Copy link
Member

twiecki commented Aug 5, 2018

@JIXING123 Did you try sampling with jobs=1? I.e. pm.sample(jobs=1).

@JIXING123
Copy link

Hi, thank you for your reply. I do not set jobs=1 as I do not know where should I put this code. For example, trace=pm.sample(2000, jobs=1) or set pm.sample(jobs=1)as an independent line?

Attachment is the code from http://people.duke.edu/~ccc14/sta-663-2016/16C_PyMC3.html, which is just used for learning PyMC3. I also try the example which MisterRedacts did. They are the same error.

import pymc3 as pm
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

n=100
heads=61
a,b=10,10
prior=stats.beta(a,b)
post=stats.beta(heads+a, n-heads+b)
ci=post.interval(0.95)

xs= np.linspace(0,1,100)
plt.plot(prior.pdf(xs),label='Prior')
plt.plot(post.pdf(xs),label='Posterior')
plt.axvline(100*heads/n, c='red', alpha=0.4, label='MLE')
plt.xlim([0, 100])
plt.axhline(0.3,ci[0],ci[1], c='black', linewidth=2, label='95% CI')
plt.legend()
pass

#introduction PyMC3
niter=2000
with pm.Model() as coin_context:
p=pm.Beta('p',alpha=2, beta=2)
y=pm.Binomial('y', n=n, p=p, observed=heads)
trace=pm.sample(niter,jobs=1)

@MisterRedactus
Copy link
Author

Try the attached procedure, environment config and test PyMC3 Python files, which worked well on my Windows PC:

PyMC3 Windows Installation Instructions.docx

Python3_Regression_Case.txt

pymc3_env_3_7.txt

Make sure you change pymc3_env_3_7.txt to pymc3_env_3_7.yml and Python3_Regression_Case.txt to Python3_Regression_Case.py before you start. Good luck.

@junpenglao
Copy link
Member

Wow thank you so much for writing down your experience!

@JWarmenhoven
Copy link
Contributor

@JIXING123
For me (PyMC 3.5, single GPU) your code works with both CPU and GPU. It is just that with a single GPU you have to make sure Theano does not try to initiate parallel sampling as indicated by @twiecki above.

In case you are using a single GPU: did you try setting cores=1 when sampling?
Your example code:
trace=pm.sample(niter, cores=1).

Looks like below commit replaced keyword njobs with cores in pm.sample?
f74bf07#diff-7eb6c4a83cfe45b9fc0eac76b57e2175

@benmbrennan
Copy link

Hey, I'm also having troubles with this and none of the suggested solutions work. I've changed the device to be 'cpu' already as well.

`model=pymc3.Model()'

'with model:'

'alpha = pymc3.MvNormal('alpha', mu=np.r_[np.ones(5),.9*np.eye卌.flatten('F')], cov=np.eye(30), shape=(30,))'
'mu=theano.tensor.dot(theano.tensor.slinalg.kron(np.eye(5),X),alpha.T)'
'Y_obs = pymc3.MvNormal('Y_obs', mu=mu.T, cov=np.eye(1260), observed=Y.flatten('F').T)'
'map_estimate=pymc3.find_MAP(model=model)'
'trace = pymc3.sample(500, jobs=1)`

I had been trying to draw the covariance matrix for the parameters from a distribution as well, but was having a lot of trouble getting that to work and wanted to just make sure I could get something simpler to work first.

X and Y are time series data matrices. X includes the lags of Y (and ones).

@junpenglao
Copy link
Member

pretty sure you will run out of memory with cov=np.eye(1260). If you are using an identity matrix as cov then it is just a univariate Gaussian - you should try to replace the MvNormal with a Normal

@benmbrennan
Copy link

I'm using identity matrices because I was being bombarded with errors when I was drawing from another distribution. Also is the cov for the observations not meant to be the covariance matrix for the error terms?

@junpenglao
Copy link
Member

If you have error also in CPU, this is likely a different issue and you should open a new issue or discussion on https://discourse.pymc.io. Did you do a search on our discourse? I remember using sparse matrix is not trivial and there are a few discussion there.

@cpoptic
Copy link

cpoptic commented Sep 2, 2018

I was able to solve this issue in my environment (a single GPU laptop) by adding the parameter "cores=1" to my trace call.

So for example:
trace = pm.sample(2000, step=step)
would be modified to
trace = pm.sample(2000, step=step, cores=1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants