Theano (ver 0.8.2) から gpu を使用する際、以下のようなエラーが出たので対応した際のログ。
ERROR (theano.sandbox.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
File "/home/guchio/miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages/theano/sandbox/gpuarray/__init__.py", line 95, in <module>
init_dev(config.device)
File "/home/guchio/miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages/theano/sandbox/gpuarray/__init__.py", line 46, in init_dev
"Make sure Theano and libgpuarray/pygpu "
RuntimeError: ('Wrong major API version for gpuarray:', 1, 'Make sure Theano and libgpuarray/pygpu are in sync.')
これは Theano の issue を参照するに libgpuarray のバージョンと Theano のバージョンのミスマッチが原因ぽく、libgpuarray の tag v-9998 をインストールすることで解決できるぽい。
そこでここを参考に pip 経由で以下のように libgpuarray をインストールしたが、以下のエラーを確認。
ちなみに、#egg 以降はここを参考に、libgpuarray の setup.py を参照した。
$ pip install -e git://github.com/Theano/libgpuarray.git@v-9998#egg=pygpu
Obtaining pygpu from git+git://github.com/Theano/libgpuarray.git@v-9998#egg=pygpu
Cloning git://github.com/Theano/libgpuarray.git (to v-9998) to ./src/pygpu
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/guchio/src/pygpu/setup.py", line 7, in <module>
import Cython
ImportError: No module named Cython
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /home/guchio/src/pygpu/
そこで cython をインストールして再度挑戦。
しかし以下のエラーを確認。
$ pip install -e git://github.com/Theano/libgpuarray.git@v-9998#egg=pygpu
Obtaining pygpu from git+git://github.com/Theano/libgpuarray.git@v-9998#egg=pygpu
Updating ./src/pygpu clone (to v-9998)
Requirement already satisfied: mako>=0.7 in ./miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages/Mako-1.0.7-py2.7.egg (from pygpu)
Requirement already satisfied: MarkupSafe>=0.9.2 in ./miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages (from mako>=0.7->pygpu)
Installing collected packages: pygpu
Found existing installation: pygpu 0.6.9
Uninstalling pygpu-0.6.9:
Successfully uninstalled pygpu-0.6.9
Running setup.py develop for pygpu
Complete output from command /home/guchio/miniconda2/envs/ntmenv-owl/bin/python -c "import setuptools, tokenize;__file__='/home/guchio/src/pygpu/setup.py';f=getattr(tokenize, 'o
pen', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps:
running develop
running egg_info
writing requirements to pygpu.egg-info/requires.txt
writing pygpu.egg-info/PKG-INFO
writing top-level names to pygpu.egg-info/top_level.txt
writing dependency_links to pygpu.egg-info/dependency_links.txt
reading manifest file 'pygpu.egg-info/SOURCES.txt'
writing manifest file 'pygpu.egg-info/SOURCES.txt'
running build_ext
building 'pygpu.gpuarray' extension
gcc -pthread -B /home/guchio/miniconda2/envs/ntmenv-owl/compiler_compat -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/cuda/include/ -fPIC -DGPUARRAY_SHARED -I/h
ome/guchio/miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages/numpy/core/include -I/home/guchio/miniconda2/envs/ntmenv-owl/include/python2.7 -c pygpu/gpuarray.c -o build/temp.li
nux-x86_64-2.7/pygpu/gpuarray.o
In file included from /home/guchio/miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1809:0,
from /home/guchio/miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:18,
from /home/guchio/miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from pygpu/gpuarray.c:514:
/home/guchio/miniconda2/envs/ntmenv-owl/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disabl
e it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by " \
^
pygpu/gpuarray.c:516:28: fatal error: gpuarray/types.h: No such file or directory
compilation terminated.
error: command 'gcc' failed with exit status 1
----------------------------------------
Rolling back uninstall of pygpu
Command "/home/guchio/miniconda2/envs/ntmenv-owl/bin/python -c "import setuptools, tokenize;__file__='/home/guchio/src/pygpu/setup.py';f=getattr(tokenize, 'open', open)(__file__);co
de=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps" failed with error code 1 in /home/guchio/src/pygpu/
その後いろいろ試したがうまく行かず、結局 Theano のバージョンを 0.8.2 から 0.9.0 に変更して解決した。
試しに gpu の速度を次のコードで図った。
from theano import function, config, shared, tensor
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], tensor.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, tensor.Elemwise) and
('Gpu' not in type(x.op).__name__)
for x in f.maker.fgraph.toposort()]):
print('Used the cpu')
else:
print('Used the gpu')
結果は以下。(cpu は 1 core のみ使用。)
# GPU ver
Using cuDNN version 5110 on context None
Mapped name None to device cuda0: Tesla K80 (0000:83:00.0)
[GpuElemwise{exp,no_inplace}(<GpuArrayType<None>(float32, (False,))>), HostFromGpu(gpuarray)(GpuElemwise{exp,no_inplace}.0)]
Looping 1000 times took 0.505921 seconds
Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761 1.62323296]
Used the gpu
# CPU ver
[Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)]
Looping 1000 times took 32.001292 seconds
Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761 1.62323284]
Used the cpu
一方、以前の記事で紹介した amdlibm を使用すると cpu でも以下のように良い結果となった。
今回の elementwise は amdlibm の効果が非常に出やすいものっぽい。
# CPU w/ amdlibm ver
[Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)]
Looping 1000 times took 1.445175 seconds
Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761 1.62323284]
Used the cpu
また、Theano 0.9.0 は今使っている Lasagne というライブラリの 0.1 との相性が悪く、lasagne/layers/pool.py に於いて
from theano.tensor.signal import downsample
という行があるが、theano 0.9.0 は downsample をサポートから外しているため、ここで下のようなエラーが出る。
ImportError: cannot import name downsample
一方 Lasagne 全体を bleeeding edge version にしてしまうと今書いているプログラムが動かなくなるため、あまり良い手ではないがこの pool.py だけ最新のものに手で書き換え解決。
Comments
comments powered by Disqus