Theano: Initialisierung der Geräte-GPU fehlgeschlagen! Grund = CNMEM_STATUS_OUT_OF_MEMORY

Question

Mar 19, 2016, 10:11 AM

Theano: Initialisierung der Geräte-GPU fehlgeschlagen! Grund = CNMEM_STATUS_OUT_OF_MEMORY

Ich betreibe dasBeispie kaggle_otto_nn.py vonKeras mit Backend vontheano. Wenn ich @ einstelcnmem=1, der folgende Fehler tritt auf:

cliu@cliu-ubuntu:keras-examples$ THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32,lib.cnmem=1 python kaggle_otto_nn.py
Using Theano backend.
ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device gpu failed:
initCnmem: cnmemInit call failed! Reason=CNMEM_STATUS_OUT_OF_MEMORY. numdev=1

/usr/local/lib/python2.7/dist-packages/Theano-0.8.0rc1-py2.7.egg/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
  "downsample module has been moved to the theano.tensor.signal.pool module.")
Traceback (most recent call last):
  File "kaggle_otto_nn.py", line 28, in <module>
    from keras.models import Sequential
  File "build/bdist.linux-x86_64/egg/keras/models.py", line 15, in <module>
  File "build/bdist.linux-x86_64/egg/keras/backend/__init__.py", line 46, in <module>
  File "build/bdist.linux-x86_64/egg/keras/backend/theano_backend.py", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/Theano-0.8.0rc1-py2.7.egg/theano/__init__.py", line 111, in <module>
    theano.sandbox.cuda.tests.test_driver.test_nvidia_driver1()
  File "/usr/local/lib/python2.7/dist-packages/Theano-0.8.0rc1-py2.7.egg/theano/sandbox/cuda/tests/test_driver.py", line 38, in test_nvidia_driver1
    if not numpy.allclose(f(), a.sum()):
  File "/usr/local/lib/python2.7/dist-packages/Theano-0.8.0rc1-py2.7.egg/theano/compile/function_module.py", line 871, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/usr/local/lib/python2.7/dist-packages/Theano-0.8.0rc1-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/usr/local/lib/python2.7/dist-packages/Theano-0.8.0rc1-py2.7.egg/theano/compile/function_module.py", line 859, in __call__
    outputs = self.fn()
RuntimeError: Cuda error: kernel_reduce_ccontig_node_97496c4d3cf9a06dc4082cc141f918d2_0: out of memory. (grid: 1 x 1; block: 256 x 1 x 1)

Apply node that caused the error: GpuCAReduce{add}{1}(<CudaNdarrayType(float32, vector)>)
Toposort index: 0
Inputs types: [CudaNdarrayType(float32, vector)]
Inputs shapes: [(10000,)]
Inputs strides: [(1,)]
Inputs values: ['not shown']
Outputs clients: [[HostFromGpu(GpuCAReduce{add}{1}.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Es scheint, als ob ich das @ nicht setzen kacnmem auf einen sehr großen Wert (ca.> 0,9), da dies zu einem Speicherüberlauf der GPU führen kann. Und wenn ichcnmem=0.9, es funktioniert richtig. GemäßDie, es

Stellt die Startgröße (in MB oder% des gesamten GPU-Speichers) des Speicherpools dar.

Un

Dies kann zu einer Fragmentierung des Speichers führen. Wenn Sie also während der Verwendung von cnmem einen Speicherfehler haben, versuchen Sie, beim Start mehr Speicher zuzuweisen, oder deaktivieren Sie ihn. Wenn Sie dies versuchen, melden Sie Ihr Ergebnis auf: reftheano-dev.

Aber wenn ich einen Speicherfehler habe, warum sollte ich zu Beginn mehr Speicher zuweisen? In meinem Fall scheint das Zuweisen von mehr Speicher zu Beginn den Fehler zu verursachen.

Zu kommentieren