Tensorflow Memory Error

Hi,

We flash this one,
JetPack-L4T-3.1-linux-x64.run

And what you need is following

nvidia@tegra-ubuntu:~$ ll /home/nvidia/.cache/pip/
total 20
drwx------  4 nvidia nvidia 4096 Oct 17 03:29 ./
drwx------ 18 nvidia nvidia 4096 Oct  5 02:33 ../
drwx------ 18 nvidia nvidia 4096 Oct 17 03:21 http/
-rw-rw-r--  1 nvidia nvidia   69 Oct 17 03:29 selfcheck.json
drwxrwxr-x 14 nvidia nvidia 4096 Oct 17 09:54 wheels/
nvidia@tegra-ubuntu:~$ ll /home/nvidia/.cache/pip/wheels/
total 56
drwxrwxr-x 14 nvidia nvidia 4096 Oct 17 09:54 ./
drwx------  4 nvidia nvidia 4096 Oct 17 03:29 ../
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 02/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 08/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 1c/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 3b/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:10 47/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 7b/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 86/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 88/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 b3/
drwxrwxr-x  3 nvidia nvidia 4096 Oct 17 09:54 bc/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 02:08 c8/
drwxrwxr-x  3 nvidia nvidia 4096 Aug 29 01:51 dd/

Hi,

Could you try this command:

sudo -H pip install tensorflow-1.3.0-cp27-cp27mu-linux_aarch64.whl

Hi AastaLLL,

I try your command

sudo -H pip install tensorflow-1.3.0-cp27-cp27mu-linux_aarch64.whl

And this is result

nvidia@tegra-ubuntu:~$ sudo -H pip install tensorflow-1.3.0-cp27-cp27mu-linux_aarch64.whl
Requirement already satisfied (use --upgrade to upgrade): tensorflow==1.3.0 from file:///home/nvidia/tensorflow-1.3.0-cp27-cp27mu-linux_aarch64.whl in /usr/local/lib/python2.7/dist-packages
Requirement already satisfied (use --upgrade to upgrade): protobuf>=3.3.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): six>=1.10.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): tensorflow-tensorboard<0.2.0,>=0.1.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): wheel in /usr/lib/python2.7/dist-packages (from tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): backports.weakref>=1.0rc1 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.11.0 in /usr/lib/python2.7/dist-packages (from tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): mock>=2.0.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): setuptools in /usr/local/lib/python2.7/dist-packages (from protobuf>=3.3.0->tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): werkzeug>=0.11.10 in /usr/local/lib/python2.7/dist-packages (from tensorflow-tensorboard<0.2.0,>=0.1.0->tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): html5lib==0.9999999 in /usr/local/lib/python2.7/dist-packages (from tensorflow-tensorboard<0.2.0,>=0.1.0->tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): markdown>=2.6.8 in /usr/local/lib/python2.7/dist-packages (from tensorflow-tensorboard<0.2.0,>=0.1.0->tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): bleach==1.5.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow-tensorboard<0.2.0,>=0.1.0->tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): pbr>=0.11 in /usr/local/lib/python2.7/dist-packages (from mock>=2.0.0->tensorflow==1.3.0)
Requirement already satisfied (use --upgrade to upgrade): funcsigs>=1; python_version < "3.3" in /usr/local/lib/python2.7/dist-packages (from mock>=2.0.0->tensorflow==1.3.0)
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
nvidia@tegra-ubuntu:~$

Hi,

There is no error message in comment #23.
We can run gan_mnist.py without error on TX2 with the wheel shared in comment#20.

nvidia@tegra-ubuntu:/media/nvidia/NVIDIA/improved_wgan_training$ python gan_mnist.py 
Uppercase local vars:
	BATCH_SIZE: 50
	CRITIC_ITERS: 5
	DIM: 64
	ITERS: 200000
	LAMBDA: 10
	MODE: wgan-gp
	OUTPUT_DIM: 784
Couldn't find MNIST dataset in /tmp, downloading...
2017-10-23 06:42:45.539490: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:879] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2017-10-23 06:42:45.539658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: NVIDIA Tegra X2
major: 6 minor: 2 memoryClockRate (GHz) 1.3005
pciBusID 0000:00:00.0
Total memory: 7.67GiB
Free memory: 4.66GiB
2017-10-23 06:42:45.539723: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2017-10-23 06:42:45.539758: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2017-10-23 06:42:45.539790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0)
2017-10-23 06:42:45.539832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:657] Could not identify NUMA node of /job:localhost/replica:0/task:0/gpu:0, defaulting to 0.  Your kernel may not have been built with NUMA support.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/util/tf_should_use.py:175: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
iter 0	train disc cost	-1.38231694698	time	3.83967995644
iter 1	train disc cost	-5.65252685547	time	0.869536876678
iter 2	train disc cost	-9.17012786865	time	0.764479160309
iter 3	train disc cost	-10.7923326492	time	0.764439105988
iter 4	train disc cost	-12.5527458191	time	0.767117977142
iter 99	train disc cost	-10.1466503143	dev disc cost	-7.58510112762	time	0.766215409731
iter 199	train disc cost	-5.83883905411	dev disc cost	-4.839802742	time	0.765890324116
iter 299	train disc cost	-4.21712446213	dev disc cost	-3.83672761917	time	0.767970068455
iter 399	train disc cost	-3.5645532608	dev disc cost	-3.45508217812	time	0.766714634895
...

Thanks.

You can try:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2) #reduce lower, ex 0.15, 0.1… util run success
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))