Install Pytorch on Jetson TK1

Dear Sir

Can I Install pytorch in Jetson-TK1 ? does Jetson’s GPU Support for it ?

Thx

Hi,

pyTorch on TX1 is good but we don’t have experience on TK1.
You can give it a try. This is a good sample:

Hi,

Did you manage to get PyTorch working on the TK1? If so some help would be much appreciated!

Thanks!

Also wondering if either of you were able to get pytorch running on the TK1.

I was able to build an older version of pytorch (0.3.1) without CUDA support. Higher versions of pytorch rely on libnvrtc, which is part of CUDA 7 and greater (and the TK1 only supports up to CUDA 6.5). Attempting to build with CUDA support resulted in the following error:

-- Build files have been written to: /tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/build/THNN
Scanning dependencies of target THNN
[ 50%] Building C object CMakeFiles/THNN.dir/init.c.o
[100%] Linking C shared library libTHNN.so
[100%] Built target THNN
Install the project...
-- Install configuration: "Release"
-- Installing: /tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/lib/libTHNN.so.1
-- Installing: /tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/lib/libTHNN.so
-- Set runtime path of "/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/lib/libTHNN.so.1" to ""
-- Installing: /tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/include/THNN/THNN.h
-- Installing: /tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/include/THNN/generic/THNN.h
-- The C compiler identification is GNU 4.8.4
-- The CXX compiler identification is GNU 4.8.4
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Removing -DNDEBUG from compile flags
-- TH_LIBRARIES: /tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/lib/libTH.so.1
-- Found CUDA: /usr/local/cuda (found suitable version "6.5", minimum required is "5.5") 
-- MAGMA not found. Compiling without MAGMA support
-- got cuda version 6.5
-- Could not find CUDA with FP16 support, compiling without torch.CudaHalfTensor
-- CUDA_NVCC_FLAGS:  -DTH_INDEX_BASE=0 -I/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/include   -I/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/include/TH -I/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_ins
tall/include/THC   -I/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/include/THS -I/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/include/THCS   -I/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/include/T
HNN -I/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/tmp_install/include/THCUNN -DOMPI_SKIP_MPICXX=1;-gencode;arch=compute_35,code=sm_35
-- THC_SO_VERSION: 1
-- Configuring done
-- Generating done
CMake Warning:
  Manually-specified variables were not used by the project:

    ATEN_LIBRARIES
    CMAKE_DEBUG_POSTFIX
    CMAKE_INSTALL_LIBDIR
    NCCL_EXTERNAL
    NO_CUDA
    THCS_LIBRARIES
    THCUNN_LIBRARIES
    THCUNN_SO_VERSION
    THC_LIBRARIES
    THD_SO_VERSION
    THNN_LIBRARIES
    THNN_SO_VERSION
    THS_LIBRARIES
    TH_SO_VERSION
    cwrap_files
    nanopb_BUILD_GENERATOR

-- Build files have been written to: /tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/build/THC
[  1%] Building NVCC (Device) object CMakeFiles/THC.dir/THC_generated_THCReduceApplyUtils.cu.o
[  2%] Building NVCC (Device) object CMakeFiles/THC.dir/THC_generated_THCBlas.cu.o
[  3%] Building NVCC (Device) object CMakeFiles/THC.dir/generated/THC_generated_THCTensorMaskedDouble.cu.o
[  4%] Building NVCC (Device) object CMakeFiles/THC.dir/THC_generated_THCSleep.cu.o
/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/THC/THCBlas.cu(495): error: identifier "cublasSgetrsBatched" is undefined

/tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/THC/THCBlas.cu(512): error: identifier "cublasDgetrsBatched" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_000074ab_00000000-6_THCBlas.cpp1.ii".
CMake Error at THC_generated_THCBlas.cu.o.cmake:267 (message):
  Error generating file
  /tmp/tmp.i6b4rGpeXt/pytorch/torch/lib/build/THC/CMakeFiles/THC.dir//./THC_generated_THCBlas.cu.o

make[2]: *** [CMakeFiles/THC.dir/THC_generated_THCBlas.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....

Unfortunately, it seems the functions referenced in the error (cublasDgetrsBatched and cublasSgetrsBatched) weren’t added until CUDA 7, based on the CUDA 7 release notes. I also tried installing pytorch using the script in the repo posted by AastaLLL but was met with the same result. So, I performed the following steps to see if it would work without CUDA.

Important: I ran into compiler errors presumably due to running out of memory, so I added 4GB swap space with a USB flash drive to actually complete the build (though more may be better). These steps also assume that you’ve installed gfortran, CMake, and the various development headers/libraries for Python, libblas, etc., and that you’re using Python 3.

First, clone the repo and check out the appropriate commit:

git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
git checkout v0.3.1

Before proceeding, apply the fix from this commit (pull #4473) (referenced in issue #4472) by adding the following import to pytorch/tools/autograd/templates/VariableType.cpp:

#include <cstddef>

and replacing line 40:

template<unsigned long N>

with this:

template<std::size_t N>

Now, we can proceed with building the wheels as follows. Note that we disable Intel MKL-DNN because it’s incompatible with 32-bit architectures. We use NO_DISTRIBUTED and set USE_NCCL=0 for similar reasons. Ensure you’ve returned to the pytorch repo root directory, then:

pip install -r requirements.txt
pip install numpy
git submodule update --init

export NO_CUDA=1
export NO_MKLDNN=1
export USE_NCCL=0
export NO_DISTRIBUTED=1

python setup.py bdist_wheel

This should successfully produce pytorch wheels. If anybody can offer suggestions to build with CUDA, that would be great.

In my previous post, I mentioned that the cuBLAS functions cublasDgetrsBatched and cublasSgetrsBatched first appear in CUDA 7. The pytorch file that references them (not found in newer versions of pytorch) is pytorch/torch/lib/THC/THCBlas.cu. The last version of pytorch that doesn’t contain references to these functions is v0.1.10, so I tried to build v0.1.10. The build proceeded further than before but eventually encountered the following error after building THCUNN:

[100%] Built target THCUNN
Install the project...
-- Install configuration: "Release"
-- Installing: /tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/lib/libTHCUNN.so.1
-- Installing: /tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/lib/libTHCUNN.so
-- Set runtime path of "/tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/lib/libTHCUNN.so.1" to ""
-- Installing: /tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/include/THCUNN/THCUNN.h
-- Installing: /tmp/tmp.CmqAOpt6SC/pytorch/torch/lib/tmp_install/include/THCUNN/generic/THCUNN.h
-- The C compiler identification is GNU 4.8.4
-- The CXX compiler identification is GNU 4.8.4
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at /usr/local/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:148 (message):
  Could NOT find CUDA: Found unsuitable version "6.5", but required is at
  least "7.0" (found /usr/local/cuda)
Call Stack (most recent call first):
  /usr/local/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:386 (_FPHSA_FAILURE_MESSAGE)
  /tmp/tmp.CmqAOpt6SC/pytorch/cmake/FindCUDA/FindCUDA.cmake:1013 (find_package_handle_standard_args)
  CMakeLists.txt:5 (FIND_PACKAGE)

So, according to the CMake file at pytorch/cmake/FindCUDA/FindCUDA.cmake, the minimum required CUDA version is 7.0. In fact, every version of this file going back to pytorch v0.1.1 (the first tagged release) mentions CUDA 7.0, which isn’t compatible with the TK1.

Unless there’s some other workaround or something I’m missing, it doesn’t seem like it’s possible to install pytorch with CUDA support on the TK1.

I managed to build the latest version of PyTorch (1.3.0, specifically this commit) without CUDA as follows.

Requirements:

  • gcc/g++ >= 4.9.2
  • Python >= 3.5 (or Python 2.7) and dev headers
  • ~4GB (?) swap space
export USE_NCCL=0
export USE_DISTRIBUTED=0
export USE_CUDA=0
export NO_MKLDNN=1

# pip install numpy may be unsuccessful depending on gcc version;
# if unsuccessful, run `CFLAGS="--std=c99" python -m pip install numpy`
# per instructions at https://github.com/numpy/numpy/issues/14147
pip install numpy

git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
git submodule sync
git submodule update --init --recursive
pip install -r requirements.txt

python setup.py bdist_wheel