Goal:
This article records each step for installing CUDA Toolkit and NVIDIA Driver on Ubuntu by following CUDA installation guide.
Env:
Ubuntu 18.04
CUDA 11.0.3
NVIDIA Driver 450.51.06
Quadro RTX 6000
Solution:
1. Verify if GPU is CUDA-capable.
# update-pciids
Downloaded daily snapshot dated 2021-03-06 03:15:02
# lspci -vnn | grep NVIDIA
17:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102GL [Quadro RTX 6000/8000] [10de:1e30] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation Quadro RTX 6000 [10de:12ba]
Note: update-pciids fetches the current version of the pci.ids file from the primary distribution site and installs it.
2. Verify OS version
uname -m && cat /etc/*release
Note: Please refer to OS support matrix "Table 1. Native Linux Distribution Support in CUDA 11.2" in CUDA Installation guide.
3. Verify gcc is installed.
gcc --version
Note: gcc is required for development using CUDA Toolkit, not required for running CUDA applications.
4. Verify the System has the Correct Kernel Headers and Development Packages Installed
Firstly get the kernel version:
# uname -r
5.0.0-23-generic
Then install linux-headers package for that kernel version:
sudo apt-get install linux-headers-$(uname -r)
Confirm:
# apt list --installed|grep linux-headers
linux-headers-5.0.0-23/bionic-updates,bionic-updates,bionic-security,bionic-security,now 5.0.0-23.24~18.04.1 all [installed,automatic]
linux-headers-5.0.0-23-generic/bionic-updates,bionic-security,now 5.0.0-23.24~18.04.1 amd64 [installed]
linux-headers-generic-hwe-18.04/now 5.0.0.23.80 amd64 [installed,upgradable to: 5.4.0.66.74~18.04.61]
5. Download and install CUDA and Driver
The download link for latest version:
https://developer.nvidia.com/cuda-downloads
Below are archived version:
https://developer.nvidia.com/cuda-toolkit-archive
Here I decided to choose CUDA 11.0.3 for Ubuntu 18.04 using Debian installer method, so using below link to download:
https://developer.nvidia.com/cuda-11.0-update1-download-archive
Choose "Linux"->"X86_64"->"Ubuntu"->"18.04"->"deb(local)".
5.1 Download the APT preferences fragment file which controls which versions of packages will be selected for installation.
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
The content of APT preferences fragment file is:
# cat /etc/apt/preferences.d/cuda-repository-pin-600
Package: nsight-compute
Pin: origin *ubuntu.com*
Pin-Priority: -1
Package: nsight-systems
Pin: origin *ubuntu.com*
Pin-Priority: -1
Package: *
Pin: release l=NVIDIA CUDA
Pin-Priority: 600
Basically it means do not install nsight-compute and nsight-systems from "*ubuntu.com*" and put a high priority for packages with "l=NVIDIA CUDA".
5.2 Download and install CUDA repository meta-data
wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda-repo-ubuntu1804-11-0-local_11.0.3-450.51.06-1_amd64.deb
sudo dpkg --install cuda-repo-ubuntu1804-11-0-local_11.0.3-450.51.06-1_amd64.deb
After that, double confirm cuda-repo is installed:
# apt list|grep cuda-repo
cuda-repo-ubuntu1804-11-0-local/now 11.0.3-450.51.06-1 amd64 [installed,local]
5.3 Install CUDA public GPG key
The command should be printed in above command's output.
sudo apt-key add /var/cuda-repo-ubuntu1804-11-0-local/7fa2af80.pub
5.4 Update the APT repository cache
sudo apt-get update
5.5 Install CUDA
sudo apt-get install cuda
Confirm:
$ apt list |grep -i cuda |grep 11.0.3
cuda/unknown,now 11.0.3-1 amd64 [installed]
cuda-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-command-line-tools-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-compiler-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-libraries-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-libraries-dev-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-minimal-build-11-0/unknown 11.0.3-1 amd64
cuda-nsight-compute-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-nsight-systems-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-repo-ubuntu1804-11-0-local/now 11.0.3-450.51.06-1 amd64 [installed,local]
cuda-runtime-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-toolkit-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-tools-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
cuda-visual-tools-11-0/unknown,now 11.0.3-1 amd64 [installed,automatic]
6. Post-installaction Actions
Add below ENV variables in .bashrc:
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64{LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Note: Here we are using soft-link just in case in the future we may have multiple versions of CUDA.
7. Recommended Actions
7.1 Start NVIDIA Persistence Daemon as root user
/usr/bin/nvidia-persistenced --verbose
7.2 Install Writable Samples
cuda-install-samples-11.0.sh ~/cudasample
Confirm:
$ ls cudasample/NVIDIA_CUDA-11.0_Samples/
0_Simple 1_Utilities 2_Graphics 3_Imaging 4_Finance 5_Simulations 6_Advanced 7_CUDALibraries common EULA.txt Makefile
8. Verify the installation
8.1 Verify NVIDIA Driver Version
$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 450.51.06 Sun Jul 19 20:02:54 UTC 2020
GCC version: gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
8.2 Verify CUDA Toolkit version
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
8.3 Compile the Examples
cd ~/cudasample/NVIDIA_CUDA-11.0_Samples
make
Note: This may take 10+ mins, take a coffee.
The resulting binaries are under "./bin" directory:
$ ls -altr ./bin/x86_64/linux/release/
total 1151944
drwxrwxr-x 3 xxxx xxxx 4096 Mar 6 14:32 ..
-rwxrwxr-x 1 xxxx xxxx 702112 Mar 6 14:32 inlinePTX
-rwxrwxr-x 1 xxxx xxxx 739624 Mar 6 14:33 immaTensorCoreGemm
...
8.4 Run "deviceQuery" to make sure CUDA compatible GPUs can be found.
$ ./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 2 CUDA Capable device(s)
Device 0: "Quadro RTX 6000"
CUDA Driver Version / Runtime Version 11.0 / 11.0
...
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.0, CUDA Runtime Version = 11.0, NumDevs = 2
Result = PASS
...
8.5 run "bandwidthTest" to ensures that the system and the CUDA-capable device are able to communicate correctly.
$ ./bin/x86_64/linux/release/bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: Quadro RTX 6000
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 12.2
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 13.2
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 539.5
Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
9. Optional Actions
9.1 Install 3rd party libraries for above sample code
sudo apt-get install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev
9.2 Install cuda-gdb-src
sudo apt install cuda-gdb-src-11-0
Then it is put in: /usr/local/cuda-11.0/extras/:
$ ls -altr /usr/local/cuda-11.0/extras/cuda-gdb-11.0.221.src.tar.gz
-rw-r--r-- 1 root root 38263962 Jul 23 2020 /usr/local/cuda-11.0/extras/cuda-gdb-11.0.221.src.tar.gz
9.3 Manage the active version of CUDA
sudo update-alternatives --install /usr/local/cuda cuda /usr/local/cuda-11.0 50
sudo update-alternatives --display cuda
sudo update-alternatives --config cuda
After that:
$ ls -altr /usr/local/cuda
lrwxrwxrwx 1 root root 22 Mar 6 15:32 /usr/local/cuda -> /etc/alternatives/cuda
$ ls -altr /etc/alternatives/cuda
lrwxrwxrwx 1 root root 20 Mar 6 15:32 /etc/alternatives/cuda -> /usr/local/cuda-11.0
10. Remove CUDA Toolkit and driver
# To remove CUDA Toolkit:
sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" "*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*"
# To remove NVIDIA Drivers:
sudo apt-get --purge remove "*nvidia*"
# To clean up the uninstall:
sudo apt-get autoremove
References:
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html
No comments:
Post a Comment