~ Moon Tzu
” The supreme art of GPU war is to install without errors; a true warrior’s path is paved with drivers and dependencies.”
CUDA (Compute Unified Device Architecture) and cuDNN (CUDA Deep Neural Network library) are integral tools in the arsenal of any developer diving into GPU-accelerated computing, especially in the realm of deep learning and parallel computing. Developed by NVIDIA, CUDA provides a parallel computing platform and application programming interface (API) model that enables dramatic increases in computing performance by utilizing the power of NVIDIA GPUs.
What is CUDA?
CUDA enables developers to accelerate applications by offloading compute-intensive tasks to the GPU, rather than relying solely on the CPU. This parallel computing platform includes libraries, tools, and APIs that allow for high-performance computing (HPC) and general-purpose GPU computing. It’s widely used in fields like scientific computing, finance, oil and gas exploration, and more prominently, in training and deploying deep neural networks.
What is cuDNN?
cuDNN, on the other hand, is a GPU-accelerated library for deep neural networks. It provides highly optimized implementations of standard routines used in deep learning, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and more. cuDNN is essential for maximizing the performance of deep learning frameworks like TensorFlow, PyTorch, and others that leverage CUDA for GPU computing.
Where can CUDA and cuDNN be used? CUDA and cuDNN find applications across various domains:
- Deep Learning: Speeds up training and inference of deep neural networks.
- Scientific Computing: Accelerates simulations, data analysis, and complex computations.
- Computer Vision: Enhances performance in image and video processing tasks.
- Finance: Enables faster risk analysis, portfolio optimization, and algorithmic trading.
By harnessing CUDA and cuDNN, developers can leverage the parallel processing capabilities of NVIDIA GPUs to significantly accelerate computations and unlock new possibilities in performance-intensive applications.
Follow the guide below to install cuda and cuDNN in your own computer and create your first program !
Step 1: Verify Your GPU is CUDA Enabled
First things first, let’s see if your hardware is worthy of the CUDA treatment. Run this command and hope for the best:
lspci | grep -i nvidia
Step 2: Resurrect NVIDIA drivers and exorcise previous cuda insallations :
Time to cleanse your machine of any previous CUDA sins:
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt-get autoremove && sudo apt-get autoclean
sudo rm -rf /usr/local/cuda*
Refer to this guide to install those sneaky drivers .
Step 3: System Update
Keeping your system up-to-date is always a good idea. Plus, it’s a great way to feel productive without actually doing much:
sudo apt-get update
sudo apt-get upgrade
Step 4: Install Necessary Packages
Time to grab some essential tools.
sudo apt-get install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev
Step 7: Reboot the System
The moment of truth. Reboot your system and hope it comes back with a newfound appreciation for GPU computing :’) If not you know the drill:
Step 8: Set Up CUDA Repository
We’re getting closer. Time to set up the CUDA repository and prepare for the meaty part:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.5.0/local_installers/cuda-repo-ubuntu2004-12-5-local_12.5.0-555.42.02-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2004-12-5-local_12.5.0-555.42.02-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2004-12-5-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-5
Note that I haven’t installed CUDA 11 because my driver supports CUDA 12. Also, TensorFlow might break with CUDA 12, but PyTorch (We don’t do tensorflow here !) will work perfectly. As a workaround, we can use a Docker container which includes cuDNN 8 with CUDA 11.8, even if our main system has CUDA 12 and cuDNN 9 installed. I have tested the above configurations as of writing this post.
Step 10: Set Up CUDA Paths
Because what’s the point of installing something if your system doesn’t know where to find it?
echo 'export PATH=/usr/local/cuda-12.5/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.5/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
The above path may be a little different for you, depending on the cuda version you download . Usually, changing the CUDA number (for eg: 12.5 to 12.1) works.
Step 11: Install cuDNN v8.2.1.32
As of writing this blog cuDNN 9.2.0 is the latest . Follow the instructions below :
wget https://developer.download.nvidia.com/compute/cudnn/9.2.0/local_installers/cudnn-local-repo-ubuntu2004-9.2.0_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu2004-9.2.0_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2004-9.2.0/cudnn-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudnn
Step 12: Verify cuDNN Installation with Sample Code
Alright, you’ve made it this far! Now let’s make sure everything is working correctly by running some sample code. NVIDIA provides some handy sample programs to verify your cuDNN installation.
- Navigate to the cuDNN samples directory: The cuDNN samples are usually installed in the
/usr/src/cudnn_samples_v9/
directory. Open a terminal and navigate to this directory:
cd /usr/src/cudnn_samples_v9/mnistCUDNN
2. Compile the sample code: Use the make
command to compile the sample code. This sample demonstrates the use of cuDNN in training a neural network for recognizing handwritten digits using the MNIST dataset.
sudo make
3. Run the sample code: After compiling, you can run the sample program to verify that cuDNN is correctly installed and functioning. Execute the following command:
./mnistCUDNN
If everything is set up correctly, you should see output indicating the progress of training the neural network and the final accuracy on the test dataset.
Step 13: Troubleshooting Common Issues
If you encounter any issues during the verification step, here are a few troubleshooting tips:
- Check environment variables: Ensure that your
PATH
andLD_LIBRARY_PATH
variables are correctly set. - Check CUDA installation: Make sure that CUDA is correctly installed and that your system recognizes the GPU.
- Recompile with correct paths: If the sample code fails to compile, double-check that the paths in your
Makefile
point to the correct locations for the CUDA and cuDNN libraries.
Step 14: Celebrate Your Success!
Congratulations! You’ve successfully installed CUDA and cuDNN, and verified their installation by running sample code. Now you’re ready to dive into deep learning and take advantage of GPU acceleration in your projects.
If you have any questions or run into any issues, feel free to drop a comment below . Happy coding!
at sudo make it didnt work 🙁
ubuntu@rubickai-gpu-1:/usr/src/cudnn_samples_v9/mnistCUDNN$ sudo make
cat: /usr/local/cuda/include/cuda.h: No such file or directory
CUDA_VERSION is
/bin/sh: 1: test: -ge: unexpected operator
Linking agains cublasLt = false
expr: syntax error: unexpected argument ‘11000’
expr: syntax error: unexpected argument ‘11010’
expr: syntax error: unexpected argument ‘11042’
expr: syntax error: unexpected argument ‘11080’
expr: syntax error: unexpected argument ‘12000’
CUDA VERSION:
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70
/bin/sh: 1: /usr/local/cuda/bin/nvcc: not found
>>> WARNING – FreeImage is not set up correctly. Please ensure FreeImage is set up correctly. <<<
[@] /usr/local/cuda/bin/nvcc -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -ccbin g++ -m64 -std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o fp16_dev.o -c fp16_dev.cu
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -std=c++11 -o fp16_emu.o -c fp16_emu.cpp
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -std=c++11 -o mnistCUDNN.o -c mnistCUDNN.cpp
[@] /usr/local/cuda/bin/nvcc -ccbin g++ -m64 -std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
Hello fellow Geek,
I see that your machine has a FreeImage dependency error . TO fix this just use :
“sudo apt-get install libfreeimage3 libfreeimage-dev”
If you get stuck at any point , visit us at “https://t.me/+OL2wD2Ep_c0yZjc1”