Installing GPU Driver, CUDA, cuDNN, and Compiling Yolov3 on Ubuntu 18.04

Recently, I came across an open-source object detection library called Yolov3, which is based on deep learning. However, when I ran it on my notebook with integrated graphics, the real-time frame rate was only 0.1fps. As a beginner, I just learned that deep learning requires an N-card and CUDA installation.

Luckily, I remembered that I had a desktop computer with a graphics card that I could use.(English version Translated by GPT-3.5, 返回中文)

Preparation

  1. Nvidia graphics card with Cuda and cuDNN support, generally GTX 9x, GTX 10x, RTX 20x are supported.
  2. Ubuntu 18.04 Desktop version
  3. Cuda + cuDNN installation guide: Installing GPU +CUDA+cuDNN on Ubuntu 18.04 (Proven and Practical) (Note: This guide is written in Chinese)
  4. OpenCV installation guide: Build OpenCV 2.4.9 & Caffe with CUDA 9.0 on Ubuntu 16.04
  5. Yolo, an object detection framework based on deep learning, using version 3: GitHub - AlexeyAB/darknet

Explanation

  1. Initial system installation, including apt updates (not mentioned in this article).
  2. It is recommended to change the source to a domestic source (Aliyun’s source is provided below).

GPU Driver, Cuda, OpenCV Installation

Installing GPU Driver

I encountered some problems here. After going through many resources, most of them mentioned uninstalling the existing driver and disabling the driver. However, on a fresh installation of Ubuntu, it seems that these steps are not necessary. Ubuntu has a convenient command for installing the driver in one line.

  1. First, let Ubuntu detect the graphics card of the machine, it will suggest drivers to install.

    1
    ubuntu-drivers devices

    The command will return the suggested driver to install.

    Example Output:

    1
    2
    3
    4
    5
    6
    == /sys/devices/pci0000:00/0000:00:1f.0 ==
    modalias : pci:v00008086d000092F0sv00001350sd00003300bc05sc80i00
    vendor : Intel Corporation
    model : Kaby Lake-H GT3 [HD Graphics 630]
    driver : i915 - distro non-free recommended
    driver : i915 - kernel modules: i915

    In this example, the suggested driver is nvidia-driver-430. To install the driver, use the following command (replace nvidia-driver-430 with the suggested driver):

    1
    sudo apt install nvidia-driver-430
  2. Install the graphics driver:

    1
    sudo ubuntu-drivers autoinstall

    This will install all supported drivers. Alternatively, you can install only the suggested driver (e.g. 430) using the above command.

  3. After installation, it is recommended to restart the system:

    1
    sudo reboot
  4. After rebooting, verify the installation by using the following command:

    1
    nvidia-smi

    If the command displays information about the graphics card, then the installation was successful. You can also check the system settings in “Details” to see if the graphics card is recognized.

    img

  5. Run Ungine Heaven to test the graphics card. You can download Heaven from here: Heaven - UNIGINE Benchmarks

    img

    Check the benchmark score to confirm that the graphics card has been installed successfully.

Installing CUDA

CUDA (Compute Unified Device Architecture) is a parallel computing framework developed by NVIDIA for their own GPUs. It can be used with CUDA to accelerate programs.

CUDA is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

  1. Download the latest CUDA Toolkit package from the CUDA official website. The latest version at the moment is “CUDA Toolkit 10.1 update1”. It is recommended to install using the network option, as it will download only the required packages instead of a 2.5GB file.

    network

  2. Click “Download” (2.9KB) and save the file in a directory on Ubuntu. Then, execute the following four commands as specified on the official website. If the network speed is slow, consider using a proxy or follow the Installing GPU +CUDA+cuDNN on Ubuntu 18.04 (Proven and Practical) guide to install using the run file.

    1
    2
    3
    4
    sudo dpkg -i cuda-repo-ubuntu1804-<version>.deb
    sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/<key>.asc
    sudo apt-get update
    sudo apt-get install cuda
  3. Wait for the installation to complete.

  4. Add the CUDA library to the environment variable. Edit the ~/.bashrc file and add the following lines:

    1
    2
    export PATH=$PATH:/usr/local/cuda/bin
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

    Save the file and apply the changes:

    1
    source ~/.bashrc

Adding cuDNN

cuDNN (CUDA Deep Neural Network Library) is a GPU-accelerated deep neural network library specifically designed for deep learning frameworks using CUDA.

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks.

  1. Download cuDNN from the official website (login required).

    cudnn

  2. Since you have CUDA 10.1 installed, download cuDNN 7.6.2.

  3. After downloading, extract the files:

    1
    tar -xzvf cudnn-10.1-linux-x64-v7.6.2.24.tgz

    This will extract a collection of .so files.

  4. Run the following commands in the directory where cuDNN was extracted (inside the cuda directory):

    1
    2
    3
    sudo cp -P cuda/lib64/* /usr/local/cuda/lib64/
    sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
    sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

    Note: Replace the commands with the appropriate filenames if necessary.

  5. CUDA and cuDNN installation is complete.

Compiling OpenCV

  1. I chose OpenCV version 2.4.13, as I encountered errors with version 2.4.9 (mentioned in the guide above).

  2. Download OpenCV 2.4.13.6 from the OpenCV official website as “OpenCV – 2.4.13.6 - Sources 94.5MB”. (Link: OpenCV - 2.4.13.6 - Sources)

  3. Follow the instructions in Build OpenCV 2.4.9 & Caffe with CUDA 9.0 on Ubuntu 16.04. Download NCVPixelOperations.hpp and replace opencv-2.4.13.6/modules/gpu/src/nvidia/core/NCVPixelOperations.hpp with the downloaded file.

  4. Extract the OpenCV source code:

    1
    unzip opencv-2.4.13.6.zip

    The console will print the following:

    1
    2
    3
    Archive:  opencv-2.4.13.6.zip
    inflating: opencv-2.4.13.6/modules/gpu/src/nvidia/core/NCV.hpp
    inflating: opencv-2.4.13.6/modules/gpu/src/nvidia/core/NCVPixelOperations.hpp
  5. Install the necessary libraries:

    1
    sudo apt-get install build-essential cmake libopencv-dev libeigen3-dev libglew-dev libgtk2.0-dev
  6. Go to the opencv-2.4.13.6 directory and create a new directory. Enter this directory:

    1
    2
    3
    cd opencv-2.4.13.6
    mkdir build
    cd build
  7. Edit and modify the following files:

    • Edit ../cmake/FindCUDA.cmake and replace the following (4 occurrences in total):

      Find:

      1
      if((NOT DEFINED CUDA_CUDA_TARGET_FLAGS OR CUDA_CALC_ARCH_FROM_VER) AND NOT CUDA_VERSION_STRING VERSION_LESS "8.0")

      Replace with:

      1
      if((NOT DEFINED CUDA_CUDA_TARGET_FLAGS OR CUDA_CALC_ARCH_FROM_VER) AND NOT CUDA_VERSION_STRING VERSION_LESS "10.0")
    • Edit ../cmake/OpenCVDetectCUDA.cmake and replace the following (2 occurrences in total):

      Find:

      1
      if((NOT DEFINED CUDA_CUDA_TARGET_FLAGS OR CUDA_CALC_ARCH_FROM_VER) AND NOT CUDA_VERSION_STRING VERSION_LESS "8.0")

      Replace with:

      1
      if((NOT DEFINED CUDA_CUDA_TARGET_FLAGS OR CUDA_CALC_ARCH_FROM_VER) AND NOT CUDA_VERSION_STRING VERSION_LESS "10.0")
    • Replace modules/gpu/src/nvidia/core/NCVPixelOperations.hpp with the downloaded NCVPixelOperations.hpp file.

  8. Run the pre-build checks:

    1
    cmake ..

    The console will print the following:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    -- General configuration for OpenCV 2.4.13.6 =====================================
    -- Version control: unknown
    --
    -- Extra modules:
    -- Location (extra): /home/user/opencv_contrib-2.4.13.6/modules
    -- Version control (extra): unknown
    --
    -- Platform:
    -- Timestamp: 2019-08-02T00:00:00Z
    -- Host: Linux 5.0.0-20-generic x86_64
    -- CMake: 3.13.4
    -- CMake generator: Unix Makefiles
    -- CMake build tool: /usr/bin/make
    -- Configuration: Release
    --
    -- C/C++:
    -- Built as dynamic libs?: YES
    -- C++ Compiler: /usr/bin/c++ (ver 7.4.0)
    -- C++ flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Winit-self -Wpointer-arith -Wuninitialized -Wmaybe-uninitialized -Wmissing-prototypes -Wstrict-prototypes -Wold-style-definition -Wmissing-parameter-type -Wunreachable-code
    -- C++ flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Winit-self -Wpointer-arith -Wuninitialized -Wmaybe-uninitialized -Wmissing-prototypes -Wstrict-prototypes -Wold-style-definition -Wmissing-parameter-type -Wunreachable-code -g
    -- C Compiler: /usr/bin/gcc
    -- C flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Winit-self -Wpointer-arith -Wuninitialized -Wmaybe-uninitialized -Wmissing-prototypes -Wstrict-prototypes -Wold-style-definition -Wmissing-parameter-type -Wunreachable-code
    -- C flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Winit-self -Wpointer-arith -Wuninitialized -Wmaybe-uninitialized -Wmissing-prototypes -Wstrict-prototypes -Wold-style-definition -Wmissing-parameter-type -Wunreachable-code -g
    -- Linker flags (Release):
    -- Linker flags (Debug):
    -- Precompiled headers: YES
    --
    -- OpenCV modules:
    -- To be built: core imgproc highgui flann features2d calib3d ml video legacy objdetect photo gpu ocl nonfree contrib python stitching ts videostab
    -- Disabled: -
    -- Disabled by dependency: -
    -- Unavailable: androidcamera dynamicuda java world
    --
    -- GUI:
    -- QT: NO
    -- GTK+ 2.x: YES (ver 2.24.32)
    -- GThread : YES (ver 2.56.4)
    -- GtkGlExt: YES (ver 1.2.0)
    -- OpenGL support: NO
    --
    -- Media I/O:
    -- ZLib: /usr/lib/x86_64-linux-gnu/libz.so (ver 1.2.11)
    -- JPEG: /usr/lib/x86_64-linux-gnu/libjpeg.so (ver 80)
    -- PNG: /usr/lib/x86_64-linux-gnu/libpng.so (ver 1.6.34)
    -- TIFF: /usr/lib/x86_64-linux-gnu/libtiff.so (ver 42 / 4.0.9)
    -- JPEG 2000: /usr/lib/x86_64-linux-gnu/libjasper.so (ver 1.900.1)
    -- OpenEXR: /usr/lib/x86_64-linux-gnu/libImath.so /usr/lib/x86_64-linux-gnu/libIlmImf.so /usr/lib/x86_64-linux-gnu/libIex.so /usr/lib/x86_64-linux-gnu/libHalf.so /usr/lib/x86_64-linux-gnu/libIlmThread.so (ver 2.2.0)
    --
    -- Video I/O:
    -- DC1394 1.x: NO
    -- DC1394 2.x: YES (ver 2.2.5)
    -- FFMPEG: YES
    -- codec: YES (ver 56.60.100)
    -- format: YES (ver 56.40.101)
    -- util: YES (ver 54.34.100)
    -- swscale: YES (ver 3.1.101)
    -- resample: YES (ver 3.1.101)
    -- gentoo-style: YES
    -- GStreamer: NO
    --
    -- Parallel framework: pthreads
    --
    -- Other third-party libraries:
    -- Use IPP: NO
    -- Use Eigen: YES (ver 3.3.4)
    -- Use TBB: YES (ver 2017.0)
    -- Use OpenMP: NO
    -- Use GCD NO
    -- Use Concurrency NO
    -- Use C=: NO
    -- Use Cuda: YES (ver 10.1)
    -- Use OpenCL: NO
    --
    -- NVIDIA CUDA: YES (ver 10.1, CUDA libraries: /usr/local/cuda/lib64, CUBLAS: /usr/local/cuda/lib64/libcublas.so.10.0)
    --
    -- Python:
    -- Interpreter: /usr/bin/python2.7 (ver 2.7.17)
    --
    -- Java:
    -- ant: /usr/bin/ant (ver 1.10.5)
    -- JNI: /usr/lib/jvm/java-8-openjdk-amd64/include /usr/lib/jvm/java-8-openjdk-amd64/include/linux /usr/lib/jvm/java-8-openjdk-amd64/include
    -- Java wrappers: YES
    -- Java tests: YES
    --
    -- Documentation:
    -- Doxygen: NO
    -- PlantUML: NO
    --
    -- Install path: /usr/local

    --
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /home/user/opencv-2.4.13.6/build
  9. Compile the source code using the following command (the -j option specifies the number of threads to use, omitting the number will use all available threads):

    1
    make -j

    The progress will be displayed for each file:

    1
    2
    3
    4
    5
    [  0%] Built target opencv_core_pch_dephelp
    [ 0%] Built target opencv_core
    [ 1%] Built target opencv_ts_pch_dephelp
    [ 1%] Built target opencv_ts
    ...

    Wait patiently for the compilation to complete. This may take some time.

  10. Install OpenCV:

    1
    sudo make install
  11. Verify the installation:

    1
    pkg-config --modversion opencv

    If you want to import cv2 in Python, you will also need to install opencv-python using the following command:

    1
    pip install opencv-python
  12. At this point, all software installations are complete.

Compiling Yolov3

Yolov3 is an object detection framework based on deep learning. The official website is YOLO: Real-Time Object Detection.

yolo

Please note that I only briefly explored Yolov3, so my knowledge is limited.

  1. Clone the Yolov3 repository from GitHub:

    1
    git clone https://github.com/pjreddie/darknet.git
  2. Go to the darknet directory and edit the Makefile. Modify the following configurations at the top:

    1
    2
    3
    GPU=1
    CUDNN=1
    OPENCV=1
  3. Compile Yolov3:

    1
    make
  4. After compiling, you need to download the pretrained model weights file yolov3.weights from here. This model can recognize objects such as tables, chairs, and some animals. You can also train your own model to recognize specific objects. Place the downloaded yolov3.weights file in the darknet directory.

  5. Try running object detection:

    1
    ./darknet detect cfg/yolov3.cfg yolov3.weights <image_path>

    Example Command:

    1
    ./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg

    detected

  6. For video detection, it is recommended to use GPU acceleration. Otherwise, the frame rate will be very low for videos (real-time detection in images is much faster with GPU acceleration).

    1
    ./darknet detector demo cfg/coco.data cfg/yolov3.cfg yolov3.weights <video_path>
  7. Feel free to explore further and deepen your knowledge. That’s all for now. Congratulations, you have completed the entire process.