Ubuntu AI环境配置

字数 630阅读 159

把apt的源换成阿里云或国内其它,速度超快。
vi /etc/apt/sources.list
全部替换

deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse


更换后秒下安装。
sudo apt-get update && sudo apt-get upgrade

一、Cuda10.1安装

驱动已经事先安装,显卡gtx1660.

https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=runfilelocal

1.1下载安装包

image.png

找到对应的版本下载;(用迅雷下载会快不少,下载完传到Ubuntu机器)


image.png
tensorflow 1.13.1只支持到cuda10.0

1.2安装

查看下载文件

root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# ls
cuda_10.0.130_410.48_linux.run                  libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
cudnn-10.0-linux-x64-v7.5.1.10.solitairetheme8  libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb
libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb

开始安装;
sudo sh cuda_10.0.130_410.48_linux.run


image.png

协议好长啊,得回车半天。(cuda10.1就改进的很好)

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: n

Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-10.0 ]:  

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
 [ default is /root ]: 

Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so

Installing the CUDA Samples in /root ...
Copying samples to /root/NVIDIA_CUDA-10.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.0
Samples:  Installed in /root, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-10.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 10.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

Logfile is /tmp/cuda_install_16565.log


提示安装成功;

1.3校验

vi ~/.bashrc
在文件最后加上:

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

命令 source ~/.bashrc 使其生效
查看nvcc -V

root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# source ~/.bashrc
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

安装成功;

二、Cudnn7.5.1安装

2.1下载安装包

https://developer.nvidia.com/rdp/cudnn-download
cudnn需要注册登录方可下载;

image.png

下载红框内标记内容;

2.2安装

tar -zxvf cudnn-10.0-linux-x64-v7.5.1.10.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

sudo dpkg -i libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb

执行结果

root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# tar -zxvf cudnn-10.0-linux-x64-v7.5.1.10.tgz
1.10-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.debcuda/include/cudnn.h
cuda/NVIDIA_SLA_cuDNN_Support.txt
cuda/lib64/libcudnn.so
cuda/lib64/libcudnn.so.7
cuda/lib64/libcudnn.so.7.5.1
cuda/lib64/libcudnn_static.a
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo cp cuda/include/cudnn.h /usr/local/cuda/include
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libc
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# 
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo dpkg -i libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb
正在选中未选择的软件包 libcudnn7。
(正在读取数据库 ... 系统当前共安装有 168675 个文件和目录。)
正准备解包 libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7 (7.5.1.10-1+cuda10.0) ...
正在设置 libcudnn7 (7.5.1.10-1+cuda10.0) ...
正在处理用于 libc-bin (2.27-3ubuntu1) 的触发器 ...
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo dpkg -i libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
正在选中未选择的软件包 libcudnn7-dev。
(正在读取数据库 ... 系统当前共安装有 168681 个文件和目录。)
正准备解包 libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7-dev (7.5.1.10-1+cuda10.0) ...
正在设置 libcudnn7-dev (7.5.1.10-1+cuda10.0) ...
update-alternatives: 使用 /usr/include/x86_64-linux-gnu/cudnn_v7.h 来在自动模式中提供 /usr/include/cudnn.h (libcudnn)
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo dpkg -i libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb
正在选中未选择的软件包 libcudnn7-doc。
(正在读取数据库 ... 系统当前共安装有 168687 个文件和目录。)
正准备解包 libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7-doc (7.5.1.10-1+cuda10.0) ...
正在设置 libcudnn7-doc (7.5.1.10-1+cuda10.0) ...


2.3校验

查看cudnn版本命令
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 5
#define CUDNN_PATCHLEVEL 1
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"

正常运行;

三、tensorflow-gpu安装

3.1查看python环境

root@doyen-ai:/home# python

Command 'python' not found, but can be installed with:

apt install python3       
apt install python        
apt install python-minimal

You also have python3 installed, you can run 'python3' instead.

Ubuntu18.04默认安装了python3.6.8
···
root@doyen-ai:/home# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34)
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.

···

3.2安装pip

apt-get install python3-pip python3-dev

root@doyen-ai:/home# pip3 -V
pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6)

再安装setuptools
pip3 install setuptools --upgrade

3.3安装tensorflow-gpu

pip3 install tensorflow-gpu

root@doyen-ai:/home# pip3 install tensorflow-gpu
Collecting tensorflow-gpu
  Downloading https://files.pythonhosted.org/packages/7b/b1/0ad4ae02e17ddd62109cd54c291e311c4b5fd09b4d0678d3d6ce4159b0f0/tensorflow_gpu-1.13.1-cp36-cp36m-manylinux1_x86_64.whl (345.2MB)

Successfully installed absl-py-0.7.1 astor-0.7.1 gast-0.2.2 grpcio-1.20.1 h5py-2.9.0 keras-applications-1.0.7 keras-preprocessing-1.0.9 markdown-3.1 mock-3.0.5 numpy-1.16.3 protobuf-3.7.1 tensorboard-1.13.1 tensorflow-estimator-1.13.0 tensorflow-gpu-1.13.1 termcolor-1.1.0 werkzeug-0.15.4

提示安装完成;

3.4检验安装

root@doyen-ai:/home# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> a = tf.random_normal((100, 100))
>>> b = tf.random_normal((100, 500))
>>> c = tf.matmul(a, b)
>>> sess = tf.InteractiveSession()
2019-05-16 15:57:26.741765: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-05-16 15:57:27.372247: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-05-16 15:57:27.373652: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1619150 executing computations on platform CUDA. Devices:
2019-05-16 15:57:27.373734: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce GTX 1660, Compute Capability 7.5
2019-05-16 15:57:27.400388: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2904000000 Hz
2019-05-16 15:57:27.401583: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1cdd240 executing computations on platform Host. Devices:
2019-05-16 15:57:27.401651: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-05-16 15:57:27.402011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce GTX 1660 major: 7 minor: 5 memoryClockRate(GHz): 1.83
pciBusID: 0000:01:00.0
totalMemory: 5.80GiB freeMemory: 5.73GiB
2019-05-16 15:57:27.402066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-05-16 15:57:27.405302: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-16 15:57:27.405366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-05-16 15:57:27.405390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-05-16 15:57:27.405581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5567 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660, pci bus id: 0000:01:00.0, compute capability: 7.5)
>>> sess.run(c)
2019-05-16 15:57:45.835122: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
array([[  6.262812 ,  -1.9345528,  10.1873865, ...,   9.533573 ,
         -7.4053297,  -4.2541947],
       [ 10.201033 ,   3.6828916,  -2.0874305, ...,  11.704482 ,
          2.2292233, -12.751171 ],
       [ -4.9506807,  -7.9405203,  11.641254 , ...,  10.210195 ,
         -3.6261683,  -1.245208 ],
       ...,
       [  6.1733346, -11.296464 ,  -6.5138006, ...,  -8.0698185,
         -4.31228  ,   6.034325 ],
       [  8.435815 ,  -6.479247 ,  -1.6091456, ...,   5.5824223,
          5.4707727,  11.140205 ],
       [ -8.973054 , -10.001549 , -15.808032 , ...,  20.240196 ,
          7.126047 ,   9.673972 ]], dtype=float32)
>>> 


感觉Ubuntu18.04 gtx1660显卡比win10版本gtx1060显卡速度快很多。

四、安装opencv4.1带cuda应用

4.1安装脚本

安装教程很多,整个脚本自动运行就好试试看。匹配Ubuntu 18.04.
找到cuda相关的显卡算力是6.1,算力地址是:
https://developer.nvidia.com/cuda-gpus
脚本默认下载opencv源码是dev版本。
稳定版请用相关语句替换

curl -L https://github.com/opencv/opencv/archive/4.1.0.zip -o opencv.zip
curl -L https://github.com/opencv/opencv_contrib/archive/4.1.0.zip -o opencv_contrib.zip
unzip opencv.zip 
unzip opencv_contrib.zip 
cd opencv/

installOpenCV-4-on-Ubuntu-18-04.sh

#!/bin/bash
#
if [ "$#" -ne 1 ]; then
    echo "Usage: $0 <Install Folder>"
    exit
fi
folder="$1"

echo "** Install requirement"
sudo apt-get update
sudo apt-get install -y build-essential cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev pkg-config
sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev
sudo apt-get install -y python2.7-dev python3.6-dev python-dev python-numpy python3-numpy
sudo apt-get install -y libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev
sudo apt-get install -y libv4l-dev v4l-utils qv4l2 v4l2ucp 
sudo apt-get install -y curl
sudo apt-get update

echo "** Download opencv-4.1.0"
cd $folder
curl -L https://github.com/opencv/opencv/archive/4.1.0.zip -o opencv-4.1.0.zip
curl -L https://github.com/opencv/opencv_contrib/archive/4.1.0.zip -o opencv_contrib-4.1.0.zip
unzip opencv-4.1.0.zip 
unzip opencv_contrib-4.1.0.zip 
cd opencv-4.1.0/

echo "** Building..."
mkdir release
cd release/
cmake \
  -D CMAKE_BUILD_TYPE=RELEASE \
  -D OPENCV_GENERATE_PKGCONFIG=YES \
  -D CMAKE_INSTALL_PREFIX=/usr/local \
  -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.1.0/modules  \
  -D CUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so \
  -D CUDA_ARCH_BIN=6.1 \
  -D CUDA_ARCH_PTX="" \
  -D WITH_CUDA=ON \
  -D WITH_TBB=ON \
  -D BUILD_opencv_python3=ON \
  -D BUILD_TESTS=OFF \
  -D BUILD_PERF_TESTS=OFF \
  -D WITH_V4L=ON \
  -D INSTALL_C_EXAMPLES=ON \
  -D INSTALL_PYTHON_EXAMPLES=ON \
  -D BUILD_EXAMPLES=ON \
  -D WITH_OPENGL=ON \
  -D ENABLE_FAST_MATH=1 \
  -D CUDA_FAST_MATH=1 \
  -D WITH_CUBLAS=1 \
  -D WITH_NVCUVID=ON \
  -D WITH_GSTREAMER=ON \
  -D WITH_OPENCL=YES \
  -D WITH_QT=ON \
  -D BUILD_opencv_cudacodec=OFF ..

make -j8
sudo make install
echo "** Install opencv-4.1.0 successfully"
echo "** Bye :)"

如果碰到下载不下来的,可以先下载然后改相关的路径重新cmake即可。


image.png

image.png

4.2测试python opencv4

root@doyen-ai:/home/software/opencv/build# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> print(cv2.__version__)
4.1.0-dev


五、安装pytorch1.1带cuda10.0

https://pytorch.org/get-started/locally/

image.png

pip3 install https://download.pytorch.org/whl/cu100/torch-1.1.0-cp36-cp36m-linux_x86_64.whl
pip3 install torchvision
root@doyen-ai:/home/software# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> 


正常加载;

六、安装mxnet

http://mxnet.incubator.apache.org

image.png

cuda10.0
需要用
pip3 install mxnet-cu100

root@doyen-ai:/home/software# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> mx.__version__
'1.4.1'
>>> a = mx.nd.ones((2, 3), mx.gpu())
>>> b  = a*2+1
>>> b
[[3. 3. 3.]
 [3. 3. 3.]]
<NDArray 2x3 @gpu(0)>
>>> 

全文完
(折腾10个小时左右,cuda10.1不支持tensorflow1.13,重新安装系统花费时间较长,编译opencv4花费时间较长)。

推荐阅读更多精彩内容