README.md 12.3 KB
Newer Older
Francesco Gatti's avatar
README  
Francesco Gatti committed
1
# tkDNN
Micaela Verucchi's avatar
Micaela Verucchi committed
2
3
tkDNN is a Deep Neural Network library built with cuDNN and tensorRT primitives, specifically thought to work on NVIDIA Embedded Boards. It has been tested on TK1, TX1, TX2, AGX Xavier and several discrete GPU.
The main goal of this project is to exploit NVIDIA boards as much as possible to obtain the best inference performance. It does not allow training. 
Francesco Gatti's avatar
README  
Francesco Gatti committed
4

Micaela Verucchi's avatar
Micaela Verucchi committed
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
## Index
* [Dependencies](#dependencies)
* [How to compile this repo](#how-to-compile-this-repo)
* [Workflow](#workflow)
* [How to export weights](#how-to-export-weights)
* [Run the demo](#run-the-demo)
* [mAP demo](#map-demo)
* [Existing tests and supported networks](#existing-tests-and-supported-networks)
* [References](#references)




## Dependencies
This branch works on every NVIDIA GPU that supports the dependencies:
20
21
22
23
* CUDA 10.0
* CUDNN 7.603
* TENSORRT 6.01
* OPENCV 4.1
24
* yaml-cpp 0.5.2 (sudo apt install libyaml-cpp-dev)
Francesco Gatti's avatar
README  
Francesco Gatti committed
25

Micaela Verucchi's avatar
Micaela Verucchi committed
26
27
## How to compile this repo
Build with cmake. If using Ubuntu 18.04 a new version of cmake is needed (1.15 or above). 
Francesco Gatti's avatar
README  
Francesco Gatti committed
28
```
Micaela Verucchi's avatar
Micaela Verucchi committed
29
30
31
git clone https://github.com/ceccocats/tkDNN
git cd tkDNN
git checkout cnet
Francesco Gatti's avatar
README  
Francesco Gatti committed
32
33
mkdir build
cd build
Micaela Verucchi's avatar
Micaela Verucchi committed
34
cmake .. # use -DTEST_DATA=False to skip dataset download
Francesco Gatti's avatar
README  
Francesco Gatti committed
35
36
make
```
Micaela Verucchi's avatar
Micaela Verucchi committed
37
If TEST_DATA is not set to False, weights needed to run some tests will be automatically downloaded.
Francesco Gatti's avatar
README  
Francesco Gatti committed
38

Micaela Verucchi's avatar
Micaela Verucchi committed
39
40
41
42
43
44
45
## Workflow
Steps needed to do inference on tkDNN with a custom neural network. 
* Build and train a NN model with your favourite framework.
* Export weights and bias for each layer and save them in a binary file (one for layer).
* Export outputs for each layer and save them in a binary file (one for layer).
* Create a new test and define the network, layer by layer using the weights extracted and the output to check the results. 
* Do inference.
Davide Sapienza's avatar
Davide Sapienza committed
46

Micaela Verucchi's avatar
Micaela Verucchi committed
47
48
49
## How to export weights

Weights are essential for any network to run inference. For each test a folder organized as follow is needed:
Davide Sapienza's avatar
Davide Sapienza committed
50
```
Micaela Verucchi's avatar
Micaela Verucchi committed
51
52
53
54
    test_nn
        |---- test_nn.cpp (nn definition in tkDNN)
        |---- layers/ (folder containing a binary file for each layer with the corresponding wieghts and bias)
        |---- debug/  (folder containing a binary file for each layer with the corresponding outputs)
Davide Sapienza's avatar
Davide Sapienza committed
55
```
Micaela Verucchi's avatar
Micaela Verucchi committed
56
Therefore, once the weights have been exported, the folders layers ans debug should be placed in the corresponding test.
Davide Sapienza's avatar
Davide Sapienza committed
57

Micaela Verucchi's avatar
Micaela Verucchi committed
58
59
### 1)Export weights from darknet
To export weights for NN that are defined in darknet framework, use [this](https://github.com/ceccocats/darknet) fork of darknet and follow these step to obtain a correct debug and layers folder, ready for tkDNN.
Davide Sapienza's avatar
Davide Sapienza committed
60
61

```
Micaela Verucchi's avatar
Micaela Verucchi committed
62
63
64
65
66
git clone https://github.com/ceccocats/darknet
git cd darknet
make
mkdir layers debug
./darknet export <path-to-cfg-file> <path-to-weights> layers
Davide Sapienza's avatar
Davide Sapienza committed
67
```
Micaela Verucchi's avatar
Micaela Verucchi committed
68
N.b. Use compilation with CPU (leave GPU=0 in Makefile) if you also want debug. 
Davide Sapienza's avatar
Davide Sapienza committed
69

Micaela Verucchi's avatar
Micaela Verucchi committed
70
71
### 2)Export weights for DLA34 and ResNet101 
To get weights and outputs needed to run the tests dla34 and resnet101 use the Python script and the Anaconda environment included in the repository.   
Davide Sapienza's avatar
Davide Sapienza committed
72

Micaela Verucchi's avatar
Micaela Verucchi committed
73
Create Anaconda environment and activate it:
Francesco Gatti's avatar
Francesco Gatti committed
74
```
Micaela Verucchi's avatar
Micaela Verucchi committed
75
76
77
conda env create -f file_name.yml
source activate env_name
python <script name>
Francesco Gatti's avatar
Francesco Gatti committed
78
```
Micaela Verucchi's avatar
Micaela Verucchi committed
79
80
### 3)Export weights for CenterNet
To get the weights needed to run Centernet tests use [this](https://github.com/sapienzadavide/CenterNet.git) fork of the original Centernet. 
Francesco Gatti's avatar
Francesco Gatti committed
81
```
Micaela Verucchi's avatar
Micaela Verucchi committed
82
git clone https://github.com/sapienzadavide/CenterNet.git
Francesco Gatti's avatar
Francesco Gatti committed
83
```
Micaela Verucchi's avatar
Micaela Verucchi committed
84
* follow the instruction in the README.md and INSTALL.md
Davide Sapienza's avatar
Davide Sapienza committed
85
86

```
Micaela Verucchi's avatar
Micaela Verucchi committed
87
88
python demo.py --input_res 512 --arch resdcn_101 ctdet --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/ctdet_coco_resdcn101.pth --exp_wo --exp_wo_dim 512
python demo.py --input_res 512 --arch dla_34 ctdet --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/ctdet_coco_dla_2x.pth --exp_wo --exp_wo_dim 512
Davide Sapienza's avatar
Davide Sapienza committed
89
```
Micaela Verucchi's avatar
Micaela Verucchi committed
90
91
92
### 4)Export weights for MobileNetSSD

To get the weights needed to run Mobilenet tests use [this](https://github.com/mive93/pytorch-ssd) fork of the a Pytorch implementation of SSD network. 
Davide Sapienza's avatar
Davide Sapienza committed
93
94

```
Micaela Verucchi's avatar
Micaela Verucchi committed
95
96
97
98
git clone https://github.com/mive93/pytorch-ssd
cd pytorch-ssd
conda env create -f env_mobv2ssd.yml
python run_ssd_live_demo.py mb2-ssd-lite <pth-model-fil> <labels-file>
Davide Sapienza's avatar
Davide Sapienza committed
99
```
Micaela Verucchi's avatar
Micaela Verucchi committed
100
### Run the demo
Davide Sapienza's avatar
Davide Sapienza committed
101

Micaela Verucchi's avatar
Micaela Verucchi committed
102
To run the an object detection demo follow these steps (example with yolov3):
Davide Sapienza's avatar
Davide Sapienza committed
103
```
Micaela Verucchi's avatar
Micaela Verucchi committed
104
105
106
107
export TKDNN_MODE=FP16  # set the half floating point optimization
rm yolo3.rt             # be sure to delete(or move) old tensorRT files
./test_yolo3            # run the yolo test (is slow)
./demo yolo3.rt ../demo/yolo_test.mp4 y
Davide Sapienza's avatar
Davide Sapienza committed
108
```
Micaela Verucchi's avatar
Micaela Verucchi committed
109
In general the demo program takes 3 parameters:
Davide Sapienza's avatar
Davide Sapienza committed
110
```
Micaela Verucchi's avatar
Micaela Verucchi committed
111
./demo <network-rt-file> <path-to-video> <kind-of-network>
112
```
Micaela Verucchi's avatar
Micaela Verucchi committed
113
114
115
116
117
118
119
where
*  ```<network-rt-file>``` is the rt file generated by a test
*  ```<<path-to-video>``` is the path to a video file or a camera input  
*  ```<kind-of-network>``` is the type of network. Thee types are currently supported: ```y``` (YOLO family), ```c``` (CenterNet family) and ```m``` (MobileNet-SSD family)
N.b. Using FP16 inference will lead to some errors in the results (first or second decimal). 

![demo](https://user-images.githubusercontent.com/11562617/72547657-540e7800-388d-11ea-83c6-49dfea2a0607.gif)
120
121
122
123

## mAP demo
To compute mAP, precision, recall and f1score, run the map_demo.

xavier's avatar
xavier committed
124
125
126
127
128
A validation set is needed. To download COCO_val2017 run (form the root folder): 
```
bash download_validation.sh 
```
To compute the map, the following parameters are needed:
129
```
Micaela Verucchi's avatar
Micaela Verucchi committed
130
./map_demo <network rt> <network type [y|c|m]> <labels file path> <config file path>
131
132
```
where 
Micaela Verucchi's avatar
Micaela Verucchi committed
133
* ```<network rt>```: rt file of a choosen network on wich compute the mAP.
Micaela Verucchi's avatar
Micaela Verucchi committed
134
* ```<network type [y|c|m]>```: type of network. Right now only y(yolo), c(centernet) and m(mobilenet) are allowed
Micaela Verucchi's avatar
Micaela Verucchi committed
135
136
* ```<labels file path>```: path to a text file containing all the paths of the groundtruth labels. It is important that all the labels of the groundtruth are in a folder called 'labels'. In the folder containing the folder 'labels' there should be also a folder 'images', containing all the groundtruth images having the same same as the labels. To better understand, if there is a label path/to/labels/000001.txt there should be a corresponding image path/to/images/000001.jpg. 
* ```<config file path>```: path to a yaml file with the parameters needed for the mAP computation, similar to demo/config.yaml
137
138
139
140

Example:

```
xavier's avatar
xavier committed
141
142
cd build
./map_demo dla34_cnet.rt c ../demo/COCO_val2017/all_labels.txt ../demo/config.yaml
xavier's avatar
xavier committed
143
```
Micaela Verucchi's avatar
Micaela Verucchi committed
144

Micaela Verucchi's avatar
Micaela Verucchi committed
145
## Existing tests and supported networks
Micaela Verucchi's avatar
Micaela Verucchi committed
146
147
148

| Test Name         | Network                                       | Dataset                                                       | N Classes | Input size    | Weights                                                                   |
| :---------------- | :-------------------------------------------- | :-----------------------------------------------------------: | :-------: | :-----------: | :------------------------------------------------------------------------ |
Micaela Verucchi's avatar
Micaela Verucchi committed
149
| yolo              | YOLO v2<sup>1</sup>                           | [COCO 2014](http://cocodataset.org/)                          | 80        | 608x608       | [weights](https://cloud.hipert.unimore.it/s/nf4PJ3k8bxBETwL/download)                                                                   |
Micaela Verucchi's avatar
Micaela Verucchi committed
150
151
152
| yolo_224          | YOLO v2<sup>1</sup>                           | [COCO 2014](http://cocodataset.org/)                          | 80        | 224x224       | weights                                                                   |
| yolo_berkeley     | YOLO v2<sup>1</sup>                           | [BDD100K  ](https://bair.berkeley.edu/blog/2018/05/30/bdd/)   | 10        | 416x736       | weights                                                                   |
| yolo_relu         | YOLO v2 (with ReLU, not Leaky)<sup>1</sup>    | [COCO 2014](http://cocodataset.org/)                          | 80        | 416x416       | weights                                                                   |
Micaela Verucchi's avatar
Micaela Verucchi committed
153
| yolo_tiny         | YOLO v2 tiny<sup>1</sup>                      | [COCO 2014](http://cocodataset.org/)                          | 80        | 416x416       | [weights](https://cloud.hipert.unimore.it/s/m3orfJr8pGrN5mQ/download)                                                                   |
Micaela Verucchi's avatar
Micaela Verucchi committed
154
| yolo_voc          | YOLO v2<sup>1</sup>                           | [VOC      ](http://host.robots.ox.ac.uk/pascal/VOC/)          | 21        | 416x416       | [weights](https://cloud.hipert.unimore.it/s/DJC5Fi2pEjfNDP9/download)                                                                   |
Micaela Verucchi's avatar
Micaela Verucchi committed
155
| yolo3             | YOLO v3<sup>2</sup>                           | [COCO 2014](http://cocodataset.org/)                          | 80        | 416x416       | [weights](https://cloud.hipert.unimore.it/s/jPXmHyptpLoNdNR/download)     |
Micaela Verucchi's avatar
Micaela Verucchi committed
156
| yolo3_berkeley    | YOLO v3<sup>2</sup>                           | [BDD100K  ](https://bair.berkeley.edu/blog/2018/05/30/bdd/)   | 10        | 320x544       | [weights](https://cloud.hipert.unimore.it/s/o5cHa4AjTKS64oD/download)                                                                   |
Micaela Verucchi's avatar
Micaela Verucchi committed
157
158
| yolo3_coco4       | YOLO v3<sup>2</sup>                           | [COCO 2014](http://cocodataset.org/)                          | 4         | 416x416       | [weights](https://cloud.hipert.unimore.it/s/o27NDzSAartbyc4/download)                                                                   |
| yolo3_flir        | YOLO v3<sup>2</sup>                           | [FREE FLIR](https://www.flir.com/oem/adas/adas-dataset-form/) | 3         | 320x544       | [weights](https://cloud.hipert.unimore.it/s/62DECncmF6bMMiH/download)                                                                   |
Micaela Verucchi's avatar
Micaela Verucchi committed
159
| yolo3_tiny        | YOLO v3 tiny<sup>2</sup>                      | [COCO 2014](http://cocodataset.org/)                          | 80        | 416x416       | [weights](https://cloud.hipert.unimore.it/s/LMcSHtWaLeps8yN/download)     |
Micaela Verucchi's avatar
Micaela Verucchi committed
160
| yolo3_tiny512     | YOLO v3 tiny<sup>2</sup>                      | [COCO 2017](http://cocodataset.org/)                          | 80        | 512x512       | [weights](https://cloud.hipert.unimore.it/s/njnYACnQfWQFKrn/download)     |
Micaela Verucchi's avatar
Micaela Verucchi committed
161
162
163
164
165
| dla34             | Deep Leayer Aggreagtion (DLA) 34<sup>3</sup>  | [COCO 2014](http://cocodataset.org/)                          | 80        | 224x224       | weights                                                                   |
| dla34_cnet        | Centernet (DLA34 backend)<sup>4</sup>         | [COCO 2017](http://cocodataset.org/)                          | 80        | 512x512       | [weights](https://cloud.hipert.unimore.it/s/8AjXdgCeRzCa5AF/download)     |
| mobilenetv2ssd    | Mobilnet v2 SSD Lite<sup>5</sup>              | [VOC      ](http://host.robots.ox.ac.uk/pascal/VOC/)          | 21        | 300x300       | [weights](https://cloud.hipert.unimore.it/s/x4ZfxBKN23zAJQp/download)     |
| resnet101         | Resnet 101<sup>6</sup>                        | [COCO 2014](http://cocodataset.org/)                          | 80        | 224x224       | weights                                                                   |
| resnet101_cnet    | Centernet (Resnet101 backend)<sup>4</sup>     | [COCO 2017](http://cocodataset.org/)                          | 80        | 512x512       | [weights](https://cloud.hipert.unimore.it/s/B6mj33k7beECXsY/download)     |
Micaela Verucchi's avatar
Micaela Verucchi committed
166
| csresnext50-panet-spp    | Cross Stage Partial Network <sup>7</sup>     | [COCO 2014](http://cocodataset.org/)                          | 80        | 416x416       | [weights](https://cloud.hipert.unimore.it/s/Kcs4xBozwY4wFx8/download)     |
Micaela Verucchi's avatar
Micaela Verucchi committed
167
168
169
170
171
172
173
174
175
176



## References

1. Redmon, Joseph, and Ali Farhadi. "YOLO9000: better, faster, stronger." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
2. Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 (2018).
3. Yu, Fisher, et al. "Deep layer aggregation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
4. Zhou, Xingyi, Dequan Wang, and Philipp Krähenbühl. "Objects as points." arXiv preprint arXiv:1904.07850 (2019).
5. Sandler, Mark, et al. "Mobilenetv2: Inverted residuals and linear bottlenecks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
Micaela Verucchi's avatar
Micaela Verucchi committed
177
6. He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Micaela Verucchi's avatar
Micaela Verucchi committed
178
7. Wang, Chien-Yao, et al. "CSPNet: A New Backbone that can Enhance Learning Capability of CNN." arXiv preprint arXiv:1911.11929 (2019).