Unverified Commit d7276c72 authored by Micaela Verucchi's avatar Micaela Verucchi Committed by GitHub
Browse files

Update README.md

parent 482e1226
# tkDNN
tkDNN is a Deep Neural Network library built with cuDNN primitives specifically thought to work on NVIDIA TK1(and all successive) board.<br>
The main scope is to do high performance inference on already trained models.
tkDNN is a Deep Neural Network library built with cuDNN and tensorRT primitives, specifically thought to work on NVIDIA Embedded Boards. It has been tested on TK1, TX1, TX2, AGX Xavier and several discrete GPU.
The main goal of this project is to exploit NVIDIA boards as much as possible to obtain the best inference performance. It does not allow training.
this branch actually work on every NVIDIA GPU that support the dependencies:
## Index
* [Dependencies](#dependencies)
* [How to compile this repo](#how-to-compile-this-repo)
* [Workflow](#workflow)
* [How to export weights](#how-to-export-weights)
* [Run the demo](#run-the-demo)
* [mAP demo](#map-demo)
* [Existing tests and supported networks](#existing-tests-and-supported-networks)
* [References](#references)
## Dependencies
This branch works on every NVIDIA GPU that supports the dependencies:
* CUDA 10.0
* CUDNN 7.603
* TENSORRT 6.01
* OPENCV 4.1
* yaml-cpp 0.5.2 (sudo apt install libyaml-cpp-dev)
## Workflow
The recommended workflow follow these step:
* Build and train a model in Keras (on any PC)
* Export weights and bias
* Define the model on tkDNN
* Do inference (on TK1)
## Compile the library
Build with cmake
## How to compile this repo
Build with cmake. If using Ubuntu 18.04 a new version of cmake is needed (1.15 or above).
```
git clone https://github.com/ceccocats/tkDNN
git cd tkDNN
git checkout cnet
mkdir build
cd build
cmake ..
# use -DTEST_DATA=False to skip dataset download
cmake .. # use -DTEST_DATA=False to skip dataset download
make
```
during the cmake configuration it will be dowloaded the weights needed for running
the tests
If TEST_DATA is not set to False, weights needed to run some tests will be automatically downloaded.
## DLA34 and ResNet101 weights
To get weights and outputs needed for running the tests you can use the Python
script and the Anaconda environment included in the repository.
## Workflow
Steps needed to do inference on tkDNN with a custom neural network.
* Build and train a NN model with your favourite framework.
* Export weights and bias for each layer and save them in a binary file (one for layer).
* Export outputs for each layer and save them in a binary file (one for layer).
* Create a new test and define the network, layer by layer using the weights extracted and the output to check the results.
* Do inference.
Create Anaconda environment and activate it:
## How to export weights
Weights are essential for any network to run inference. For each test a folder organized as follow is needed:
```
conda env create -f file_name.yml
source activate env_name
test_nn
|---- test_nn.cpp (nn definition in tkDNN)
|---- layers/ (folder containing a binary file for each layer with the corresponding wieghts and bias)
|---- debug/ (folder containing a binary file for each layer with the corresponding outputs)
```
Run the Python script inside the environment.
Therefore, once the weights have been exported, the folders layers ans debug should be placed in the corresponding test.
## CenterNet weights
To get the weights needed for running the tests:
### 1)Export weights from darknet
To export weights for NN that are defined in darknet framework, use [this](https://github.com/ceccocats/darknet) fork of darknet and follow these step to obtain a correct debug and layers folder, ready for tkDNN.
* clone the forked repository by the original CenterNet:
```
git clone https://github.com/sapienzadavide/CenterNet.git
```
* follow the instruction in the README.md and INSTALL.md
* copy the weigths and outputs from /path/to/CenterNet/src/ in ./test/centernet-path/ . For example:
```
cp /path/to/CenterNet/src/layers_dla/* ./test/dla34_cnet/layers/
cp /path/to/CenterNet/src/debug_dla/* ./test/dla34_cnet/debug/
```
or
```
cp /path/to/CenterNet/src/layers_resdcn/* ./test/resnet101_cnet/layers/
cp /path/to/CenterNet/src/debug_resdcn/* ./test/resnet101_cnet/debug/
git clone https://github.com/ceccocats/darknet
git cd darknet
make
mkdir layers debug
./darknet export <path-to-cfg-file> <path-to-weights> layers
```
N.b. Use compilation with CPU (leave GPU=0 in Makefile) if you also want debug.
## Test
Assumiung you have correctly builded the library these are the test ready to exec:
* test_simple: a simple convolutional and dense network (CUDNN only)
* test_mnist: the famous mnist netwok (CUDNN and TENSORRT)
* test_mnistRT: the mnist network hardcoded in using tensorRT apis (TENSORRT only)
* test_yolo: YOLO detection network (CUDNN and TENSORRT)
* test_yolo_tiny: smaller version of YOLO (CUDNN and TENSRRT)
* test_yolo3_berkeley: our yolo3 version trained with BDD100K dateset
* test_resnet101: ResNet101 network (CUDNN and TENSORRT)
* test_resnet101_cnet: CenterNet detection based on ResNet101 (CUDNN and TENSORRT)
* test_dla34: DLA34 network (CUDNN and TENSORRT)
* test_dla34_cnet: CenterNet detection based on DLA34 (CUDNN and TENSORRT)
### 2)Export weights for DLA34 and ResNet101
To get weights and outputs needed to run the tests dla34 and resnet101 use the Python script and the Anaconda environment included in the repository.
## yolo3 berkeley demo detection
For the live detection you need to precompile the tensorRT file by luncing the desidered network test, this is the recommended process:
Create Anaconda environment and activate it:
```
export TKDNN_MODE=FP16 # set the half floating point optimization
rm yolo3_berkeley.rt # be sure to delete(or move) old tensorRT files
./test_yolo3_berkeley # run the yolo test (is slow)
# with f16 inference the result will be a bit incorrect
conda env create -f file_name.yml
source activate env_name
python <script name>
```
this will genereate a yolo3_berkeley.rt file that can be used for live detection:
### 3)Export weights for CenterNet
To get the weights needed to run Centernet tests use [this](https://github.com/sapienzadavide/CenterNet.git) fork of the original Centernet.
```
./demo # launch detection on a demo video
./demo yolo3_berkeley.rt /dev/video0 y # launch detection on device 0
git clone https://github.com/sapienzadavide/CenterNet.git
```
![demo](https://user-images.githubusercontent.com/11562617/72547657-540e7800-388d-11ea-83c6-49dfea2a0607.gif)
* follow the instruction in the README.md and INSTALL.md
## CenterNet (DLA34, ResNet101) demo detection
For the live detection you need to precompile the tensorRT file by luncing the desidered network test, this is the recommended process:
```
export TKDNN_MODE=FP16 # set the half floating point optimization
python demo.py --input_res 512 --arch resdcn_101 ctdet --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/ctdet_coco_resdcn101.pth --exp_wo --exp_wo_dim 512
python demo.py --input_res 512 --arch dla_34 ctdet --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/ctdet_coco_dla_2x.pth --exp_wo --exp_wo_dim 512
```
### 4)Export weights for MobileNetSSD
To get the weights needed to run Mobilenet tests use [this](https://github.com/mive93/pytorch-ssd) fork of the a Pytorch implementation of SSD network.
For CenterNet based on ResNet101:
```
rm resnet101_cnet.rt # be sure to delete(or move) old tensorRT files
./test_resnet101_cnet # run the yolo test (is slow)
# with f16 inference the result will be a bit incorrect
git clone https://github.com/mive93/pytorch-ssd
cd pytorch-ssd
conda env create -f env_mobv2ssd.yml
python run_ssd_live_demo.py mb2-ssd-lite <pth-model-fil> <labels-file>
```
### Run the demo
For CenterNet based on DLA34:
To run the an object detection demo follow these steps (example with yolov3):
```
rm dla34_cnet.rt # be sure to delete(or move) old tensorRT files
./test_dla34_cnet # run the yolo test (is slow)
# with f16 inference the result will be a bit incorrect
export TKDNN_MODE=FP16 # set the half floating point optimization
rm yolo3.rt # be sure to delete(or move) old tensorRT files
./test_yolo3 # run the yolo test (is slow)
./demo yolo3.rt ../demo/yolo_test.mp4 y
```
this will genereate resnet101_cnet.rt and dla34_cnet.rt file that can be used for live detection:
In general the demo program takes 3 parameters:
```
./demo dla34_cnet.rt ../demo/yolo_test.mp4 c # launch detection on a demo video
./demo resnet101_cnet.rt /dev/video0 c # launch detection on device 0
./demo dla34_cnet.rt /dev/video0 c # launch detection on device 0
./demo <network-rt-file> <path-to-video> <kind-of-network>
```
where
* ```<network-rt-file>``` is the rt file generated by a test
* ```<<path-to-video>``` is the path to a video file or a camera input
* ```<kind-of-network>``` is the type of network. Thee types are currently supported: ```y``` (YOLO family), ```c``` (CenterNet family) and ```m``` (MobileNet-SSD family)
N.b. Using FP16 inference will lead to some errors in the results (first or second decimal).
![demo](https://user-images.githubusercontent.com/11562617/72547657-540e7800-388d-11ea-83c6-49dfea2a0607.gif)
## mAP demo
To compute mAP, precision, recall and f1score, run the map_demo.
......@@ -122,14 +125,13 @@ A validation set is needed. To download COCO_val2017 run (form the root folder):
```
bash download_validation.sh
```
To compute the map, the following parameters are needed:
```
./map_demo <network rt> <network type [y|c]> <labels file path> <config file path>
./map_demo <network rt> <network type [y|c|m]> <labels file path> <config file path>
```
where
* ```<network rt>```: rt file of a choosen network on wich compute the mAP.
* ```<network type [y|c]>```: type of network. Right now only y(yolo) and c(centernet) are allowed
* ```<network type [y|c|m]>```: type of network. Right now only y(yolo), c(centernet) and m(mobilenet) are allowed
* ```<labels file path>```: path to a text file containing all the paths of the groundtruth labels. It is important that all the labels of the groundtruth are in a folder called 'labels'. In the folder containing the folder 'labels' there should be also a folder 'images', containing all the groundtruth images having the same same as the labels. To better understand, if there is a label path/to/labels/000001.txt there should be a corresponding image path/to/images/000001.jpg.
* ```<config file path>```: path to a yaml file with the parameters needed for the mAP computation, similar to demo/config.yaml
......@@ -140,7 +142,7 @@ cd build
./map_demo dla34_cnet.rt c ../demo/COCO_val2017/all_labels.txt ../demo/config.yaml
```
## Supported networks
## Existing tests and supported networks
| Test Name | Network | Dataset | N Classes | Input size | Weights |
| :---------------- | :-------------------------------------------- | :-----------------------------------------------------------: | :-------: | :-----------: | :------------------------------------------------------------------------ |
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment