Commit fd56e649 authored by Davide Sapienza's avatar Davide Sapienza
Browse files

Update the README and split it into several files.



Signed-off-by: default avatarDavide Sapienza <sapienza.dav@gmail.com>
parent 34c1c3d5
......@@ -70,26 +70,11 @@ Results for COCO val 2017 (5k images), on RTX 2080Ti, with conf threshold=0.001
- [How to compile this repo](#how-to-compile-this-repo)
- [Workflow](#workflow)
- [How to export weights](#how-to-export-weights)
- [1)Export weights from darknet](#1export-weights-from-darknet)
- [2)Export weights for DLA34 and ResNet101](#2export-weights-for-dla34-and-resnet101)
- [3)Export weights for CenterNet](#3export-weights-for-centernet)
- [4)Export weights for MobileNetSSD](#4export-weights-for-mobilenetssd)
- [Run the demo](#run-the-demo)
- [FP16 inference](#fp16-inference)
- [INT8 inference](#int8-inference)
- [mAP demo](#map-demo)
- [Existing tests and supported networks](#existing-tests-and-supported-networks)
- [References](#references)
- [tkDNN on Windows 10 (experimental)](#tkdnn-on-windows-10-experimental)
- [Dependencies-Windows](#dependencies-windows)
- [Compiling tkDNN on Windows](#compiling-tkdnn-on-windows)
- [Run the demo on Windows](#run-the-demo-on-windows)
- [FP16 inference windows](#fp16-inference-windows)
- [INT8 inference windows](#int8-inference-windows)
- [Known issues with tkDNN on Windows](#known-issues-with-tkdnn-on-windows)
## Dependencies
......@@ -126,246 +111,17 @@ Steps needed to do inference on tkDNN with a custom neural network.
* Create a new test and define the network, layer by layer using the weights extracted and the output to check the results.
* Do inference.
## How to export weights
Weights are essential for any network to run inference. For each test a folder organized as follow is needed (in the build folder):
```
test_nn
|---- layers/ (folder containing a binary file for each layer with the corresponding wieghts and bias)
|---- debug/ (folder containing a binary file for each layer with the corresponding outputs)
```
Therefore, once the weights have been exported, the folders layers and debug should be placed in the corresponding test.
### 1)Export weights from darknet
To export weights for NNs that are defined in darknet framework, use [this](https://git.hipert.unimore.it/fgatti/darknet.git) fork of darknet and follow these steps to obtain a correct debug and layers folder, ready for tkDNN.
```
git clone https://git.hipert.unimore.it/fgatti/darknet.git
cd darknet
make
mkdir layers debug
./darknet export <path-to-cfg-file> <path-to-weights> layers
```
N.b. Use compilation with CPU (leave GPU=0 in Makefile) if you also want debug.
### 2)Export weights for DLA34 and ResNet101
To get weights and outputs needed to run the tests dla34 and resnet101 use the Python script and the Anaconda environment included in the repository.
Create Anaconda environment and activate it:
```
conda env create -f file_name.yml
source activate env_name
python <script name>
```
### 3)Export weights for CenterNet
To get the weights needed to run Centernet tests use [this](https://github.com/sapienzadavide/CenterNet.git) fork of the original Centernet.
```
git clone https://github.com/sapienzadavide/CenterNet.git
```
* follow the instruction in the README.md and INSTALL.md
```
python demo.py --input_res 512 --arch resdcn_101 ctdet --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/ctdet_coco_resdcn101.pth --exp_wo --exp_wo_dim 512
python demo.py --input_res 512 --arch dla_34 ctdet --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/ctdet_coco_dla_2x.pth --exp_wo --exp_wo_dim 512
```
### 4)Export weights for MobileNetSSD
To get the weights needed to run Mobilenet tests use [this](https://github.com/mive93/pytorch-ssd) fork of a Pytorch implementation of SSD network.
## Exporting weights
```
git clone https://github.com/mive93/pytorch-ssd
cd pytorch-ssd
conda env create -f env_mobv2ssd.yml
python run_ssd_live_demo.py mb2-ssd-lite <pth-model-fil> <labels-file>
```
### 5)Export weights for CenterTrack
To get the weights needed to run CenterTrack tests use [this](https://github.com/sapienzadavide/CenterTrack.git) fork of the original CenterTrack.
```
git clone https://github.com/sapienzadavide/CenterTrack.git
```
* follow the instruction in the README.md and INSTALL.md
```
python demo.py tracking,ddd --load_model ../models/nuScenes_3Dtracking.pth --dataset nuscenes --pre_hm --track_thresh 0.1 --demo /path/to/image/or/folder/or/video/or/webcam --test_focal_length 633 --exp_wo --exp_wo_dim 512 --input_h 512 --input_w 512
```
## Darknet Parser
tkDNN implement and easy parser for darknet cfg files, a network can be converted with *tk::dnn::darknetParser*:
```
// example of parsing yolo4
tk::dnn::Network *net = tk::dnn::darknetParser("yolov4.cfg", "yolov4/layers", "coco.names");
net->print();
```
All models from darknet are now parsed directly from cfg, you still need to export the weights with the described tools in the previous section.
<details>
<summary>Supported layers</summary>
convolutional
maxpool
avgpool
shortcut
upsample
route
reorg
region
yolo
</details>
<details>
<summary>Supported activations</summary>
relu
leaky
mish
logistic
</details>
For specific details on how to export weights see ./docs/exporting_weights.md
## Run the demo
This is an example using yolov4.
To run the an object detection first create the .rt file by running:
```
rm yolo4_fp32.rt # be sure to delete(or move) old tensorRT files
./test_yolo4 # run the yolo test (is slow)
```
If you get problems in the creation, try to check the error activating the debug of TensorRT in this way:
```
cmake .. -DDEBUG=True
make
```
Once you have successfully created your rt file, run the demo:
```
./demo yolo4_fp32.rt ../demo/yolo_test.mp4 y
```
In general the demo program takes 7 parameters:
```
./demo <network-rt-file> <path-to-video> <kind-of-network> <number-of-classes> <n-batches> <show-flag> <conf-thresh>
```
where
* ```<network-rt-file>``` is the rt file generated by a test
* ```<<path-to-video>``` is the path to a video file or a camera input
* ```<kind-of-network>``` is the type of network. Thee types are currently supported: ```y``` (YOLO family), ```c``` (CenterNet family) and ```m``` (MobileNet-SSD family)
* ```<number-of-classes>```is the number of classes the network is trained on
* ```<n-batches>``` number of batches to use in inference (N.B. you should first export TKDNN_BATCHSIZE to the required n_batches and create again the rt file for the network).
* ```<show-flag>``` if set to 0 the demo will not show the visualization but save the video into result.mp4 (if n-batches ==1)
* ```<conf-thresh>``` confidence threshold for the detector. Only bounding boxes with threshold greater than conf-thresh will be displayed.
N.b. By default it is used FP32 inference
![demo](https://user-images.githubusercontent.com/11562617/72547657-540e7800-388d-11ea-83c6-49dfea2a0607.gif)
### Run the 3D demo
To run the 3D object detection demo follow these steps (example with CenterNet based on DLA34):
```
rm dla34_cnet3d_fp32.rt # be sure to delete(or move) old tensorRT files
./test_dla34_cnet3d # run the yolo test (is slow)
./demo3D dla34_cnet3d_fp32.rt ../demo/yolo_test.mp4 c
```
The demo3D program takes the same parameters of the demo program:
```
./demo <network-rt-file> <path-to-video> <kind-of-network> <number-of-classes>
```
#### Run the 3D OD-tracking demo
To run the 3D object detection & tracking demo follow these steps (example with CenterTrack based on DLA34):
```
rm dla34_cnet3d_track_fp32.rt # be sure to delete(or move) old tensorRT files
./test_dla34_cnet3d_track # run the yolo test (is slow)
./demo3D dla34_cnet3d_track_fp32.rt ../demo/yolo_test.mp4 t
```
### FP16 inference
To run the an object detection demo with FP16 inference follow these steps (example with yolov3):
```
export TKDNN_MODE=FP16 # set the half floating point optimization
rm yolo3_fp16.rt # be sure to delete(or move) old tensorRT files
./test_yolo3 # run the yolo test (is slow)
./demo yolo3_fp16.rt ../demo/yolo_test.mp4 y
```
N.b. Using FP16 inference will lead to some errors in the results (first or second decimal).
### INT8 inference
To run the an object detection demo with INT8 inference three environment variables need to be set:
* ```export TKDNN_MODE=INT8```: set the 8-bit integer optimization
* ```export TKDNN_CALIB_IMG_PATH=/path/to/calibration/image_list.txt``` : image_list.txt has in each line the absolute path to a calibration image
* ```export TKDNN_CALIB_LABEL_PATH=/path/to/calibration/label_list.txt```: label_list.txt has in each line the absolute path to a calibration label
You should provide image_list.txt and label_list.txt, using training images. However, if you want to quickly test the INT8 inference you can run (from this repo root folder)
```
bash scripts/download_validation.sh COCO
```
to automatically download COCO2017 validation (inside demo folder) and create those needed file. Use BDD instead of COCO to download BDD validation.
Then a complete example using yolo3 and COCO dataset would be:
```
export TKDNN_MODE=INT8
export TKDNN_CALIB_LABEL_PATH=../demo/COCO_val2017/all_labels.txt
export TKDNN_CALIB_IMG_PATH=../demo/COCO_val2017/all_images.txt
rm yolo3_int8.rt # be sure to delete(or move) old tensorRT files
./test_yolo3 # run the yolo test (is slow)
./demo yolo3_int8.rt ../demo/yolo_test.mp4 y
```
N.B.
* Using INT8 inference will lead to some errors in the results.
* The test will be slower: this is due to the INT8 calibration, which may take some time to complete.
* INT8 calibration requires TensorRT version greater than or equal to 6.0
* Only 100 images are used to create the calibration table by default (set in the code).
### BatchSize bigger than 1
```
export TKDNN_BATCHSIZE=2
# build tensorRT files
```
This will create a TensorRT file with the desired **max** batch size.
The test will still run with a batch of 1, but the created tensorRT can manage the desired batch size.
### Test batch Inference
This will test the network with random input and check if the output of each batch is the same.
```
./test_rtinference <network-rt-file> <number-of-batches>
# <number-of-batches> should be less or equal to the max batch size of the <network-rt-file>
# example
export TKDNN_BATCHSIZE=4 # set max batch size
rm yolo3_fp32.rt # be sure to delete(or move) old tensorRT files
./test_yolo3 # build RT file
./test_rtinference yolo3_fp32.rt 4 # test with a batch size of 4
```
For specific details on how to run the demos see ./docs/demo.md
## mAP demo
To compute mAP, precision, recall and f1score, run the map_demo.
A validation set is needed.
To download COCO_val2017 (80 classes) run (form the root folder):
```
bash scripts/download_validation.sh COCO
```
To download Berkeley_val (10 classes) run (form the root folder):
```
bash scripts/download_validation.sh BDD
```
To compute the map, the following parameters are needed:
```
./map_demo <network rt> <network type [y|c|m]> <labels file path> <config file path>
```
where
* ```<network rt>```: rt file of a chosen network on which compute the mAP.
* ```<network type [y|c|m]>```: type of network. Right now only y(yolo), c(centernet) and m(mobilenet) are allowed
* ```<labels file path>```: path to a text file containing all the paths of the ground-truth labels. It is important that all the labels of the ground-truth are in a folder called 'labels'. In the folder containing the folder 'labels' there should be also a folder 'images', containing all the ground-truth images having the same same as the labels. To better understand, if there is a label path/to/labels/000001.txt there should be a corresponding image path/to/images/000001.jpg.
* ```<config file path>```: path to a yaml file with the parameters needed for the mAP computation, similar to demo/config.yaml
Example:
```
cd build
./map_demo dla34_cnet_FP32.rt c ../demo/COCO_val2017/all_labels.txt ../demo/config.yaml
```
This demo also creates a json file named ```net_name_COCO_res.json``` containing all the detections computed. The detections are in COCO format, the correct format to submit the results to [CodaLab COCO detection challenge](https://competitions.codalab.org/competitions/20794#participate).
For specific details on how to run the mAP demo see ./docs/mAP_demo.md
## Existing tests and supported networks
......@@ -399,92 +155,7 @@ This demo also creates a json file named ```net_name_COCO_res.json``` containing
### tkDNN on Windows 10 (experimental)
### Dependencies-Windows
This branch should work on every NVIDIA GPU supported in windows with the following dependencies:
* WINDOWS 10 1803 or HIGHER
* CUDA 10.0 (Recommended CUDA 11.2 )
* CUDNN 7.6 (Recommended CUDNN 8.1.1 )
* TENSORRT 6.0.1 (Recommended TENSORRT 7.2.3.4 )
* OPENCV 3.4 (Recommended OPENCV 4.2.0 )
* MSVC 16.7
* YAML-CPP
* EIGEN3
* 7ZIP (ADD TO PATH)
* NINJA 1.10
All the above mentioned dependencies except 7ZIP can be installed using Microsoft's [VCPKG](https://github.com/microsoft/vcpkg.git) .
After bootstrapping VCPKG the dependencies can be built and installed using the following command :
```
opencv4(normal) - vcpkg.exe install opencv4[tbb,jpeg,tiff,opengl,openmp,png,ffmpeg,eigen]:x64-windows yaml-cpp:x64-windows eigen3:x64-windows --x-install-root=C:\opt --x-buildtrees-root=C:\temp_vcpkg_build
opencv4(cuda) - vcpkg.exe install opencv4[cuda,nonfree,contrib,eigen,tbb,jpeg,tiff,opengl,openmp,png,ffmpeg]:x64-windows yaml-cpp:x64-windows eigen3:x64-windows --x-install-root=C:\opt --x-buildtrees-root=C:\temp_vcpkg_build
```
To build opencv4 with cuda and cudnn version corresponding to your cuda version,vcpkg's cudnn portfile needs to be modified by adding ```$ENV{CUDA_PATH}``` at lines 16 and 17 in the portfile.cmake
After VCPKG finishes building and installing all the packages delete C:\temp_vcpkg_build and add C:\opt\x64-windows\bin and C:\opt\x64-windows\debug\bin to path
### Compiling tkDNN on Windows
tkDNN is built with cmake(3.15+) on windows along with ninja.Msbuild and NMake Makefiles are drastically slower when compiling the library compared to windows
```
git clone https://github.com/ceccocats/tkDNN.git
cd tkdnn-windows
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -G"Ninja" ..
ninja -j4
```
### Run the demo on Windows
This example uses yolo4_tiny.\
To run the object detection file create .rt file bu running:
```
.\test_yolo4tiny.exe
```
Once the rt file has been successfully create,run the demo using the following command:
```
.\demo.exe yolo4tiny_fp32.rt ..\demo\yolo_test.mp4 y
```
For general info on more demo paramters,check Run the demo section on top
To run the test_all_tests.sh on windows,use git bash or msys2
### FP16 inference windows
This is an untested feature on windows.To run the object detection demo with FP16 interference follow the below steps(example with yolo4tiny):
```
set TKDNN_MODE=FP16
del /f yolo4tiny_fp16.rt
.\test_yolo4tiny.exe
.\demo.exe yolo4tiny_fp16.rt ..\demo\yolo_test.mp4
```
### INT8 inference windows
To run object detection demo with INT8 (example with yolo4tiny):
```
set TKDNN_MODE=INT8
set TKDNN_CALIB_LABEL_PATH=..\demo\COCO_val2017\all_labels.txt
set TKDNN_CALIB_IMG_PATH=..\demo\COCO_val2017\all_images.txt
del /f yolo4tiny_int8.rt # be sure to delete(or move) old tensorRT files
.\test_yolo4tiny.exe # run the yolo test (is slow)
.\demo.exe yolo4tiny_int8.rt ..\demo\yolo_test.mp4 y
```
### Known issues with tkDNN on Windows
Mobilenet and Centernet demos work properly only when built with msvc 16.7 in Release Mode,when built in debug mode for the mentioned networks one might encounter opencv assert errors
All Darknet models work properly with demo using MSVC version(16.7-16.9)
It is recommended to use Nvidia Driver(465+),Cuda unknown errors have been observed when using older drivers on pascal(SM 61) devices.
For specific details on how to run the demos on Windows 10 see ./docs/windows.md
## References
......
# tkDNN Demo
## Supported Network
2D Object Detection:
* Yolo4, Yolo4-csp, Yolo4x, Yolo4_berkeley, Yolo4tiny
* Yolo3, Yolo3_berkeley, Yolo3_coco4, Yolo3_flir, Yolo3_512, Yolo3tiny, Yolo3tiny_512
* Yolo2, Yolo2_voc, Yolo2tiny
* Csresnext50-panet-spp, Csresnext50-panet-spp_berkeley
* Resnet101_cnet, Dla34_cnet
* Mobilenetv2ssd, Mobilenetv2ssd512, Bdd-mobilenetv2ssd
3D Object Detection:
* Dla34_cnet3d
2D/3D Object Detection and Tracking:
* Dla34_cnet3d_track
## Index
- [Run the demo](#run-the-demo)
- [2D Object Detection](#object-detection-2d)
- [3D Object Detection](#object-detection-3d)
- [Object Detection & Tracking](#object-detection-tracking)
- [FP16 inference](#fp16-inference)
- [INT8 inference](#int8-inference)
- [Batching](#batching)
- [Run the demo on Windows](#demo-windows)
## Run the demo
N.b. By default it is used FP32 inference
### 2D Object Detection
This is an example using yolov4.
To run the an object detection first create the .rt file by running:
```
rm yolo4_fp32.rt # be sure to delete(or move) old tensorRT files
./test_yolo4 # run the yolo test (is slow)
```
If you get problems in the creation, try to check the error activating the debug of TensorRT in this way:
```
cmake .. -DDEBUG=True
make
```
Once you have successfully created your rt file, run the demo:
```
./demo yolo4_fp32.rt ../demo/yolo_test.mp4 y
```
In general the demo program takes 7 parameters:
```
./demo <network-rt-file> <path-to-video> <kind-of-network> <number-of-classes> <n-batches> <show-flag> <conf-thresh>
```
where
* ```<network-rt-file>``` is the rt file generated by a test
* ```<<path-to-video>``` is the path to a video file or a camera input
* ```<kind-of-network>``` is the type of network. Thee types are currently supported: ```y``` (YOLO family), ```c``` (CenterNet family) and ```m``` (MobileNet-SSD family)
* ```<number-of-classes>```is the number of classes the network is trained on
* ```<n-batches>``` number of batches to use in inference (N.B. you should first export TKDNN_BATCHSIZE to the required n_batches and create again the rt file for the network).
* ```<show-flag>``` if set to 0 the demo will not show the visualization but save the video into result.mp4 (if n-batches ==1)
* ```<conf-thresh>``` confidence threshold for the detector. Only bounding boxes with threshold greater than conf-thresh will be displayed.
![demo](https://user-images.githubusercontent.com/11562617/72547657-540e7800-388d-11ea-83c6-49dfea2a0607.gif)
### 3D Object Detection
To run the 3D object detection demo follow these steps (example with CenterNet based on DLA34):
```
rm dla34_cnet3d_fp32.rt # be sure to delete(or move) old tensorRT files
./test_dla34_cnet3d # run the yolo test (is slow)
./demo3D dla34_cnet3d_fp32.rt ../demo/yolo_test.mp4 c
```
The demo3D program takes the same parameters of the demo program:
```
./demo3D <network-rt-file> <path-to-video> <kind-of-network> <number-of-classes> <n-batches> <show-flag> <conf-thresh>
```
### Object Detection & Tracking
To run the 3D object detection & tracking demo follow these steps (example with CenterTrack based on DLA34):
```
rm dla34_ctrack_fp32.rt # be sure to delete(or move) old tensorRT files
./test_dla34_ctrack # run the yolo test (is slow)
./demoTracker dla34_ctrack_fp32.rt ../demo/yolo_test.mp4 c
```
The demoTracker program takes the same parameters of the demo program:
```
./demoTracker <network-rt-file> <path-to-video> <kind-of-network> <number-of-classes> <n-batches> <show-flag> <conf-thresh> <2D/3D-flag>
```
where
* ```<2D/3D-flag>``` if set to 0 the demo will be in the 2D mode, while if set to 1 the demo will be in the 3D mode (Default is 1 - 3D mode).
### FP16 inference
To run the demo with FP16 inference follow these steps (example with yolov3):
```
export TKDNN_MODE=FP16 # set the half floating point optimization
rm yolo3_fp16.rt # be sure to delete(or move) old tensorRT files
./test_yolo3 # run the yolo test (is slow)
./demo yolo3_fp16.rt ../demo/yolo_test.mp4 y
```
N.b. Using FP16 inference will lead to some errors in the results (first or second decimal).
### INT8 inference
To run the demo with INT8 inference three environment variables need to be set:
* ```export TKDNN_MODE=INT8```: set the 8-bit integer optimization
* ```export TKDNN_CALIB_IMG_PATH=/path/to/calibration/image_list.txt``` : image_list.txt has in each line the absolute path to a calibration image
* ```export TKDNN_CALIB_LABEL_PATH=/path/to/calibration/label_list.txt```: label_list.txt has in each line the absolute path to a calibration label
You should provide image_list.txt and label_list.txt, using training images. However, if you want to quickly test the INT8 inference you can run (from this repo root folder)
```
bash scripts/download_validation.sh COCO
```
to automatically download COCO2017 validation (inside demo folder) and create those needed file. Use BDD instead of COCO to download BDD validation.
Then a complete example using yolo3 and COCO dataset would be:
```
export TKDNN_MODE=INT8
export TKDNN_CALIB_LABEL_PATH=../demo/COCO_val2017/all_labels.txt
export TKDNN_CALIB_IMG_PATH=../demo/COCO_val2017/all_images.txt
rm yolo3_int8.rt # be sure to delete(or move) old tensorRT files
./test_yolo3 # run the yolo test (is slow)
./demo yolo3_int8.rt ../demo/yolo_test.mp4 y
```
N.B.
* Using INT8 inference will lead to some errors in the results.
* The test will be slower: this is due to the INT8 calibration, which may take some time to complete.
* INT8 calibration requires TensorRT version greater than or equal to 6.0
* Only 100 images are used to create the calibration table by default (set in the code).
### Batching
#### BatchSize bigger than 1
```
export TKDNN_BATCHSIZE=2
# build tensorRT files
```
This will create a TensorRT file with the desired **max** batch size.
The test will still run with a batch of 1, but the created tensorRT can manage the desired batch size.
#### Test batch Inference
This will test the network with random input and check if the output of each batch is the same.
```
./test_rtinference <network-rt-file> <number-of-batches>
# <number-of-batches> should be less or equal to the max batch size of the <network-rt-file>
# example
export TKDNN_BATCHSIZE=4 # set max batch size
rm yolo3_fp32.rt # be sure to delete(or move) old tensorRT files
./test_yolo3 # build RT file
./test_rtinference yolo3_fp32.rt 4 # test with a batch size of 4
```
### Run the demo on Windows
This example uses yolo4_tiny.\
To run the object detection file create .rt file bu running:
```
.\test_yolo4tiny.exe
```
Once the rt file has been successfully create,run the demo using the following command:
```
.\demo.exe yolo4tiny_fp32.rt ..\demo\yolo_test.mp4 y
```
For general info on more demo paramters,check Run the demo section on top
To run the test_all_tests.sh on windows,use git bash or msys2
### FP16 inference windows
This is an untested feature on windows.To run the object detection demo with FP16 interference follow the below steps(example with yolo4tiny):
```
set TKDNN_MODE=FP16
del /f yolo4tiny_fp16.rt
.\test_yolo4tiny.exe
.\demo.exe yolo4tiny_fp16.rt ..\demo\yolo_test.mp4
```
### INT8 inference windows
To run object detection demo with INT8 (example with yolo4tiny):
```
set TKDNN_MODE=INT8
set TKDNN_CALIB_LABEL_PATH=..\demo\COCO_val2017\all_labels.txt
set TKDNN_CALIB_IMG_PATH=..\demo\COCO_val2017\all_images.txt
del /f yolo4tiny_int8.rt # be sure to delete(or move) old tensorRT files
.\test_yolo4tiny.exe # run the yolo test (is slow)
.\demo.exe yolo4tiny_int8.rt ..\demo\yolo_test.mp4 y
```
### Known issues with tkDNN on Windows
Mobilenet and Centernet demos work properly only when built with msvc 16.7 in Release Mode,when built in debug mode for the mentioned networks one might encounter opencv assert errors
All Darknet models work properly with demo using MSVC version(16.7-16.9)
It is recommended to use Nvidia Driver(465+),Cuda unknown errors have been observed when using older drivers on pascal(SM 61) devices.
# tkDNN export weights
## Index
- [How to export weights](#how-to-export-weights)
- [1)Export weights from darknet](#1export-weights-from-darknet)
- [2)Export weights for DLA34 and ResNet101](#2export-weights-for-dla34-and-resnet101)
- [3)Export weights for CenterNet](#3export-weights-for-centernet)
- [4)Export weights for MobileNetSSD](#4export-weights-for-mobilenetssd)
- [Darknet Parser](#darkent-parser)
## How to export weights
Weights are essential for any network to run inference. For each test a folder organized as follow is needed (in the build folder):
```
test_nn
|---- layers/ (folder containing a binary file for each layer with the corresponding wieghts and bias)
|---- debug/ (folder containing a binary file for each layer with the corresponding outputs)
```
Therefore, once the weights have been exported, the folders layers and debug should be placed in the corresponding test.
### 1)Export weights from darknet
To export weights for NNs that are defined in darknet framework, use [this](https://git.hipert.unimore.it/fgatti/darknet.git) fork of darknet and follow these steps to obtain a correct debug and layers folder, ready for tkDNN.
```
git clone https://git.hipert.unimore.it/fgatti/darknet.git
cd darknet
make
mkdir layers debug
./darknet export <path-to-cfg-file> <path-to-weights> layers
```
N.b. Use compilation with CPU (leave GPU=0 in Makefile) if you also want debug.
### 2)Export weights for DLA34 and ResNet101
To get weights and outputs needed to run the tests dla34 and resnet101 use the Python script and the Anaconda environment included in the repository.
Create Anaconda environment and activate it:
```
conda env create -f file_name.yml
source activate env_name
python <script name>
```
### 3)Export weights for CenterNet
To get the weights needed to run Centernet tests use [this](https://github.com/sapienzadavide/CenterNet.git) fork of the original Centernet.
```
git clone https://github.com/sapienzadavide/CenterNet.git
```
* follow the instruction in the README.md and INSTALL.md
```
python demo.py --input_res 512 --arch resdcn_101 ctdet --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/ctdet_coco_resdcn101.pth --exp_wo --exp_wo_dim 512
python demo.py --input_res 512 --arch dla_34 ctdet --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/ctdet_coco_dla_2x.pth --exp_wo --exp_wo_dim 512
```
### 4)Export weights for MobileNetSSD
To get the weights needed to run Mobilenet tests use [this](https://github.com/mive93/pytorch-ssd) fork of a Pytorch implementation of SSD network.
```
git clone https://github.com/mive93/pytorch-ssd
cd pytorch-ssd
conda env create -f env_mobv2ssd.yml
python run_ssd_live_demo.py mb2-ssd-lite <pth-model-fil> <labels-file>
```
### 5)Export weights for CenterTrack