README_seg.md 5.08 KB
Newer Older
1
2
3
4
5
6
7
# Semantic Segmentation with tkDNN

Currently tkDNN supports only ShelfNet as semantic segmentation network.


## Run the demo

8
To run the semantic segmentation demo follow these steps (example with shelfnet):
9
```
10
11
12
13
rm shelfnet_fp32.rt        # be sure to delete(or move) old tensorRT files
export TKDNN_BATCHSIZE=4   # be sure you have batch size > than 1 if you want to run inference on images bigger than 1024
./test_shelfnet            # run the yolo test (is slow)
./demo shelfnet_fp32.rt ../demo/yolo_test.mp4 1 19
14
15
16
17
18
19
20
21
22
23
24
25
26
```
In general the demo program takes the following parameters:
```
./seg_demo <network-rt-file> <path-to-video> <n-batches> <number-of-classes> <resize-flag> <baseline-resize> <show-flag> <write-pred>
```
where
*  ```<network-rt-file>``` is the rt file generated by a test
*  ```<<path-to-video>``` is the path to a video file or a camera input  
*  ```<n-batches>``` number of batches to use in inference (N.B. you should first export TKDNN_BATCHSIZE to the required n_batches and create again the rt file for the network).
*  ```<number-of-classes>```is the number of classes the network is trained on
*  ```<resize-flag>``` if set to 0 the demo will not resize the input frames, but use it as it is, otherwise it will resize it.
*  ```<baseline-resize>``` is ```<resize-flag>``` is set to 1, then the input frames will be proportionally resized using ```<baseline-resize>``` as width baseline.
*  ```<show-flag>``` if set to 0 the demo will not show the visualization but save the video into result.mp4 (if n-batches ==1)
27
*  ```<write-pred>``` if set to 0 (default) the demo will run, otherwise the evaluation of a dataset will run and the output of the segmentation will be saved. Attention: this is under development and paths are embedded, so change them in the code in advance.
28

29
30
NB) By default it is used FP32 inference
NB) The batching is not used to work on more streams, rather to work on more tiles of the same image. Shelfnet never resized the input image, therefore for images greater than 1024x1024 tiles of 1024x1024 are given in input to the network in batch. 
31

Micaela Verucchi's avatar
Micaela Verucchi committed
32
33
34
![gif](output.gif "Results on yolo_test.mp4")  

For other demo videos refer to [this playlist](https://www.youtube.com/playlist?list=PLv0nEQYDD45y5EdSiywwCGPBmJVUzIWwe).
35

Micaela Verucchi's avatar
Micaela Verucchi committed
36
NB) The gif and the videos are obtained with Mapillary Vistas weights, that we cannot publicly share due to its license restrictions. However, you can train Shelfnet using Mapillary and [this](https://git.hipert.unimore.it/mverucchi/shelfnet) fork of the original repo.
37

38

39
40
## FPS Results

Micaela Verucchi's avatar
Micaela Verucchi committed
41
Inference FPS of shelfnet with tkDNN, average of 1200 images on:
42
  * RTX 2080Ti (CUDA 10.2, TensorRT 7.0.0, Cudnn 7.6.5);
Micaela Verucchi's avatar
Micaela Verucchi committed
43
  * Xavier AGX, Jetpack 4.3 (CUDA 10.0, CUDNN 7.6.3, tensorrt 6.0.1 );
44
45
46
47
48
49
50
51
52
53
54

| Platform   | Test                     | Phase   | FP32, ms  | FP32, FPS | FP16, ms  |	FP16, FPS  | INT8, ms |	INT8, FPS | 
| :------:   | :-----:                  | :-----: | :-----:   | :-----:   | :-----:   |	:-----:    | :-----:  |	:-----:   | 
| RTX 2080Ti | shelfnet 1024x1024 (B=1) | pre     | 6.11863   |  163.435  |   5.81465 |  171.979   |  5.88699 |   169.866 |
| RTX 2080Ti | shelfnet 1024x1024 (B=1) | inf     | 11.5464   |  86.6074  |   7.35396 |  135.981   |  6.37623 |   156.832 |
| RTX 2080Ti | shelfnet 1024x1024 (B=1) | post    | 4.09058   |  244.464  |   3.91961 |  255.128   |  4.07343 |   245.493 |
| RTX 2080Ti | shelfnet 1024x1024 (B=1) | tot     | 21.7556   |  45.9652  |   17.0882 |  58.5199   |  16.3366 |   61.2121 |
| RTX 2080Ti | shelfnet 2048x2048 (B=4) | pre     | 25.435    |  39.3158  |   25.2953 |  39.5331   |  25.9303 |   38.565  | 
| RTX 2080Ti | shelfnet 2048x2048 (B=4) | inf     | 36.5015   |  27.3961  |   17.0534 |  58.6395   |  15.6061 |   64.0773 |  
| RTX 2080Ti | shelfnet 2048x2048 (B=4) | post    | 17.3917   |  57.4985  |   17.1649 |  58.2583   |  17.5539 |   56.9675 |  
| RTX 2080Ti | shelfnet 2048x2048 (B=4) | tot     | 79.3283   |  12.6058  |   59.5136 |  16.8029   |  59.0903 |   16.9233 |  
Micaela Verucchi's avatar
Micaela Verucchi committed
55
56
57
58
59
60
61
62
63
| AGX Xavier | shelfnet 1024x1024 (B=1) | pre     | 8.0174    |  124.729  |   7.5117  |  133.126   |  7.47333 |   133.809 |
| AGX Xavier | shelfnet 1024x1024 (B=1) | inf     | 72.4173   |  13.8089  |   37.505  |  26.6631   |  31.3286 |   31.9197 |
| AGX Xavier | shelfnet 1024x1024 (B=1) | post    | 8.89958   |  112.365  |   8.83576 |  113.176   |  9.42655 |   106.083 |
| AGX Xavier | shelfnet 1024x1024 (B=1) | tot     | 89.3342   |  11.1939  |   53.8525 |  18.5692   |  48.2285 |   20.7346 |
| AGX Xavier | shelfnet 2048x2048 (B=4) | pre     | 47.1454   |  21.211   |   21.6475 |  46.1947   |  21.4201 |   46.6851 | 
| AGX Xavier | shelfnet 2048x2048 (B=4) | inf     | 266.537   |  3.75183  |   128.321 |  7.79293   |  107.621 |   9.29185 |  
| AGX Xavier | shelfnet 2048x2048 (B=4) | post    | 44.0711   |  22.6906  |   40.1732 |  24.8922   |  39.873  |   25.0796 |  
| AGX Xavier | shelfnet 2048x2048 (B=4) | tot     | 357.753   |  2.79522  |   190.142 |  5.25922   |  168.914 |   5.92016 |  

64

65
66
67
## Known issues

When creating the rt file all the checks returns errors. It is due to a different resize function and handling of the original ShelfNet outputs.
Micaela Verucchi's avatar
Micaela Verucchi committed
68
However, the network is supposed to work.