TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand

Page created by Zachary Richards
 
CONTINUE READING
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
TensorRT Optimizations for
Embedded Facial Recognition

                Alexey Kadeishvili, CTO, Vocord
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
Vocord Company: Main Facts
 ■ Developer of video surveillance and video analytics systems since 1999
 ■ Deep expertise in facial recognition
 ■ Top-rated in NIST and Megaface face recognition tests
 ■ NVIDIA Metropolis program member

 Our customers and partners

                                                                  www.vocord.com   2
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
Notable figures
    250+ projects for public and private sectors

    140 million faces in enrollment database in a single project

    200,000 cameras are managed by VOCORD video analysis software

    350,000/month API request to VOCORD FaceMatica cloud

    Geography: Europe, Middle East, SE Asia, East Asia, Latin America,
    Oceania
                                                                   www.vocord.com   3
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
Face recognition products

        VOCORD FaceControl                    VOCORD FaceMatica         Face Recognition SDK
        “Faces in the crowd” FR system        Face recognition engine   Face recognition engine SDK
                                              in a Cloud

 nano   VOCORD NanoFace                       VOCORD NetCam             VOCORD FaceControl 3D
        NVIDIA Jetson-based                   New generation face       Free flow 3D facial recognition
        embedded face recognition             recognition camera
        solution

                                    All products support NVIDIA GPU

                                                                                 www.vocord.com           4
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
Main Factors Impacting Facial Recognition

                                                          Enrolment DB quality:
                                                          something beyond control

                     Inbound
                   image quality       Enrolment DB

                              Recognition
                                engine

          Recognition engine: already works as in the Marvel movies   www.vocord.com   5
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
VOCORD Facial Recognition Engine

             TOP in Megaface Face Scrub Open Challenge 2015-2018
             With accuracy 91.76%

             TOP in NIST Face Recognition Vendor Test 2016-2018
             TPR at FPR 10-4 = 98.7%, TPR at FPR 10-6 = 96.6%

                                                                www.vocord.com   6
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
Cross Nation Invariance

Source: NIST Face recognition vendor test, 2018
                                                  www.vocord.com   7
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
Pose Invariance
  0.25
                                        Enrollment DB  60˚, enrollment DB >60˚            Group 3
                                                                                 30 ÷ 45˚
      0.1

                                                                                 Group 4
  0.05
                                                                                 45 ÷ 60˚

                                                                                 Group 5
       0                                                                         > 60˚
        1.E-07   1.E-06   1.E-05   1.E-04    1.E-03    1.E-02   1.E-01   1.E00

                                            FAR

                                                                                            www.vocord.com   8
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
Image Resolution Impact
                             1.0

                             0.95
True Identification Rate**

                                                       Face identification probability

                             0.9
                                         Recommended minimum

                                                                                                             Optimal resolution
                             0.85

                             0.8
                                                                                                                                                 L=48 pix       L =24 pix

                             0.75

                             0.7
                                    12                                                   24      36                               48   60   72

                                                                                              Pixels between eyes (L)
*L – the distance between eyes, pix
** FAR=10-4                                                                                                                                                 www.vocord.com   9
TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
How to improve recognition?

 The quality of acquired face                                       Enrollment DB quality:
 images: point of growth                                            something beyond control

                                  Inbound
                                   Image               Enrollment
                                   Quality                DB

                                         Recognition
                                           Engine

                                Recognition engine: already works
                                    as in the Marvel movies
                                                                                www.vocord.com   10
Different types of test datasets

NIST FRVT Report 2017 10 03
                                    www.vocord.com   11
“Controlled” dataset
                              Algorithm A

                              Algorithm B

NIST FRVT Report 2017 10 03
                                            www.vocord.com   12
“Uncontrolled” dataset

                              Algorithm A

                              Algorithm B

NIST FRVT Report 2017 10 03
                                            www.vocord.com   13
Controlled vs. Uncontrolled (FRR log scale)

      0.7
                                                                   Algorithm A,
                                                                   uncontrolled environment

      0.6
                                                                   Algorithm B,
                                                                   uncontrolled environment

      0.5
                                                                   Algorithm A,
                                                                   controlled environment
FRR

      0.4
                                                                   Algorithm B,
                                                                   controlled environment

      0.3

      0.2

      0.1
       1.E-07   1.E-06   1.E-05         1.E-04   1.E-03   1.E-02

                                  FAR

                                                                                        www.vocord.com   14
Controlled vs. Uncontrolled (linear scale)

      0.7
                                                                   Algorithm A,
                                                                   uncontrolled environment

      0.6
                                                                   Algorithm B,
                                                                   uncontrolled environment

      0.5
                                                                   Algorithm A,
FRR

                                                                   controlled environment

      0.4
                                                                   Algorithm B,
                                                                   controlled environment

      0.3

      0.2

      0.1
       1.E-07   1.E-06   1.E-05         1.E-04   1.E-03   1.E-02

                                  FAR

                                                                                        www.vocord.com   15
Hit the bottom: Images from IP camera
The Advantages of Edge Video Analysis
■   Face recognition onboard

■   No compression artifacts: the
    image is taken directly from the
    sensor

■   Dynamic Region of Interest for
    every intelligent algorithm

■   Algorithm adjustment for particular
    camera set up
                                                 VOCORD NetCam.AI
                                          edge video analytics camera

                                                        www.vocord.com   17
Video Enhancement Onboard
Dynamic ROI enhances the quality of image in the face area

         Backlight, no             12 bit image        12 bit image with
         enhancement              with static ROI        dynamic ROI

                                                                           18
VOCORD NetCam.AI HW Features
     High quality sensor    Automated lens control

    NVIDIA Jetson TX1 GPU

                                              www.vocord.com   19
VOCORD NetCam.AI Tech Specs
      Camera specs
      Resolution                                 3÷5 Mpix
      Temperature range                          -25С ~ +50С
      Ingress Protection                         IP 67
      Dimensions                                 20x71x150 mm
      Power consumption                          15W
      Built-in facial recognition engine specs

      Min face resolution for face recognition   12 pixels between the eyes
      Number of faces detected in one frame      Up to 25
      Latency of biometric template extraction   Up to 150 ms per 1 face
      Face recognition performance               Up to 32 faces/s
      Inference framework                        TensorRT

                                                                              www.vocord.com   20
Performance on Different Platforms
35
     32
                                    NVIDIA Jetson TX1
30
                                    Intel Movidius
25                                  Qualcom Snapdragon 820

20                   19

15
                                            12

10        9
                          6
 5                                                   4
               2,2            1,4                        0,9
 0
     "Shallow" CNN   "Medium" CNN             "Deep" CNN

                                                               www.vocord.com   21
Higher FPS Improves Accuracy
       0.15

       0.13
                                                  Single face:
                                                          “Deep” CNN
       0.11                                              “Medium” CNN
                                                         ”Shallow” CNN
       0.09

                                                  Track (multiple faces):
 FRR

        0.7
                                                         “Deep” CNN
                                                          “Medium” CNN
        0.5
                                                          ”Shallow” CNN
       0.03

       0.01

         0
        1.E-07   1.E-06   1.E-05         1.E-04              1.E-03         1.E-02

                                   FAR

                                                                                     www.vocord.com   22
TensorRT vs. MXNet Performance

      35
                                                                    MXNet
                              32

      30
                                                                    TensoRT

      25

      20                                  19
                    18
FPS

      15

                                                            12
                                    10
      10

                                                   6
      5

      0

                 “Shallow” CNN      “Medium” CNN       “Very” CNN

      Platform: NVIDIA Jetson TX1
                                                                              www.vocord.com   23
WHAT’S THE PROFIT?

                     www.vocord.com   24
Face recognition systems architectures
   Edge analytics system                                “Traditional” server architecture approach
   with VOCORD NetCam.AI cameras                  VS    with regular IP-cameras

                                                                                          Data center
                                                                                    with many expensive rack
                             One archive server
                                                                                            servers

                LAN, Wi-Fi                                                    LAN

       95% of processing is here                       95% of processing is here
                                                                                                          25
Cost-Efficiency: 100 High Loaded Cameras
Edge computing with VOCORD NetCam.AI                        “Traditional” server architecture with IP cameras
                                                       VS
Cameras                                                     Cameras
USD 2,000 x 100 = USD 200,000                               USD 500 x 100 = USD 50,000

Server for matching and archive                             Servers
USD 10,000                                                  Detection: 2 servers, 4xCPU 32 cores each
                                                            USD 60,000
                                                            Template extraction: 4 servers, 2 GPU Tesla P40 each
                                                            USD 120,000
                                                            Server for matching and archive
                                                            USD 10,000

CAPEX: USD 210,000                                          CAPEX: USD 240,000
Maintenance costs:                                          Maintenance costs:
power supply (800 Wt), bandwidth (2Gbps), rack space        power supply (7-8 kWt), bandwidth (2Gbps), rack space

OPEX: USD 2,000 per year                                    OPEX: USD 30,000 per year

                                                                                                    www.vocord.com   26
WHAT’S NEXT?
•   Uploading various video analytics algorithms
•   Highly customized algorithms
•   Interacting cameras as a part of IoT
•   3D vision

                                              www.vocord.com   27
Open Platform: Easy Algorithm Uploading
                       Facial
                       recognition

        Behavioral                   License plate
          analysis                   recognition

                                             Vehicle
                                             types
       Emergency
           cases

                     Lost and
                     found objects                     www.vocord.com   28
Camera-Dependent Algorithm Customization

       Step 1. The camera                                      Step 2. The neural network
       collects images and                                     is retrained on the server
 uploads them to the server                                    using new images

                              Step 3. Customized,
                              light-weight neural network
                              is uploaded back to the camera

                                                                          www.vocord.com    29
Customization to restricted data
             Unrestricted data                                                                  Restricted data
      0.04
                                                                                    0.04

      0.035                                                                         0.035
                                          “Deep” neural network                                                        “Deep” neural network
      0.03
                                          “Shallow” neural network
                                                                                    0.03
                                                                                                                       “Shallow” nueral network
      0.025                                                                         0.025
FRR

                                                                              FRR
      0.02                                                                          0.02

      0.015                                                                         0.015

       0.01                                                                          0.01

      0.005                                                                         0.005

         0
              1.E-07   1.E-06    1.E-05         1.E-04   1.E-03      1.E-02
                                                                                            1.E-07   1.E-06   1.E-05   1.E-04   1.E-03   1.E-02   1.E-01
                                          FAR                                                                          FAR

               Deeper DNNs provide better                                                   On restricted data difference between deep and shallow
               performance on unrestricted data                                             network is negligible

                                                                                                                                www.vocord.com             30
Intercamera Tracking

                       Face

                       Bag

 NetCam.AI #1                  NetCam.AI #2

                       Jeans

                                              www.vocord.com   31
Obtaining 3D Models
■ Building a 3D object from synchronous snapshots from multiple cameras
■ Feature preprocessing for conjugate points search

                                                                 www.vocord.com   32
Thank you for your attention! Questions?

                              E-mail: sales@vocord.com
                              Website: www.vocord.com
You can also read