TensorRT Optimizations for Embedded Facial Recognition - Alexey Kadeishvili, CTO, Vocord - GTC On-Demand
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Vocord Company: Main Facts
■ Developer of video surveillance and video analytics systems since 1999
■ Deep expertise in facial recognition
■ Top-rated in NIST and Megaface face recognition tests
■ NVIDIA Metropolis program member
Our customers and partners
www.vocord.com 2Notable figures
250+ projects for public and private sectors
140 million faces in enrollment database in a single project
200,000 cameras are managed by VOCORD video analysis software
350,000/month API request to VOCORD FaceMatica cloud
Geography: Europe, Middle East, SE Asia, East Asia, Latin America,
Oceania
www.vocord.com 3Face recognition products
VOCORD FaceControl VOCORD FaceMatica Face Recognition SDK
“Faces in the crowd” FR system Face recognition engine Face recognition engine SDK
in a Cloud
nano VOCORD NanoFace VOCORD NetCam VOCORD FaceControl 3D
NVIDIA Jetson-based New generation face Free flow 3D facial recognition
embedded face recognition recognition camera
solution
All products support NVIDIA GPU
www.vocord.com 4Main Factors Impacting Facial Recognition
Enrolment DB quality:
something beyond control
Inbound
image quality Enrolment DB
Recognition
engine
Recognition engine: already works as in the Marvel movies www.vocord.com 5VOCORD Facial Recognition Engine
TOP in Megaface Face Scrub Open Challenge 2015-2018
With accuracy 91.76%
TOP in NIST Face Recognition Vendor Test 2016-2018
TPR at FPR 10-4 = 98.7%, TPR at FPR 10-6 = 96.6%
www.vocord.com 6Pose Invariance
0.25
Enrollment DB 60˚, enrollment DB >60˚ Group 3
30 ÷ 45˚
0.1
Group 4
0.05
45 ÷ 60˚
Group 5
0 > 60˚
1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 1.E00
FAR
www.vocord.com 8Image Resolution Impact
1.0
0.95
True Identification Rate**
Face identification probability
0.9
Recommended minimum
Optimal resolution
0.85
0.8
L=48 pix L =24 pix
0.75
0.7
12 24 36 48 60 72
Pixels between eyes (L)
*L – the distance between eyes, pix
** FAR=10-4 www.vocord.com 9How to improve recognition?
The quality of acquired face Enrollment DB quality:
images: point of growth something beyond control
Inbound
Image Enrollment
Quality DB
Recognition
Engine
Recognition engine: already works
as in the Marvel movies
www.vocord.com 10Different types of test datasets
NIST FRVT Report 2017 10 03
www.vocord.com 11“Controlled” dataset
Algorithm A
Algorithm B
NIST FRVT Report 2017 10 03
www.vocord.com 12“Uncontrolled” dataset
Algorithm A
Algorithm B
NIST FRVT Report 2017 10 03
www.vocord.com 13Controlled vs. Uncontrolled (FRR log scale)
0.7
Algorithm A,
uncontrolled environment
0.6
Algorithm B,
uncontrolled environment
0.5
Algorithm A,
controlled environment
FRR
0.4
Algorithm B,
controlled environment
0.3
0.2
0.1
1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02
FAR
www.vocord.com 14Controlled vs. Uncontrolled (linear scale)
0.7
Algorithm A,
uncontrolled environment
0.6
Algorithm B,
uncontrolled environment
0.5
Algorithm A,
FRR
controlled environment
0.4
Algorithm B,
controlled environment
0.3
0.2
0.1
1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02
FAR
www.vocord.com 15Hit the bottom: Images from IP camera
The Advantages of Edge Video Analysis
■ Face recognition onboard
■ No compression artifacts: the
image is taken directly from the
sensor
■ Dynamic Region of Interest for
every intelligent algorithm
■ Algorithm adjustment for particular
camera set up
VOCORD NetCam.AI
edge video analytics camera
www.vocord.com 17Video Enhancement Onboard
Dynamic ROI enhances the quality of image in the face area
Backlight, no 12 bit image 12 bit image with
enhancement with static ROI dynamic ROI
18VOCORD NetCam.AI HW Features
High quality sensor Automated lens control
NVIDIA Jetson TX1 GPU
www.vocord.com 19VOCORD NetCam.AI Tech Specs
Camera specs
Resolution 3÷5 Mpix
Temperature range -25С ~ +50С
Ingress Protection IP 67
Dimensions 20x71x150 mm
Power consumption 15W
Built-in facial recognition engine specs
Min face resolution for face recognition 12 pixels between the eyes
Number of faces detected in one frame Up to 25
Latency of biometric template extraction Up to 150 ms per 1 face
Face recognition performance Up to 32 faces/s
Inference framework TensorRT
www.vocord.com 20Performance on Different Platforms
35
32
NVIDIA Jetson TX1
30
Intel Movidius
25 Qualcom Snapdragon 820
20 19
15
12
10 9
6
5 4
2,2 1,4 0,9
0
"Shallow" CNN "Medium" CNN "Deep" CNN
www.vocord.com 21Higher FPS Improves Accuracy
0.15
0.13
Single face:
“Deep” CNN
0.11 “Medium” CNN
”Shallow” CNN
0.09
Track (multiple faces):
FRR
0.7
“Deep” CNN
“Medium” CNN
0.5
”Shallow” CNN
0.03
0.01
0
1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02
FAR
www.vocord.com 22TensorRT vs. MXNet Performance
35
MXNet
32
30
TensoRT
25
20 19
18
FPS
15
12
10
10
6
5
0
“Shallow” CNN “Medium” CNN “Very” CNN
Platform: NVIDIA Jetson TX1
www.vocord.com 23WHAT’S THE PROFIT?
www.vocord.com 24Face recognition systems architectures
Edge analytics system “Traditional” server architecture approach
with VOCORD NetCam.AI cameras VS with regular IP-cameras
Data center
with many expensive rack
One archive server
servers
LAN, Wi-Fi LAN
95% of processing is here 95% of processing is here
25Cost-Efficiency: 100 High Loaded Cameras
Edge computing with VOCORD NetCam.AI “Traditional” server architecture with IP cameras
VS
Cameras Cameras
USD 2,000 x 100 = USD 200,000 USD 500 x 100 = USD 50,000
Server for matching and archive Servers
USD 10,000 Detection: 2 servers, 4xCPU 32 cores each
USD 60,000
Template extraction: 4 servers, 2 GPU Tesla P40 each
USD 120,000
Server for matching and archive
USD 10,000
CAPEX: USD 210,000 CAPEX: USD 240,000
Maintenance costs: Maintenance costs:
power supply (800 Wt), bandwidth (2Gbps), rack space power supply (7-8 kWt), bandwidth (2Gbps), rack space
OPEX: USD 2,000 per year OPEX: USD 30,000 per year
www.vocord.com 26WHAT’S NEXT?
• Uploading various video analytics algorithms
• Highly customized algorithms
• Interacting cameras as a part of IoT
• 3D vision
www.vocord.com 27Open Platform: Easy Algorithm Uploading
Facial
recognition
Behavioral License plate
analysis recognition
Vehicle
types
Emergency
cases
Lost and
found objects www.vocord.com 28Camera-Dependent Algorithm Customization
Step 1. The camera Step 2. The neural network
collects images and is retrained on the server
uploads them to the server using new images
Step 3. Customized,
light-weight neural network
is uploaded back to the camera
www.vocord.com 29Customization to restricted data
Unrestricted data Restricted data
0.04
0.04
0.035 0.035
“Deep” neural network “Deep” neural network
0.03
“Shallow” neural network
0.03
“Shallow” nueral network
0.025 0.025
FRR
FRR
0.02 0.02
0.015 0.015
0.01 0.01
0.005 0.005
0
1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02
1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01
FAR FAR
Deeper DNNs provide better On restricted data difference between deep and shallow
performance on unrestricted data network is negligible
www.vocord.com 30Intercamera Tracking
Face
Bag
NetCam.AI #1 NetCam.AI #2
Jeans
www.vocord.com 31Obtaining 3D Models
■ Building a 3D object from synchronous snapshots from multiple cameras
■ Feature preprocessing for conjugate points search
www.vocord.com 32Thank you for your attention! Questions?
E-mail: sales@vocord.com
Website: www.vocord.comYou can also read