Investigation on how presentation attack detection can be used to increase security for face recognition as biometric identification ...

Page created by Darrell Perez
 
CONTINUE READING
Investigation on how presentation attack detection can be used to increase security for face recognition as biometric identification ...
Investigation on how presentation attack detec-
tion can be used to increase security for face
recognition as biometric identification
Improvements on traditional locking system

Fredrik Öberg

Independent degree project – second cycle — Master thesis
Main field of study: Department of Information Systems and Technology
Credits: 30 hp
Semester, year: 10, 2021
Supervisor: Sebastian Försth (Dewire), Luca Beltramelli (Mid sweden university)
Examiner: Mikael Gidlund, mikael.gidlund@miun.se
Degree programme: civil engineering computer science , 300 credits
Investigation on how presentation attack detection can be used to increase security for face recognition as biometric identification ...
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

Abstract
Biometric identification has already been applied to society today, as to-
day’s mobile phones use fingerprints and other methods like iris and the
face itself. With growth for technologies like computer vision, the Internet
of Things, Artificial Intelligence, The use of face recognition as a biomet-
ric identification on ordinary doors has become increasingly common. This
thesis studies is looking into the possibility of replacing regular door locks
with face recognition or supplement the locks to increase security by using
a pre-trained state-of-the-art face recognition method based on a convolu-
tion neural network. A subsequent investigation concluded that a networks
based face recognition are is highly vulnerable to attacks in the form of pre-
sentation attacks. This study investigates protection mechanisms against
these forms of attack by developing a presentation attack detection and an-
alyzing its performance. The obtained results from the proof of concept
showed that local binary patterns histograms as a presentation attack detec-
tion could help the state of art face recognition to avoid attacks up to 88% of
the attacks the convolution neural network approved without the presenta-
tion attack detection. However, to replace traditional locks, more work must
be done to detect more attacks in form of both higher percentage of attacks
blocked by the system and the types of attack that can be done. Neverthe-
less, as a supplement face recognition represents a promising technology
to supplement traditional door locks, enchaining their security by comple-
menting the authorization with biometric authentication. So the main con-
tributions is that by using simple older methods LBPH can help modern
state of the art face regognition to detect presentation attacks according to
the results of the tests. This study also worked to adapt this PAD to be suit-
able for low end edge devices to be able to adapt in an environment where
modern solutions are used, which LBPH have.
Keywords Face Recognition, Presentation Attacks, Convolutional Neural
Network

                                       i
Investigation on how presentation attack detection can be used to increase security for face recognition as biometric identification ...
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

Acknowledgements
First, i want to start by thanking Dewire by Knightec, who gave me the
opportunity to do this thesis with them and my supervisor Sebastian Försth.
Secondly this thesis could never have been good to complete without the
help of my supervisor Luca Beltramelli at mid Sweden university witch help
me when i needed it and gave excellent feedback on the thesis to improve
it.

                                    ii
Investigation on how presentation attack detection can be used to increase security for face recognition as biometric identification ...
Table of Contents
Abstract                                                                                                        i

Acknowledgements                                                                                               ii

List of Figures                                                                                                v

List of Tables                                                                                                vi

Terminology / Notation                                                                                        vii

1 Introduction                                                                                                 1
  1.1 Background and problem motivation           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    1
  1.2 Overall aim . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    3
  1.3 Scope . . . . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    4
  1.4 Research question . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    4
  1.5 Concrete and verifiable goals . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    5
  1.6 Outline . . . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    5
  1.7 Contributions . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    5

2 Theory                                                                                                       6
  2.1 Face Detection . . . . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .    6
      2.1.1 Haar-like cascade . . . . . . . . .           .   .   .   .   .   .   .   .   .   .   .   .   .    6
      2.1.2 Histogram of Oriented Gradients               .   .   .   .   .   .   .   .   .   .   .   .   .    7
  2.2 Face Recognition . . . . . . . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .    7
  2.3 Spoofing and Presentation Attack . . . .            .   .   .   .   .   .   .   .   .   .   .   .   .    9
  2.4 Face classification . . . . . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   10
  2.5 Methods . . . . . . . . . . . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   11
      2.5.1 Convolutional Neural Network .                .   .   .   .   .   .   .   .   .   .   .   .   .   11
      2.5.2 Local Binary Pattern . . . . . . .            .   .   .   .   .   .   .   .   .   .   .   .   .   12
      2.5.3 Principal component analysis . .              .   .   .   .   .   .   .   .   .   .   .   .   .   13
  2.6 Databases . . . . . . . . . . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   14
      2.6.1 Face recognition . . . . . . . . .            .   .   .   .   .   .   .   .   .   .   .   .   .   14
      2.6.2 Spoofing attacks databases . . .              .   .   .   .   .   .   .   .   .   .   .   .   .   14
  2.7 Related work . . . . . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   15

3 Methodology                                                                                                 17
  3.1 Research area and strategy . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
  3.2 Proposed solution . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
  3.3 Dataset structure . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
  3.4 Choice of algorithms . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19
      3.4.1 Face detection . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19
      3.4.2 Face recognition . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19
      3.4.3 Image classification . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   20
      3.4.4 Presentation attack detection         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   20

                                     iii
Investigation on how presentation attack detection can be used to increase security for face recognition as biometric identification ...
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

   3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Implementation                                                                                                 22
  4.1 Testing framework . . . . . . . . . . . . .            .   .   .   .   .   .   .   .   .   .   .   .   .   22
      4.1.1 Presentation attacks . . . . . . .               .   .   .   .   .   .   .   .   .   .   .   .   .   23
  4.2 Face recognition system . . . . . . . . .              .   .   .   .   .   .   .   .   .   .   .   .   .   24
      4.2.1 Face detection . . . . . . . . . . .             .   .   .   .   .   .   .   .   .   .   .   .   .   24
      4.2.2 Face recognition with CNN . . .                  .   .   .   .   .   .   .   .   .   .   .   .   .   25
      4.2.3 Image classification . . . . . . . .             .   .   .   .   .   .   .   .   .   .   .   .   .   26
  4.3 Presentation attack detection with LBPH                .   .   .   .   .   .   .   .   .   .   .   .   .   26
      4.3.1 Face detection . . . . . . . . . . .             .   .   .   .   .   .   .   .   .   .   .   .   .   26
      4.3.2 LBPH training . . . . . . . . . . .              .   .   .   .   .   .   .   .   .   .   .   .   .   27
      4.3.3 Image classification . . . . . . . .             .   .   .   .   .   .   .   .   .   .   .   .   .   27

5 Result                                                                                                         28
  5.1 Investigation of methods . . . .     . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   28
      5.1.1 Face recognition . . . .       . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   28
      5.1.2 Presentation attacks . .       . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   28
  5.2 Implementation of systems . . .      . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   29
  5.3 Evaluation against the database        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30
      5.3.1 Case one FR . . . . . . .      . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   31
      5.3.2 Case two PAD . . . . .         . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   31
      5.3.3 Case three PAD + FR . .        . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   32

6 Discussion                                                                                                     33
  6.1 Development of system . . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   33
      6.1.1 CNN face recognition . . . . .           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   33
      6.1.2 Presentation attack detection            .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   33
  6.2 Framework discussion . . . . . . . .           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   33
  6.3 Evaluation of results . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   34
  6.4 Ethical aspects . . . . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   34
  6.5 Future work . . . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35

7 Conclusions                                                                                                    36
  7.1 Concrete and verifiable goals . . . . . .              .   .   .   .   .   .   .   .   .   .   .   .   .   36
  7.2 Conclusion research question . . . . . .               .   .   .   .   .   .   .   .   .   .   .   .   .   37
  7.3 Overall conclusion and lessons learned                 .   .   .   .   .   .   .   .   .   .   .   .   .   38
  7.4 Main contributions . . . . . . . . . . . . .           .   .   .   .   .   .   .   .   .   .   .   .   .   38

References                                                                                                       40

                                      iv
List of Figures
  1    Cloud based face recognition system . . . . . . . . . . . . . .        .    3
  2    Illustration Haar-like features . . . . . . . . . . . . . . . . . .    .    6
  3    Face recognition process . . . . . . . . . . . . . . . . . . . . .     .    7
  4    Standardization of weak point in ISO/IEC DIS 30107-1, 2016             .    9
  5    Convolutional neural network . . . . . . . . . . . . . . . . . .       .   11
  6    Max-pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .   12
  7    Local Binary Pattern . . . . . . . . . . . . . . . . . . . . . . .     .   13
  8    Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .   22
  9    Folder structure . . . . . . . . . . . . . . . . . . . . . . . . . .   .   25
  10   Cnn detection of a face . . . . . . . . . . . . . . . . . . . . . .    .   29
  11   Attack and real histogram distribution . . . . . . . . . . . . .       .   30

                                      v
List of Tables
  1   Database protocols . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   19
  2   CNN baseline confusion matrix . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   31
  3   CNN baseline results obtain from matrix .      .   .   .   .   .   .   .   .   .   .   .   .   31
  4   PAD confusion matrix . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   31
  5   PAD baseline results obtain from matrix .      .   .   .   .   .   .   .   .   .   .   .   .   32

                                   vi
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

Terminology
.

    CNN      Convolutional Neural Network
    CRISP-DM CRoss Industry Standard Process for Data Mining (CRISP-DM)
    DNN      Deep Neural Network
    FR       Face Regongition
    IoT      Internet of Things
    KNN      k-nearest neighbor
    LBP      Local Binary Pattern
    PA       Presentation Attack
    PAD      Presentation Attack Detection
    SVM      Support Vector Machine

                                   vii
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

1     Introduction
This chapter explains the background of this study which focus on examine
the security of facial recognition as biometric identification as a replacement
or supplement to key system. Followed by, concrete and verifiable goals to
achieve the overall goal for the project. And in the end, the reader can read
about how this report is structured and what limitations it had, but also the
author’s Contribution.

1.1     Background and problem motivation
In recent years, biometric identification is becoming the preferred solution
to a wide range of problems involving identity-checking because of the abil-
ity to provide more secure identification and verification, which this article
states. [1] And from this, a method for biometric identification that is very
common today is face recognition. So by focusing on biometric identifica-
tion as an alternative to replacing traditional locks. We can see that this has
already been applied to society today, as today’s mobile phones use finger-
prints. Other methods exist as well, like the iris and the face itself. Of these
three, when it comes to daily use, face recognition is the least intrusive of
them because of how easy it is to analyze images with faces. A recent survey
[2] published in 2019 has identified and categorized over 330 contributions
to deep learning-based face recognition, a testament to the significant in-
terest surrounding this area in academia. One big part of what this survey
talked about where the Identification process of a person, which is simply
the process of someone claiming to be a specific person. After this process,
what needs to happen is an authentication process to verifying or prove the
claimed identity. This process happens today in the form of a traditional
locking system that can use a key or password. These traditional lock leads
to having many accounts, passwords, and more, and keeping track of these
is becoming increasingly complex, especially when it comes to systems that
require high security. And to solve these problems, traditional systems have
biometric identification can be used, which is a process where parts of a
person’s body are analyzed to identify the person. By looking into how bio-
metric identification is used, we can see that this type of identification has
started to increase. It is used more in places like smartphones, laptops, and
tablets to secure data and other sensitive information because of the unique-
ness of biometric characteristics in the security system. [3]
However, one of these identification methods mentioned earlier is face recog-
nition. Some challenges that need to be addressed for this method are low
resolution, pose variation, complex illumination, and motion blur. Face
recognition methods based on more traditional algorithms like support vec-
tor machine (SVM), Eigenfaces, Fisherfaces, Metaface, and Bayesian faces

                                       1
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

do not handle the problems mentioned above in a good way. Furthermore,
all of the mentioned methods cannot handle unconstrained face matching,
like having different lighting and background every time. All of these above
mentioning problems are described in [2] One interesting thing the survey
focused on where Convolutional Neural Network (CNN), Which out of 330
contributions in the survey, 61% were based on the CNN to solve different
face recognition problems. These methods showed a good result on verifi-
cation with face recognition of up to 96 % accuracy. [2]

And by combining these Biometric identification methods with technolo-
gies such as computer vision, Internet of Things, Artificial Intelligence, and
cloud solutions. An ideal system that utilizes these technologies have been
created for this study in figure 1 as a reference picture which will be ex-
plained in detail later. Based on that picture and that phones and comput-
ers are already utilizing face recognition to identify users. Questions appear,
like, is it possible to use FR to solve the traditional key system access prob-
lem?

This question has already been tackled for face recognition because of re-
search on how ordinary doors with face recognition will work. For exam-
ple, [4] [5] [6]. Intelligent doors with face recognition are realized. However,
with face recognition, other problems will also appear, like presentation at-
tacks, when face recognition wrongfully gives access when attacked. This
article [7] states that if a deep neural network (DNN) face recognition is
used, the method is highly vulnerable to Presentation Attacks if the model
has higher than 90% accuracy. Furthermore, since security is always a hot
topic, this study is more about security regarding facial recognition.
Furthermore, to explain why we can look at today’s people who use large
parts of key systems or code to access places. However, this does not con-
firm whether it is the physical owner of the key who accesses the site be-
cause there is no guarantee that when using keys and codes, the people
who do not have the authorization to enter will enter.
One way to solve this problem is by developing a face recognition system
that can unlock doors with people’s faces. Nevertheless, still many open
questions remain to be answered. Would this system be more secure than
regular locks, and will it be safe to use as an everyday use? How resilient
is it to Presentation Attacks like replay attacks? What pros and cons does
a system like this have? Alternatively, can it be used as a face recognition
supplement to the already existing key system?. All of the above questions
and thoughts will this study try to answer.

                                       2
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

              Figure 1: Cloud based face recognition system

1.2    Overall aim
This study aims to examine the security of facial recognition as biometric
identification as a replacement or supplement to key systems or similarly
restricted areas where not everyone is authorized to have access. This will
be done by making a proof of concept that uses LBPH in the python library
openCV2 which was originally made to do face recognition but in this case
will act as a PAD. And this will be compered with a CNN FR system to
see how it handels PAs. Furthermore, This aim proceed from assuming the
developed system will be used in the reference picture in figure 1. Due to
popular technologies such as cloud computing and IoT that Industries are
trying to implement. This figure can be illustrated as follow. A full-fledged
door locking system with an edge device to capture and detect faces to see
if the picture is a presentation attack. Furthermore, thanks to it having an
edge device, the developed system must run on lower-end edge devices.
Then the edge device sends the picture to the cloud. Then in the cloud,

                                     3
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

a face recognition process begins. A decision will be made if the person
is allowed to enter the door. To do this, a face recognition model with high
accuracy will be used. With the architecture in Fig 1. as a reference when de-
veloping the FR and the presentation attack detection (PAD), this study will
investigate how facial recognition as biometric identification can replace or
supplement traditional systems. More practically, the aim is to develop a
PAD to protect against PA based on related work. Furthermore, by using a
pre-trained CNN model and evaluate its vulnerability against PA. This will
create a two system one for the PAD and one for the FR These systems will
work together to classify if it is a PA and if it is allowed. And all of this will
be evaluated based on accuracy and how good it can restrain PA against the
Replay-Attack database. [8]

1.3     Scope
This thesis has several limitations in the scope. One of these is thanks to
how many ways of implementing a face recognition system. This thesis fo-
cused on convolution neural networks for the face recognition method be-
cause it is considered a more state-of-the-art way of doing it. To investigate
and evaluate if it is possible to create a face recognition system to replace a
traditional lock or have it as a supplement. Will this thesis focus on how to
protect against presentation attacks and see how it affects the state-of-the-art
face recognition baseline protection. Furthermore, the study is based on the
earlier mentioned architecture in figure 1. which means that the PAD must
be able to run on a low-end edge device which means that the PAD must be
suitable for this. Due to the time span of the thesis, this study will only look
at high-resolution replay attacks because of the simplicity of a regular user
to do this type of attack.

1.4     Research question
The main research questions in this thesis are as follows:
   • How can a PAD be used to increase the security of a state of art face
     recognition model like a CNN model with high accuracy in a locking
     system?
   • Can traditional locking systems be replaced with face recognition or
     be used as a supplement to increase the security of an existing locking
     system?

                                        4
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

1.5     Concrete and verifiable goals
The concrete goals of the project are as follows:
   • Investigate what methods there are in face recognition and what meth-
     ods to protect against presentation attacks against these face recogni-
     tion methods.
   • Implement a face recognition system using a CNN model and a PAD
     suitable for running on a low-end edge device.
   • Implement an test environment to evaluate the PAD, and the FR.
   • Evaluate the CNN model and the PAD against the presentation attack
     database.
   • Evaluate the PAD and the CNN model together against the presenta-
     tion attack database.
   • From the result, Evaluate the possibility of increased security by re-
     placement or supplement of a traditional locking system with Face
     recognition to see strengths and weaknesses when it comes to facial
     recognition.

1.6     Outline
Chapter 1 describes the general background and what the purpose of this
project is. Chapter 2 explains the necessary theory and presents the related
works. Chapter 3 explains the method used to carry out this project to test
and validate the created systems. Chapter 4 describes the implementation
of the system. Then comes Chapter 5, which will present the results. Finally,
Chapter 6 and 7 show the discussion and conclusions.

1.7     Contributions
The thesis has been performed by Fredrik Öberg, under supervision of Se-
bastian Försth (Dewire), Luca Beltramelli (Mid sweden university). Dewire
by Knightec

                                      5
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

2       Theory
This chapter describes the theoretical elements for this thesis. The reader
will get the theory to understand how and what the coming chapters will
present. Some significant parts are deep learning and facial recognition.
Another part will be previous work in the areas of this study.

2.1     Face Detection
Face detection is a technology in computer science that aims to detect and
identify faces in an image or a video stream. There are different methods to
accomplish the tasks. One of these is CNN. Developing these networks from
scratch will require vast amounts of data, and can be complex. So if this is
a problem, a pre-trained model trained on millions of faces can be used to
make it easier. Furthermore, there are also other commonly used methods
like Haar-like cascade (HOG) with (SVM) or (LBP) cascade. A comparison
of this method has been made by. [9] which showed that the HOG+SVM
approach is more robust and accurate than LBP and Haar approaches, with
an average detection rate of 92.68%.

2.1.1    Haar-like cascade
Viola-Jones Algorithm Developed in 2001 by Paul Viola and Michael Jones. [10]
Is the first step to have a Haar-like cascade. The Viola-Jones algorithm is an
object-recognition framework that allows the detection of image features in
real-time. Despite being an outdated framework, Viola-Jones is quite pow-
erful, and its application has proven to be exceptionally notable in real-time
face detection.

                  Figure 2: Illustration Haar-like features

                                     6
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

Detection works by outline a box on the image then iterates thru this image
with this box. Furthermore, while this box is going thru the image, it will be
searching for these haar-like features. These haar-like features are features
the system can see in pictures based on the distribution in black and white
colors. Furthermore, by combining these features, a face can be created.
How these features look like is shown in figure 2

2.1.2    Histogram of Oriented Gradients
A feature descriptor is an algorithm that takes an image and outputs fea-
ture descriptors/feature vectors. And what it does is encode the informa-
tion into a series of numbers and then act as a numerical "fingerprint" that
can differentiate one feature from another. This is the base of how the His-
togram of Oriented Gradients works. Furthermore, this method wants to
create images with so low an amount of data and still see what the picture
represents.
HOG works by focusing on the structure or the shape of an object, and what
HOG does is provide the edge directions by extracting the gradient and ori-
entation of the edges. Additionally, small regions in the picture will rep-
resent these orientations in the image. Furthermore, for each region, the
gradients and orientation are calculated. Finally, the HOG generates a His-
togram for each of these regions separately. Based on the values of the pixels
and create the histograms using the gradients and orientations.

2.2     Face Recognition
Face recognition is a digital technology that began to be developed in the
1970s [11] and has since been developed at a tremendous rate essentially
because computers have become more powerful. What Face Recognition
does to identify or verify a person based on a digital image or video frame.
By comparing images from a given image within a database to generate a
model. This model knows all images in the database, the process in figure
3 Ilustate a typical system.

                     Figure 3: Face recognition process

                                      7
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

   • Occlusion and Partial occlusion are some of the significant challenges
     of face recognition, which is the ability to hide part of the face. It
     would be difficult to recognize a face if some part of the face is missing.
   • Low Resolution as an example, the pictures are taken from surveil-
     lance video cameras comprise tiny faces.
   • Digital Noise lineages are inclined to several types of noise. This
     noise leads to poor detection and recognition accuracy.
   • Illumination the variations in illumination can drastically degrade the
     performance of the face recognition system. The reasons for these vari-
     ations could be background light, shadow, brightness, contrast.
   • Pose Variation frontal face reconstruction is required to match the im-
     age face with the face in the database.
   • Expressions With the help of facial expressions, we can express our
     feelings which can affect the FR.
   • Aging is one of the natural components.
   • Plastic Surgery plastic surgery and their faces will be unknown to the
     existing face recognition framework.
Based on these factors, there is a couple of methods to conduct facial recog-
nition. These can be summarized using geometry-based Methods, Holis-
tic Methods, Feature-based Methods, Hybrid Methods, and Deep Learning
Methods.
   • Geometry-based Methods This method is one of the first proposed
     methods for face recognition. The method works by finding a set of
     facial landmarks to measure the position and distance between them.
   • Holistic Methods Represent faces using the entire face region. Many
     of these methods work by projecting face images onto a low-dimensional
     space.
   • Feature-based Methods refer to methods that leverage local features
     extracted at different locations in a face image.
   • Hybrid Methods combine techniques from holistic and feature-based
     methods. Solutions like a holistic and feature-based method were state
     of the art before deep learning became widespread.
   • Deep Learning Methods CNNs are the most common type of deep
     learning method for face recognition. It is because of the capability to
     handle an unconstrained environment. One negative effect of CNN is
     the amount of training data it needs and how long it takes

                                       8
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

2.3     Spoofing and Presentation Attack
Face recognition as a method for Biometrics identification has been used
in public devices as so far back to 2009 and earlier where big companies
like Lenovo, Asus, and Toshiba. Moreover, as this paper conclude is that is
possible to bypass all Three of the big company’s face recognition. [12]
As mention in chapter 1, state art faces recognition like CNN tens to have
very high accuracy. Thanks to this, it is not always ideal to use it because
it should also protect itself against attacks. After all, they are many types
of attacks. [13] Tackle this problem by developing a secure framework to
protect the privacy of the data by offloading the data from the edge to the
cloud.

  Figure 4: Standardization of weak point in ISO/IEC DIS 30107-1, 2016

A more general way is what figure 4 shows (ISO/IEC DIS 30107-1, 2016).
These are weak point attacks on the biometric sensor (point 1) is called direct
attacks or PAs Attacks at points 2 to 9 are called indirect attacks. From this, a
presentation attack is when using biometric data as an attack on the system.
The attacker will display biometric data to create events that wrongfully
appear to pass the system when receiving data directly from the person,
online or existing databases. It is possible to create these types of attacks.
Protecting against these PAs is to develop countermeasures to PAs that iden-
tify whether the presented biometric sample is a false presentation. This sys-
tem is called PAD (presentation attack detection). Some variations of PADs
are Frame-based, only use a single image to classify face samples. These
PAD systems can quickly output a decision. Video-based require a video
recording of a certain length to classify the samples. Other methods require
human interaction, like Challenge-Response.
When it comes to PA attacks, the are multiple ways to do them. Morphed

                                       9
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

face attacks are one way of attacking a system. [14] Investigated the vul-
nerability of biometric systems to such morphed face attacks. The result
ended with creating two new databases by printing and scanning digitally
morphed images using two different scanners and valuating the techniques
proposed to detect morphed face images. Furthermore, other databases
have also created to test and train the PAD and FR system to handle this
type of attack. One this thesis will use named REPLAY-ATTACK, will be
used in this paper. Other papers like this one 6313548 studied the Effec-
tiveness of Local Binary Patterns in Face Anti-spoofing and, for evaluation,
used the REPLAY-ATTACK. This paper as well used it for Image-Based Ob-
ject Spoofing Detection [15], which tries to improve the spoofing detection
ability by using multiple color schemes to concatenate them and train the
model, which shows promising results against other PAD.
The state-of-the-art method of developing PADs is to make CNN models.
A problem with CNN-based PADs is that it needs many data to train cor-
rectly as [16] mentions numerous parameters in these deep learning-based
detection methods cannot be as good they can be due to limited data.

2.4     Face classification
Face classification is classifying the features extracted from a person after
getting hold of the facial features by the recognition. Furthermore, compar-
ing it to the database to classify this. More precisely, person A has features,
and person B has other features that must be classified to decide which per-
son it is. To achieved this, a classification algorithm can be applied. Some
classification algorithms are SVM, k-NN and Gaussian Naïve Bayes.
   • SVM the objective of the support vector machine algorithm is to find
     a hyperplane in N-dimensional space that distinctly classifies the data
     points.
   • k-NN The k-nearest neighbors (KNN) algorithm is a simple, easy-to-
     implement supervised machine learning algorithm that can solve clas-
     sification and regression problems.
   • Gaussian Naïve Bayes Based on Bayesian classification methods, Naive
     Bayes classifiers rely on Bayes’s theorem, an equation describing the
     relationship of conditional probabilities of statistical quantities. In
     Bayesian classification, we are interested in finding the probability of
     a label given some observed features.

                                      10
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

2.5     Methods
There are many ways to do face recognition when it comes to face recogni-
tion, so this chapter presents an explanation of a couple of methods to do
face recognition.

2.5.1   Convolutional Neural Network
A convolutional neural network is a method in the field of deep learning.
This method is a common and well-known method for image classification,
object classification, and faces classification. CNN takes an input image and
runs this image thru a couple of different layers. An example of this setup
is shown in figure 6. What the layers do is explain down below.

                  Figure 5: Convolutional neural network

   • Input Layer This layer takes an image that has a basic two-dimensional
     structure. But if we take the colors, then we can represent the image
     in three-dimensional. Images are encoded into color channels, so the
     image data is represented into each color intensity in color, typically
     RGB. The intensity of each channel color into the width and height of
     the image becomes three-dimensional. To be able to use the image in
     the CNN, it needs to reshape it into a single column. As an example,
     28x28 = 784 will be converted into a 784x1. So, if the training data is
     n, the input will be (784, n)
   • Convolution Layer This layer main focused is to extract features. What
     the layer does is taking the input image and connect it to the Convo
     layer. This performs a convolution operation, which means it will cy-
     cle through the image with a set size of a filter. As an example, if the
     image is4x4 and the filter is 3x3 the cycle will go through the image
     four times and calculate a 2x2 matrix. equation (1) is the general for-
     mula for this operation. which shows the operation where N is the
     image size and Fis the filter. If the size of the output wants to be con-

                                     11
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

     trolled, padding can be added to the equation. (2) shows a version that
     adds p as padding.

                   ( NxN ) ∗ ( FxF ) = ( N − F + 1) x ( N − F + 1)         (1)
                       ( N + 2p − F + 1) x ( N + 2p − F + 1)               (2)

   • Pooling Layer This is for reducing the volume of the image to a more
     spatial form and is usually between two Convolution layers. One of
     the more popular Pooling layers is max pooling which means the max-
     imum value in a batch will be chosen in the reduction figure 6 shows
     a 2x2 max polling process. This is to reduce the computationally ex-
     pensive not doing it will have.

                           Figure 6: Max-pooling

   • Fully Connected Layer A fully connected layer involves weights, bi-
     ases, and neurons. It connects neurons in one layer to neurons in an-
     other layer. It is used to classify images between different categories
     by training. In place of fully connected layers, conventional classifiers
     like SVM can be used as well. However, we generally adding a Fully
     Connected Layer will be added to make the model end-to-end train-
     able.

2.5.2    Local Binary Pattern

Local Binary Pattern is a simple yet very efficient texture operator which la-
bels the pixels of an image by thresholding the neighborhood of each pixel
and considers the result as a binary number. As is shown in 7 The general
way to describe this process is equation (3)where S is defined as (4)The ob-
tain values then can be used to create a histogram of the future which then
combines with another future histogram. This histogram is a classifier for
different classification methods.

                                      12
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

Due to its discriminative power and computational simplicity, the LBP tex-
ture operator has become a popular approach in various applications. It
is the unifying approach to the more traditionally divergent statistical and
structural models of texture analysis. Perhaps the essential property of the
LBP operator in real-world applications is its robustness to monotonic gray-
scale changes caused, for example, by illumination variations. Another im-
portant property is its computational simplicity, making it possible to ana-
lyze images in challenging real-time settings.

                                         p =0
                   LBP(gpx , gpy ) ∑ P−1 S( gp − gc) × 2 p               (3)

                                     
                                          0 if        x≥0
                          S( x ) =                                       (4)
                                          1 if        x
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

Step 5 compute the covariance matrix
Step 6 calculate the eigenvectors with their related eigenvalues
Step 7 K eigenvectors
For face detection and recognition, the Eigenface approach is considered
by many to be the first working facial recognition technology. It served as
the basis for one of the top commercial face recognition technology prod-
ucts. Since its initial development and publication, there have been many
extensions to the original method and many new automatic face recognition
systems. Eigenfaces are a baseline comparison method to demonstrate the
system’s minimum expected performance.

2.6     Databases
Using a database of pictures and training a model can be done with a face
recognition algorithm. This chapter will discuss the different types of databases
that exist to use to create FR models like CNN. To get the best result, differ-
ent kinds of data in the database depending on the situation to achieve the
best result. For example, age can significantly impact the result and the
lighting, and the environment. Another essential part is the problem with
presentation attack, which also needs training depending on PAD type. It
will also address some of the databases for different kinds of attacks on sys-
tems.

2.6.1    Face recognition
To develop a successful FR system, the system must consider what kind of
problems the system has to deal with, and This requires a database to train
the model. The choice of database most fits the purpose of the model. For
example, if the model purpose is to make an FR system for children, then the
database must contain images and variation of the image that mimics chil-
dren. Other factors like Occlusion, Low-Resolution Noise Plastic Surgery,
Aging illumination, Pose Expressions can affect the result.

2.6.2    Spoofing attacks databases
To handle presentation attacks, databases must be available to test if the
system can handle several types of attacks. This chapter explains what types
of databases there are and the different purposes. Some are for latex masks.
Some are for print attacks others are replay attacks. Later these databases
can test the crated system against matrices for face recognition.
   • MOBIO This database consists of bi-modal (audio and video). which
     contains data from 152 people, 100 males and 52 females. This was

                                     14
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

      done from 2008 to 2010 from six different sites from five different coun-
      tries. [17]
   • Replay-Attack Is a 2012 database and is made up of 1300 video clips
     of photo and video attack attempts on 50 clients. Furthermore, have
     four different groups. training data ("train"), to be used for training
     your anti-spoof classifier, Development data ("devel"), to be used for
     threshold estimation, Test data ("test"), with which to report error fig-
     ures, Enrollment data ("enroll") that can be used to verify spoofing
     sensitivity on the face detection algorithms. [8]
   • Replay-Mobile Is a similar database to Replay-Attack. Consists of
     1190 video clips of photo and video attack attempt to 40 clients, under
     different lighting conditions. an also have the same groups as Replay-
     Attack [18].
   • SWAN The SWAN-Idiap dataset comprises 150 subjects captured in
     six different sessions reflecting real-life scenarios of smartphone-assisted
     authentication. One of the unique features of this dataset is that it
     is collected in four other geographic locations representing a diverse
     population and ethnicity. Additionally, it also contains a multi-modal
     Presentation Attack (PA) or spoofing dataset using low-cost Presen-
     tation Attack Instruments (PAI) such as print and electronic display
     attacks . [19]
   • WMCA The Wide Multi-Channel Presentation Attack (WMCA) database
     consists of 1941 short video recordings of both bonafide and presen-
     tation attacks from 72 different identities. The data is recorded from
     several channels, including color, depth, infra-red, and thermal [20].

2.7     Related work
As mentioned in the introduction of this thesis, much work exists in face
recognition in recent years. About 330 contributions analyzed in the 2019
servery 61 % were based on the CNN network to solve different face recog-
nition problems. Show good results on verification with face recognition
up to 96 % Accuracy. One big part of this servery was to focus on what
problems FR must overcome to get a good face recognition, and some of
them play a big role depending on the purpose of the model. Some of these
problems were still image-based face recognition. Where in recent year con-
siderable progress has been made in constraint environment. Furthermore,
recently, researchers focus more on unconstrained face recognition where
various poses, illuminations, expressions, blur, ages, and occlusions were
problems. [2] However, with FR models with high accuracy, other problems

will be discovered. Like what this article has researched Deeply vulnerable:

                                      15
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

a study of the robustness of face recognition to presentation attacks. What
this article has done is to investigate the DNN FR model’s vulnerability to a
PA. Because as of earlier said, DNN FR, like CNN, has been recently outper-
formed other methods by a significant margin. Nevertheless, maximizing
recognition performance alone is not sufficient. The system should also be
capable of resisting various kinds of attacks, including PA. What this studie
shows is that high DNN based FR is highly correlated to be vulnerable to
PA when the accuracy starts to be in the 90% or more [7]. Which also shows
in this article [21] which concludes the lessons learned about spoofing and
anti-spoofing in face biometrics and highlight open issues and future direc-
tions. A what they say is that "Without spoofing counter-measures, most of
the state-of-the-art facial biometric systems are indeed vulnerable to attacks
since they try to maximize the discriminability between identities without
regards to whether the presented trait originates from a legitimate living
client or not."
As for system development for exactly door access, some articles focused
on developing a Low-Cost Embedded Facial Recognition System for Door
Access Control using Deep Learning. To have ass an edge device and so on.
However, one vulnerability this found is that we have said earlier the ability
to use a phone with the face and access the door. [4] Other paper have also
done developing the smart door system like [5] [6]

                                     16
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

3     Methodology
This chapter describes the methods used to fulfill the Concrete and verifi-
able objectives described in Chapter 1.5. By first explaining the Work strat-
egy that will be used to achieve the goal of this study. This will be performed
during the thesis period until the project has been completed. The Last part
will describe the testing and the validation of the system.

3.1     Research area and strategy
During the work, A conclusive research with experimental data has been
conducted and how this was achieved was with a mixture of the two agile
work strategies Scrum and XP. Scrum was chosen as it is well suited for de-
velopment projects where the requirements often change during the work.
With Scrum, it is in these cases easy to change the requirements set at the be-
ginning of the project. The method is also suitable when there is uncertainty
about which parts the project will have. XP was used together with Scrum
to enable backlog changes during an ongoing sprint, as rapid changes and
varying requirements could have occurred. Scrum has a sprint length of at
least two weeks, while XP has a length of one to two weeks.

Furthermore, the mix between Scrum and XP has meant that the work has
been focused on a product backlog. This backlog has constantly been chang-
ing based on need. These changes could also have taken place during an
ongoing sprint. Something that Scrum as the only strategy had not allowed.

In the initial stage of the work, a feasibility study has been carried out. This
helps to produce information to create a solid foundation to work from. In
meetings with Knightec Dewire and Mid Sweden University, the discussion
regarding the scope and area of the project has been clarified. This informa-
tion has since been of great importance for the collection of requirements on
which the work is based on.

During the feasibility study, information has been obtained from a similar
Thesis that has existed. This is to investigate how these solutions work and
how and if this could affect the project’s direction. The feasibility study has
shown that similar systems and software exist today but with some differ-
ences.

                                      17
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

3.2     Proposed solution
The proposed approach to investigate the aim of this study is to make a
proof of concept that uses LBPH in the library openCV2 which was orig-
inally made to do face recognition but in this case will act as a PAD and
this proof of concept is based on if it is possible to replace regular locking
systems with state of art systems like CNN.
The main way how this study will work is based on the CRoss Industry
Standard Process for Data Mining (CRISP-DM) where the first focus is on
finding purpose for the project through Business understanding. The sec-
ond step is to understand what type of data will be needed in this case,
which database to use and what type of attacks. Which then leads to prepar-
ing the data to be used. In the end, modeling and testing will be done to be
evaluated it. To be able to to this fist of all an investigation has been done
to complete one of the Concrete and verifiable goals. This investigation has
shown that CNN faces recognition has really high accuracy which means
that it a good candidate for the study. which furthermore research of the re-
lated topic the concept of Presentation attack was introduced. This concept
is attacks on the FR system and a couple of articles show a high correlation
between high accuracy FR system and the vulnerability for PA. So based on
that information the study will lock into a special case which was shown in
chapter 1 this was an IoT and cloud-based solution for a locking system.

With this in mind, the proof of concept will include a FR and a PAD with
a testing framework to see if the high-accuracy CNN model will have effi-
cient results from protection from PAs with this PAD which must be suitable
to run on low-edge devices. Also, what type of data will be tested on the
system. All of the choice which algorithm to use is explained further in this
chapter. First, the dataset. Then the choice of algorithms for Face detection,
Face recognition, Image classification, and presentation attack detection.

3.3     Dataset structure
The chosen database for PA where the REPLAY-ATTACK database because
of the related article [15]. which use LBP, which is similar to this study.
The chosen PAD in that article gave a good result. Also, how easy it is to
make a replay attack on a system. The database is constructed to have four
different types of data train, dev, testing, and Enrollment which give the
user a comprehensive ability to construct a PAD. Furthermore, the dataset
comes with protocols of different types of attacks listed in Table 1. Moving
on, the PAD and FR will train on this protocol. As seen in the table, a good
variation of training and testing is available. The PAD and CNN model
will have the own collection of training, but both of them will use the same
database.

                                     18
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

                        Table 1: Database protocols

                  Hand-Attack             Fixed-Support       All Supports
 Protocol         train dev test          train dev test      train dev test
 Print            30    30   40           30     30  40       60     60    80
 Mobile           60    60   80           60     60  80       120    120 160
 Highdef          60    60   80           60     60  80       120    120 160
 Digitalphoto     60    60   80           60     60  80       120    120 160
 Photo            90    90   120          90     90  120      180    180 240
 Video            60    60   80           60     60  80       120    120 160
 Grandtest        150   150 200           150    150 200      300    300 400

3.4     Choice of algorithms
In this study, a couple of choices have been made because of how broad the
choices is. This chapter will address the choice of the critical parts in the
system. This will include which face detection, face recognition, and what
classifying method will be used. Furthermore, what methods are used to
detect the presentation attacks.

3.4.1   Face detection
This study focuses on face recognition and the detection of PA and not the
detection of the face which means not the focus has not been on face de-
tection. Furthermore, the face detection method’s choice is based on the
mentioned architecture presented in Chapter 1. So it must be able to run
on a lower edge device. To make it more accessible, the system will use
Opencv2 and Python’s own face recognition library to utilize as this study
do not have a focus on Face detection. so what method the CNN and the
PAD will be Histogram of Oriented Gradients (HOG) to detect the face in
the CNN , and the PAD will use CV2 CascadeClassifier.

3.4.2   Face recognition
For the face recognition, we have two choices: one for the CNN face Rego-
nigtion and one for the PAD. For the CNN models, there are a lot of different
trained models that can be used. As mention in Section 2.5.1. Researchers
have developed different kinds of CNN architecture which. In this study,
Dlibs face recognition will be used which is build in python. Dlibs is a ver-
sion of the ResNet-34 developed by [22] but with fewer layers and the num-
ber of filters reduced by half. This version was made by Davis King and
was done on a severely different dataset, including self scraped from the
internet, scrub dataset. [23], the VGG dataset [24] and the Labeled Faces in
the Wild (LFW) [25] dataset the network compares to other state-of-the-art

                                     19
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

methods, reaching 99.38% accuracy. [26]

3.4.3    Image classification
The last part is image classification. The used CNN model will use regular
Euclidean distance with a specific confidence. Furthermore, in the end, the
result of Euclidean distance will end up in a voting system. To decide which
one of the faces has more confidence. The LBPH will be using the histogram
generated for each face to compare it to the input, and with calculated con-
fidence, it will decide how close the face is to the real one.

3.4.4    Presentation attack detection
The PAD will be using LBPH because of the promising result in [15], which
worked with LBP with different color schemes. Also, one reason for using
LPB based is because it not highly computational is excellent for edge de-
vices. More state of art PADs that uses CNN to train the PAD is problematic
because of the amount of data it needs. This article [16] mentions, the avail-
able databases used for PA are not so good because of the size CNN needs.
This article also states that CNN and LPH have a similar structure which
can be a good choice. What thay did in the article was to use LPB to reduce
the CNN, so it did not need as much data which is ass mention earlier as a
problem.

3.5     Evaluation
To understand how good or bad the created system is. It can be evaluated
against performance matrices. In this chapter, some evaluations of biomet-
ric recognition systems will be explained.
The generic way of evaluating this kind of system is Metrics for binary clas-
sification systems. The idea is to identify if a person is positive or negative.
eq. (6) defines a label positive or negative depending on the function M ( x )
which returns the score of the face model, which then can be compared
against a certain threshold r.
                               
                                   positive i f   M( x) ≥ r
                     label =                                                 (6)
                                   negative i f   M( x) < r

These metrics for binary classification systems have four possible outcomes
listed down below.
   • true positive (TP) when x is a positive sample and is labeled as a pos-
     itive sample.

                                        20
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

   • true negative (TN) when x is a negative sample and is labeled as a
     negative sample.
   • false positive (FP) when x is a negative sample and is labeled as a
     positive sample.
   • false negative (FN) when x is a positive sample and is labeled as a
     negative sample
Furthermore, based on these values, a calculation can be done to obtain the
following computed score.
   • sensitivity, recall, hit rate, or true positive rate (TPR):
   • specificity, selectivity or true negative rate (TNR):
   • precision or positive predictive value (PPV):
   • negative predictive value (NPV):
   • False Rejection Rate (FRR):
   • False Acceptance Rate (FAR):
   • half total error rate (HTER):
To test A spoofing detection system, we must handle two types of errors,
either the actual access is rejected (false rejection), or an attack is accepted
(false acceptance). In order to measure the performance of a spoofing de-
tection system, the Half Total Error Rate (HTER), which combines the False
Rejection Rate (FRR) and the False Acceptance Rate (FAR) and is defined as
(7)

                                       FAR + FRR
                       HTER(%) =                 ∗ 100                       (7)
                                           2

                                          FP
                               FAR =                                         (8)
                                        FP + TP

                                          FN
                               FRR =                                         (9)
                                        TP + FN

In an ideal spoofing detection system, both FAR and FRR should be 0. An-
other metric commonly used to evaluate a biometric system is the EER -
Equal error rate. This error rate is obtained at the threshold that provides
the same FAR and FRR.

                                       21
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

4     Implementation
This chapter will cover the CNN face recognition implementation using
Python, which uses DLIB which is a C++ toolkit containing machine learn-
ing algorithms. As mention earlier, a pre-trained and modified version of
the resnet32 will be used to do the FR. The developed PAD will use LBPH
to train on the faces in the databases. And then how this two models for
PAD and FR can be used as a evaluation if the face was real or an attack will
be covered in this chapter as well. Furthermore will the evaluation of the
system be done by developing a framework, to be able to attack the system
with specific protocols that the database has.

4.1     Testing framework
The created framework to test the CNN and the PAD is based on testing
three different cases in the system illustrated in figure 8. The first case
represents the FR result without the PAD and the second one is the result
of the PAD. This is to evaluate the two systems separately. The third one is
when the system applies the PAD to the system. This is to see how the PAD
affected the FR when faces labeled PA is removed in the FR.

                           Figure 8: Framework

Furthermore the created framework is based on a terminal application that
works with arguments. Below is the listed argument for the framework.
A trained CNN model and PAD have been created with the corresponded
data, which is the replay database. The training on the CNN will be done on
the actual videos in the dataset. The PADs training can happen in several
ways depending on what system is testing because of the severe types of
attack that can be done on the system which is explain more later.
    • Testing specific argument

                                     22
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

        – test-method which test you gona run CNN only PAD or both
          togheter
        – protocal which protocal to run
   • CNN specific argument
        – detection-method face detection model to use either ‘hog‘ or
          ‘cnn‘
        – encodings path to serialized db of facial encodings for the CNN
        – dataset path to input directory of faces + images
        – Image path to input image
        – Inputvideo path to input video
        – display whether or not to display output frame to screen
        – outputvideo path to output video
   • LBPH specific argument
        – lbphcascade path to the face detection cascade for the LBPH
        – lbphyml path to the yml file which containd the traind data for
          the PAD
        – lbphlabels path to the pickle file which contain the accosiaded
          names
        – lbphsavecapture save path for the PAD model

4.1.1   Presentation attacks
Ass mentioned earlier in this chapter. The system will be evaluated against
some protocols. These are created by the developer of the Database ex-
plained in section 3.2, which is what type of attacks can be done on the
system. These protocols will work as an attack on the system in the three
cases explain earlier. First, it will run FR without the PAD to see how the
baseline protection is for the CNN FR. The same protocol will run thru the
PAD to detect as many PAs as possible. Furthermore, the case three will be
run last which are a combination of both cases 1 and 2 together. So in these
three cases three attacks will have been done, which are evaluated against
the HTER value explained in section 3.4 In the end after the system has ob-
tained results from the CNN and the PAD separately. But also together. will
conclusion and discussion will be presented in the later chapters.

                                    23
Investigation on how presentation attack dedection can be used to increase
security for face recognition as biometric identification
Fredrik Öberg                                                 2021–06–15

4.2     Face recognition system
One of the concrete and verifiable goals was to implement a face recognition
system using the CNN model. The Python Face recognition library was
used to accomplish this goal, and this library uses the modified version of
the resnet 32 CNN model and trained with over 3 million faces. The rest of
4.2 will explain the implementation steps of the created FR system with this
CNN model.

4.2.1    Face detection
The first step of training the CNN model is to detecting faces in the pictures.
The face detection part of the CNN model will work with the Histogram of
Oriented Gradients to speed up the process which was explained in detail
in chapter 2.1.2. Why this metod is used is becuse when testing was done
a noticeable increase of execution time was shown when for exempel CNN
was used instead of HOG
Futermore when developing the CNN FR, some possible cases were devel-
oped. One when the system needed to recognize faces in a single image,
one with recognizable faces in a live video stream from the webcam and
then outputing a video, and one to recognize faces in a video file residing
on disk and output the processed video to disk. down below a step-by-step
process be explained how the detection will work
   • Step 1 Depending on what type of media the user is using, the detec-
     tion of the face is the same. The idea is to store the known encoding
     and the known names in two lists. These two will contain the face
     encodings and corresponding names for each person in the dataset.
   • Step 2 Depending on how many people the user wants to train. The
     system needs to iterate thru them and detect the faces. This is depend-
     ing on the structure illustrated in figure 9 The process will iterate N
     times if N people are in the dataset. From there, the system will ex-
     tract the name of the person from the image path. And important step
     is to converting images to RGB because DLIB expects it, so before we
     proceed, a swap needs to be done.
   • Step 3 In this step, for each iteration we use the library module in
     python named face recognition that has a face locations method that
     takes the RGB image and what type of method to detect ass mention
     earlier it is HOG.
   • Step 4 In this last step, we utilize the face encodings module in the
     library to convert the image to an en numerical encoding and take
     the name and the encoding and append it to (known encodings and
     known names).

                                      24
You can also read