SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE

Page created by Eduardo Jenkins
 
CONTINUE READING
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
SIDE-CHANNEL ANALYSIS OF THE
XILINX ZYNQ ULTRASCALE+
ENCRYPTION ENGINE
Benjamin Hettwer1, Sebastien Leger1, Daniel Fennes2
Stefan Gehrer3, and Tim Güneysu2
1Robert Bosch GmbH, Corporate Sector Research, Stuttgart, Germany
2HorstGörtz Institute for IT-Security, Ruhr University Bochum, Germany
3Robert Bosch LLC, Corporate Research, Pittsburgh, USA

CHES 2021
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
Motivation
FPGA Security
 Mostly based on Static Random-Access Memory (SRAM)
  Configuration file (i.e. bitstream) must be loaded at each power-up

 Bitstream has to be protected against duplication, manipulation and reverse engineering

 Bitstream encryption (and authentication) supported by many devices

 EDA Software FPGA

 NVM
 Bitstream
 01010 11000 Configuration
 01000 10100
 110…
 Enc 000…
 Dec memory
 (SRAM)

2 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
Motivation
SCA on Bitstream Encryption
 Examples of successful extraction of bitstream decryption key
  Moradi et al., 2012 [1] => Xilinx Virtex-4, Virtex-5
  Moradi et al., 2016 [2] => Xilinx Spartan-6, Kintex-7, Artix-7
  Moradi et al., 2013 [3] => Altera Stratix II
  Swierczynski et al. 2015 [4] => Altera Stratix II, Stratix III

 General problem: No dedicated SCA countermeasures!

3 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
Motivation
Xilinx ZYNQ UltraScale+
 On-chip encryption engine
  AES-256 in GCM mode (full hardware)
  Used for bitstream encryption and authentication

 Protocol-based countermeasure: key rolling
  Limits data collection for the adversary
  Xilinx gives no recommendation about a suitable block size
 ‒ I.e. how many AES blocks are encrypted under same key

 Goal: Find appropriate value for key rolling parameter
 Image Source:
 Xilinx Zynq UltraScale+ Device, Technical Reference Manual, v1.9

4 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
ZYNQ UltraScale+ Encryption Engine
Assumptions

 Hardware root of trust authentication enabled (RSA)
  Only decryption of authenticated items (bitstream, software)
 can be measured
  No chosen ciphertext

 Attacker can record one specific bitstream decryption
 multiple times
  Averaging of traces => Increases Signal-to-Noise (SnR) ratio

 No masking countermeasure in AES core
  Focused on 1st order leakage

 Access to open copy of device => Profiling attacks possible
5 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
ZYNQ UltraScale+ Encryption Engine
Setup
 Target: ZYNQ UltraScale+ evaluation board ZCU 102 (16 nm technology)
  Flip-chip packaging => Removed metal cap with a small driller

 Leakage vector: EM signal induced by on-chip decoupling capacitors
  Placed EM probe directly to decoupling capacitor related to AES power rail (VCCPSINTLP)

6 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
ZYNQ UltraScale+ Encryption Engine
Setup (cont.)
 Measurement setup
  Target board clock frequency: 48 MHz
  Langer EMV probe ICR HV 500-75 placed on 4-axis positioning system
  Picoscope 6404 with sampling rate of 625 MS/s and 500 MHz bandwidth
  Averaging factor: 250

7 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
ZYNQ UltraScale+ Encryption Engine
AES Hardware Architecture
 Black-box reverse engineering of AES-256 architecture
  Correlation with known key

 Steps
  Timing behavior of the core (14 clock cycles per enc.) => Round-based implementation
  Tried different leakage models with varying number of pipeline stages and registers
  Assumed architecture:

8 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
ZYNQ UltraScale+ Encryption Engine
AES Hardware Architecture (cont.)
 Power Leakage :
  = [( ( || +1 ) ⊕ ( || +1 )]
 ‒ : Hamming weight
 ‒ : State of register in AES round 
 ‒ : Initialization vector
 ‒ : Counter value after iterations

 Hamming Distance (HD) between the output of encryption in round , and the output of encryption
 + 1 in the same round

9 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
SIDE-CHANNEL ANALYSIS OF THE XILINX ZYNQ ULTRASCALE+ ENCRYPTION ENGINE
Attack Procedure
CTR Mode and Hardware Specific Attributes

 Red: Changes in AES state when only LSB of Counter (CTR) changes
  Only one byte out of 16 bytes is changing at each encryption and is leaking information
  HD Leakage in 1 only related to one key byte ( 15 )
  Several rounds have to be considered to extract full 256-bit AES key

10 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Attack Procedure
… over first five AES rounds
 1: key byte ( 15 ) with 8-bit hypothesis

 2 : Four constants ( 0 − 3 ) related to subkeys with 8-bit hypothesis
  Constants are only valid for 256 consecutive encryptions => upper bound

 3 : 16 constants ( 0 − 15 ) with four attacks with 32-bit hypothesis
  Enables to calculate input vector of 4 for 256 consecutive encryptions

 4 : 16 subkey bytes with four attacks with 32-bit hypothesis

 5 : 16 subkey bytes with four attacks with 32-bit hypothesis

11 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Data Set
 200.000 traces (after averaging)
  75% profiling, 24% validation (to monitor training progress), 1% attack traces
  Random keys and IVs
  Each trace includes EM emanations for 256 successive AES-GCM encryptions ( 0 to 255)
  15625 samples per trace

12 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Baseline Attacks
 Correlation Power Analysis (CPA) with Linear Discriminant Analysis (LDA) pre-processing
  LDA projects traces into a smaller dimension which maximizes intra-class variance

 Calculated LDA co-effients for each HD leakage per round attack using 50 sample points
  Location of samples has been determined by correlation with known key
  Attack traces compressed into single data point per leakage

 Challenge
  In theory 256 – 1 = 255 leakages
  First leakage for = 3
  AES pipeline registers are filled with values from r + 3 every 4th clock cycle
  Only 190 encryptions generate exploitable leakage

13 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Baseline Attacks (cont.)
 Two power leakage models
  Regular HD
  Linear Regression (LR) – aka stochastic approach by Schindler et al. [5]

 Round 1 Round 2 Round 3 Round 4 Round 5
 CPA-LDA 170* - - - -
 (HD Model)
 CPA-LDA 130* - - - -
 (LR Model)
 *Number of traces to reach a key rank of one

14 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Deep Learning Attacks - Concept
 Correlation Optimization (CO) scheme by Robyns et al. [6]

 Idea: Train a Deep Neural Network (DNN) to produce an encoding of the input data 
 (i.e. the traces) that maximizes the Pearson correlation with a hypothetical power leakage 
  Extension of classical CPA with an additional profiling/learning phase
  Specialized loss function: ℒ ( , ) = 1 − ( , )
 DNN Corr.
 
 Why CO?
  Only 190 encryptions available => profiled attack
  CO automatically encodes traces into single value => faster attack phase (good for 32-bit hypothesis)
  Imbalanced data set due to binominal distribution produced by HW function
 ‒ Problem for Gaussian templates and standard deep learning-based attacks with cross-entropy loss function

15 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Deep Learning Attacks – Concept (cont.)
 CO extensions
 ( )
  Bitwise Correlation Loss (CO-BIT) DNN Corr.
 ‒ Bit flips in registers are used as leakage labels (no HW) ( )
 ( ) 
 ‒ Sum of correlation for each bit used to calculate total loss: 
 ( )
 ℒ − , = 1 − σ 

  Weighted-Bit Correlation (CO-W-BIT) DNN Corr.
 ‒ Adoption of stochastic approach 
 ‒ Weight co-efficient for each bit => approximated by neuron
 ‒ Creates optimized encoding for leakage that is (1) 1
 used in loss function:
 
 ℒ − − , = 1 − ( − , ) (2) 2
 
 σ
 ( ) 
 
16 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Deep Learning Attacks - Models

 (1)
 DNN 1

 1 1
 (2)
 DNN 2 DNN 2 (1)
 DNN 2
 …

 190 3
 (190)
 DNN 190

 Individual model per  One model output per  Triple output model
 leakage leakage  Trade-off between training
  Shorter traces, but long  Faster training, time and complexity
 training time more complex
17 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Deep Learning Attacks – Models (cont.)
 Implemented all three schemes
  Model per leakage: CO, CO-BIT, CO-W-BIT
  Output per counter leakage (All-CTR): CO
  Triple output model (3-CTR): CO

 Neural Network Types
  Multi-Layer Perceptron (MLP)
 ‒ 2 hidden layers (10/15 neurons)
  Convolutional Neural Network (CNN)
 ‒ 3 blocks of batch_norm + conv + max_pooling

 Attacks implemented in Python with Keras and Tensorflow frameworks

18 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Results – Round 1
 Target: key byte 15

19 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Results – Round 2
 Target: Four 8-bit constants ( 0 − 3 )

20 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Results – Round 3
 Target: 16 constants ( 0 − 15 ) with four attacks with 32-bit hypothesis
  Enables to calculate input vector of 4 for 256 consecutive encryptions

 Implemented attack phase on GPU using Nvidia CUDA
  232 key guesses = 4 294 967 296
  Reduced recovery phase from weeks to approx. 6h

 None of attacks were able to extract constants within 190 encryptions
  Key rank between 10.000 – 10.000.000
  Complexity too high and SnR too low for 32-bit attack
  > 1000 encryptions needed in our setup

21 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Experiments
Results – Round 4/5
 Target: 16 subkey bytes with four attacks with 32-bit hypothesis

 Key recovery not successful
  Round 3 constants not available
  ~ 3000 encryptions needed

22 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Recommendation for Key Rolling Parameter
Boot Time Effects

 Key rolling parameter impacts size of configuration image and boot time
  < 5 increases boot time by more than 400%
  > 40 increases boot time by less than 10%

23 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
Recommendation for Key Rolling Parameter
Conclusion
 Complete extraction of the key has not been successful within 256 encryptions
  Why not just using a value of ~200?

 Parts of the key could be extracted with less than 50 encryptions
  Worst case scenario: Profiling attack on single device
  “Attacks always become better, never get worse” – Old NSA saying
  Lifetime of products can be 20 years or longer (e.g. in automotive)

 Recommendation: 20 - 30
  Security margin against future attacks
  Reasonable boot time overhead (15 − 25%)

24 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
THANK YOU!
References

 [1] Moradi, A., Kasper, M., Paar, C.: Black-box side-channel attacks highlight the importance of countermeasures.
 In: Dunkelman, O. (ed.) CT-RSA 2012. LNCS, vol. 7178, pp. 1–18. Springer, Heidelberg (2012)
 [2] Moradi, A., Schneider, T.: Improved Side-Channel Analysis Attacks on Xilinx Bitstream Encryption of 5, 6, and
 7 Series. In: Standaert, F.X., Oswald, E. (eds.) Constructive Side-Channel Analysis and Secure Design. pp. 71–
 87. Springer International Publishing, Cham (2016)
 [3] Moradi, A., Oswald, D., Paar, C., Swierczynski, P.: Side-channel attacks on the bitstream encryption
 mechanism of AlteraStratix II: facilitating black-box analysis using software reverse-engineering. In: FPGA, pp.
 91–100. ACM (2013)
 [4] Swierczynski, P., Moradi, A., Oswald, D., Paar, C.: Physical security evaluation of the bitstream encryption
 mechanism of Altera Stratix II and Stratix III FPGAs. TRETS 7(4), 34:1–34:23 (2015)
 [5] Schindler, W., Lemke, K., Paar, C.: A stochastic model for differential side channel cryptanalysis. In: Rao, J.R.,
 Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 30–46. Springer, Heidelberg (2005)
 [6] Robyns, P., Quax, P., Lamotte, W.: Improving CEMA using Correlation Optimization. IACR Transactions on
 Cryptographic Hardware and Embedded Systems 2019(1), 1–24 (Nov 2018).
 https://doi.org/10.13154/tches.v2019.i1.1-24, https://tches.iacr.org/index.php/TCHES/article/view/7332
 Hettwer et al. | Side-Channel Analysis of the Xilinx Zynq UltraScale+ Encryption Engine | CHES 2021
 © Robert Bosch GmbH 2021. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.
You can also read