From Theory to Practice: Private Circuit and Its Ambush—Debapriya ...
Transcript of From Theory to Practice: Private Circuit and Its Ambush—Debapriya ...
Indian Institute of Technology Kharagpur
Telecom ParisTech
From Theory to Practice: Private Circuit and Its Ambush
Debapriya Basu Roy, Shivam Bhasin, Sylvain Guilley, Jean-LucDanger and Debdeep Mukhopadhyay
20/01/2015 Debapriya Basu Roy, Weekly Presentation 1/20
Introduction
Side Channel: Information leakage from the implementation
Probing Attack
Power attack and EM radiation attack
t-Private Circuit: Countermeasure design with sound theoretical proof
Overhead: O(nt2), n is the number of gates in the circuit.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20
Introduction
Side Channel: Information leakage from the implementation
Probing Attack
Power attack and EM radiation attack
t-Private Circuit: Countermeasure design with sound theoretical proof
Overhead: O(nt2), n is the number of gates in the circuit.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20
Introduction
Side Channel: Information leakage from the implementation
Probing Attack
Power attack and EM radiation attack
t-Private Circuit: Countermeasure design with sound theoretical proof
Overhead: O(nt2), n is the number of gates in the circuit.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20
Introduction
Side Channel: Information leakage from the implementation
Probing Attack
Power attack and EM radiation attack
t-Private Circuit: Countermeasure design with sound theoretical proof
Overhead: O(nt2), n is the number of gates in the circuit.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20
Introduction
Side Channel: Information leakage from the implementation
Probing Attack
Power attack and EM radiation attack
t-Private Circuit: Countermeasure design with sound theoretical proof
Overhead: O(nt2), n is the number of gates in the circuit.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20
Related Works
Masking: by product of private circuit to protect against first orderdifferential attacks.
Reducing Overhead: From O(nt2) to O(nt), further reducing it todt/2e.Designing block ciphers with reduced number of AND operations, forexample: PICARO.
Modifying private circuit for efficient FPGA implementation.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 3/20
Related Works
Masking: by product of private circuit to protect against first orderdifferential attacks.
Reducing Overhead: From O(nt2) to O(nt), further reducing it todt/2e.
Designing block ciphers with reduced number of AND operations, forexample: PICARO.
Modifying private circuit for efficient FPGA implementation.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 3/20
Related Works
Masking: by product of private circuit to protect against first orderdifferential attacks.
Reducing Overhead: From O(nt2) to O(nt), further reducing it todt/2e.Designing block ciphers with reduced number of AND operations, forexample: PICARO.
Modifying private circuit for efficient FPGA implementation.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 3/20
Related Works
Masking: by product of private circuit to protect against first orderdifferential attacks.
Reducing Overhead: From O(nt2) to O(nt), further reducing it todt/2e.Designing block ciphers with reduced number of AND operations, forexample: PICARO.
Modifying private circuit for efficient FPGA implementation.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 3/20
Motivation
Private circuit is based on sound theoretical proof but with someinherent assumptions which may not be valid in practical scenario.
Theoretical analysis of private circuit for power analysis in presence ofglitches has been studied.
However, no practical evaluation of private circuit is present in theliterature
20/01/2015 Debapriya Basu Roy, Weekly Presentation 4/20
Motivation
Private circuit is based on sound theoretical proof but with someinherent assumptions which may not be valid in practical scenario.
Theoretical analysis of private circuit for power analysis in presence ofglitches has been studied.
However, no practical evaluation of private circuit is present in theliterature
20/01/2015 Debapriya Basu Roy, Weekly Presentation 4/20
Motivation
Private circuit is based on sound theoretical proof but with someinherent assumptions which may not be valid in practical scenario.
Theoretical analysis of private circuit for power analysis in presence ofglitches has been studied.
However, no practical evaluation of private circuit is present in theliterature
20/01/2015 Debapriya Basu Roy, Weekly Presentation 4/20
Contribution
We identify the practical scenarios in which private circuit may fail toprovide us the desired security.
We actually try to identify the lazy engineering practices which canrattle the security of private circuit.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 5/20
Contribution
We identify the practical scenarios in which private circuit may fail toprovide us the desired security.
We actually try to identify the lazy engineering practices which canrattle the security of private circuit.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 5/20
Contribution
We have implemented a lightweight block cipher SIMON usingprivate circuit methodology on SASEBO-GII board.
The implemented private circuits are analyzed against SCA using EMtraces and correlation power analysis.
Moreover, we have used Test Vector Leakage Assessment (TVLA)methodology based leakage detection to classify our design as sidechannel secure or not.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 6/20
Contribution
We have implemented a lightweight block cipher SIMON usingprivate circuit methodology on SASEBO-GII board.
The implemented private circuits are analyzed against SCA using EMtraces and correlation power analysis.
Moreover, we have used Test Vector Leakage Assessment (TVLA)methodology based leakage detection to classify our design as sidechannel secure or not.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 6/20
Contribution
We have implemented a lightweight block cipher SIMON usingprivate circuit methodology on SASEBO-GII board.
The implemented private circuits are analyzed against SCA using EMtraces and correlation power analysis.
Moreover, we have used Test Vector Leakage Assessment (TVLA)methodology based leakage detection to classify our design as sidechannel secure or not.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 6/20
Contribution
Optimized SIMON: SIMON is implemented according to the ISWscheme, but the design tool is free to optimize the circuit. This is anexample of lazy engineering approach,
2-input LUT based SIMON: Here, to mimic the private circuitmethodology exactly on the FPGA, we have constrained the designtool to map each two-input gate to a single LUT. In other words,though a LUT has six inputs, it is modeled as two-input gate andgate-level optimization is minimized.
Synchronized 2-input LUT based SIMON: This is nearly similar to theprevious methodology. The only difference is that each gate or LUT ispreceded and followed by flip-flops so that each and every input tothe gates is synchronized and glitches are minimized .
20/01/2015 Debapriya Basu Roy, Weekly Presentation 7/20
Contribution
Optimized SIMON: SIMON is implemented according to the ISWscheme, but the design tool is free to optimize the circuit. This is anexample of lazy engineering approach,
2-input LUT based SIMON: Here, to mimic the private circuitmethodology exactly on the FPGA, we have constrained the designtool to map each two-input gate to a single LUT. In other words,though a LUT has six inputs, it is modeled as two-input gate andgate-level optimization is minimized.
Synchronized 2-input LUT based SIMON: This is nearly similar to theprevious methodology. The only difference is that each gate or LUT ispreceded and followed by flip-flops so that each and every input tothe gates is synchronized and glitches are minimized .
20/01/2015 Debapriya Basu Roy, Weekly Presentation 7/20
Contribution
Optimized SIMON: SIMON is implemented according to the ISWscheme, but the design tool is free to optimize the circuit. This is anexample of lazy engineering approach,
2-input LUT based SIMON: Here, to mimic the private circuitmethodology exactly on the FPGA, we have constrained the designtool to map each two-input gate to a single LUT. In other words,though a LUT has six inputs, it is modeled as two-input gate andgate-level optimization is minimized.
Synchronized 2-input LUT based SIMON: This is nearly similar to theprevious methodology. The only difference is that each gate or LUT ispreceded and followed by flip-flops so that each and every input tothe gates is synchronized and glitches are minimized .
20/01/2015 Debapriya Basu Roy, Weekly Presentation 7/20
Preliminaries
SIMON
In 2013, NSA had introduced two ultra-lightweight block cipher SIMONand SPECK with a Feistel construction. Out of the two block ciphers,SIMON is more suited for hardware implementations. SIMON can encrypta block of 2k bits, with a key of m · k bits.
TVLA
TVLA consists in operating the device under test with a fixed and chosenkey. Then, a T-test is applied on both sets of measurements. Similardifference testing can be performed on intermediate values of the blockcipher and also on each bit of that intermediate value.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 8/20
Preliminaries
SIMON
In 2013, NSA had introduced two ultra-lightweight block cipher SIMONand SPECK with a Feistel construction. Out of the two block ciphers,SIMON is more suited for hardware implementations. SIMON can encrypta block of 2k bits, with a key of m · k bits.
TVLA
TVLA consists in operating the device under test with a fixed and chosenkey. Then, a T-test is applied on both sets of measurements. Similardifference testing can be performed on intermediate values of the blockcipher and also on each bit of that intermediate value.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 8/20
t-Private Circuit
Input Encoding: A vector of (a1, a2, ..., a2t , a2t+1)
a2t+1 = a⊕2t⊕i=1
ai .
NOT gate: a = (a1, a2, ..., a2t+1). `a = (a1, a2, ..., a2t+1).
Xor gate:ci = ai ⊕ bi , 1 ≤ i ≤ 2t .
20/01/2015 Debapriya Basu Roy, Weekly Presentation 9/20
t-Private Circuit
Input Encoding: A vector of (a1, a2, ..., a2t , a2t+1)
a2t+1 = a⊕2t⊕i=1
ai .
NOT gate: a = (a1, a2, ..., a2t+1). `a = (a1, a2, ..., a2t+1).
Xor gate:ci = ai ⊕ bi , 1 ≤ i ≤ 2t .
20/01/2015 Debapriya Basu Roy, Weekly Presentation 9/20
t-Private Circuit
Input Encoding: A vector of (a1, a2, ..., a2t , a2t+1)
a2t+1 = a⊕2t⊕i=1
ai .
NOT gate: a = (a1, a2, ..., a2t+1). `a = (a1, a2, ..., a2t+1).
Xor gate:ci = ai ⊕ bi , 1 ≤ i ≤ 2t .
20/01/2015 Debapriya Basu Roy, Weekly Presentation 9/20
t-Private Circuit
AND gate: Inputs a = (a1, a2, ..., a2t+1) and b = (b1, b2, ..., b2t+1),output c = (c1, c2, ..., c2t+1), which is calculated by following steps:
1 Generate random bits ri,j , where i 6= j and 1 ≤ i ≤ j ≤ 2t + 1.2 Compute rj,i = (ri,j ⊕ aibj)⊕ ajbi , where i 6= j and 1 ≤ i ≤ j ≤ 2t + 1.3 Compute ci = aibi ⊕
⊕j 6=i ri,j , where 1 ≤ i ≤ 2t and 1 ≤ j ≤ 2t.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 10/20
AND Gate Example
Inputs of the AND gate are two vectors a = (a1, a2, a3) andb = (b1, b2, b3), Output c = (c1, c2, c3) is calculated as follows:
c1 = a1b1 ⊕ r1,2 ⊕ r1,3 (1)
c2 = a2b2 ⊕ (r1,2 ⊕ a1b2)⊕ a2b1 ⊕ r2,3 (2)
c3 = a3b3 ⊕ (r1,3 ⊕ a1b3)⊕ a3b1 ⊕ (r2,3 ⊕ a2b3)⊕ a3b2 (3)
20/01/2015 Debapriya Basu Roy, Weekly Presentation 11/20
CAD Optimization
Figure: t = 1 private circuit for AND third coordinate on 4-input LUTs
4 input LUT
4 input LUT
4 input LUT
unconnected
a3
b3
a2
a1
r1,3
a3b3 ⊕ a3b1 ⊕ a3b2 = a3(b) => Leakage
r2,3
a1b3 ⊕ a2b3 ⊕ r1,3
c3
b3
b2
b1
{p(b = 0|x = 0) = 2/3,
p(b = 1|x = 0) = 1/3,and
{p(b = 0|x = 1) = 0,
p(b = 1|x = 1) = 1.(4)
20/01/2015 Debapriya Basu Roy, Weekly Presentation 12/20
CAD Optimization
Figure: t = 1 private circuit for AND third coordinate on 4-input LUTs
4 input LUT
4 input LUT
4 input LUT
unconnected
a3
b3
a2
a1
r1,3
a3b3 ⊕ a3b1 ⊕ a3b2 = a3(b) => Leakage
r2,3
a1b3 ⊕ a2b3 ⊕ r1,3
c3
b3
b2
b1
{p(b = 0|x = 0) = 2/3,
p(b = 1|x = 0) = 1/3,and
{p(b = 0|x = 1) = 0,
p(b = 1|x = 1) = 1.(4)
20/01/2015 Debapriya Basu Roy, Weekly Presentation 12/20
Delay in Random Variables
There are two ways in which random variables can be provided to theprivate circuit: as external input or from a Random Number generator(RNG). Generally, random numbers are provided to the circuit froman RNG.
c3 = a3b3 ⊕ (r1,3 ⊕ a1b3)⊕ a3b1 ⊕ (r2,3 ⊕ a2b3)⊕ a3b2 (5)
Delay in the arrival of random bits r1,3, r2,3, a1 and a2 lead toinformation leakage.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 13/20
Delay in Random Variables
There are two ways in which random variables can be provided to theprivate circuit: as external input or from a Random Number generator(RNG). Generally, random numbers are provided to the circuit froman RNG.
c3 = a3b3 ⊕ (r1,3 ⊕ a1b3)⊕ a3b1 ⊕ (r2,3 ⊕ a2b3)⊕ a3b2 (5)
Delay in the arrival of random bits r1,3, r2,3, a1 and a2 lead toinformation leakage.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 13/20
Experimental Setup
A parallel implementation SIMON32/64 crypto-core, running at clockfrequency of 24-MHz, along with a simple UART interface is used totest our design on the Xilinx Virtex XC5-VLX30 FPGA of theSASEBO-GII platform.
For t = 1, total number of random bits required by SIMON is 272,whereas for t = 2 and t = 3, number of required random bits become608 and 1008. Random numbers are generated by a maximal lengthLFSR.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 14/20
Experimental Setup
A parallel implementation SIMON32/64 crypto-core, running at clockfrequency of 24-MHz, along with a simple UART interface is used totest our design on the Xilinx Virtex XC5-VLX30 FPGA of theSASEBO-GII platform.
For t = 1, total number of random bits required by SIMON is 272,whereas for t = 2 and t = 3, number of required random bits become608 and 1008. Random numbers are generated by a maximal lengthLFSR.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 14/20
Result: Optimized Simon
(a) TVLA Plot
0 100 200 300 400 500 600 700 800 900 1000−20
−15
−10
−5
0
5
10
15
20
Sample Points
TV
LA
Valu
e
TVLA Value of Optimized Simon
Safe Value of TVLA
(b) Average KeyRanking
0 200 400 600 800 10001
2
3
4
5
6
7
Number of Traces/1000
Av
era
ge
Ke
y R
an
kin
g
(c) Correlation Value
0 10 20 30 40 50 60−0.015
−0.01
−0.005
0
0.005
0.01
0.015
Sample Points
Co
rre
lati
on
Va
lue
Wrong Key Guess
Correct Key Guess
20/01/2015 Debapriya Basu Roy, Weekly Presentation 15/20
Result: 2 input LUT based Simon
(a) TVLA Plot
0 200 400 600 800 1000−15
−10
−5
0
5
10
15
Sample Points
TV
LA
Valu
e
TVLA Value of 2 i/p LUT Simon
Safe Value of TVLA
(b) Average KeyRanking
0 200 400 600 800 10001
2
3
4
5
6
7
8
Number of Traces/1000
Av
era
ge
Ke
y R
an
kin
g
(c) Correlation Value
0 5 10 15 20 25 30 35 40 45 50−0.01
−0.008
−0.006
−0.004
−0.002
0
0.002
0.004
0.006
0.008
0.01
Sample Points
Co
rre
lati
on
Va
lue
Wrong Key Guess
Correct Key Guess
20/01/2015 Debapriya Basu Roy, Weekly Presentation 16/20
Synchronized 2-input LUT based SIMON
(a) TVLA Plot
0 200 400 600 800 1000 1200 1400 1600 1800 2000−15
−10
−5
0
5
10
15
Sample Points
TV
LA
Valu
e
TVLA Value of Synchronized Simon
Safe Value of TVLA
First Round Output
(b) Avg. KeyRank P)
0 200 400 600 800 10002
3
4
5
6
Number of Traces/1000
Avera
ge K
ey R
an
kin
g
(c) CorrelationValue P
0 5 10 15 20 25 30 35 40 45 50−0.015
−0.01
−0.005
0
0.005
0.01
0.015
Sample Points
Co
rrela
tio
n V
alu
e
Wrong Key Guess
Correct Key Guess
(d) Avg. KeyRank 1
0 200 400 600 800 10001
2
3
4
5
6
7
Number of Traces/1000
Avera
ge K
ey R
an
kin
g
(e) CorrelationValue 1
0 10 20 30 40 50 60−0.015
−0.01
−0.005
0
0.005
0.01
0.015
Sample Points
Co
rrela
tio
n V
alu
e
Wrong Key Guess
Correct Key Guess
Figure: Side Channel Analysis of Synchronized 2 input LUT SIMON
20/01/2015 Debapriya Basu Roy, Weekly Presentation 17/20
Attack Summary
Table: Summary of Side Channel Analysis
Design TVLA Avg. Key RemarksName Test Ranking
Optimized Fails, significant Key ranking is low, NotSIMON information leakage successful attack secure
2 input LUT Fails, but less Key ranking is high, Secure againstbased SIMON information leakage attack is not CPA, could be
compared to optimized successful broken bySIMON better model
Synchronized Passes: no leakage Key ranking is high, Secure2 input LUT at first round. Initial attack is notbased Simon peaks are caused by successful
plain-text loading
20/01/2015 Debapriya Basu Roy, Weekly Presentation 18/20
Resource Comparison
Name LUTs Registers Slices Freq. Clock(MHz) Cycles
Optimized 761 805 595 147 32SIMON (1×) (1×) (1×) (1×) (1×)
2 i/p LUT 1305 805 1241 88 32based SIMON (1.71×) (1×) (2.08×) (0.59×) (1×)Synchronized
2 i/p LUT 1309 2920 4090 104 288based SIMON (1.71×) (3.62×) (6.87×) (0.70×) (9×)
20/01/2015 Debapriya Basu Roy, Weekly Presentation 19/20
Conclusion
We analyzed private circuits at an implementation level on a SIMONcrypto-processor. Our results show that it is very easy for a CAD toolto override the basic requirements of private circuits.
Practical evaluations indicate that with proper constraints the leakagecan be reduced. Moreover, by synchronizing each gate, we removeglitches and delay and approach much closer to theoretical evaluationof private circuits, but at a huge overhead.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 20/20
Conclusion
We analyzed private circuits at an implementation level on a SIMONcrypto-processor. Our results show that it is very easy for a CAD toolto override the basic requirements of private circuits.
Practical evaluations indicate that with proper constraints the leakagecan be reduced. Moreover, by synchronizing each gate, we removeglitches and delay and approach much closer to theoretical evaluationof private circuits, but at a huge overhead.
20/01/2015 Debapriya Basu Roy, Weekly Presentation 20/20