Software implementations of ECC: security and e ciency · Software implementations of ECC: security...
Transcript of Software implementations of ECC: security and e ciency · Software implementations of ECC: security...
![Page 1: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/1.jpg)
Software implementations of ECC:security and efficiency
Tung Chou
Osaka University
17 November 2018
Summer School of the ECC workshop
![Page 2: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/2.jpg)
Scope
constructive software
pre-quantum
• : small exercises, homework
1
![Page 3: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/3.jpg)
Scope
constructive software
pre-quantum
• : small exercises, homework
1
![Page 4: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/4.jpg)
Diffie-Hellman key-exchange (agreement)
Parties agreed on: group G = 〈g〉
(e.g., G ⊂ Z∗p)
random a
random b
ga
gb
s = (gb)a s = (ga)b
• “Assumed” to be “hard”
• recovering c from gc
• recovering gab given g, ga, gb
• Security depends on G
2
![Page 5: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/5.jpg)
Diffie-Hellman key-exchange (agreement)
Parties agreed on: group G = 〈g〉
(e.g., G ⊂ Z∗p)
random a
random b
ga
gb
s = (gb)a s = (ga)b
• “Assumed” to be “hard”
• recovering c from gc
• recovering gab given g, ga, gb
• Security depends on G
2
![Page 6: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/6.jpg)
Diffie-Hellman key-exchange (agreement)
Parties agreed on: group G = 〈g〉
(e.g., G ⊂ Z∗p)
random a
random b
ga
gb
s = (gb)a s = (ga)b
• “Assumed” to be “hard”
• recovering c from gc
• recovering gab given g, ga, gb
• Security depends on G
2
![Page 7: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/7.jpg)
Diffie-Hellman key-exchange (agreement)
Parties agreed on: group G = 〈g〉
(e.g., G ⊂ Z∗p)
random a
random b
ga
gb
s = (gb)a s = (ga)b
• “Assumed” to be “hard”
• recovering c from gc
• recovering gab given g, ga, gb
• Security depends on G
2
![Page 8: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/8.jpg)
Elliptic curves
E : y2 = x3 +Ax+B,
A,B ∈ K ⊂ L
E(L) = {∞} ∪ {(x, y) ∈ L2 | y2 = x3 +Ax+B}
x
y
P
Q
P +Q
3
![Page 9: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/9.jpg)
Elliptic-curve Diffie-Hellman
Parties agreed on: curve E, 〈B〉 ⊆ E(K)
random a
random b
aB
bB
s = a(bB) s = b(aB)
“Assumed” to be “hard”
• recovering k from kP
• recovering abP given P, aP, bP
4
![Page 10: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/10.jpg)
Elliptic-curve Diffie-Hellman
Parties agreed on: curve E, 〈B〉 ⊆ E(K)
random a
random b
aB
bB
s = a(bB) s = b(aB)
“Assumed” to be “hard”
• recovering k from kP
• recovering abP given P, aP, bP
4
![Page 11: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/11.jpg)
Elliptic-curve Diffie-Hellman
Parties agreed on: curve E, 〈B〉 ⊆ E(K)
random a
random b
aB
bB
s = a(bB) s = b(aB)
“Assumed” to be “hard”
• recovering k from kP
• recovering abP given P, aP, bP
4
![Page 12: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/12.jpg)
Scalar multiplications
k · P
base point
(fixed? variable?)
scalar
(secret? public?)
5
![Page 13: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/13.jpg)
Scalar multiplications
k · P
base point
(fixed? variable?)
scalar
(secret? public?)
5
![Page 14: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/14.jpg)
safecurves.cr.yp.to/
• Different curves forms
• Large field sizes: typically > 2200
6
![Page 15: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/15.jpg)
safecurves.cr.yp.to/
• Different curves forms
• Large field sizes: typically > 2200
6
![Page 16: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/16.jpg)
ECC implementation pyramid
Scalar multiplications kP
Point adds/doubles
Field arithmetic in Fp = Zp
Big-integer arithmetic
7
![Page 19: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/19.jpg)
Naive scalar multiplication
• Input: scalar k, point P
• Output: kP
Q = P
For i from 1 to k-1 do
Q += P
• Way too slow... (recall the group order)
• Timing depends on k (secret)!
10
![Page 20: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/20.jpg)
Naive scalar multiplication
• Input: scalar k, point P
• Output: kP
Q = P
For i from 1 to k-1 do
Q += P
• Way too slow... (recall the group order)
• Timing depends on k (secret)!
10
![Page 21: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/21.jpg)
Naive scalar multiplication
• Input: scalar k, point P
• Output: kP
Q = P
For i from 1 to k-1 do
Q += P
• Way too slow... (recall the group order)
• Timing depends on k (secret)!
10
![Page 22: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/22.jpg)
Naive scalar multiplication
• Input: scalar k, point P
• Output: kP
Q = P
For i from 1 to k-1 do
Q += P
• Way too slow... (recall the group order)
• Timing depends on k (secret)!
10
![Page 23: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/23.jpg)
Theory versus reality
(picture from xkcd.com/538/ [CC BY-NC 2.5])
• How about the attack scenarios between the 2 cases?
11
![Page 24: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/24.jpg)
Theory versus reality
(picture from xkcd.com/538/ [CC BY-NC 2.5])
• How about the attack scenarios between the 2 cases?
11
![Page 25: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/25.jpg)
Side channels
secret
timing
power
Attacker
• Side channel information might be correlated to the secret(s)
• We need constant-time implementations• timing should be independent of the secret(s)
12
![Page 26: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/26.jpg)
Side channels
secret
timing
power
Attacker
• Side channel information might be correlated to the secret(s)
• We need constant-time implementations• timing should be independent of the secret(s)
12
![Page 27: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/27.jpg)
Side channels
secret
timing
power
Attacker
• Side channel information might be correlated to the secret(s)
• We need constant-time implementations• timing should be independent of the secret(s)
12
![Page 28: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/28.jpg)
Scalar multiplications with double-and-add
• Input: k = (k`−1kl−2...k0)2, point P
• Output: kP
Q = 0
For i from l-1 downto 0 do
Q = 2Q
if k[i] == 1
Q += P
• θ(`) instead of k − 1 point operations
• Similarly, we can do “square-and-multiply”
• From LSB to MSB?
← timing difference!
13
![Page 29: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/29.jpg)
Scalar multiplications with double-and-add
• Input: k = (k`−1kl−2...k0)2, point P
• Output: kP
Q = 0
For i from l-1 downto 0 do
Q = 2Q
if k[i] == 1
Q += P
• θ(`) instead of k − 1 point operations
• Similarly, we can do “square-and-multiply”
• From LSB to MSB?
← timing difference!
13
![Page 30: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/30.jpg)
Scalar multiplications with double-and-add
• Input: k = (k`−1kl−2...k0)2, point P
• Output: kP
Q = 0
For i from l-1 downto 0 do
Q = 2Q
if k[i] == 1
Q += P
• θ(`) instead of k − 1 point operations
• Similarly, we can do “square-and-multiply”
• From LSB to MSB?
← timing difference!
13
![Page 31: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/31.jpg)
Scalar multiplications with double-and-add
• Input: k = (k`−1kl−2...k0)2, point P
• Output: kP
Q = 0
For i from l-1 downto 0 do
Q = 2Q
if k[i] == 1
Q += P
• θ(`) instead of k − 1 point operations
• Similarly, we can do “square-and-multiply”
• From LSB to MSB?
← timing difference!
13
![Page 32: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/32.jpg)
Scalar multiplications with double-and-add
• Input: k = (k`−1k`−2...k0)2, point P
• Output: kP
Q = 0
For i from l-1 downto 0 do
R0 = 2Q
R1 = Q + P
if k[i] == 1
Q = R1
else
Q = R0
← branch prediction
(...and also the problem with instruction cache)
14
![Page 33: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/33.jpg)
Scalar multiplications with double-and-add
• Input: k = (k`−1k`−2...k0)2, point P
• Output: kP
Q = 0
For i from l-1 downto 0 do
R0 = 2Q
R1 = Q + P
if k[i] == 1
Q = R1
else
Q = R0
← branch prediction
(...and also the problem with instruction cache)
14
![Page 34: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/34.jpg)
Scalar multiplications with double-and-add
• Input: k = (k`−1k`−2...k0)2, point P
• Output: kP
Q = 0
For i from l-1 downto 0 do
R0 = 2Q
R1 = Q + P
if k[i] == 1
Q = R1
else
Q = R0
← branch prediction
(...and also the problem with instruction cache)
14
![Page 35: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/35.jpg)
Scalar multiplications with double-and-add
• Input: k = (k`−1k`−2...k0)2, point P
• Output: kP
Q = 0
For i from l-1 downto 0 do
R0 = 2Q
R1 = Q + P
if k[i] == 1
Q = R1
else
Q = R0
← branch prediction
(...and also the problem with instruction cache)
14
![Page 36: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/36.jpg)
Branch prediction
1
0
: predict taken
: predict NOT taken
1
0
taken
!taken
!taken
taken
1 1
0 0
!taken
taken
!taken
taken
take
n
!taken
taken
!taken
• Good for loop structures. Why?
15
![Page 37: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/37.jpg)
Branch prediction
1
0
: predict taken
: predict NOT taken
1
0
taken
!taken
!taken
taken
1 1
0 0
!taken
taken
!taken
takenta
ken
!taken
taken
!taken
• Good for loop structures. Why?
15
![Page 38: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/38.jpg)
Branch prediction
1
0
: predict taken
: predict NOT taken
1
0
taken
!taken
!taken
taken
1 1
0 0
!taken
taken
!taken
takenta
ken
!taken
taken
!taken
• Good for loop structures. Why?
15
![Page 39: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/39.jpg)
Branch prediction (cont.)
(picture from en.wikipedia.org/wiki/Branch_predictor [CC BY-SA 3.0])
16
![Page 40: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/40.jpg)
Scalar multiplications with double-and-add• Input: k = (k`−1k`−2...k0)2, point P• Output: kP
Q = 0
For i from l-1 downto 0 do
Q = 2Q
R = Q + P
Q = SEL(R, Q, k[i]) ← constant-time
SEL(R, Q, b):
mask_R = -b
mask_Q = ~mask_R
return (mask_R & R) | (mask_Q & Q)
• Still many dummy operations
17
![Page 41: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/41.jpg)
Scalar multiplications with double-and-add• Input: k = (k`−1k`−2...k0)2, point P• Output: kP
Q = 0
For i from l-1 downto 0 do
Q = 2Q
R = Q + P
Q = SEL(R, Q, k[i]) ← constant-time
SEL(R, Q, b):
mask_R = -b
mask_Q = ~mask_R
return (mask_R & R) | (mask_Q & Q)
• Still many dummy operations
17
![Page 42: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/42.jpg)
Scalar multiplications with double-and-add• Input: k = (k`−1k`−2...k0)2, point P• Output: kP
Q = 0
For i from l-1 downto 0 do
Q = 2Q
R = Q + P
Q = SEL(R, Q, k[i]) ← constant-time
SEL(R, Q, b):
mask_R = -b
mask_Q = ~mask_R
return (mask_R & R) | (mask_Q & Q)
• Still many dummy operations17
![Page 43: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/43.jpg)
Scalar multiplications with table lookups (fixed window)• Group w bits. Keep a table of 0P, . . . , (2w − 1)P
• Window width w = 4, ` = 256.
For i from 0 to 15 do
T[i] = iP
Q = 0
For i in [252, 248, ..., 0] do
j = (k[i+3] k[i+2] k[i+1] k[i])_2
Q = 2Q
Q = 2Q
Q = 2Q
Q = 2Q
Q += T[j]
• ≈ `/w additions
← non-constant-time!
18
![Page 44: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/44.jpg)
Scalar multiplications with table lookups (fixed window)• Group w bits. Keep a table of 0P, . . . , (2w − 1)P
• Window width w = 4, ` = 256.
For i from 0 to 15 do
T[i] = iP
Q = 0
For i in [252, 248, ..., 0] do
j = (k[i+3] k[i+2] k[i+1] k[i])_2
Q = 2Q
Q = 2Q
Q = 2Q
Q = 2Q
Q += T[j]
• ≈ `/w additions
← non-constant-time!
18
![Page 45: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/45.jpg)
Scalar multiplications with table lookups (fixed window)• Group w bits. Keep a table of 0P, . . . , (2w − 1)P
• Window width w = 4, ` = 256.
For i from 0 to 15 do
T[i] = iP
Q = 0
For i in [252, 248, ..., 0] do
j = (k[i+3] k[i+2] k[i+1] k[i])_2
Q = 2Q
Q = 2Q
Q = 2Q
Q = 2Q
Q += T[j]
• ≈ `/w additions
← non-constant-time!
18
![Page 46: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/46.jpg)
Scalar multiplications with table lookups (fixed window)• Group w bits. Keep a table of 0P, . . . , (2w − 1)P
• Window width w = 4, ` = 256.
For i from 0 to 15 do
T[i] = iP
Q = 0
For i in [252, 248, ..., 0] do
j = (k[i+3] k[i+2] k[i+1] k[i])_2
Q = 2Q
Q = 2Q
Q = 2Q
Q = 2Q
Q += T[j]
• ≈ `/w additions
← non-constant-time!
18
![Page 47: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/47.jpg)
Scalar multiplications with table lookups (fixed window)• Group w bits. Keep a table of 0P, . . . , (2w − 1)P
• Window width w = 4, ` = 256.
For i from 0 to 15 do
T[i] = iP
Q = 0
For i in [252, 248, ..., 0] do
j = (k[i+3] k[i+2] k[i+1] k[i])_2
Q = 2Q
Q = 2Q
Q = 2Q
Q = 2Q
Q += T[j]
• ≈ `/w additions
← non-constant-time!
18
![Page 48: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/48.jpg)
Caches
ALU
data
Cache
CPU
Memory
line
data
• Real architectures are more complex (hierarchy, associativity, etc)• Instruction caches
19
![Page 49: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/49.jpg)
Caches
ALU
data
Cache
CPU
Memory
line
data
• Real architectures are more complex (hierarchy, associativity, etc)• Instruction caches
19
![Page 50: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/50.jpg)
Caches
ALU
data
Cache
CPU
Memory
line
data
• Real architectures are more complex (hierarchy, associativity, etc)• Instruction caches
19
![Page 51: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/51.jpg)
Cache-timing attacks
cache lines
Eve
CryptoEve
fast fast fast fastslow slow
T [0] and T [4] have been accessed... Muahahaha
• Similar for instruction cache
• To avoid timing attacks, you should avoid
• secret-dependent conditions• secret-dependent memory indices
20
![Page 52: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/52.jpg)
Cache-timing attacks
cache lines
Eve
CryptoEve
fast fast fast fastslow slow
T [0] and T [4] have been accessed... Muahahaha
• Similar for instruction cache
• To avoid timing attacks, you should avoid
• secret-dependent conditions• secret-dependent memory indices
20
![Page 53: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/53.jpg)
Cache-timing attacks
cache lines
Eve
Crypto
Eve
fast fast fast fastslow slow
T [0] and T [4] have been accessed... Muahahaha
• Similar for instruction cache
• To avoid timing attacks, you should avoid
• secret-dependent conditions• secret-dependent memory indices
20
![Page 54: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/54.jpg)
Cache-timing attacks
cache lines
EveCrypto
Eve
fast fast fast fastslow slow
T [0] and T [4] have been accessed... Muahahaha
• Similar for instruction cache
• To avoid timing attacks, you should avoid
• secret-dependent conditions• secret-dependent memory indices
20
![Page 55: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/55.jpg)
Cache-timing attacks
cache lines
EveCrypto
Eve
fast fast fast fastslow slow
T [0] and T [4] have been accessed... Muahahaha
• Similar for instruction cache
• To avoid timing attacks, you should avoid
• secret-dependent conditions• secret-dependent memory indices
20
![Page 56: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/56.jpg)
Cache-timing attacks
cache lines
EveCrypto
Eve
fast fast fast fastslow slow
T [0] and T [4] have been accessed... Muahahaha
• Similar for instruction cache
• To avoid timing attacks, you should avoid
• secret-dependent conditions• secret-dependent memory indices
20
![Page 57: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/57.jpg)
Constant-time table lookups• Consider w = 3
// compute mask_j from i s.t.
// mask_j = 0xFF..F if j == i
// mask_j = 0x00..0 if j != i
v = mask_0 & T[0];
v |= mask_1 & T[1];
v |= mask_2 & T[2];
v |= mask_3 & T[3];
v |= mask_4 & T[4];
v |= mask_5 & T[5];
v |= mask_6 & T[6];
v |= mask_7 & T[7];
// now v = T[i]
• There is a limit on w
21
![Page 58: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/58.jpg)
Constant-time table lookups• Consider w = 3
// compute mask_j from i s.t.
// mask_j = 0xFF..F if j == i
// mask_j = 0x00..0 if j != i
v = mask_0 & T[0];
v |= mask_1 & T[1];
v |= mask_2 & T[2];
v |= mask_3 & T[3];
v |= mask_4 & T[4];
v |= mask_5 & T[5];
v |= mask_6 & T[6];
v |= mask_7 & T[7];
// now v = T[i]
• There is a limit on w21
![Page 59: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/59.jpg)
Signed window• Each window is consider to be in [−2w−1, 2w−1 − 1]
• 1 denotes −1
(01 10 11 01 11)2
(01 10 11 10 01)2
(01 11 01 10 01)2
(10 01 01 10 01)2
• Rewriting is cheap
22
![Page 60: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/60.jpg)
Signed window• Each window is consider to be in [−2w−1, 2w−1 − 1]
• 1 denotes −1
(01 10 11 01 11)2
(01 10 11 10 01)2
(01 11 01 10 01)2
(10 01 01 10 01)2
• Rewriting is cheap
22
![Page 61: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/61.jpg)
Signed window• Each window is consider to be in [−2w−1, 2w−1 − 1]
• 1 denotes −1
(01 10 11 01 11)2
(01 10 11 10 01)2
(01 11 01 10 01)2
(10 01 01 10 01)2
• Rewriting is cheap
22
![Page 62: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/62.jpg)
Signed window• Each window is consider to be in [−2w−1, 2w−1 − 1]
• 1 denotes −1
(01 10 11 01 11)2
(01 10 11 10 01)2
(01 11 01 10 01)2
(10 01 01 10 01)2
• Rewriting is cheap
22
![Page 63: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/63.jpg)
Signed window• Each window is consider to be in [−2w−1, 2w−1 − 1]
• 1 denotes −1
(01 10 11 01 11)2
(01 10 11 10 01)2
(01 11 01 10 01)2
(10 01 01 10 01)2
• Rewriting is cheap
22
![Page 64: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/64.jpg)
On-demand negation
// compute mask_j from i s.t.
// mask_j = 0xFF..F if |j| == i
// mask_j = 0x00..0 if |j| != i
v = mask_0 & T[0];
v |= mask_1 & T[1];
v |= mask_2 & T[2];
v |= mask_3 & T[3];
v |= mask_4 & T[4];
v = SEL(-v, v, sign(j));
// now v = T[i]
• More memory-efficient
• More time-efficient (negation is cheap)
23
![Page 65: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/65.jpg)
Fixed-base scalar multiplication
• Precomputation is free...
• can use large window sizes (still many squarings).• can just precompute all 2iP and do additions (no doublings).
• Combining with windowing:
• just precompute all m2iwP for m ∈ [0, 2w − 1].• ≈ `/w additions, no doublings
• Suppose each table entry takes 64 bytes, ` = 256 and w = 4.
Memory requirement?
24
![Page 66: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/66.jpg)
Fixed-base scalar multiplication
• Precomputation is free...
• can use large window sizes (still many squarings).• can just precompute all 2iP and do additions (no doublings).
• Combining with windowing:
• just precompute all m2iwP for m ∈ [0, 2w − 1].• ≈ `/w additions, no doublings
• Suppose each table entry takes 64 bytes, ` = 256 and w = 4.
Memory requirement?
24
![Page 67: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/67.jpg)
Fixed-base scalar multiplication
• Precomputation is free...
• can use large window sizes (still many squarings).• can just precompute all 2iP and do additions (no doublings).
• Combining with windowing:
• just precompute all m2iwP for m ∈ [0, 2w − 1].• ≈ `/w additions, no doublings
• Suppose each table entry takes 64 bytes, ` = 256 and w = 4.
Memory requirement?
24
![Page 68: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/68.jpg)
Variable-time instructions
http://www.agner.org/optimize/instruction_tables.pdf
latency
25
![Page 69: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/69.jpg)
EdDSA signature scheme
B: based point, ` = |B|, H: hash function with 2b-bit outputs
Key generation:k : b-bit secret key
a← H(k)A← aB : public key
Signing:r ← H(a,M)
R← rB
s← (r +H(R,A,M)a) mod `
(R, s) : signature of M
Verification:
sB = R+H(R,A,M)A
• Convince yourself that the scheme works.
26
![Page 70: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/70.jpg)
EdDSA signature scheme
B: based point, ` = |B|, H: hash function with 2b-bit outputs
Key generation:k : b-bit secret key
a← H(k)A← aB : public key
Signing:r ← H(a,M)
R← rB
s← (r +H(R,A,M)a) mod `
(R, s) : signature of M
Verification:
sB = R+H(R,A,M)A
• Convince yourself that the scheme works.
26
![Page 71: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/71.jpg)
EdDSA signature scheme
B: based point, ` = |B|, H: hash function with 2b-bit outputs
Key generation:k : b-bit secret key
a← H(k)A← aB : public key
Signing:r ← H(a,M)
R← rB
s← (r +H(R,A,M)a) mod `
(R, s) : signature of M
Verification:
sB = R+H(R,A,M)A
• Convince yourself that the scheme works.
26
![Page 72: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/72.jpg)
Sliding window• Keep mP with odd m ∈ [0, 2w − 1]
• Shift the window if necessary
(01 10 11 01 11)2
(01 10 11 01 03)2
(01 10 11 01 03)2
(01 10 03 01 03)2
(00 30 03 01 03)2
• 4⇒ 3 additions
• Can use signed sliding window
27
![Page 73: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/73.jpg)
Sliding window• Keep mP with odd m ∈ [0, 2w − 1]
• Shift the window if necessary
(01 10 11 01 11)2
(01 10 11 01 03)2
(01 10 11 01 03)2
(01 10 03 01 03)2
(00 30 03 01 03)2
• 4⇒ 3 additions
• Can use signed sliding window
27
![Page 74: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/74.jpg)
Sliding window• Keep mP with odd m ∈ [0, 2w − 1]
• Shift the window if necessary
(01 10 11 01 11)2
(01 10 11 01 03)2
(01 10 11 01 03)2
(01 10 03 01 03)2
(00 30 03 01 03)2
• 4⇒ 3 additions
• Can use signed sliding window
27
![Page 75: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/75.jpg)
Sliding window• Keep mP with odd m ∈ [0, 2w − 1]
• Shift the window if necessary
(01 10 11 01 11)2
(01 10 11 01 03)2
(01 10 11 01 03)2
(01 10 03 01 03)2
(00 30 03 01 03)2
• 4⇒ 3 additions
• Can use signed sliding window
27
![Page 76: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/76.jpg)
Sliding window• Keep mP with odd m ∈ [0, 2w − 1]
• Shift the window if necessary
(01 10 11 01 11)2
(01 10 11 01 03)2
(01 10 11 01 03)2
(01 10 03 01 03)2
(00 30 03 01 03)2
• 4⇒ 3 additions
• Can use signed sliding window
27
![Page 77: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/77.jpg)
Sliding window• Keep mP with odd m ∈ [0, 2w − 1]
• Shift the window if necessary
(01 10 11 01 11)2
(01 10 11 01 03)2
(01 10 11 01 03)2
(01 10 03 01 03)2
(00 30 03 01 03)2
• 4⇒ 3 additions
• Can use signed sliding window27
![Page 78: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/78.jpg)
Double-scalar multiplications
• Input: public a, b, point P,Q
• Output: aP + bQ
R = 0
For i from l-1 downto 0 do
R = 2R
if a[i] == 1
R += P
if b[i] == 1
R += Q
• Sliding window can be applied (even different window sizes)
• Multi-scalar multiplication
28
![Page 79: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/79.jpg)
Double-scalar multiplications
• Input: public a, b, point P,Q
• Output: aP + bQ
R = 0
For i from l-1 downto 0 do
R = 2R
if a[i] == 1
R += P
if b[i] == 1
R += Q
• Sliding window can be applied (even different window sizes)
• Multi-scalar multiplication
28
![Page 80: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/80.jpg)
Double-scalar multiplications
• Input: public a, b, point P,Q
• Output: aP + bQ
R = 0
For i from l-1 downto 0 do
R = 2R
if a[i] == 1
R += P
if b[i] == 1
R += Q
• Sliding window can be applied (even different window sizes)
• Multi-scalar multiplication
28
![Page 81: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/81.jpg)
Montgomery curves1987 Montgomery:• Curve
by2 = x3 + ax2 + x
• Doubling
2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2
4x2(x22 + ax2 + 1)
• Differential addition
(x3, y3)− (x2, y2) = (x1, y1)
(x3, y3) + (x2, y2) = (x5, y5)
⇒ x5 =(x2x3 − 1)2
x1(x2 − x3)2
• We can compute the x coordinate of kP !
• Represent (x, y) as(X : Z), x = X/Z
29
![Page 82: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/82.jpg)
Montgomery curves1987 Montgomery:• Curve
by2 = x3 + ax2 + x
• Doubling
2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2
4x2(x22 + ax2 + 1)
• Differential addition
(x3, y3)− (x2, y2) = (x1, y1)
(x3, y3) + (x2, y2) = (x5, y5)
⇒ x5 =(x2x3 − 1)2
x1(x2 − x3)2
• We can compute the x coordinate of kP !
• Represent (x, y) as(X : Z), x = X/Z
29
![Page 83: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/83.jpg)
Montgomery curves1987 Montgomery:• Curve
by2 = x3 + ax2 + x
• Doubling
2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2
4x2(x22 + ax2 + 1)
• Differential addition
(x3, y3)− (x2, y2) = (x1, y1)
(x3, y3) + (x2, y2) = (x5, y5)
⇒ x5 =(x2x3 − 1)2
x1(x2 − x3)2
• We can compute the x coordinate of kP !
• Represent (x, y) as(X : Z), x = X/Z
29
![Page 84: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/84.jpg)
Montgomery curves1987 Montgomery:• Curve
by2 = x3 + ax2 + x
• Doubling
2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2
4x2(x22 + ax2 + 1)
• Differential addition
(x3, y3)− (x2, y2) = (x1, y1)
(x3, y3) + (x2, y2) = (x5, y5)
⇒ x5 =(x2x3 − 1)2
x1(x2 − x3)2
• We can compute the x coordinate of kP !
• Represent (x, y) as(X : Z), x = X/Z
29
![Page 85: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/85.jpg)
Montgomery curves1987 Montgomery:• Curve
by2 = x3 + ax2 + x
• Doubling
2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2
4x2(x22 + ax2 + 1)
• Differential addition
(x3, y3)− (x2, y2) = (x1, y1)
(x3, y3) + (x2, y2) = (x5, y5)
⇒ x5 =(x2x3 − 1)2
x1(x2 − x3)2
• We can compute the x coordinate of kP !
• Represent (x, y) as(X : Z), x = X/Z
29
![Page 86: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/86.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 87: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/87.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 88: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/88.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 89: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/89.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 90: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/90.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 91: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/91.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 92: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/92.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 93: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/93.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 94: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/94.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 95: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/95.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 96: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/96.jpg)
Montgomery ladder
Goal: compute nP, n = (n`−1n`−2 · · ·n0)2
Start from (Q`, R`) = (0, P )
Maintain (Qi, Ri) s.t. Ri = Qi + P ,
Qi = Σ`i=jnj2
j−iP (i decreasing)
Output Q0 = nP
if ni = 0,
(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)
if ni = 1,
(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)
n = (n3n2n1n0)2 = (1101)2 = 13
(Q4, R4) = (0, P )
n3 = 1
(Q3, R3) = (P, 2P )
n2 = 1
(Q2, R2) = (3P, 4P )
n1 = 0
(Q1, R1) = (6P, 7P )
n0 = 1
(Q0, R0) = (13P , 14P )
30
![Page 97: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/97.jpg)
Implementing the Montgomery ladder
dummy
P P ′
2P P + P ′
Ladder Step
ki cswap
• How to implement cswap?• hint: 4 bit operations to conditionally swap 2 bits
31
![Page 98: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/98.jpg)
Implementing the Montgomery ladder
dummy
P P ′
2P P + P ′
Ladder Stepki cswap
• How to implement cswap?• hint: 4 bit operations to conditionally swap 2 bits
31
![Page 99: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/99.jpg)
Implementing the Montgomery ladder
dummy
P P ′
2P P + P ′
Ladder Stepki cswap
• How to implement cswap?• hint: 4 bit operations to conditionally swap 2 bits
31
![Page 100: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/100.jpg)
Vector units
• Advanced Vector Extensions (AVX)
• Enables vector instructions
32
![Page 101: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/101.jpg)
2-way vectorization
x
a1 a0
b1 b0
a1b1 a0b0
32 bits32 bits
64 bits64 bits
• Useful for accelerating cryptographic computation
33
![Page 102: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/102.jpg)
Exploiting parallelism: naive way
P1 P2
k1P1 k2P2
• Might be good for busy servers. Other applications?
34
![Page 103: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/103.jpg)
Exploiting parallelism: naive way
P1 P2
k1P1 k2P2
• Might be good for busy servers. Other applications?
34
![Page 104: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/104.jpg)
Exploiting parallelism: key generation
P = s0B + s116B + s2162B + · · ·+ s6316
63B
P0 = s0B + s2162B + · · ·+ s6216
62B
P1 = s1B + s3162B + · · ·+ s6316
62B
P = 16P1 + P0
• Extra benefit of reducing table size
• Tradeoffs between time and space
35
![Page 105: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/105.jpg)
Exploiting parallelism: key generation
P = s0B + s116B + s2162B + · · ·+ s6316
63B
P0 = s0B + s2162B + · · ·+ s6216
62B
P1 = s1B + s3162B + · · ·+ s6316
62B
P = 16P1 + P0
• Extra benefit of reducing table size
• Tradeoffs between time and space
35
![Page 106: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/106.jpg)
Exploiting parallelism: key generation
P = s0B + s116B + s2162B + · · ·+ s6316
63B
P0 = s0B + s2162B + · · ·+ s6216
62B
P1 = s1B + s3162B + · · ·+ s6316
62B
P = 16P1 + P0
• Extra benefit of reducing table size
• Tradeoffs between time and space
35
![Page 107: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/107.jpg)
Exploiting parallelism: key generation
P = s0B + s116B + s2162B + · · ·+ s6316
63B
P0 = s0B + s2162B + · · ·+ s6216
62B
P1 = s1B + s3162B + · · ·+ s6316
62B
P = 16P1 + P0
• Extra benefit of reducing table size
• Tradeoffs between time and space
35
![Page 108: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/108.jpg)
Exploiting parallelism: ladder step
x z x′ z′
+ − + −
× × × ×
× − + −
× × ×
+ ×
×
x2 z2 x3 z3
(A− 2)/4
x1
P P ′
2P P + P ′
www.hyperelliptic.org/EFD/g1p/auto-montgom-xz.html#ladder-mladd-1987-m
36
![Page 109: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/109.jpg)
Exploiting parallelism: ladder step
x z x′ z′
+ − + −
× × × ×
× − + −
× × ×
+ ×
×
x2 z2 x3 z3
(A− 2)/4
x1
P P ′
2P P + P ′
www.hyperelliptic.org/EFD/g1p/auto-montgom-xz.html#ladder-mladd-1987-m
36
![Page 110: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/110.jpg)
Exploiting parallelism: ladder step
x z x′ z′
+ − + −
× × × ×
× − + −
× × ×
+ ×
×
x2 z2 x3 z3
(A− 2)/4
x1
P P ′
2P P + P ′
www.hyperelliptic.org/EFD/g1p/auto-montgom-xz.html#ladder-mladd-1987-m
36
![Page 111: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/111.jpg)
Exploiting parallelism: ladder step
x z x′ z′
+ − + −
× × × ×
× − + −
× × ×
+ ×
×
x2 z2 x3 z3
(A− 2)/4
x1
P P ′
2P P + P ′
www.hyperelliptic.org/EFD/g1p/auto-montgom-xz.html#ladder-mladd-1987-m
36
![Page 112: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/112.jpg)
Exploiting parallelism: ladder step
x z x′ z′
+ − + −
× × × ×
× − + −
× × ×
+ ×
×
x2 z2 x3 z3
(A− 2)/4
x1
P P ′
2P P + P ′
www.hyperelliptic.org/EFD/g1p/auto-montgom-xz.html#ladder-mladd-1987-m
36
![Page 113: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/113.jpg)
Exploiting parallelism: ladder step
x z x′ z′
+ − + −
× × × ×
× − + −
× × ×
+ ×
×
x2 z2 x3 z3
(A− 2)/4
x1
P P ′
2P P + P ′
www.hyperelliptic.org/EFD/g1p/auto-montgom-xz.html#ladder-mladd-1987-m
36
![Page 114: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/114.jpg)
Exploiting parallelism: ladder step
x z x′ z′
+ − + −
× × × ×
× − + −
× × ×
+ ×
×
x2 z2 x3 z3
(A− 2)/4
x1
P P ′
2P P + P ′
www.hyperelliptic.org/EFD/g1p/auto-montgom-xz.html#ladder-mladd-1987-m
36
![Page 115: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/115.jpg)
Exploiting parallelism: doubling (twisted Edwards curves)
• Let (X : Y : Z) := (X/Z, Y/Z)
• Goal: compute (X ′ : Y ′ : Z ′) = 2(X : Y : Z)
X Y Z
+ +
X′ Y ′ Z′
× × × ×
− + − +
× × ×
www.hyperelliptic.org/EFD/g1p/auto-twisted-extended-1.html#doubling-dbl-2008-hwcd
37
![Page 116: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/116.jpg)
Exploiting parallelism: doubling (twisted Edwards curves)
• Let (X : Y : Z) := (X/Z, Y/Z)
• Goal: compute (X ′ : Y ′ : Z ′) = 2(X : Y : Z)
X Y Z
+ +
X′ Y ′ Z′
× × × ×
− + − +
× × ×
www.hyperelliptic.org/EFD/g1p/auto-twisted-extended-1.html#doubling-dbl-2008-hwcd
37
![Page 117: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/117.jpg)
Exploiting parallelism: doubling (twisted Edwards curves)
• Let (X : Y : Z) := (X/Z, Y/Z)
• Goal: compute (X ′ : Y ′ : Z ′) = 2(X : Y : Z)
X Y Z
+ +
X′ Y ′ Z′
× × × ×
− + − +
× × ×
www.hyperelliptic.org/EFD/g1p/auto-twisted-extended-1.html#doubling-dbl-2008-hwcd
37
![Page 118: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/118.jpg)
Exploiting parallelism: doubling (twisted Edwards curves)
• Let (X : Y : Z) := (X/Z, Y/Z)
• Goal: compute (X ′ : Y ′ : Z ′) = 2(X : Y : Z)
X Y Z
+ +
X′ Y ′ Z′
× × × ×
− + − +
× × ×
www.hyperelliptic.org/EFD/g1p/auto-twisted-extended-1.html#doubling-dbl-2008-hwcd
37
![Page 119: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/119.jpg)
Exploiting parallelism: doubling (twisted Edwards curves)
• Let (X : Y : Z) := (X/Z, Y/Z)
• Goal: compute (X ′ : Y ′ : Z ′) = 2(X : Y : Z)
X Y Z
+ +
X′ Y ′ Z′
× × × ×
− + − +
× × ×
www.hyperelliptic.org/EFD/g1p/auto-twisted-extended-1.html#doubling-dbl-2008-hwcd
37
![Page 120: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/120.jpg)
Exploiting parallelism: doubling (twisted Edwards curves)
• Let (X : Y : Z) := (X/Z, Y/Z)
• Goal: compute (X ′ : Y ′ : Z ′) = 2(X : Y : Z)
X Y Z
+ +
X′ Y ′ Z′
× × × ×
− + − +
× × ×
www.hyperelliptic.org/EFD/g1p/auto-twisted-extended-1.html#doubling-dbl-2008-hwcd
37
![Page 121: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/121.jpg)
Field representation: case of F2255−19• Intuition: radix-264 representation
• each field elements in represented by 4 limbs
f = f0+ f1264+ f22128+ f32192
g = g0+ g1264+ g22128+ g32192
• Multiplication (mod 2256 − 38):
h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1
h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2
h2 = f0g2+ f1g1+ f2g0+ 38f3g3
h3 = f0g3+ f1g2+ f2g1+ f3g0
• Making use MULX: “64× 64→ 64 64”
• Lots of carries!
38
![Page 122: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/122.jpg)
Field representation: case of F2255−19• Intuition: radix-264 representation
• each field elements in represented by 4 limbs
f = f0+ f1264+ f22128+ f32192
g = g0+ g1264+ g22128+ g32192
• Multiplication (mod 2256 − 38):
h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1
h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2
h2 = f0g2+ f1g1+ f2g0+ 38f3g3
h3 = f0g3+ f1g2+ f2g1+ f3g0
• Making use MULX: “64× 64→ 64 64”
• Lots of carries!
38
![Page 123: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/123.jpg)
Field representation: case of F2255−19• Intuition: radix-264 representation
• each field elements in represented by 4 limbs
f = f0+ f1264+ f22128+ f32192
g = g0+ g1264+ g22128+ g32192
• Multiplication (mod 2256 − 38):
h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1
h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2
h2 = f0g2+ f1g1+ f2g0+ 38f3g3
h3 = f0g3+ f1g2+ f2g1+ f3g0
• Making use MULX: “64× 64→ 64 64”
• Lots of carries!
38
![Page 124: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/124.jpg)
Field representation: case of F2255−19• Intuition: radix-264 representation
• each field elements in represented by 4 limbs
f = f0+ f1264+ f22128+ f32192
g = g0+ g1264+ g22128+ g32192
• Multiplication (mod 2256 − 38):
h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1
h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2
h2 = f0g2+ f1g1+ f2g0+ 38f3g3
h3 = f0g3+ f1g2+ f2g1+ f3g0
• Making use MULX: “64× 64→ 64 64”
• Lots of carries!
38
![Page 125: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/125.jpg)
Field representation: case of F2255−19• Intuition: radix-264 representation
• each field elements in represented by 4 limbs
f = f0+ f1264+ f22128+ f32192
g = g0+ g1264+ g22128+ g32192
• Multiplication (mod 2256 − 38):
h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1
h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2
h2 = f0g2+ f1g1+ f2g0+ 38f3g3
h3 = f0g3+ f1g2+ f2g1+ f3g0
• Making use MULX: “64× 64→ 64 64”
• Lots of carries!
38
![Page 126: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/126.jpg)
Adding products of the limbs
low high
hi
prod
+ +
c c
• adding 128-bit products produces 2 carry bits:
→ require 2 “addtion with carry” (adc) instructions
• adc can be expensive:
6x slower than add in terms of througput on some Intel CPUs
39
![Page 127: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/127.jpg)
Adding products of the limbs
low high
hi
prod
+ +
c c
• adding 128-bit products produces 2 carry bits:
→ require 2 “addtion with carry” (adc) instructions
• adc can be expensive:
6x slower than add in terms of througput on some Intel CPUs
39
![Page 128: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/128.jpg)
Adding products of the limbs
low high
hi
prod
+ +
c c
• adding 128-bit products produces 2 carry bits:
→ require 2 “addtion with carry” (adc) instructions
• adc can be expensive:
6x slower than add in terms of througput on some Intel CPUs
39
![Page 129: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/129.jpg)
Redundant representation
• Radix-51 representation
f = f0+ f1251+ f22102+ f32153+ f42204
g = g0+ g1251+ g22102+ g32153+ g42204
• Multiplication
h0 = f0g0+ 19f1g4+ 19f2g3+ 19f3g2+ 19f4g1
h1 = f0g1+ f1g0+ 19f2g4+ 19f3g3+ 19f4g2
h2 = f0g2+ f1g1+ f2g0+ 19f3g4+ 19f4g3
h3 = f0g3+ f1g2+ f2g1+ f3g0+ 19f4g4
h4 = f0g4+ f1g3+ f2g2+ f3g1+ f4g0
• Adding upper 64 bits does not generate carries!
⇒ reducing (roughly) half of the adc’s
40
![Page 130: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/130.jpg)
Redundant representation
• Radix-51 representation
f = f0+ f1251+ f22102+ f32153+ f42204
g = g0+ g1251+ g22102+ g32153+ g42204
• Multiplication
h0 = f0g0+ 19f1g4+ 19f2g3+ 19f3g2+ 19f4g1
h1 = f0g1+ f1g0+ 19f2g4+ 19f3g3+ 19f4g2
h2 = f0g2+ f1g1+ f2g0+ 19f3g4+ 19f4g3
h3 = f0g3+ f1g2+ f2g1+ f3g0+ 19f4g4
h4 = f0g4+ f1g3+ f2g2+ f3g1+ f4g0
• Adding upper 64 bits does not generate carries!
⇒ reducing (roughly) half of the adc’s
40
![Page 131: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/131.jpg)
Redundant representation
• Radix-51 representation
f = f0+ f1251+ f22102+ f32153+ f42204
g = g0+ g1251+ g22102+ g32153+ g42204
• Multiplication
h0 = f0g0+ 19f1g4+ 19f2g3+ 19f3g2+ 19f4g1
h1 = f0g1+ f1g0+ 19f2g4+ 19f3g3+ 19f4g2
h2 = f0g2+ f1g1+ f2g0+ 19f3g4+ 19f4g3
h3 = f0g3+ f1g2+ f2g1+ f3g0+ 19f4g4
h4 = f0g4+ f1g3+ f2g2+ f3g1+ f4g0
• Adding upper 64 bits does not generate carries!
⇒ reducing (roughly) half of the adc’s
40
![Page 132: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/132.jpg)
Redundant representation
• Radix-51 representation
f = f0+ f1251+ f22102+ f32153+ f42204
g = g0+ g1251+ g22102+ g32153+ g42204
• Multiplication
h0 = f0g0+ 19f1g4+ 19f2g3+ 19f3g2+ 19f4g1
h1 = f0g1+ f1g0+ 19f2g4+ 19f3g3+ 19f4g2
h2 = f0g2+ f1g1+ f2g0+ 19f3g4+ 19f4g3
h3 = f0g3+ f1g2+ f2g1+ f3g0+ 19f4g4
h4 = f0g4+ f1g3+ f2g2+ f3g1+ f4g0
• Adding upper 64 bits does not generate carries!
⇒ reducing (roughly) half of the adc’s
40
![Page 133: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/133.jpg)
Reduction (carries)
• Carry hi → hi+1 (consider radix 2b):
c = hi � b, hi+1 = hi+1 + c, hi = hi − c
• Long carry chain:
h0 → h1 → h2 → h3 → h4 → h5 → h6 → h7 → h8 → h9 → h0 → h1
(Drawback: suffering from long latency)
• Reducing latency by interleaving carries:
h0 → h1 → h2 → h3 → h4 → h5 → h6
h5 → h6 → h7 → h8 → h9 → h0 → h1
41
![Page 134: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/134.jpg)
Reduction (carries)
• Carry hi → hi+1 (consider radix 2b):
c = hi � b, hi+1 = hi+1 + c, hi = hi − c
• Long carry chain:
h0 → h1 → h2 → h3 → h4 → h5 → h6 → h7 → h8 → h9 → h0 → h1
(Drawback: suffering from long latency)
• Reducing latency by interleaving carries:
h0 → h1 → h2 → h3 → h4 → h5 → h6
h5 → h6 → h7 → h8 → h9 → h0 → h1
41
![Page 135: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/135.jpg)
Reduction (carries)
• Carry hi → hi+1 (consider radix 2b):
c = hi � b, hi+1 = hi+1 + c, hi = hi − c
• Long carry chain:
h0 → h1 → h2 → h3 → h4 → h5 → h6 → h7 → h8 → h9 → h0 → h1
(Drawback: suffering from long latency)
• Reducing latency by interleaving carries:
h0 → h1 → h2 → h3 → h4 → h5 → h6
h5 → h6 → h7 → h8 → h9 → h0 → h1
41
![Page 136: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/136.jpg)
Instruction-level optimization
instructions ports latency reciprocal throughput
PADD/SUB(S,US)B/W/D/Q p15 1 0.5
PMULDQ p0 5 1
PSLLDQ, PSRLDQ p5 1 1
PAND PANDN POR PXOR p015 1 0.33
www.agner.org/optimize/instruction_tables.pdf
• Each execution ports can handle 1 (micro-) instruction per cycle
• Useful for estimating the actual performance
• PADD or PSLLDQ to multiply by 2?
• More on www.agner.org
42
![Page 137: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/137.jpg)
Instruction-level optimization
instructions ports latency reciprocal throughput
PADD/SUB(S,US)B/W/D/Q p15 1 0.5
PMULDQ p0 5 1
PSLLDQ, PSRLDQ p5 1 1
PAND PANDN POR PXOR p015 1 0.33
www.agner.org/optimize/instruction_tables.pdf
• Each execution ports can handle 1 (micro-) instruction per cycle
• Useful for estimating the actual performance
• PADD or PSLLDQ to multiply by 2?
• More on www.agner.org
42
![Page 138: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/138.jpg)
Field inversion
• Used for converting from (X : Y : Z) to (X/Z, Y/Z), for example
• typically only once
• Extended Euclidean algorithm
• a, b 7−→ x, y s.t. xa + yb = gcd(a, b)
• a, p 7−→ x, y s.t. xa + yp = 1, i.e., x ≡ a−1 (mod p)
• not (immediately) constant-time
• Square-and-multiply
• ap−1 ≡ 1 (mod p)⇒ ap−2 is the inverse!
• p− 2 is public
43
![Page 139: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/139.jpg)
Field inversion
• Used for converting from (X : Y : Z) to (X/Z, Y/Z), for example
• typically only once
• Extended Euclidean algorithm
• a, b 7−→ x, y s.t. xa + yb = gcd(a, b)
• a, p 7−→ x, y s.t. xa + yp = 1, i.e., x ≡ a−1 (mod p)
• not (immediately) constant-time
• Square-and-multiply
• ap−1 ≡ 1 (mod p)⇒ ap−2 is the inverse!
• p− 2 is public
43
![Page 140: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/140.jpg)
Field inversion
• Used for converting from (X : Y : Z) to (X/Z, Y/Z), for example
• typically only once
• Extended Euclidean algorithm
• a, b 7−→ x, y s.t. xa + yb = gcd(a, b)
• a, p 7−→ x, y s.t. xa + yp = 1, i.e., x ≡ a−1 (mod p)
• not (immediately) constant-time
• Square-and-multiply
• ap−1 ≡ 1 (mod p)⇒ ap−2 is the inverse!
• p− 2 is public
43
![Page 141: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/141.jpg)
Field inversion
• Used for converting from (X : Y : Z) to (X/Z, Y/Z), for example
• typically only once
• Extended Euclidean algorithm
• a, b 7−→ x, y s.t. xa + yb = gcd(a, b)
• a, p 7−→ x, y s.t. xa + yp = 1, i.e., x ≡ a−1 (mod p)
• not (immediately) constant-time
• Square-and-multiply
• ap−1 ≡ 1 (mod p)⇒ ap−2 is the inverse!
• p− 2 is public
43
![Page 142: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/142.jpg)
SageMath
44
![Page 143: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC](https://reader035.fdocuments.us/reader035/viewer/2022071021/5fd5c4ee7542a017d45d763a/html5/thumbnails/143.jpg)
doc.sagemath.org/html/en/constructions/
elliptic_curves.html
45