Download - Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Software implementations of ECC:security and efficiency

Tung Chou

Osaka University

17 November 2018

Summer School of the ECC workshop

Page 2: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Scope

constructive software

pre-quantum

• : small exercises, homework

1

Page 3: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Scope

constructive software

pre-quantum

• : small exercises, homework

1

Page 4: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Diffie-Hellman key-exchange (agreement)

Parties agreed on: group G = 〈g〉

(e.g., G ⊂ Z∗p)

random a

random b

ga

gb

s = (gb)a s = (ga)b

• “Assumed” to be “hard”

• recovering c from gc

• recovering gab given g, ga, gb

• Security depends on G

2

Page 5: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

(e.g., G ⊂ Z∗p)

random a

random b

ga

gb

s = (gb)a s = (ga)b

2

Page 6: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

(e.g., G ⊂ Z∗p)

random a

random b

ga

gb

s = (gb)a s = (ga)b

2

Page 7: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

(e.g., G ⊂ Z∗p)

random a

random b

ga

gb

s = (gb)a s = (ga)b

2

Page 8: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Elliptic curves

E : y2 = x3 +Ax+B,

A,B ∈ K ⊂ L

E(L) = {∞} ∪ {(x, y) ∈ L2 | y2 = x3 +Ax+B}

x

y

P

Q

P +Q

3

Page 9: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Elliptic-curve Diffie-Hellman

Parties agreed on: curve E, 〈B〉 ⊆ E(K)

random a

random b

aB

bB

s = a(bB) s = b(aB)

“Assumed” to be “hard”

• recovering k from kP

• recovering abP given P, aP, bP

4

Page 10: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

random a

random b

aB

bB

s = a(bB) s = b(aB)

4

Page 11: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

random a

random b

aB

bB

s = a(bB) s = b(aB)

4

Page 12: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Scalar multiplications

k · P

base point

(fixed? variable?)

scalar

(secret? public?)

5

Page 13: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Scalar multiplications

k · P

base point

(fixed? variable?)

scalar

(secret? public?)

5

Page 14: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

safecurves.cr.yp.to/

• Different curves forms

• Large field sizes: typically > 2200

6

Page 15: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Different curves forms

• Large field sizes: typically > 2200

6

Page 16: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

ECC implementation pyramid

Scalar multiplications kP

Point adds/doubles

Field arithmetic in Fp = Zp

Big-integer arithmetic

7

Page 17: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

www.hyperelliptic.org/EFD/

8

Page 18: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

9

Page 19: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Naive scalar multiplication

• Input: scalar k, point P

• Output: kP

Q = P

For i from 1 to k-1 do

Q += P

• Way too slow... (recall the group order)

• Timing depends on k (secret)!

10

Page 20: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = P

Q += P

10

Page 21: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = P

Q += P

10

Page 22: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = P

Q += P

10

Page 23: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Theory versus reality

(picture from xkcd.com/538/ [CC BY-NC 2.5])

• How about the attack scenarios between the 2 cases?

11

xkcd.com/538/

https://creativecommons.org/licenses/by-nc/2.5/

Page 24: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Theory versus reality

(picture from xkcd.com/538/ [CC BY-NC 2.5])

• How about the attack scenarios between the 2 cases?

11

xkcd.com/538/

https://creativecommons.org/licenses/by-nc/2.5/

Page 25: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Side channels

secret

timing

power

Attacker

• Side channel information might be correlated to the secret(s)

• We need constant-time implementations• timing should be independent of the secret(s)

12

Page 26: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Side channels

secret

timing

power

Attacker

12

Page 27: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Side channels

secret

timing

power

Attacker

12

Page 28: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Scalar multiplications with double-and-add

• Input: k = (k`−1kl−2...k0)2, point P

• Output: kP

Q = 0

For i from l-1 downto 0 do

Q = 2Q

if k[i] == 1

Q += P

• θ(`) instead of k − 1 point operations

• Similarly, we can do “square-and-multiply”

• From LSB to MSB?

← timing difference!

13

Page 29: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = 0

Q = 2Q

if k[i] == 1

Q += P

13

Page 30: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = 0

Q = 2Q

if k[i] == 1

Q += P

13

Page 31: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = 0

Q = 2Q

if k[i] == 1

Q += P

13

Page 32: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Input: k = (k`−1k`−2...k0)2, point P

• Output: kP

Q = 0

R0 = 2Q

R1 = Q + P

if k[i] == 1

Q = R1

else

Q = R0

← branch prediction

(...and also the problem with instruction cache)

14

Page 33: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = 0

R0 = 2Q

R1 = Q + P

if k[i] == 1

Q = R1

else

Q = R0

14

Page 34: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = 0

R0 = 2Q

R1 = Q + P

if k[i] == 1

Q = R1

else

Q = R0

14

Page 35: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: kP

Q = 0

R0 = 2Q

R1 = Q + P

if k[i] == 1

Q = R1

else

Q = R0

14

Page 36: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Branch prediction

1

0

: predict taken

: predict NOT taken

1

0

taken

!taken

taken

1 1

0 0

!taken

taken

!taken

taken

take

n

!taken

taken

!taken

• Good for loop structures. Why?

15

Page 37: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Branch prediction

1

0

: predict taken

: predict NOT taken

1

0

taken

!taken

taken

1 1

0 0

!taken

taken

!taken

takenta

ken

!taken

taken

!taken

15

Page 38: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Branch prediction

1

0

: predict taken

: predict NOT taken

1

0

taken

!taken

taken

1 1

0 0

!taken

taken

!taken

takenta

ken

!taken

taken

!taken

15

Page 39: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Branch prediction (cont.)

(picture from en.wikipedia.org/wiki/Branch_predictor [CC BY-SA 3.0])

16

en.wikipedia.org/wiki/Branch_predictor

https://creativecommons.org/licenses/by-sa/3.0/

Page 40: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Scalar multiplications with double-and-add• Input: k = (k`−1k`−2...k0)2, point P• Output: kP

Q = 0

Q = 2Q

R = Q + P

Q = SEL(R, Q, k[i]) ← constant-time

SEL(R, Q, b):

mask_R = -b

mask_Q = ~mask_R

return (mask_R & R) | (mask_Q & Q)

• Still many dummy operations

17

Page 41: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Q = 0

Q = 2Q

R = Q + P

SEL(R, Q, b):

mask_R = -b

mask_Q = ~mask_R

• Still many dummy operations

17

Page 42: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Q = 0

Q = 2Q

R = Q + P

SEL(R, Q, b):

mask_R = -b

mask_Q = ~mask_R

• Still many dummy operations17

Page 43: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Scalar multiplications with table lookups (fixed window)• Group w bits. Keep a table of 0P, . . . , (2w − 1)P

• Window width w = 4, ` = 256.

For i from 0 to 15 do

T[i] = iP

Q = 0

For i in [252, 248, ..., 0] do

j = (k[i+3] k[i+2] k[i+1] k[i])_2

Q = 2Q

Q += T[j]

• ≈ `/w additions

← non-constant-time!

18

Page 44: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

T[i] = iP

Q = 0

For i in [252, 248, ..., 0] do

j = (k[i+3] k[i+2] k[i+1] k[i])_2

Q = 2Q

Q += T[j]

18

Page 45: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

T[i] = iP

Q = 0

For i in [252, 248, ..., 0] do

j = (k[i+3] k[i+2] k[i+1] k[i])_2

Q = 2Q

Q += T[j]

18

Page 46: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

T[i] = iP

Q = 0

For i in [252, 248, ..., 0] do

j = (k[i+3] k[i+2] k[i+1] k[i])_2

Q = 2Q

Q += T[j]

18

Page 47: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

T[i] = iP

Q = 0

For i in [252, 248, ..., 0] do

j = (k[i+3] k[i+2] k[i+1] k[i])_2

Q = 2Q

Q += T[j]

18

Page 48: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Caches

ALU

data

Cache

CPU

Memory

line

data

• Real architectures are more complex (hierarchy, associativity, etc)• Instruction caches

19

Page 49: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Caches

ALU

data

Cache

CPU

Memory

line

data

19

Page 50: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Caches

ALU

data

Cache

CPU

Memory

line

data

19

Page 51: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Cache-timing attacks

cache lines

Eve

CryptoEve

fast fast fast fastslow slow

T [0] and T [4] have been accessed... Muahahaha

• Similar for instruction cache

• To avoid timing attacks, you should avoid

• secret-dependent conditions• secret-dependent memory indices

20

Page 52: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

cache lines

Eve

CryptoEve

20

Page 53: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

cache lines

Eve

Crypto

Eve

20

Page 54: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

cache lines

EveCrypto

Eve

20

Page 55: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

cache lines

EveCrypto

Eve

20

Page 56: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

cache lines

EveCrypto

Eve

20

Page 57: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Constant-time table lookups• Consider w = 3

// compute mask_j from i s.t.

// mask_j = 0xFF..F if j == i

// mask_j = 0x00..0 if j != i

v = mask_0 & T[0];

v |= mask_1 & T[1];

v |= mask_2 & T[2];

v |= mask_3 & T[3];

v |= mask_4 & T[4];

v |= mask_5 & T[5];

v |= mask_6 & T[6];

v |= mask_7 & T[7];

// now v = T[i]

• There is a limit on w

21

Page 58: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Constant-time table lookups• Consider w = 3

// mask_j = 0xFF..F if j == i

// mask_j = 0x00..0 if j != i

v = mask_0 & T[0];

v |= mask_1 & T[1];

v |= mask_2 & T[2];

v |= mask_3 & T[3];

v |= mask_4 & T[4];

v |= mask_5 & T[5];

v |= mask_6 & T[6];

v |= mask_7 & T[7];

// now v = T[i]

• There is a limit on w21

Page 59: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Signed window• Each window is consider to be in [−2w−1, 2w−1 − 1]

• 1 denotes −1

(01 10 11 01 11)2

(01 10 11 10 01)2

(01 11 01 10 01)2

(10 01 01 10 01)2

• Rewriting is cheap

22

Page 60: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• 1 denotes −1

(01 10 11 01 11)2

(01 10 11 10 01)2

(01 11 01 10 01)2

(10 01 01 10 01)2

22

Page 61: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• 1 denotes −1

(01 10 11 01 11)2

(01 10 11 10 01)2

(01 11 01 10 01)2

(10 01 01 10 01)2

22

Page 62: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• 1 denotes −1

(01 10 11 01 11)2

(01 10 11 10 01)2

(01 11 01 10 01)2

(10 01 01 10 01)2

22

Page 63: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• 1 denotes −1

(01 10 11 01 11)2

(01 10 11 10 01)2

(01 11 01 10 01)2

(10 01 01 10 01)2

22

Page 64: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

On-demand negation

// mask_j = 0xFF..F if |j| == i

// mask_j = 0x00..0 if |j| != i

v = mask_0 & T[0];

v |= mask_1 & T[1];

v |= mask_2 & T[2];

v |= mask_3 & T[3];

v |= mask_4 & T[4];

v = SEL(-v, v, sign(j));

// now v = T[i]

• More memory-efficient

• More time-efficient (negation is cheap)

23

Page 65: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Fixed-base scalar multiplication

• Precomputation is free...

• can use large window sizes (still many squarings).• can just precompute all 2iP and do additions (no doublings).

• Combining with windowing:

• just precompute all m2iwP for m ∈ [0, 2w − 1].• ≈ `/w additions, no doublings

• Suppose each table entry takes 64 bytes, ` = 256 and w = 4.

Memory requirement?

24

Page 66: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Memory requirement?

24

Page 67: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Memory requirement?

24

Page 68: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Variable-time instructions

http://www.agner.org/optimize/instruction_tables.pdf

latency

25

http://www.agner.org/optimize/instruction_tables.pdf

Page 69: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

EdDSA signature scheme

B: based point, ` = |B|, H: hash function with 2b-bit outputs

Key generation:k : b-bit secret key

a← H(k)A← aB : public key

Signing:r ← H(a,M)

R← rB

s← (r +H(R,A,M)a) mod `

(R, s) : signature of M

Verification:

sB = R+H(R,A,M)A

• Convince yourself that the scheme works.

26

Page 70: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

R← rB

Verification:

sB = R+H(R,A,M)A

26

Page 71: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

R← rB

Verification:

sB = R+H(R,A,M)A

26

Page 72: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Sliding window• Keep mP with odd m ∈ [0, 2w − 1]

• Shift the window if necessary

(01 10 11 01 11)2

(01 10 11 01 03)2

(01 10 03 01 03)2

(00 30 03 01 03)2

• 4⇒ 3 additions

• Can use signed sliding window

27

Page 73: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

(01 10 11 01 11)2

(01 10 11 01 03)2

(01 10 03 01 03)2

(00 30 03 01 03)2

27

Page 74: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

(01 10 11 01 11)2

(01 10 11 01 03)2

(01 10 03 01 03)2

(00 30 03 01 03)2

27

Page 75: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

(01 10 11 01 11)2

(01 10 11 01 03)2

(01 10 03 01 03)2

(00 30 03 01 03)2

27

Page 76: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

(01 10 11 01 11)2

(01 10 11 01 03)2

(01 10 03 01 03)2

(00 30 03 01 03)2

27

Page 77: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

(01 10 11 01 11)2

(01 10 11 01 03)2

(01 10 03 01 03)2

(00 30 03 01 03)2

• Can use signed sliding window27

Page 78: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Double-scalar multiplications

• Input: public a, b, point P,Q

• Output: aP + bQ

R = 0

R = 2R

if a[i] == 1

R += P

if b[i] == 1

R += Q

• Sliding window can be applied (even different window sizes)

• Multi-scalar multiplication

28

Page 79: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: aP + bQ

R = 0

R = 2R

if a[i] == 1

R += P

if b[i] == 1

R += Q

28

Page 80: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Output: aP + bQ

R = 0

R = 2R

if a[i] == 1

R += P

if b[i] == 1

R += Q

28

Page 81: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery curves1987 Montgomery:• Curve

by2 = x3 + ax2 + x

• Doubling

2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2

4x2(x22 + ax2 + 1)

• Differential addition

(x3, y3)− (x2, y2) = (x1, y1)

(x3, y3) + (x2, y2) = (x5, y5)

⇒ x5 =(x2x3 − 1)2

x1(x2 − x3)2

• We can compute the x coordinate of kP !

• Represent (x, y) as(X : Z), x = X/Z

29

Page 82: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

by2 = x3 + ax2 + x

• Doubling

2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2

4x2(x22 + ax2 + 1)

(x3, y3)− (x2, y2) = (x1, y1)

(x3, y3) + (x2, y2) = (x5, y5)

⇒ x5 =(x2x3 − 1)2

x1(x2 − x3)2

29

Page 83: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

by2 = x3 + ax2 + x

• Doubling

2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2

4x2(x22 + ax2 + 1)

(x3, y3)− (x2, y2) = (x1, y1)

(x3, y3) + (x2, y2) = (x5, y5)

⇒ x5 =(x2x3 − 1)2

x1(x2 − x3)2

29

Page 84: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

by2 = x3 + ax2 + x

• Doubling

2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2

4x2(x22 + ax2 + 1)

(x3, y3)− (x2, y2) = (x1, y1)

(x3, y3) + (x2, y2) = (x5, y5)

⇒ x5 =(x2x3 − 1)2

x1(x2 − x3)2

29

Page 85: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

by2 = x3 + ax2 + x

• Doubling

2(x2, y2) = (x4, y4) ⇒ x4 =(x22 − 1)2

4x2(x22 + ax2 + 1)

(x3, y3)− (x2, y2) = (x1, y1)

(x3, y3) + (x2, y2) = (x5, y5)

⇒ x5 =(x2x3 − 1)2

x1(x2 − x3)2

29

Page 86: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Goal: compute nP, n = (n`−1n`−2 · · ·n0)2

Start from (Q`, R`) = (0, P )

Maintain (Qi, Ri) s.t. Ri = Qi + P ,

Qi = Σ`i=jnj2

j−iP (i decreasing)

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 87: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 88: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 89: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 90: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 91: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 92: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 93: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 94: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 95: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 96: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Montgomery ladder

Qi = Σ`i=jnj2

Output Q0 = nP

if ni = 0,

(Qi, Ri) = (2Qi+1, Qi+1+Ri+1)

if ni = 1,

(Qi, Ri) = (Qi+1+Ri+1, 2Ri+1)

n = (n3n2n1n0)2 = (1101)2 = 13

(Q4, R4) = (0, P )

n3 = 1

(Q3, R3) = (P, 2P )

n2 = 1

(Q2, R2) = (3P, 4P )

n1 = 0

(Q1, R1) = (6P, 7P )

n0 = 1

(Q0, R0) = (13P , 14P )

30

Page 97: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Implementing the Montgomery ladder

dummy

P P ′

2P P + P ′

Ladder Step

ki cswap

• How to implement cswap?• hint: 4 bit operations to conditionally swap 2 bits

31

Page 98: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

dummy

P P ′

2P P + P ′

Ladder Stepki cswap

31

Page 99: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

dummy

P P ′

2P P + P ′

Ladder Stepki cswap

31

Page 100: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Vector units

• Advanced Vector Extensions (AVX)

• Enables vector instructions

32

Page 101: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

2-way vectorization

x

a1 a0

b1 b0

a1b1 a0b0

32 bits32 bits

64 bits64 bits

• Useful for accelerating cryptographic computation

33

Page 102: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Exploiting parallelism: naive way

P1 P2

k1P1 k2P2

• Might be good for busy servers. Other applications?

34

Page 103: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Exploiting parallelism: naive way

P1 P2

k1P1 k2P2

• Might be good for busy servers. Other applications?

34

Page 104: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Exploiting parallelism: key generation

P = s0B + s116B + s2162B + · · ·+ s6316

63B

P0 = s0B + s2162B + · · ·+ s6216

62B

P1 = s1B + s3162B + · · ·+ s6316

62B

P = 16P1 + P0

• Extra benefit of reducing table size

• Tradeoffs between time and space

35

Page 105: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

P = s0B + s116B + s2162B + · · ·+ s6316

63B

P0 = s0B + s2162B + · · ·+ s6216

62B

P1 = s1B + s3162B + · · ·+ s6316

62B

P = 16P1 + P0

35

Page 106: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

P = s0B + s116B + s2162B + · · ·+ s6316

63B

P0 = s0B + s2162B + · · ·+ s6216

62B

P1 = s1B + s3162B + · · ·+ s6316

62B

P = 16P1 + P0

35

Page 107: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

P = s0B + s116B + s2162B + · · ·+ s6316

63B

P0 = s0B + s2162B + · · ·+ s6216

62B

P1 = s1B + s3162B + · · ·+ s6316

62B

P = 16P1 + P0

35

Page 108: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Exploiting parallelism: ladder step

x z x′ z′

+ − + −

× × × ×

× − + −

× × ×

+ ×

×

x2 z2 x3 z3

(A− 2)/4

x1

P P ′

2P P + P ′

www.hyperelliptic.org/EFD/g1p/auto-montgom-xz.html#ladder-mladd-1987-m

36

Page 109: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

x z x′ z′

+ − + −

× × × ×

× − + −

× × ×

+ ×

×

x2 z2 x3 z3

(A− 2)/4

x1

P P ′

2P P + P ′

36

Page 110: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

x z x′ z′

+ − + −

× × × ×

× − + −

× × ×

+ ×

×

x2 z2 x3 z3

(A− 2)/4

x1

P P ′

2P P + P ′

36

Page 111: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

x z x′ z′

+ − + −

× × × ×

× − + −

× × ×

+ ×

×

x2 z2 x3 z3

(A− 2)/4

x1

P P ′

2P P + P ′

36

Page 112: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

x z x′ z′

+ − + −

× × × ×

× − + −

× × ×

+ ×

×

x2 z2 x3 z3

(A− 2)/4

x1

P P ′

2P P + P ′

36

Page 113: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

x z x′ z′

+ − + −

× × × ×

× − + −

× × ×

+ ×

×

x2 z2 x3 z3

(A− 2)/4

x1

P P ′

2P P + P ′

36

Page 114: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

x z x′ z′

+ − + −

× × × ×

× − + −

× × ×

+ ×

×

x2 z2 x3 z3

(A− 2)/4

x1

P P ′

2P P + P ′

36

Page 115: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Exploiting parallelism: doubling (twisted Edwards curves)

• Let (X : Y : Z) := (X/Z, Y/Z)

• Goal: compute (X ′ : Y ′ : Z ′) = 2(X : Y : Z)

X Y Z

+ +

X′ Y ′ Z′

× × × ×

− + − +

× × ×

www.hyperelliptic.org/EFD/g1p/auto-twisted-extended-1.html#doubling-dbl-2008-hwcd

37

Page 116: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Let (X : Y : Z) := (X/Z, Y/Z)

X Y Z

+ +

X′ Y ′ Z′

× × × ×

− + − +

× × ×

37

Page 117: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Let (X : Y : Z) := (X/Z, Y/Z)

X Y Z

+ +

X′ Y ′ Z′

× × × ×

− + − +

× × ×

37

Page 118: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Let (X : Y : Z) := (X/Z, Y/Z)

X Y Z

+ +

X′ Y ′ Z′

× × × ×

− + − +

× × ×

37

Page 119: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Let (X : Y : Z) := (X/Z, Y/Z)

X Y Z

+ +

X′ Y ′ Z′

× × × ×

− + − +

× × ×

37

Page 120: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

• Let (X : Y : Z) := (X/Z, Y/Z)

X Y Z

+ +

X′ Y ′ Z′

× × × ×

− + − +

× × ×

37

Page 121: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Field representation: case of F2255−19• Intuition: radix-264 representation

• each field elements in represented by 4 limbs

f = f0+ f1264+ f22128+ f32192

g = g0+ g1264+ g22128+ g32192

• Multiplication (mod 2256 − 38):

h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1

h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2

h2 = f0g2+ f1g1+ f2g0+ 38f3g3

h3 = f0g3+ f1g2+ f2g1+ f3g0

• Making use MULX: “64× 64→ 64 64”

• Lots of carries!

38

Page 122: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

f = f0+ f1264+ f22128+ f32192

g = g0+ g1264+ g22128+ g32192

h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1

h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2

h2 = f0g2+ f1g1+ f2g0+ 38f3g3

h3 = f0g3+ f1g2+ f2g1+ f3g0

• Making use MULX: “64× 64→ 64 64”

38

Page 123: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

f = f0+ f1264+ f22128+ f32192

g = g0+ g1264+ g22128+ g32192

h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1

h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2

h2 = f0g2+ f1g1+ f2g0+ 38f3g3

h3 = f0g3+ f1g2+ f2g1+ f3g0

• Making use MULX: “64× 64→ 64 64”

38

Page 124: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

f = f0+ f1264+ f22128+ f32192

g = g0+ g1264+ g22128+ g32192

h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1

h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2

h2 = f0g2+ f1g1+ f2g0+ 38f3g3

h3 = f0g3+ f1g2+ f2g1+ f3g0

• Making use MULX: “64× 64→ 64 64”

38

Page 125: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

f = f0+ f1264+ f22128+ f32192

g = g0+ g1264+ g22128+ g32192

h0 = f0g0+ 38f1g3+ 38f2g2+ 38f3g1

h1 = f0g1+ f1g0+ 38f2g3+ 38f3g2

h2 = f0g2+ f1g1+ f2g0+ 38f3g3

h3 = f0g3+ f1g2+ f2g1+ f3g0

• Making use MULX: “64× 64→ 64 64”

38

Page 126: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Adding products of the limbs

low high

hi

prod

+ +

c c

• adding 128-bit products produces 2 carry bits:

→ require 2 “addtion with carry” (adc) instructions

• adc can be expensive:

6x slower than add in terms of througput on some Intel CPUs

39

Page 127: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

low high

hi

prod

+ +

c c

39

Page 128: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

low high

hi

prod

+ +

c c

39

Page 129: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Redundant representation

• Radix-51 representation

f = f0+ f1251+ f22102+ f32153+ f42204

g = g0+ g1251+ g22102+ g32153+ g42204

• Multiplication

h0 = f0g0+ 19f1g4+ 19f2g3+ 19f3g2+ 19f4g1

h1 = f0g1+ f1g0+ 19f2g4+ 19f3g3+ 19f4g2

h2 = f0g2+ f1g1+ f2g0+ 19f3g4+ 19f4g3

h3 = f0g3+ f1g2+ f2g1+ f3g0+ 19f4g4

h4 = f0g4+ f1g3+ f2g2+ f3g1+ f4g0

• Adding upper 64 bits does not generate carries!

⇒ reducing (roughly) half of the adc’s

40

Page 130: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

f = f0+ f1251+ f22102+ f32153+ f42204

g = g0+ g1251+ g22102+ g32153+ g42204

• Multiplication

h0 = f0g0+ 19f1g4+ 19f2g3+ 19f3g2+ 19f4g1

h1 = f0g1+ f1g0+ 19f2g4+ 19f3g3+ 19f4g2

h2 = f0g2+ f1g1+ f2g0+ 19f3g4+ 19f4g3

h3 = f0g3+ f1g2+ f2g1+ f3g0+ 19f4g4

h4 = f0g4+ f1g3+ f2g2+ f3g1+ f4g0

40

Page 131: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

f = f0+ f1251+ f22102+ f32153+ f42204

g = g0+ g1251+ g22102+ g32153+ g42204

• Multiplication

h0 = f0g0+ 19f1g4+ 19f2g3+ 19f3g2+ 19f4g1

h1 = f0g1+ f1g0+ 19f2g4+ 19f3g3+ 19f4g2

h2 = f0g2+ f1g1+ f2g0+ 19f3g4+ 19f4g3

h3 = f0g3+ f1g2+ f2g1+ f3g0+ 19f4g4

h4 = f0g4+ f1g3+ f2g2+ f3g1+ f4g0

40

Page 132: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

f = f0+ f1251+ f22102+ f32153+ f42204

g = g0+ g1251+ g22102+ g32153+ g42204

• Multiplication

h0 = f0g0+ 19f1g4+ 19f2g3+ 19f3g2+ 19f4g1

h1 = f0g1+ f1g0+ 19f2g4+ 19f3g3+ 19f4g2

h2 = f0g2+ f1g1+ f2g0+ 19f3g4+ 19f4g3

h3 = f0g3+ f1g2+ f2g1+ f3g0+ 19f4g4

h4 = f0g4+ f1g3+ f2g2+ f3g1+ f4g0

40

Page 133: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Reduction (carries)

• Carry hi → hi+1 (consider radix 2b):

c = hi � b, hi+1 = hi+1 + c, hi = hi − c

• Long carry chain:

h0 → h1 → h2 → h3 → h4 → h5 → h6 → h7 → h8 → h9 → h0 → h1

(Drawback: suffering from long latency)

• Reducing latency by interleaving carries:

h0 → h1 → h2 → h3 → h4 → h5 → h6

h5 → h6 → h7 → h8 → h9 → h0 → h1

41

Page 134: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Reduction (carries)

c = hi � b, hi+1 = hi+1 + c, hi = hi − c

h0 → h1 → h2 → h3 → h4 → h5 → h6 → h7 → h8 → h9 → h0 → h1

h0 → h1 → h2 → h3 → h4 → h5 → h6

h5 → h6 → h7 → h8 → h9 → h0 → h1

41

Page 135: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Reduction (carries)

c = hi � b, hi+1 = hi+1 + c, hi = hi − c

h0 → h1 → h2 → h3 → h4 → h5 → h6 → h7 → h8 → h9 → h0 → h1

h0 → h1 → h2 → h3 → h4 → h5 → h6

h5 → h6 → h7 → h8 → h9 → h0 → h1

41

Page 136: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Instruction-level optimization

instructions ports latency reciprocal throughput

PADD/SUB(S,US)B/W/D/Q p15 1 0.5

PMULDQ p0 5 1

PSLLDQ, PSRLDQ p5 1 1

PAND PANDN POR PXOR p015 1 0.33

www.agner.org/optimize/instruction_tables.pdf

• Each execution ports can handle 1 (micro-) instruction per cycle

• Useful for estimating the actual performance

• PADD or PSLLDQ to multiply by 2?

• More on www.agner.org

42

www.agner.org

Page 137: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Instruction-level optimization

instructions ports latency reciprocal throughput

PADD/SUB(S,US)B/W/D/Q p15 1 0.5

PMULDQ p0 5 1

PSLLDQ, PSRLDQ p5 1 1

PAND PANDN POR PXOR p015 1 0.33

• Each execution ports can handle 1 (micro-) instruction per cycle

• Useful for estimating the actual performance

• PADD or PSLLDQ to multiply by 2?

• More on www.agner.org

42

www.agner.org

Page 138: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Field inversion

• Used for converting from (X : Y : Z) to (X/Z, Y/Z), for example

• typically only once

• Extended Euclidean algorithm

• a, b 7−→ x, y s.t. xa + yb = gcd(a, b)

• a, p 7−→ x, y s.t. xa + yp = 1, i.e., x ≡ a−1 (mod p)

• not (immediately) constant-time

• Square-and-multiply

• ap−1 ≡ 1 (mod p)⇒ ap−2 is the inverse!

• p− 2 is public

43

Page 139: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Field inversion

43

Page 140: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Field inversion

43

Page 141: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

Field inversion

43

Page 142: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

SageMath

44

Page 143: Software implementations of ECC: security and e ciency · Software implementations of ECC: security and e ciency Tung Chou Osaka University 17 November 2018 Summer School of the ECC

doc.sagemath.org/html/en/constructions/

elliptic_curves.html

45

doc.sagemath.org/html/en/constructions/elliptic_curves.html