Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier...

22
Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center, KACST Riyadh, Saudi Arabia [email protected] Daler Rakhmatov ECE, University of Victoria Victoria, Canada [email protected] IEEE ASAP 2014 Conference IBM Research - Zurich, Switzerland

Transcript of Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier...

Page 1: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields

Hamad Alrimeih

Cyber Security Center, KACST

Riyadh, Saudi Arabia

[email protected]

Daler Rakhmatov

ECE, University of Victoria

Victoria, Canada

[email protected]

IEEE ASAP 2014 Conference IBM Research - Zurich, Switzerland

Page 2: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Outline

• Introduction

• Our implementation

• Fast reduction (NIST prime fields)

• Pipeline structure and timing

• Using proposed multiplier

Page 3: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Introduction

• Modular multiplications: performance + flexibility

• Performance: Use wide datapath, pipelining, and NIST-recommended prime fields (fast reduction)

– P192 = 2192 – 264 – 1

– P224 = 2224 – 296 + 1

– P256 = 2256 – 2224 + 2192 + 296 – 1

– P384 = 2384 – 2128 – 296 + 232 – 1

– P521 = 2521 – 1

• Flexibility: Support all five NIST primes

Page 4: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Our Implementation

• Xilinx Virtex-6 FPGA

– 272-bit datapath

– 100-MHz clock frequency

– 8.4K slices + 289 DSP48 blocks

• Throughput:

– 108 mod. multiplications/s for P192, P224, P256

– 125×105 mod. multiplications/s for P384, P521

• Latency:

– 80 ns for P192, P224, P256

– 200 ns for P384, P521

Page 5: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Fast Reduction Modulo P192

Page 6: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Fast Reduction Modulo P224

Page 7: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Fast Reduction Modulo P256

Page 8: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Fast Reduction Modulo P384

Page 9: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Fast Reduction Modulo P521

Page 10: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Sum-Carry Pairs (P192, P224, P256)

• Non-modular product: Z = S[Z] + C[Z]

– Incomplete sum of partial products (N 32-bit words): S[Z] = (sj

[Z]), j = 0, 1, 2, …, N – 1

– Saved multibit carries: C[Z] = (cj[Z]), j = 1, 2, …, N

• Fast reduction: Z mod p = (U – V) mod p

Page 11: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Sum-Carry Pairs (P384)

• Non-modular product: Zr = S[Zr] + C[Zr]

– Subscript r represents LL (low-low), LH (low-high), HL (high-low), or HH (high-high), referring to half-operand multiplications

• Fast reduction: Zr mod p = (Ur – Vr) mod p

Page 12: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Index Mapping (1)

Page 13: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Index Mapping (2)

Page 14: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Pipeline Structure (1)

Page 15: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Pipeline Structure (2)

Page 16: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Pipeline Timing: P192, P224, P256

Stage I

Cycle 1 Cycle 2

Stage A Stage B

Cycle 3 Cycle 4

Stage C Stage D

Cycle 5 Cycle 6

Stage E Stage F

Cycle 7 Cycle 8

Stage H

Cycle 9

Stage I Stage A Stage B Stage C Stage D Stage E Stage F Stage H

Z0

Z1

Clock

Cycle

Op

era

tio

n

Page 17: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Pipeline Timing: P384, P521

Clock

Cycle

Op

era

tio

n

I

1Cycle:

X0LY0L

X0LY0H

X0HY0L

X0HY0H

A

2

B

3

C

4

DL

5

DH

6

EL

7

EH

8

FL

9

FH

10

GL

11

GH

12 13 14 15 16 17 18 19 20 21 22 23 24 25

I A B C DL DH EL EH FL FH GL GH

I A B C DL DH EL EH FL FH GL GH

I A B C DL DH EL EH FL FH GL GH

Z0

X1LY1L

X1LY1H

X1HY1L

X1HY1H

Z1

I A B C DL DH EL EH FL FH GL GH

I A B C DL DH EL EH FL FH GL GH

I A B C DL DH EL EH FL FH GL GH

I A B C DL DH EL EH FL FH GL GH

26

HL HH

HL HH

27 28

Page 18: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Summary

• Our modular multiplier supports all five NIST prime fields (192/224/256/384/521-bit ECC )

• Its 272-bit datapath has 9 pipeline stages

• It takes 8 clock cycles to produce a multiplication result when using P192, P224, or P256

• It takes 20 clock cycles to produce a multiplication result when using P384 or P521

Page 19: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

ECC Application

• (a) Jacobian point doubling:

QJ ← 2QJ

• (b) Affine-Jacobian point addition or subtraction:

QJ ← ± PA + QJ

Page 20: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Overall Architecture (1)

Page 21: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Overall Architecture (2)

Our Multiplier

Page 22: Pipelined Modular Multiplier Supporting Multiple Standard ...€¦ · Pipelined Modular Multiplier Supporting Multiple Standard Prime Fields Hamad Alrimeih Cyber Security Center,

Thank You!

Questions?