DESIGN AND VALIDATION OF NTRU PUBLIC-KEY CRYPTOSYSTEM
Transcript of DESIGN AND VALIDATION OF NTRU PUBLIC-KEY CRYPTOSYSTEM
DESIGN AND VALIDATION OF NTRU PUBLIC-KEY CRYPTOSYSTEM
Preeti Kamat
B.S., Visveswaraiah Technological University, India, 2005
Jaykumar Patel B.S., Visveswaraiah Technological University, India, 2007
PROJECT
Submitted in partial satisfaction of the requirements for the degrees of
MASTER OF SCIENCE
in
ELECTRICAL AND ELECTRONIC ENGINEERING
at
CALIFORNIA STATE UNIVERSITY, SACRAMENTO
SPRING 2010
ii
DESIGN AND VALIDATION OF NTRU PUBLIC-KEY CRYPTOSYSTEM
A Project
by
Preeti Kamat
Jaykumar Patel
Approved by:
__________________________________, Committee Chair John Balachandra, Ph.D.
__________________________________, Second Reader Preetham Kumar, Ph.D.
____________________________ Date
iii
Students:
Preeti Kamat and Jaykumar Patel
I certify that these students have met the requirements for format contained in the
University format manual, and that this project is suitable for shelving in the Library and
credit is to be awarded for the project.
___________________________, Graduate Coordinator ___________________ Preetham Kumar, Ph.D. Date
Department of Electrical and Electronic Engineering
iv
Abstract
of
DESIGN AND VALIDATION OF NTRU PUBLIC-KEY CRYPTOSYSTEM
NTRU cryptosystem is a relatively new Public Key Cryptosystem. Public Key
Cryptography or Asymmetric Cryptography is used in areas of digital signatures and key
exchange. RSA is an acclaimed Public Key cryptosystem that is in use since 1977.
However, it is very slow in comparison with Symmetric Cryptography systems in
processing bulk data encryption and decryption. In contrast, NTRU runs much faster on
large data systems than RSA and has become a very popular algorithm today in terms of
data encryption and decryption. The key generation process in NTRU is much faster than
that in RSA, and this process is one of the most important processes in Public Key
Cryptography.
FPGAs are one of best hardware used for implementing reconfigurable
computing. Reconfigurable computing is very popular because it is capable of computing
many different applications with a great speed. An important feature of reconfigurable
computing is that computations are performed in hardware, but the flexibility of a
solution in software is maintained at the same time.
The purpose of this project is to initially explain the NTRU algorithm which is a
proprietary algorithm patented by NTRU Cryptosystems. NTRU Crptosystems has
recently become a part of Security Innovations, a leading provider of security solutions.
v
This project advocates a hardware implementation of the NTRU public-key cryptosystem
which is made of three important phases- Key Creation, Encryption and Decryption. The
system has been implemented in Verilog HDL, simulated using Synopsys from VCS and
synthesized using Xilinx ISE Design Suite.
________________________________________________, Committee Chair John Balachandra, Ph.D. ________________________ Date
vi
ACKNOWLEDGEMENTS
We would like to acknowledge and extend our heartfelt gratitude to the following persons
who have made the completion of this project a reality.
Our Project advisor, Dr. John Balachandra, for his valuable advice on the many
algorithms needed to understand and implement NTRU and his constant guidance and
encouragement.
A very sincere thank you, to our Graduate advisor, Dr. Preetham Kumar, for his
continued guidance and support throughout the course of this project
A note of gratitude to our friends, for helping us in the many times we needed a different
point of view than our own
Finally, we would like to extend gratitude to families, for supporting us all through and
most of all God, for giving us the strength and opportunities to be what we are today
vii
TABLE OF CONTENTS
Page
Acknowledgements……………………………………………………………………....vi
List of Tables…………………………………………………………………………….. x
List of Figures………………………………………………………………………….... xi
Chapters
1. INTRODUCTION ……………………………………………………………………1
1.1 Overview…………………………………………………………………....... 1
1.2 Private Key Cryptosystem ………………………………………………....... 4
1.3 Public Key Cryptosystem ………………………………………………….... 5
1.4 NTRU Public Key Cryptosystem ………………………………………......... 6
2. POLYNOMIAL ALGEBRA AND NUMBER THEORY ………………………….... 8
3. DESIGN OF NTRU PKCS ..………………………………………………………… 11
3.1 NTRU Multiplier Design………………………………………………........ 11
3.2 Processing Unit ………………………………………………………..…… 14
3.3 NTRU Multiplier or PM (Polynomial Multiplier) ……………………......... 19
3.3.1 COEFF……………………………………………………………. 19
3.3.2 SHIFTER AND COUNTER…………………………………........ 20
3.4 Key Creator…………………………………………………………………. 20
3.5 NTRU Encryptor………………………………………………………......... 21
3.6 NTRU Decryptor………………………………………………………........ 22
viii
3.7 NTRU PKCS ……………………………………………………………….. 24
4. VALIDATION OF NTRU PKCS ……………………………………………............ 25
4.1 Design Verification ………………………………………………………… 25
4.2 NTRU PKCS – Testbench …………………………………………………. 27
5. SIMULATION RESULTS AND WAVEFORMS ..………………………………… 32
5.1 Low level of security, parameters N=107, q=64, p=3 ……………………... 32
5.2 Small example parameters N=11, q=32, p=3 ……………………………… 37
6. SYNTHESIS FIGURES ……………………………………………………………. 45
6.1 NTRU_Decryptor_Blk ………………………………………………. 45
6.2 NTRU_Decryptor ................................................................................. 46
6.3 NTRU_Encryptor_Blk ......................................................................... 47
6.4 NTRU_Encryptor ................................................................................. 48
6.5 NTRU_Key .......................................................................................... 49
6.6 Mult_Mod............................................................................................. 50
6.7 Polynomial_Mult ................................................................................. 50
6.8 Barrel_shift ………………………………………………………….. 51
6.9 Coeff ……………………………………………………………….... 51
6.10 Bit4_Cnt ………………………………………………………….... 52
6.11 Proc_Unit …………………………………………………………... 52
6.12 Const_Mult ………………………………………………………… 53
7. CONCLUSIONS AND FUTURE WORK…………………………………………. 54
ix
Appendix A. RTL Code ………………………………………………………………... 57
A.1 Parameters N=107, q=64, p=3 …………………………………………….. 57
A.2 Parameters N=11, q=32, p=3 …………………………………………........ 77
Appendix B. Synthesis Reports ………...……………………………………………… 96
B.1 NTRU_Key ………….…………………………………………………….. 96
B.2 NTRU_Encryptor ………………………………………………………… 108
B.3 NTRU_Decryptor ………………………………………………………… 123
Appendix C. The NTRU Public Key Cryptosystem (PKCS) ……………...………..... 137 C.1 NTRU PKCS Parameters………………………………………………..... 137
C.2 Key Creation……………………………………………………………… 138
C.3 Encryption………………………………………………………………… 140
C.4 Decryption………………………………………………………………… 141
References…….……………………………………………………………………….. 144
x
LIST OF TABLES Page
1. Table: 1 PU Truth Table...….………………..…………………………………. 15
2. Table: 2 PU Integer Value..…..………………………………………………… 16
3. Table: 3 PU K-Map..……………………………………………………………. 17
4. Table: 4 NTRU Security Parameters………………………………………….. 138
5. Table: 5 Small Security Parameters ................…………………………........... 138
xi
LIST OF FIGURES
Page
1. Figure 1: Private Key Cryptosystem.…………………………………………….. 4
2. Figure 2: Public Key Cryptosystem….…….…………………………………….. 5
3. Figure 3: Polynomial Multiplication ...…….…………………………………… 12
4. Figure 4: Partial Product Array………….……………………………………… 13
5. Figure 5: Processing Unit………….………………………………………......... 14
6. Figure 6: 8 - Bit Full Adder………….…….………………………………........ 17
7. Figure 7: Coefficient Multiplier….. ...…….……………………………………. 18
8. Figure 8: NTRU Multiplier Design.…..…………………………………............ 19
9. Figure 9: Key Creator…………..……….…………………………………........ 21
10. Figure 10: NTRU Encryption..…….………………………………………........ 21
11. Figure 11: Mult_Mod……………….…….……………………………….......... 22
12. Figure 12: NTRU Decryptor…….........…….…………………………………... 23
13. Figure 13: NTRU PKCS…………...…….……………………………………... 24
14. Figure 14: NTRU_Decryptor_Blk Top Level…………………………………... 45
15. Figure 15: NTRU_Decryptor_Blk Logic Block……………………………....... 45
16. Figure 16: NTRU_Decryptor Top Level..….………………………………....... 46
17. Figure 17: NTRU_Decryptor Logic Block....………………………………....... 46
18. Figure 18: NTRU_Encryptor_Blk Top Level………………………………....... 47
xii
19. Figure 19: NTRU_Encryptor_Blk Logic Block……..………………………..... 47
20. Figure 20: NTRU_Encryptor Top Level…….………………………………...... 48
21. Figure 21: NTRU_Encryptor Logic Block….………………………………….. 48
22. Figure 22: NTRU_Key Top Level.......…….…………………………………... 49
23. Figure 23: NTRU_Key Logic Block…….....…………………………………… 49
24. Figure 24: Mult_Mod Logic Block…..…….………………………………........ 50
25. Figure 25: Polynomial_Mult Logic Block....…………………………………… 50
26. Figure 26: Barrel_Shift Logic Block...…….…………………………………… 51
27. Figure 27: Coeff Logic Block…..... ...…….…………………………………..... 51
28. Figure 28: Bit4_Cnt Logic Block........…….………………………………….... 52
29. Figure 29: Proc_Unit Logic Block.......…….…………………………………… 52
30. Figure 30: Const_Mult Logic Block ...…….………………………………….... 53
1
Chapter 1
INTRODUCTION
1.1 Overview
Today’s world is growing with the technology and power. The growth of the
world depends on communication. This means, communication is growing exponentially.
As the growth of the communication increases, the need for its security also increases in
the same manner. In the last century, communication over the telephone was very limited.
It was only for the short distance communication. Today, one can handle his entire
business in a different country or continent by sitting in the comforts of his office
elsewhere. Here comes the role of security of data. When we talk about the
confidentiality of the data, the best technique that one can point out to is Cryptography.
Cryptography is the best technique to shield the integrity and the confidentiality
of the message stream or data transmitted on communication channel or network [4]. In
simple words, we can say, cryptography is an algorithm that enables us to have a secure
communication between the transmitter and the receiver. Cryptography contains
mathematical operations designed to guard data communication [4]. Mainly,
cryptography is based on mathematical techniques to modify the data. If the original
message bits are converted in some random message bits and transmitted over a
communication channel then if some person intercepts the message or overhears the
message, he would not be able to figure out the exact message that was sent by the
transmitter due to its randomness. We can come up with this randomness of data by
applying some mathematical operation such as multiplication, addition, and
2
transformation. It involves encryption and decryption as back bones of the entire
algorithm.
Encryption: Encryption is the process where data gets encrypted which means the
message to be sent is converted into another message stream. The new message stream
contains the original data bits as well as some pseudo bits to hide the original message
from unwanted entities.
Decryption: Decryption is the process where the encrypted message stream is
converted back to the original message bits. The original message bits can be retrieved
back by discarding the pseudo bits those are being merged while encrypting the original
message.
A Conventional Cryptosystem has the following five major terminologies:
Plaintext – The plaintext is the original message bits that are to be transferred securely
between two parties.
Encryption Algorithm – The encryption algorithm execute different mathematical
application and transformation on the plain text.
Key(s) – Key can be described as some particular critical data utilized by the user at the
transmitter side to encrypt the data and also utilized by the user at receiver side to decrypt
the data.
Cipher – Cipher is an algorithm to convert the plaintext into the coded message stream by
performing some mathematical steps such as multiplication, addition, and substitution.
3
Ciphertext – Ciphertext is the processed plaintext by applying an algorithm known as
cipher using the Key. In short, ciphertext is the secured message that is to be transmitted
over the communication channel.
Decryption Algorithm – Decryption algorithm is the reverse engineering algorithm to
retrieve back the original message bits using the ciphertext, and the key.
There are two constraints for protected use of conventional encryption.
1. Strong Encryption Algorithm
2. Security of Key
Encryption Algorithm: We require a strong encryption algorithm that means we have
to have an encryption algorithm such as a person who knows the algorithm and have the
access of few or more cipher text cannot decipher other cipher texts that are unknown to
him. The strength of an encryption algorithm is defined on the basis of the level of access
of the algorithm and the cipher techniques.
Security of Key: We require a most excellent security for our Key that is being
used to encrypt the data. If someone got to know about the key and if he can access it and
also he does know the algorithm then he can read as well as modify the whole
communication. Thus, the communication is no longer private or confidential. Therefore,
the users must keep the key top secret [1].
It is very vital to keep the key secret because the confidentiality of the key
impacts more on the private communication than the privacy of the algorithm because if
someone knows the algorithm and does not know the key then he cannot access the
communication. Instead if he knows the key that is being used to encrypt the message
4
stream then he can easily read the message and also he might change the message and
send a counterfeit message to mislead the user at the other of the communication channel.
Thus, we can say that privacy of the key is more important than the secrecy of the
encryption algorithm [1].
Today, private key cryptosystem and public key cryptosystem are the most
commonly used cryptosystems.
1.2 Private Key Cryptosystem
Encryption Decryption
Symmetric Key Symmetric Key
Original Data Scrambled Data Original Data
Figure 1: Private Key Cryptosystem
Private Key: This kind of key is also known as conventional key or single key or
symmetric key. In such algorithm symmetric ciphers are used to encrypt the message bits.
The message bits are encrypted using a key and the same key can be used to decrypt the
message. This means the knowledge of just one key does the encryption as well as
decryption. Therefore, in such system the key must be secret between the end users.
Private-key encryption methodology can provide a good level of substantiation.
The data encrypted using a key cannot be decrypted using another key. One needs to have
the exact same cipher to decrypt the transferred data. Thus, if the symmetric key is kept
728492292
4389422
#4!&$*392p432%$#123*
728492292
4389422
5
clandestine between the end users, they can be sure that their communication is secured.
They can get confidential and correct data as long as the communication channel works
fine and the data travels unaffected.
Private-key encryption is efficient only when the symmetric key is kept top secret
by the end users involved in the communication. If the third person somehow identifies
the key, it affects not only the confidentiality but also the data. A third party, with the
knowledge of the symmetric key, can not only decrypt the message, but he can also send
the false data mimicking that he is the one of the end users.
The Data Encryption Standard (DES) and the Advanced Encryption Standard
(AES) are the examples of this encryption system.
1.3 Public Key Cryptosystem
Encryption Decryption
Public Key Private Key
Original Data Scrambled Data Original Data
Figure 2: Public Key Cryptosystem
Public-key cryptosystem is the system in which awareness of encryption key
gives no indication about the decryption key. The public-key cryptosystem uses
asymmetric ciphers. Since there are different keys to encrypt and decrypt the data, it
gives more secured communication. In public-key cryptosystem, each end user has his
7284922
9243894
#4!&$*392p432%$#12
7284922
9243894
6
own private key and a public key. Public key can be known by anyone. To encrypt and
decrypt the message bits, the private key as well as the public key, both are used.
The methodology depicted in the above figure, lets you liberally dispense a public
key, and you will be capable to read data encrypted using this key. To transfer the data to
another person, one encrypts the message stream using his public key, and the person
receives the encrypted message. He decrypts the received message bits with the
appropriate private key. This is how, the whole algorithm work: data encrypted with the
private key “A” can only be decrypted with the appropriate public key “B”. (Private-key
“A” corresponds to public key “B”).
Compared with private-key encryption, public-key encryption requires more
computation and is therefore not always appropriate for large amounts of data.
The NTRU, RSA, and ECC are the examples of this encryption system.
1.4 NTRU Public Key Cryptosystem
Public-Key Cryptosystem, named NTRU stands for Number Theorist Research
Unit. NTRU is ring-based cryptosystem. NTRU was set up in 1996 and turned in to an
absolutely efficient company in 2000. NTRU was recently taken over by
SecurityInnovation, an application security company. NTRU is comparatively a new
cryptography technique that is known to be more proficient than the existing and more
extensively used public-key cryptosystem like RSA. In contrast to RSA, NTRU
necessitates approximately 0( ) process steps and a key length of 0(N), whereas RSA
needs 0( ) process steps and a key length of 0( ). For this reason, NTRU has lesser
complexity and its key size scales at slower rate. NTRU has lesser number of
7
multiplications for encryption and decryption. Hence, it can be implemented more
resourcefully than RSA. As a result this cryptosystem is showing more potential choice to
the more established public-key cryptosystem.
The core of NTRU is designed over an integer ring. Key creation, encryption, and
decryption are the most time consuming processes which point out the multiplication of
two polynomials defined over an integer ring (described with more feature in the
following section). The time consuming operation is the multiplication of the
polynomials. If we are able to save some time for the multiplication or else we can say
that if we speed up the multiplication process then we can come up with the improved
performance of the NTRU system. For that, it is essential to develop software algorithm
or a hardware speed up mechanism such as pipelining. At this point, few software and
hardware implementations are published. It is a well growing field in the field of
communication security.
To understand the mechanism or the process steps of encryption and decryption in
NTRU, first we need to be aware of the algorithm that controls the flow of the process
and how the information gets processed in order to secure the communication between
two parties. Person, who reads, should know polynomial algebra and number theory that
is being used in this project to create a key, encrypt the data and to decrypt the data. The
polynomial courses are described in the next chapter in depth.
8
Chapter 2
POLYNOMIAL ALGEBRA AND NUMBER THEORY
• Modular Arithmetic [3]
Modular arithmetic is simply division with remainder, where you keep the remainder and
throw everything else away. For example,
a = b (modulo m)
simply means that a when divided by m leaves the remainder b. This is the same as
saying that the difference a-b is a multiple of m. The integer m is called the modulus of
the congruence [3].
• Truncated Polynomial Rings [3]
The principal objects used by the NTRU are polynomials of degree N-1 having integer
coefficients:
a = a0 + a1X + a2X2 + a3X3 + . . . + aN-2XN-2 + aN-1XN-1.
The coefficients a0,...,aN-1 are integers. Some of the coefficients are allowed to be 0.
The set of all such polynomials is denoted by R.
The polynomials in R are added together in the usual way by simply adding their
coefficients:
a + b = (a0+b0) + (a1+b1)X + . . . + (aN-1+bN-1)XN-1.
They are also multiplied in almost the usual manner, with one change. After doing the
multiplication, the power XN should be replaced by 1, the power XN+1 should be replaced
by X, the power XN+2 should be replaced by X2, and so on [3].
9
Example : Suppose N=3, and take the two polynomials a = 2–X+3X2and b =1+2X-X2.
Then
a + b = (2-X+3X2) + (1+2X-X2) = 3+X+2X2 and
a*b = (2-X+3X2)*(1+2X-X2) = 2+3X-X2+7X3-3X4 = 2+3X-X2+7-3X = 9-X2.
The following is the general formula for multiplying polynomials in R:
a*b = c0 + c1X + c2X2 + c3X3 + . . . + cN-2XN-2 + cN-1XN-1,
where the kth coefficient ck is given by the formula
ck = a0bk + a1bk-1 + . . . + akb0 + ak+1bN-1 + ak+1bN-2 + . . . aN-1bk+1.
In modern terminology, R is called the Ring of Truncated Polynomials Z[X]/(XN-1) [3].
The NTRU PKCS uses the ring of truncated polynomials R combined with the modular
arithmetic described earlier. These are combined by reducing the coefficients of a
polynomial a modulo an integer q. Thus the expression
a (modulo q)
means to reduce the coefficients of a modulo q. That is, divide each coefficient by q and
take the remainder [3].
To make storage and computation easier, it is convenient to just list the coefficients of a
polynomial without explicitly writing the powers of X. For example, the polynomial
a = a0+ a1X + a2X2 + a3X3 + . . . + aN-2XN-2 + aN-1XN-1
is conveniently written as the list of N numbers: a = (a0, a1, a2, . . . ,, aN-2, aN-1 ).
Note that zeros should be included in the list if some of the powers of X are missing. For
example, when N = 7 the polynomial a = 3+2X2-3X4+X6 is stored as the list (3,0,2,0,-
3,0,1). But if N = 9, then a would be stored as the list (3,0,2,0,-3,0,1,0,0) [3].
10
• Inverses in Truncated Polynomial Rings
The inverse modulo q of a polynomial a is a polynomial A with the property that
a*A = 1 (modulo q)
Not every polynomial has an inverse modulo q, but it is easy to determine if a has an
inverse, and to compute the inverse if it exists [3].
Example: Take N=7, q=11, a = 3+2X2-3X4+X6.
The inverse of a modulo 11 is A= -2+4X+2X2+4X3-4X4+2X5-2X6, since
(3+2X2-3X4+X6)*(-2+4X+2X2+4X3-4X4+2X5-2X6) = -10+22X+22X3-22X6 = 1 (modulo
11).
The next chapter will explain the complete process involved in NTRU [3].
11
Chapter 3
DESIGN OF NTRU PKCS
This section explains the architecture of the NTRU system built in this project
starting from the smallest unit in the design and moving up higher to the bigger blocks.
The NTRU system basically consists of three blocks: Key Creator, Encryptor and
Decryptor. All the 3 blocks use polynomial multiplication and hence, it is important to
choose a fast multiplication algorithm that will quickly multiply the polynomials,
yielding an effective design.
3.1 NTRU Multiplier Design [2]
NTRU is based on polynomial additions and multiplications in the ring R
Z[X]/(XN-1), as explained earlier. Polynomial multiplication is the cyclic convolution of
two polynomials, denoted by ‘*’. The NTRU multiplier designed here has a scalable
architecture and to explain this architecture, we shall consider the following parameter
values: p=3, q =256, N = 5.
The partial product array shown below is parallel in nature, this is because since
the polynomials in the multiplication are all reduced modulo XN-1, all the partial product
terms are exceeding the degree XN-1 after being reduced modulo XN-1, will be added back
to the lower portion of the partial product array. Look at the illustration below-
Since each partial product term is reduced modulo q, the carry propagation is
confined within each column but not across the columns. This eliminates the need to
propagate the carry across columns.
Consider,
12
a = a0 + a1X + a2X2 + a3X3 + a4X4 and {each ai is a 8-bit coefficient-since
q=256 }
b = b0 + b1X + b2X2 + b3X3 + b4X4 {each bi is a 2-bit coefficient-since p=3
}
Now consider the multiplication a*b
Figure 3: Polynomial Multiplication [2]
Thus, the above partial product array gets simplified to the array shown below :
13
Figure 4: Partial Product Array
For the partial product column k, a single processing unit (PU), which will be
explained later, performs the following operation:
c[k] = c[k] + a[i] *b[j] (mod q) where, j = 0,1,….N-1 and i = -j mod N
This PU consists of one coefficient multiplication, one coefficient addition, and a
reduction modulo q. Each of the partial product terms that have been “boxed” need one
PU for their computation and from the figure above, we see that for a given column we
need 5 PU’s to compute product coefficient.
The next section will explain the design of the Processing Unit (PU)
14
3.2 Processing Unit [2]
The Processing Unit (PU) is the heart of the NTRU multiplier which performs the
following coefficient operation : c[k] = c[k] + a[i] *b[j] (mod q)[2]
We know that b[j], the multiplicand, is a 2-bit coefficient and can either be {-
1,0,1} and a[i], the multiplier, is any 8-bit coefficient since it is reduced mod q(=256).
c[k], product coefficient is also 8-bit wide since it is reduced mod q(=256) as well [2].
Figure 5: Processing Unit [8]
The PU consists of a coefficient multiplier and an adder, both of which
incorporate the reduction modulo q. The components of the processing unit consist solely
of combinational logic and are not dependent upon a rising edge clock signal. The
coefficient multiplier computes M = a[i] * b[j] (mod q) portion of the operation [2]. The
main hardware consists of eight 2 by 1-bit multipliers, each of which was designed to
behave according to the truth table shown on the next page.
Note that: a[i]0 = Bit 0 of the 8-bit a[i] coefficient
15
b[j]0 = Bit 0 of the 2-bit b[j] coefficient
b[j]1 = Bit 1 of the 2-bit b[j] coefficient
Ii = Result of multiplication of a[i]0 and b[j]
Table 1: PU Truth Table [2]
X is nothing but don’t care case.
In the above truth table the values marked by * may seem odd which is explained
below:
This design infers the 2-bit demonstration for b[j] dissimilarly than its decimal equivalent
according to the table shown below [2]:
16
Table 2: PU Integer Value [2]
So for the case b[j] = (11)2 = -1, it is necessary that a[i]0 be converted into it’s 2-s
complement representation in order for the adder to subtract. However, to keep the design
simple, the multiplier only inverts the value of a[i]0 . The two's complement conversion is
completed by setting the carry-in of the adder to `1'. This is accomplished by the AND
gate shown in figure 3.
For the case b[j] = (10)2 = 2, it should be noted that this operation is not
performed by the main hardware of the multiplier as indicated by the don't care (X)
condition in the table. Hence, a multiplexer is needed to pass the left shifted value of a[i]0
for when b[j] = 2, otherwise for the other 3 combinations of b[j], I is passed to M [2].
The reduction modulo q portion of the equation is handled by ignoring any carries
that exceed the 8-bit boundary, which would only occur for the case b[j] = 2 [2].
The Karnaugh-map for the truth table of I[i] is as follows:
17
Table 3: PU K-map [2]
This gives us the expression for all the I[i]s.
As shown in the figure 4 on the next page, if b[j] = (10)2, the output of the AND
gate is = 1 which makes the selector input of the multiplexer =1 and the left-shifted-by-2
value of a6..0[i] appears at the output of the multiplexer, otherwise the value of I appears
as the output of the multiplexer.
Finally, the output of the multiplexer, M, is passed on to the 8-bit adder for
accumulation which is responsible for computing c[k] = M + c[k]. Again, the reduction
modulo q portion of the equation is handled by ignoring any carries that exceed the 8-bit
boundary.
The 8-by-8 bit adder is a Full adder with the Final carry out ignored.
Figure 6: 8-bit Full Adder
18
Figure 7: Coefficient Multiplier [2]
19
3.3 NTRU Multiplier or PM (Polynomial Multiplier)
Now that the operation of the PU has been fully understood, the next block to be
studied is the NTRU Multiplier in detail (also referred to as Polynomial Multiplier or
PM).
This block consists of COEFF, SHIFTER AND COUNTER blocks
Figure 8: NTRU Multiplier Design
3.3.1 COEFF:
From the above figure, we see that each column can be processed independently.
In this case, each column of partial products consists of 5 PU’s i.e. number of PU’s
needed for a given column=N. Add the output of the previous PU as input carry to the
20
one below it and finally, we arrive at the coefficient of that corresponding column. In this
case, since the product is also reduced modq, this coefficient c[k] is 8-bits too.
3.3.2 SHIFTER AND COUNTER
From the above figure, we see that in each column, the order in which b0 b1 b2 b3
b4 are listed is the same, it is the values of a0 a1 a2 a3 a4 that change their sequence (look at
direction of arrows). This capability has been implemented using a Barrel Shifter, which
is used to shift the multiplier coefficients that are input to the PM, such that partial
products of the next column are computed in the correct order. The size of this shifter
will be 8*4=40 ie, q*N.
To determine how many times the shift needs to be performed, a counter has been
designed. These counter increments with each shift and goes to zero when the shift has to
stop. In this case, we have to shift the coefficients of the multiplier 5 times- which means,
the counter needs to be a 3-bit counter i.e Nlog2-counter. Since each coefficient of the
multiplier is 8bits long, each shift rotates the multiplier by 8 bits.
3.4 Key Creator:
This block creates the public key ‘h’ for each NTRU user. Public key is given as,
h = p.fq*g mod q ,where p=3
This computation is done with two blocks- Constant Multiplier and Polynomial
Multiplier(PM) Constant multiplier (CM)- Multiplies each coefficient of the polynomial
fq with the constant p Polynomial multiplier- Multiplies two polynomials as described
above- output of the CM with g- and reduces the output modulo q, generates public key
h.
21
Figure 9: Key Creator
3.5 NTRU Encryptor:
This block computes the encrypted message for secure data exchange. Encrypted
message is given as,
e = r*h + m (mod q)
Where, r is the random polynomial selected
h is the public key
m is the message to be encrypted.
PM - Multiplies the polynomials r and h
Coefficient adder- adds the output of PM to message m, generates encrypted message e.
Figure 10: NTRU Encryptor
22
3.6 NTRU Decryptor:
This block performs the final and most important operation of the system,
decrypting the received encrypted message. This process is explained as below-
Step 1: a = f*e ( mod q )
Step 2: Shift the coefficients of a from (0,q-1) to (-q/2, q/2)
Step 3: b = a ( mod p )
Step 4: c = fp*b (mod p)
Decryption basically involves polynomial multiplication and reduction of the product
mod p. Polynomial multiplication only performs mod q on the product and hence, we
design another block that performs mod p on the output of the PM.
Figure 11: Mult_Mod
The mult_mod block is the basic block of the decryption process.
Step1-3 is performed by one mult_mod block i.e. polynomials f and e are multiplied and
reduced mod p.
For eg, if the result of the multiplication f*e mod q is:
23
a = 3 - 7X - 10X2 - 11X3 + 10X4 + 7X5 + 6X6 + 7X7 + 5X8 - 3X9 - 7X10 (mod
256)
The mult_mod block returns,
b = - X - X2 + X3 + X4 + X5 + X7 - X8 - X10 (mod 3)
[3mod3=0, -7mod3=-1, -10mod3=-1, -11mod3=1, 10mod=1, 7mod3=1, 6mod3=0 etc.]
Next, Step 4: c = fp*b (mod p) is performed by another mult_mod block. The output of
the previous mult_mod block and fp are given as inputs to this block, which agin
performs polynomial multiplication followed by mod p on the product. This result is the
final decrypted message! This message should be equal to the message initially sent.
Figure 12: NTRU Decryptor
24
3.7 NTRU PKCS
A block diagram of the entire cryptosystem with inputs and outputs from the three main
blocks discussed above can be drawn as:
Figure 13: NTRU PKCS
25
Chapter 4
VALIDATION OF NTRU PKCS
4.1 Design Verification
The previous section showed the different steps involved in the design of the NTRU
PKCS. This section will show the implementation and verification of the design with
some examples. Each block of the design has been written in HDL Verilog and compiled
and simulated using VCS Synopsys. A bottom up approach has been used in this design.
Each block/module has been individually written and verified and then
merged/instantiated into another block.
The NTRU PKCS takes the following inputs:
Maximum degree of polynomials, N
Small modulo, p
Big modulo, q
Polynomials f, g
Inverse of f mod p, fp
Inverse of f mod q, fq
Random polynomial, r chosen by message sender
Message polynomial, m sent by sender
To demonstrate the system designed, we use the below inputs
N=11 q= 25=32 p=3.
Let, f = -1 + X + X2 - X4 + X6 + X9 - X10
g = -1 + X2 + X3 + X5 - X8 - X10
26
Let us represent the above polynomials as:
f = {-1,1,1,0,-1,0,1,0,0,1,-1}
g = {-1,0,1,1,0,1,0,0,-1,0,-1}
Each coefficient of a given degree has a fixed position in the array and hence if any
degree is missing, that corresponding coefficient should be represented by a 0.
Since N=11, f and g have 11 coefficients, each 2 bits wide. Hence, f, g are 11*2=22 bits
wide. The inverses of these polynomials are
• fp = {1,2,0,2,2,1,0,2,1,2,0}
• fq = {5,9,6,16,4,15,16,22,20,18,30 }
• fp has 11 coefficients, each 2 bits long, hence fp = 22 bits
• fq has 11 coefficients, each 5 bits long, hence fq=55 bits
Using h = pfq *g mod q, the public key is calculated.
• The key created h is {8,25,22,20,12,24,15,19,12,19,16}.
• h has 11 coefficients, each 5 bits long, hence h=55 bits
Now that the public key is generated, we are ready to encrypt and send a message. To
create this message, we use a LFSR. The LFSR is a Linear Shift Feedback Register that is
used to generate random numbers.
• The message generated is, m = {3,0,3,1,0,1,0,0,3,0,3}.
• m has 11 coefficients, each 2 bits long, hence m = 22 bits
Using e = pr*h + m, the encrypted message is calculated and transmitted to the receiver.
• The encrypted message generated is, e = {14,11,25,24,15,17,30,7,25,5,17}
• m has 11 coefficients, each 5 bits long, hence m=55 bits
27
Using b = f*e ( mod p) and c = fp*b (mod p), the received encrypted message is
decrypted
• The first mult_mod block generates b = {0,1,3,3,0,3,0,1,3,3,1}
• b has 11 coefficients, each 2 bits long, hence b = 22 bits
• The decrypted message generated by the second mult_mod block is c =
{3,0,3,1,0,1,0,0,3,0,3}
• c has 11 coefficients, each 2 bits long, hence c = 22 bits
• This decrypted message, c is the same as the message, m sent by the sender and
hence the NTRU PKCS design is verified!
• This system may not function properly only in the case where all the coefficients
of the message to be send are the same
4.2 NTRU PKCS- Testbench
After designing the Key Creator, Encryptor and Decryptor blocks, a unified
testbench module was written, which instantiates the above blocks in it. This way, we can
consider the 3 blocks as part of one system, versus separately providing inputs to each of
them. The values of the input polynomials- f, g, r, the inverses fp, fq are accepted in this
block and passed on to the appropriate sub-module. The testbench also has a clock
generator block that generates the clock signals of a given period.
The testbench contains 3 signals that signify the completion of each stage of the
cryptosystem- key_complete, encrypt_complete, decrypt_complete. These signals
become high after their corresponding stage is complete. Once all the 3 signals are high,
it means that the decryption process is complete. At this point, a comparator block in the
28
testbench checks the sent message and the decrypted message. If they are the same, it
gives out a message showing a successful output, while if they do not match, an
unsuccessful message is sent out.
The output for the case explained in section 5.1 is shown below.
############################################################################## #################### NTRU PKCS for N=11 q=5 p=3 ######################### Input polynomials, f= 3 1 1 0 3 0 1 0 0 1 3 g= 3 0 1 1 0 1 0 0 3 0 3 Inverse of f modp, Fp= 1 2 0 2 2 1 0 2 1 2 0 Inverse of f modq, Fq= 5 9 6 16 4 15 16 22 20 18 30 Random polynomials, r = 3 0 1 1 1 3 0 3 0 0 0 Message m= 3 0 0 1 3 0 0 0 3 1 1 ############################################################################## time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 0 0 0 0 0 0 0 0 0 0 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 e= 0 0 0 0 0 0 0 0 0 0 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 c= 0 x x x x x x x x x 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 c= 0 0 0 0 0 0 0 0 0 0 0 time = 15: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 0 0 0 0 0 0 0 0 0 0 time = 25: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 0 0 0 0 0 0 0 0 0
29
time = 35: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 0 0 0 0 0 0 0 0 time = 45: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 0 0 0 0 0 0 0 time = 55: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 0 0 0 0 0 0 time = 65: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 0 0 0 0 0 time = 75: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 0 0 0 0 time = 85: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 19 0 0 0 time = 95: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 19 12 0 0 time = 105: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 19 12 19 0 time = 115: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 19 12 19 16 time = 120: key_complete=1 encrypt_complete=0 decrypt_complete= 0 h= 0 0 0 0 0 0 0 0 0 0 0 time = 230: key_complete=1 encrypt_complete=1 decrypt_complete= 0 e= 14 11 26 24 14 16 30 7 25 6 19 time = 345: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 0 0 0 0 0 0 0 0 time = 375: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 0 0 0 0 0 0 0 time = 385: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 3 0 0 0 0 0 0 time = 425: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 3 0 0 0 3 0 0
30
time = 435: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 3 0 0 0 3 1 0 time = 445: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 3 0 0 0 3 1 1 time = 450: key_complete=1 encrypt_complete=1 decrypt_complete= 1 c= 0 0 0 1 3 0 0 0 3 1 0 time = 450: key_complete=1 encrypt_complete=1 decrypt_complete= 1 c= 0 0 0 0 0 0 0 0 0 0 0 ############################################################################## ###### Success!!! The decrypted message is the same as the original message sent ###### Public key h= 8 25 22 20 12 24 15 19 12 19 16 Message sent m= 3 0 0 1 3 0 0 0 3 1 1 Encrypted message e= 14 11 26 24 14 16 30 7 25 6 19 Decrypted message c = 3 0 0 1 3 0 0 0 3 1 1 #############################################################################
The NTRU system was also designed for a more realistic example- for low level of
security. The parameters for this are N=107, p=3, q=64. To make the hardware design
simple and reuse smaller blocks instead of building one big block for N=107, we used a
different approach to design the system for this set of inputs.
We first designed a system for N=7,q=64,p=3. The different polynomials required for this
set are explained below.
f, g, fp, m, r = 7*2 = 14 bits
fq = 7*6 = 42 bits
Key, h = pfq*g (mod 64) = 7*6 = 42 bits
31
Encrypted message, ei = ri*h + mi (mod 64) = 7*6 = 42 bits
Decrypted message, c = 7*2 = 14 bits
Once this was verified, we applied the same design above to calculate the results
for the NTRU low level security parameters. The upper 10 bits of the message were set to
zeros since the actual message to be sent was 107*2=214 bits wide. The message was
split into smaller chunks, each of which was encrypted and decrypted individually to give
the final result. The polynomials in this design are explained below.
N = 16*7=112
f, g, fp, m, r = 16*7*2 = 214 bits (Top 10 bits set to 0)
Key, h = pfq*g (mod 64) = 16*7*6 = 672 bits
Encrypted message, ei = ri*h + mi (mod 64) = 16*7*6 = 672 bits
Decrypted message, c = 16*7*2 = 214 bits (Top 10 bits set to 0)
Hence, the NTRU cryptosystem was verified for different sets of input values.
This system was synthesized using Xilinx ISE Design Suite. The Key creator,
Encryptor and Decryptor blocks were individually synthesized and all the blocks of the
system were found to be perfectly synthesizable, without any issues. The synthesis results
of each of these blocks are included in Appendix C.
32
Chapter 5
SIMULATION RESULTS AND WAVEFORMS
5.1 Low level of security, parameters N=107, q=64, p=3
Chronologic VCS simulator copyright 1991-2008
Contains Synopsys proprietary information. Compiler version B-2008.12-9; Runtime version B-2008.12-9; Apr 3 20:11 2010 VCD+ Writer B-2008.12-9 Copyright 2005 Synopsys Inc. #################################################################################################################################### ############################################ NTRU PKCS for N=11 q=5 p=3 ############################################ Input polynomials, f= 0 0 1 1 3 0 0 g= 3 1 0 1 3 0 0 Inverse of f modp, Fp= 1 1 0 1 63 0 63 Inverse of f modq, Fq= 9 11 62 13 49 29 20 Random polynomials, r = 0 3 1 3 3 3 0 Message m= 9177447591388863703 #################################################################################################################################### time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 0 0 0 0 0 0 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 e= 0 0 0 0 0 0 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 c= 0 x x x x x 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 c= 0 0 0 0 0 0 0
33
time = 15: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 13 0 0 0 0 0 0 time = 25: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 13 62 0 0 0 0 0 time = 35: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 13 62 12 0 0 0 0 time = 45: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 13 62 12 50 0 0 0 time = 55: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 13 62 12 50 26 0 0 time = 65: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 13 62 12 50 26 21 0 time = 75: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 13 62 12 50 26 21 8 time = 85: key_complete=1 encrypt_complete=0 decrypt_complete= 0 h= 0 0 0 0 0 0 0 time = 150: key_complete=1 encrypt_complete=1 decrypt_complete= 0 e= 52 27 25 7 7 0 10 time = 225: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 0 0 0 0 time = 235: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 1 0 0 0 0 0 time = 245: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 1 1 0 0 0 0 time = 255: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 1 1 3 0 0 0 time = 265: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 1 1 3 0 0 0 time = 275: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 1 1 3 0 3 0
34
time = 285: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 1 1 3 0 3 1 time = 295: key_complete=1 encrypt_complete=1 decrypt_complete= 1 c= 0 1 1 3 0 3 0 time = 295: key_complete=1 encrypt_complete=1 decrypt_complete= 1 c= 0 0 0 0 0 0 0 #################################################################################################################################### ###################### Success!!! The decrypted message is the same as the original message sent ###################### Public key h= 13 62 12 50 26 21 8 Message sent m= 9177447591388863703 Encrypted message e= 52 27 25 7 7 0 10 Decrypted message c= 9177447591388863703 #################################################################################################################################### $finish called from file "ntru_pkcs.v", line 131. $finish at simulation time 320 V C S S i m u l a t i o n R e p o r t Time: 320 CPU Time: 0.070 seconds; Data structure size: 0.3Mb Sat Apr 3 20:11:34 2010
35
Waveform: 1
36
Waveform: 2
37
Waveform: 3
5.2 Small example parameters N=11, q=32, p=3
Chronologic VCS simulator copyright 1991-2008
Contains Synopsys proprietary information. Compiler version B-2008.12-9; Runtime version B-2008.12-9; Apr 3 20:04 2010 VCD+ Writer B-2008.12-9 Copyright 2005 Synopsys Inc. #################################################################################################################################### ############################################ NTRU PKCS for N=11 q=5 p=3 ############################################
38
Input polynomials, f= 3 1 1 0 3 0 1 0 0 1 3 g= 3 0 1 1 0 1 0 0 3 0 3 Inverse of f modp, Fp= 1 2 0 2 2 1 0 2 1 2 0 Inverse of f modq, Fq= 5 9 6 16 4 15 16 22 20 18 30 Random polynomials, r = 3 0 1 1 1 3 0 3 0 0 0 Message m= 3 0 0 1 3 0 0 0 3 1 1 #################################################################################################################################### time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 0 0 0 0 0 0 0 0 0 0 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 e= 0 0 0 0 0 0 0 0 0 0 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 c= 0 x x x x x x x x x 0 time = 0: key_complete=0 encrypt_complete=0 decrypt_complete= 0 c= 0 0 0 0 0 0 0 0 0 0 0 time = 15: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 0 0 0 0 0 0 0 0 0 0 time = 25: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 0 0 0 0 0 0 0 0 0 time = 35: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 0 0 0 0 0 0 0 0 time = 45: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 0 0 0 0 0 0 0 time = 55: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 0 0 0 0 0 0 time = 65: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 0 0 0 0 0 time = 75: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 0 0 0 0
39
time = 85: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 19 0 0 0 time = 95: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 19 12 0 0 time = 105: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 19 12 19 0 time = 115: key_complete=0 encrypt_complete=0 decrypt_complete= 0 h= 8 25 22 20 12 24 15 19 12 19 16 time = 120: key_complete=1 encrypt_complete=0 decrypt_complete= 0 h= 0 0 0 0 0 0 0 0 0 0 0 time = 230: key_complete=1 encrypt_complete=1 decrypt_complete= 0 e= 14 11 26 24 14 16 30 7 25 6 19 time = 345: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 0 0 0 0 0 0 0 0 time = 375: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 0 0 0 0 0 0 0 time = 385: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 3 0 0 0 0 0 0 time = 425: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 3 0 0 0 3 0 0 time = 435: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 3 0 0 0 3 1 0 time = 445: key_complete=1 encrypt_complete=1 decrypt_complete= 0 c= 3 0 0 1 3 0 0 0 3 1 1 time = 450: key_complete=1 encrypt_complete=1 decrypt_complete= 1 c= 0 0 0 1 3 0 0 0 3 1 0 time = 450: key_complete=1 encrypt_complete=1 decrypt_complete= 1 c= 0 0 0 0 0 0 0 0 0 0 0 ####################################################################################################################################
40
###################### Success!!! The decrypted message is the same as the original message sent ###################### Public key h= 8 25 22 20 12 24 15 19 12 19 16 Message sent m= 3 0 0 1 3 0 0 0 3 1 1 Encrypted message e= 14 11 26 24 14 16 30 7 25 6 19 Decrypted message c = 3 0 0 1 3 0 0 0 3 1 1 #################################################################################################################################### $finish called from file "ntru_pkcs.v", line 142. $finish at simulation time 480 V C S S i m u l a t i o n R e p o r t Time: 480 CPU Time: 0.040 seconds; Data structure size: 0.0Mb Sat Apr 3 20:04:47 2010
41
Waveform: 4
42
Waveform: 5
43
Waveform: 6
44
Waveform: 7
45
Chapter 6
SYNTHESIS FIGURES
6.1 NTRU_Decryptor_Blk
Figure 14: NTRU_Decryptor_Blk Top Level
Figure 15: NTRU_Decryptor_Blk Logic Block
46
6.2 NTRU_Decryptor
Figure 16: NTRU_Decryptor Top Level
Figure 17: NTRU_Decryptor Logic Block
47
6.3 NTRU_Encryptor_Blk
Figure 18: NTRU_Encryptor_Blk Top Level
Figure 19: NTRU_Encryptor_Blk Logic Block
48
6.4 NTRU_Encryptor
Figure 20: NTRU_Encryptor Top Level
Figure 21: NTRU_Encryptor Logic Block
49
6.5 NTRU_Key
Figure 22: NTRU_Key Top Level
Figure 23: NTRU_Key Logic Block
50
6.6 Mult_Mod
Figure 24: Mult_Mod Logic Block
6.7 Polynomial_Mult
Figure 25: Polynomial_Mult Logic Block
51
6.8 Barrel_Shift
Figure 26: Barrel_Shift Logic Block
6.9 Coeff
Figure 27: Coeff Logic Block
52
6.10 Bit4_Cnt
Figure 28: Bit4_Cnt Logic Block
6.11 Proc_Unit
Figure 29: Proc_Unit Logic Block
53
6.12 Const_Mult
Figure 30: Const_Mult Logic Block
54
Chapter 7
CONCLUSIONS AND FUTURE WORK
The NTRU Public-key cryptosystem was studied and a hardware implementation
for this sytem was designed using Verilog HDL. This system has been verified for
different input values, N=7, N=11, N=107, q=32, q=64. This is a very flexible design,
since we have been able to test out different values of input polynomials with minor
changes made to the design for different sets of inputs. The size of polynomials in the
Processing Unit, need to be modified for different values of q. The size of the barrel
shifter needs to be modified for different values of N. Once these values are changed, the
system can easily encrypt and decrypt a given message.
The next step would be to implement this system for higher levels of security like
moderate (N=164) and high (N=503) levels. This can be done by starting with a smaller
set of inpouts and then building the bigger design by instantiating the smaller blocks in
them. In order to do this, we need to find the different input polynomial values to the
design. The NTRU Company has not made this data public for all values of security.
The testbench designed in this project is very user-friendly and is able to cater to
any different value of input provided. The output generated by the decryptor goes to a
comparaotor, which checks this output against the input message sent and depending on
the outcome gives out a success or failure message. This makes sure that the process is
totally automated and is not prone to any kind of calculation error.
55
An implementation based on Montgomery Multiplication was also studied as part
of the project research. A hardware implementation of NTRU using this multiplication
algorithm can be done to increase the multiplication speed [5].
The field of data security is a very important field and will always continue to be
one. Hence, accurate and reliable encryption-decryption algorithms are very essential. A
hardware implementation of this algorithm enables us to implement this algorithm on
FPGA’s, which help execute the algorithm much faster with more reliability. Thus the
NTRU Public Key Cryptosystem designed in this project is of significant importance in
the field of data security today.
56
APPENDICES
57
APPENDIX A
RTL Code
// Code for 4-bit counter
A.1 PARAMETERS N=107, q=64, p=3
`ifdef _bit4_cnt_ `else `define _bit4_cnt_ module bit4_cnt(cnt4,rst,clk); output [3:0] cnt4; input rst,clk; reg [3:0] cnt4=0; always @ (posedge clk or posedge rst) begin if (rst) cnt4 <= 0; else cnt4 <= cnt4+1; end endmodule // bit4_cnt `endif ___________________________________________________________ // Code for Barrel Shifter //Used to shift the coefficients of the inputs of the polynomial multiplication process to the correct position `ifdef _BARREL_SHIFT_ `else `define _BARREL_SHIFT_ module barrel_shift(shift_out,poly_in,shift_num); parameter N = 7; parameter q = 6; parameter big_size = N*q; //Size of poly_in in bits
58
input [3:0] shift_num; //Number of positions to be shifted input [big_size-1:0] poly_in; //Unshifted value of input polynomial output reg[big_size-1:0] shift_out=0; //Shifted value of polynomial always @ (poly_in or shift_num) begin case (shift_num) 4'd0 : shift_out <= {poly_in[11:6],poly_in[17:12],poly_in[23:18],poly_in[29:24],poly_in[35:30],poly_in[41:36],poly_in[5:0]}; 4'd1 : shift_out <= {poly_in[17:12],poly_in[23:18],poly_in[29:24],poly_in[35:30],poly_in[41:36],poly_in[5:0],poly_in[11:6]}; 4'd2 : shift_out <= {poly_in[23:18],poly_in[29:24],poly_in[35:30],poly_in[41:36],poly_in[5:0],poly_in[11:6],poly_in[17:12]}; 4'd3 : shift_out <= {poly_in[29:24],poly_in[35:30],poly_in[41:36],poly_in[5:0],poly_in[11:6],poly_in[17:12],poly_in[23:18]}; 4'd4 : shift_out <= {poly_in[35:30],poly_in[41:36],poly_in[5:0],poly_in[11:6],poly_in[17:12],poly_in[23:18],poly_in[29:24]}; 4'd5 : shift_out <= {poly_in[41:36],poly_in[5:0],poly_in[11:6],poly_in[17:12],poly_in[23:18],poly_in[29:24],poly_in[35:30]}; 4'd6 : shift_out <= {poly_in[5:0],poly_in[11:6],poly_in[17:12],poly_in[23:18],poly_in[29:24],poly_in[35:30],poly_in[41:36]}; 4'd7 : shift_out <=0; 4'd8 : shift_out <=0; 4'd9 : shift_out <=0; 4'd10 : shift_out <=0; 4'd11 : shift_out <=0; 4'd12 : shift_out <=0;
59
4'd13 : shift_out <=0; 4'd14 : shift_out <=0; 4'd15 : shift_out <=0; default : ; endcase end endmodule // barrel_shift `endif ___________________________________________________________ // Code for Procesing Unit- heart of the polynomial multiplier `ifdef _PROC_UNIT_ `else `define _PROC_UNIT_ module proc_unit (out,in1,in2,carry_in); parameter q = 6; input [q-1:0] in1,carry_in; input [1:0] in2; output [q-1:0] out; wire [q-1:0] in_10; wire [q-1:0] temp_out; wire [q-1:0] out; wire [q-1:0] shift_2; assign temp_out[5]=(((~in1[5]) & in2[1]) | (in1[5]&(~in2[1])&in2[0])); assign temp_out[4]=(((~in1[4]) & in2[1]) | (in1[4]&(~in2[1])&in2[0])); assign temp_out[3]=(((~in1[3]) & in2[1]) | (in1[3]&(~in2[1])&in2[0])); assign temp_out[2]=(((~in1[2]) & in2[1]) | (in1[2]&(~in2[1])&in2[0])); assign temp_out[1]=(((~in1[1]) & in2[1]) | (in1[1]&(~in2[1])&in2[0])); assign temp_out[0]=(((~in1[0]) & in2[1]) | (in1[0]&(~in2[1])&in2[0])); assign shift_2= in1[q-1:0]<<1;
60
assign in_10 = (in2[1]&(~in2[0])) ? shift_2 : temp_out ; assign out = in_10 + carry_in + (in2[1]&in2[0]); endmodule // proc_unit `endif ___________________________________________________________ // Code for coeff block // Compute final polynomial coefficient by adding each column element got by polynomial multiplication process `ifdef _COEFF_ `else `define _COEFF_ module coeff(C_final,A,B); parameter N = 7 ; parameter q = 6 ; parameter big_size= N*q; //Size of A in bits parameter small_size= N*2; //Size of B in bits input [big_size-1:0] A; input [small_size-1:0] B; output wire[q-1:0] C_final; wire [big_size-1:0] C; assign C_final = C[41:36]; //Final value of the polynomial coefficient proc_unit proc0(.out(C[5:0]),.in1(A[5:0]),.in2(B[1:0]),.carry_in(6'b0)); proc_unit proc1(.out(C[11:6]),.in1(A[11:6]),.in2(B[3:2]),.carry_in(C[5:0])); proc_unit proc2(.out(C[17:12]),.in1(A[17:12]),.in2(B[5:4]),.carry_in(C[11:6])); proc_unit proc3(.out(C[23:18]),.in1(A[23:18]),.in2(B[7:6]),.carry_in(C[17:12]));
61
proc_unit proc4(.out(C[29:24]),.in1(A[29:24]),.in2(B[9:8]),.carry_in(C[23:18])); proc_unit proc5(.out(C[35:30]),.in1(A[35:30]),.in2(B[11:10]),.carry_in(C[29:24])); proc_unit proc6(.out(C[41:36]),.in1(A[41:36]),.in2(B[13:12]),.carry_in(C[35:30])); endmodule `endif ___________________________________________________________ // Code for NTRU Multiplier or Polynomial Multiplication Engine (PME) // Performs polynomial multiplication on two input polynomials, by shifting one of the inputs using the barrel shifter, to compute product `ifdef _POLYNOMIAL_MULT_ `else `define _POLYNOMIAL_MULT_ module polynomial_mult(poly_prod,poly1,poly2,poly_done,clk,rst); parameter N= 7; parameter q= 6; parameter big_size= N*q; //Size of poly1 in bits parameter small_size= N*2; //Size of poly2 in bits input clk,rst; input [big_size-1:0] poly1; input [small_size-1:0] poly2; output reg[big_size-1:0] poly_prod=0; output wire poly_done; wire [big_size-1:0] poly1_shift; wire [q-1:0] coeff; wire [big_size-1:0] prod_temp; wire [3:0] shift_cnt; reg poly_done_temp=0; assign poly_done=poly_done_temp;
62
assign prod_temp = coeff<<(6*shift_cnt); //Intermediate shifted product Coefficients bit4_cnt count(.cnt4(shift_cnt),.rst(rst),.clk(clk)); barrel_shift shift(.shift_out(poly1_shift),.poly_in(poly1),.shift_num(shift_cnt)); coeff cf(.C_final(coeff),.A(poly1_shift),.B(poly2)); always @ (posedge clk or posedge rst)//or posedge poly_done begin if (rst) poly_prod <=0 ; else if (poly_done) poly_prod <=0 ; else poly_prod <= poly_prod + prod_temp; end always @ (negedge clk) begin if(shift_cnt == 4'b0111) begin poly_done_temp <=1; end end endmodule `endif ___________________________________________________________ // Code for the p.Fq multiplier block // Multiplies the constant integer value of 'const' with each and every coefficient of the 'poly' polynomial `ifdef _const_mult_ `else `define _const_mult_ module const_mult(prod,poly,const); parameter N= 7; parameter q= 6;
63
parameter big_size= N*q; output [big_size-1:0] prod; input [big_size-1:0] poly; input [1:0] const; assign prod[5:0] = poly[5:0]*const; assign prod[11:6] = poly[11:6]*const; assign prod[17:12] = poly[17:12]*const; assign prod[23:18] = poly[23:18]*const; assign prod[29:24] = poly[29:24]*const; assign prod[35:30] = poly[35:30]*const; assign prod[41:36] = poly[41:36]*const; endmodule // const_mult `endif ___________________________________________________________ // Code for Key Creator block // Key 'h' is calculated using Polynomial multiplication : (p*Fq*g)mod64 module ntru_key(key,f_invq,g_poly,key_done,clk,rst); parameter N= 7; parameter q= 6; parameter p= 2'd3; parameter big_size= N*q; // Size of Fq, h in bits parameter small_size= N*2; // Size of g in bits input [big_size-1:0] f_invq; // Inverse of random polynomial F mod q input [small_size-1:0] g_poly; // Random input polynomial input clk,rst; output [big_size-1:0] key; // Output value of key generated output key_done; // Set when key creation process is complete wire [big_size-1:0] pFq; const_mult const(.prod(pFq),.poly(f_invq),.const(p));
64
polynomial_mult poly1(.poly_prod(key),.poly1(pFq),.poly2(g_poly),.poly_done(key_done),.clk(clk),.rst(rst)); endmodule ___________________________________________________________ // Code for Encryptor block // Encrypted message is calculated as: e = r*h + m (modulo 64) module ntru_encryptor(enc_msg,key,r_poly,msg,encrypt_done,clk,rst); parameter N= 7; parameter q= 6; parameter big_size= N*q; // Size of h,e,rxh in bits parameter small_size= N*2; // Size of r,m in bits input [big_size-1:0] key; // Key created input [small_size-1:0] r_poly; // Random polynomial input [small_size-1:0] msg; // Message to be sent input clk,rst; output reg [big_size-1:0] enc_msg=0; // Encrypted message generated output reg encrypt_done=0; // Set when encryption process is complete wire mult_done; // Set when polynomial multiplication: r*h is complete wire [big_size-1:0] rxh; // Product of r*h // Multiply key, h and random polynomial, r polynomial_mult poly2(.poly_prod(rxh),.poly1(key),.poly2(r_poly),.poly_done(mult_done),.clk(clk),.rst(rst)); always @ (posedge mult_done)
65
begin if ((msg[1]&msg[0]) ==1) //check if msgsg bit is -1: if yes, subtract 1 else add msgessage msg to the r*h product enc_msg[5:0] = rxh[5:0] - 1; else enc_msg[5:0] = rxh[5:0] + msg[1:0]; if ((msg[3]&msg[2]) ==1) enc_msg[11:6] = rxh[11:6] - 1; else enc_msg[11:6] = rxh[11:6] + msg[3:2]; if ((msg[5]&msg[4]) ==1) enc_msg[17:12] = rxh[17:12] - 1; else enc_msg[17:12] = rxh[17:12] + msg[5:4]; if ((msg[7]&msg[6]) ==1) enc_msg[23:18] = rxh[23:18] - 1; else enc_msg[23:18] = rxh[23:18] + msg[7:6]; if ((msg[9]&msg[8]) ==1) enc_msg[29:24] = rxh[29:24] - 1; else enc_msg[29:24] = rxh[29:24] + msg[9:8]; if ((msg[11]&msg[10]) ==1) enc_msg[35:30] = rxh[35:30] - 1; else enc_msg[35:30] = rxh[35:30] + msg[11:10]; if ((msg[13]&msg[12]) ==1) enc_msg[41:36] = rxh[41:36] - 1; else enc_msg[41:36] = rxh[41:36] + msg[13:12]; encrypt_done=1; //done signal goes high only only after encrypted msg is generated end endmodule // ntru_encryptor ___________________________________________________________
66
// Code for different TOP level modules for q=64, N= 107(Moderate Level of NTRU Security) // Code for ntru_encryptor_blk module ntru_encryptor_blk(enc_msg,key,r_poly,msg,encrypt_top_done,clk,rst); parameter N= 112; parameter q= 6; parameter big_size= N*q; parameter small_size= N*2; input [41:0] key; input [small_size-1:0] r_poly; input [small_size-1:0] msg; input clk,rst; output wire [big_size-1:0] enc_msg; output encrypt_top_done; wire d_1,d_2,d_3,d_4,d_5,d_6,d_7,d_8,d_9,d_10,d_11,d_12,d_13,d_14,d_15; ntru_encryptor enc1(.enc_msg(enc_msg[41:0]),.key(key),.r_poly(r_poly[13:0]),.msg(msg[13:0]),.encrypt_done(d_1),.clk(clk),.rst(rst)); ntru_encryptor enc2(.enc_msg(enc_msg[83:42]),.key(key),.r_poly(r_poly[27:14]),.msg(msg[27:14]),.encrypt_done(d_2),.clk(clk),.rst(rst)); ntru_encryptor enc3(.enc_msg(enc_msg[125:84]),.key(key),.r_poly(r_poly[41:28]),.msg(msg[41:28]),.encrypt_done(d_3),.clk(clk),.rst(rst)); ntru_encryptor enc4(.enc_msg(enc_msg[167:126]),.key(key),.r_poly(r_poly[55:42]),.msg(msg[55:42]),.encrypt_done(d_4),.clk(clk),.rst(rst)); ntru_encryptor enc5(.enc_msg(enc_msg[209:168]),.key(key),.r_poly(r_poly[69:56]),.msg(msg[69:56]),.encrypt_done(d_5),.clk(clk),.rst(rst));
67
ntru_encryptor enc6(.enc_msg(enc_msg[251:210]),.key(key),.r_poly(r_poly[83:70]),.msg(msg[83:70]),.encrypt_done(d_6),.clk(clk),.rst(rst)); ntru_encryptor enc7(.enc_msg(enc_msg[293:252]),.key(key),.r_poly(r_poly[97:84]),.msg(msg[97:84]),.encrypt_done(d_7),.clk(clk),.rst(rst)); ntru_encryptor enc8(.enc_msg(enc_msg[335:294]),.key(key),.r_poly(r_poly[111:98]),.msg(msg[111:98]),.encrypt_done(d_8),.clk(clk),.rst(rst)); ntru_encryptor enc9(.enc_msg(enc_msg[377:336]),.key(key),.r_poly(r_poly[125:112]),.msg(msg[125:112]),.encrypt_done(d_9),.clk(clk),.rst(rst)); ntru_encryptor enc10(.enc_msg(enc_msg[419:378]),.key(key),.r_poly(r_poly[139:126]),.msg(msg[139:126]),.encrypt_done(d_10),.clk(clk),.rst(rst)); ntru_encryptor enc11(.enc_msg(enc_msg[461:420]),.key(key),.r_poly(r_poly[153:140]),.msg(msg[153:140]),.encrypt_done(d_11),.clk(clk),.rst(rst)); ntru_encryptor enc12(.enc_msg(enc_msg[503:462]),.key(key),.r_poly(r_poly[167:154]),.msg(msg[167:154]),.encrypt_done(d_12),.clk(clk),.rst(rst)); ntru_encryptor enc13(.enc_msg(enc_msg[545:504]),.key(key),.r_poly(r_poly[181:168]),.msg(msg[181:168]),.encrypt_done(d_13),.clk(clk),.rst(rst)); ntru_encryptor enc14(.enc_msg(enc_msg[587:546]),.key(key),.r_poly(r_poly[195:182]),.msg(msg[195:182]),.encrypt_done(d_14),.clk(clk),.rst(rst)); ntru_encryptor enc15(.enc_msg(enc_msg[629:588]),.key(key),.r_poly(r_poly[209:196]),.msg(msg[209:196]),.encrypt_done(d_15),.clk(clk),.rst(rst)); ntru_encryptor enc16(.enc_msg(enc_msg[671:630]),.key(key),.r_poly({10'b0,r_poly[213:210]}),.msg({10'b0,msg[213:210]}),.encrypt_done(encrypt_top_done),.clk(clk),.rst(rst));
68
endmodule // ntru_encryptor_blk ___________________________________________________________ // Code for single stage of Decryptor block // This block computes polynomial multiplcation product and then does a mod3 operation on each coefficient element module mult_mod(poly_mod,mult1,mult2,mod_done,clk,rst); parameter N= 7; parameter q= 6; parameter big_size= N*q; parameter small_size= N*2; input [big_size-1:0] mult1; // input polynomial input [small_size-1:0] mult2; // input polynomial input clk,rst; output wire [small_size-1:0] poly_mod; // Product of polynomial multiplication, mod3 output wire mod_done; // mod_done wire [big_size-1:0] mult_out; // Product of polynomial multiplication //Function mod3 computes the different mod3 values for a 6-bit input ranging from 1-63 function[q-1:0] mod3; input [q-1:0] mod_in; begin case(mod_in) 6'd1,4,7,10,13,16,19,22,25,28,31,35,38,41,44,47,50,53,56,59,62 : mod3= 1; 6'd2,5,8,11,14,17,20,23,26,29,32,33,36,39,42,45,48,51,54,57,60,63 : mod3= -1; 6'd3,6,9,12,15,18,21,24,27,30,34,37,40,43,46,49,52,55,58,61 : mod3= 0; default : mod3 =0;
69
endcase end endfunction // mod3 // Multiply 2 polynomials mod64 polynomial_mult poly3(.poly_prod(mult_out),.poly1(mult1),.poly2(mult2),.poly_done(mod_done),.clk(clk),.rst(rst)); // Compute the mod3 value of each coefficient for the product of the polynomial multiplication assign poly_mod[1:0] = mod3(mult_out[5:0]); assign poly_mod[3:2] = mod3(mult_out[11:6]); assign poly_mod[5:4] = mod3(mult_out[17:12]); assign poly_mod[7:6] = mod3(mult_out[23:18]); assign poly_mod[9:8] = mod3(mult_out[29:24]); assign poly_mod[11:10] = mod3(mult_out[35:30]); assign poly_mod[13:12] = mod3(mult_out[41:36]); endmodule // mult_mod ___________________________________________________________ // Code for combined Decryptor block module ntru_decryptor(dec_msg,enc_msg,f_poly,f_invp,decrypt_done,clk,rst); parameter N= 7; parameter q= 6; parameter big_size= N*q; parameter small_size= N*2; input [big_size-1:0] enc_msg; // Encrypted message input [small_size-1:0] f_poly; // Random polynomial input [big_size-1:0] f_invp; // Inverse polynomial f mod p input clk,rst; output [small_size-1:0] dec_msg; // Decrypted message output wire decrypt_done; // Decryption done
70
reg [small_size-1:0] dec_b_reg; wire [small_size-1:0] dec_b; wire done_fe,done_Fpb,rst_fe,rst_Fpb; assign rst_fe = rst; assign rst_Fpb = ~done_fe | rst; assign decrypt_done = done_Fpb; // Compute b = (f*e (modulo q))modulo_3 mult_mod fe(dec_b,enc_msg,f_poly,done_fe,clk,rst_fe); always @ (done_fe) begin dec_b_reg <= dec_b; end // Compute c = Fp*b_out (modulo_3) mult_mod Fpb(dec_msg,f_invp,dec_b_reg,done_Fpb,clk,rst_Fpb); endmodule ___________________________________________________________ // Code for ntru_decryptor_blk module ntru_decryptor_blk(dec_msg,enc_msg,f_poly,f_invp,decrypt_top_done,clk,rst); parameter N= 112; parameter q= 6; parameter big_size= N*q; parameter small_size= N*2; input [41:0] f_invp; // Inverse polynomial f mod 3 input [13:0] f_poly; // Random polynomial f input [big_size-1:0] enc_msg; // Encrypted message input clk,rst; output wire decrypt_top_done; // Decryption complete output [small_size-1:0] dec_msg; // Final decrypted message
71
wire d_1,d_2,d_3,d_4,d_5,d_6,d_7,d_8,d_9,d_10,d_11,d_12,d_13,d_14,d_15; ntru_decryptor dec1(.dec_msg(dec_msg[13:0]),.enc_msg(enc_msg[41:0]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_1),.clk(clk),.rst(rst)); ntru_decryptor dec2(.dec_msg(dec_msg[27:14]),.enc_msg(enc_msg[83:42]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_2),.clk(clk),.rst(rst)); ntru_decryptor dec3(.dec_msg(dec_msg[41:28]),.enc_msg(enc_msg[125:84]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_3),.clk(clk),.rst(rst)); ntru_decryptor dec4(.dec_msg(dec_msg[55:42]),.enc_msg(enc_msg[167:126]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_4),.clk(clk),.rst(rst)); ntru_decryptor dec5(.dec_msg(dec_msg[69:56]),.enc_msg(enc_msg[209:168]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_5),.clk(clk),.rst(rst)); ntru_decryptor dec6(.dec_msg(dec_msg[83:70]),.enc_msg(enc_msg[251:210]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_6),.clk(clk),.rst(rst)); ntru_decryptor dec7(.dec_msg(dec_msg[97:84]),.enc_msg(enc_msg[293:252]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_7),.clk(clk),.rst(rst)); ntru_decryptor dec8(.dec_msg(dec_msg[111:98]),.enc_msg(enc_msg[335:294]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_8),.clk(clk),.rst(rst)); ntru_decryptor dec9(.dec_msg(dec_msg[125:112]),.enc_msg(enc_msg[377:336]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_9),.clk(clk),.rst(rst)); ntru_decryptor dec10(.dec_msg(dec_msg[139:126]),.enc_msg(enc_msg[419:378]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_10),.clk(clk),.rst(rst));
72
ntru_decryptor dec11(.dec_msg(dec_msg[153:140]),.enc_msg(enc_msg[461:420]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_11),.clk(clk),.rst(rst)); ntru_decryptor dec12(.dec_msg(dec_msg[167:154]),.enc_msg(enc_msg[503:462]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_12),.clk(clk),.rst(rst)); ntru_decryptor dec13(.dec_msg(dec_msg[181:168]),.enc_msg(enc_msg[545:504]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_13),.clk(clk),.rst(rst)); ntru_decryptor dec14(.dec_msg(dec_msg[195:182]),.enc_msg(enc_msg[587:546]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_14),.clk(clk),.rst(rst)); ntru_decryptor dec15(.dec_msg(dec_msg[209:196]),.enc_msg(enc_msg[629:588]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(d_15),.clk(clk),.rst(rst)); ntru_decryptor dec16(.dec_msg(dec_msg[223:210]),.enc_msg(enc_msg[671:630]),.f_poly(f_poly),.f_invp(f_invp),.decrypt_done(decrypt_top_done),.clk(clk),.rst(rst)); endmodule // ntru_decryptor_blk ___________________________________________________________ // Code for Unified Test Bench for Low Level of NTRU Security `include "ntru_key.v" `include "ntru_encryptor_blk.v" `include "ntru_decryptor_blk.v" module ntru_pkcs; parameter N= 112; parameter N_ind= 7; parameter q= 6; parameter p= 3; parameter big_size= N*q; parameter small_size= N*2;
73
parameter big_ind= N_ind*q; parameter small_ind= N_ind*2; // NTRU Key Creator reg clk,rst_key; reg [big_ind-1:0] f_invq; reg [big_ind-1:0] pFq; reg [small_ind-1:0] g_poly; wire [big_ind-1:0] key; wire key_complete; // NTRU Encryptor Block reg [small_size-1:0] r_poly=0; reg [small_size-1:0] msg=0; reg [big_ind-1:0] key_reg=0; wire rst_enc; wire [big_size-1:0] enc_msg; wire encrypt_complete; // NTRU Decryptor Block reg [small_ind-1:0] f_poly; wire rst_dec; wire decrypt_complete; reg [big_ind-1:0] f_invp; wire [small_size-1:0] dec_msg; reg [small_size-1:0] dec_msg_reg; assign rst_enc = ~key_complete; assign rst_dec = ~encrypt_complete | rst_key; //Instantiate all the three blocks for key creation, encryption, decryption ntru_key key_create(key,f_invq,g_poly,key_complete,clk,rst_key); ntru_encryptor_blk encrypt(enc_msg,key_reg,r_poly,msg,encrypt_complete,clk,rst_enc); ntru_decryptor_blk decrypt(dec_msg,enc_msg,f_poly,f_invp,decrypt_complete,clk,rst_dec); wire [1:0] g6,g5,g4,g3,g2,g1,g0,f6,f5,f4,f3,f2,f1,f0,r6,r5,r4,r3,r2,r1,r0,m6,m5,m4,m3,m2,m1,m0,c6,c5,c4,c3,c2,c1,c0,ct6,ct5,ct4,ct3,ct2,ct1,ct0;
74
wire [q-1:0] ht6,ht5,ht4,ht3,ht2,ht1,ht0,h6,h5,h4,h3,h2,h1,h0,fp6,fp5,fp4,fp3,fp2,fp1,fp0,fq6,fq5,fq4,fq3,fq2,fq1,fq0,e6,e5,e4,e3,e2,e1,e0; assign {g6,g5,g4,g3,g2,g1,g0}=g_poly; assign {r6,r5,r4,r3,r2,r1,r0}=r_poly; assign {f6,f5,f4,f3,f2,f1,f0}=f_poly; assign {m6,m5,m4,m3,m2,m1,m0}=msg; assign {c6,c5,c4,c3,c2,c1,c0}=dec_msg; assign {ct6,ct5,ct4,ct3,ct2,ct1,ct0}=dec_msg_reg; assign {fp6,fp5,fp4,fp3,fp2,fp1,fp0}=f_invp; assign {fq6,fq5,fq4,fq3,fq2,fq1,fq0}=f_invq; assign {h6,h5,h4,h3,h2,h1,h0}=key; assign {ht6,ht5,ht4,ht3,ht2,ht1,ht0}=key_reg; assign {e6,e5,e4,e3,e2,e1,e0}=enc_msg; initial $vcdpluson; // Clock generator initial begin clk = 0; #10 forever #5 clk = ~clk; end // Input polynomials and display commands initial begin rst_key = 1; f_invq = {6'd20,6'd29,-6'd15,6'd13,-6'd2,6'd11,6'd9}; g_poly = {2'd0,2'd0,-2'd1,2'd1,2'd0,2'd1,-2'd1}; f_invp = {-6'd1,6'd0,-6'd1,6'd1,6'd0,6'd1,6'd1}; f_poly = {2'd0,2'd0,-2'd1,2'd1,2'd1,2'd0,2'd0}; r_poly = {2'd0,-2'd1,-2'd1,-2'd1,2'd1,-2'd1,2'd0}; msg = 224'h7F5Cd7F5Cd7F5Cd7; //{4'hf,4'h5,4'hc,4'h1,4'h3,4'h7,4'hd,4'hf,4'h7}; //5192296858534827628530496329220097; //224'h8F5C28F5C28F5C28 $display("#################################################
75
###################################################################################\n"); $display("############################################ NTRU PKCS for N=11 q=5 p=3 ############################################\n") ; $display(" Input polynomials, f= %d %d %d %d %d %d %d g= %d %d %d %d %d %d %d\n Inverse of f modp, Fp= %d %d %d %d %d %d %d Inverse of f modq, Fq= %d %d %d %d %d %d %d",f0,f1,f2,f3,f4,f5,f6,g0,g1,g2,g3,g4,g5,g6,fp0,fp1,fp2,fp3,fp4,fp5,fp6,fq0,fq1,fq2,fq3,fq4,fq5,fq6); $display(" Random polynomials, r = %d %d %d %d %d %d %d Message m= %d \n",r0,r1,r2,r3,r4,r5,r6,msg); $display("####################################################################################################################################\n"); #10; rst_key =0; #600 $finish; end always @ (key) begin $display("time = %3d: key_complete=%d encrypt_complete=%d decrypt_complete= %d h= %d %d %d %d %d %d %d \n",$time,key_complete,encrypt_complete,decrypt_complete,h0,h1,h2,h3,h4,h5,h6); end always @ (posedge key_complete) begin key_reg <= key; end always @ (enc_msg) begin $display("time = %3d: key_complete=%d encrypt_complete=%d decrypt_complete= %d e= %d %d %d %d %d %d %d \n",$time,key_complete,encrypt_complete,decrypt_complete,e0,e1,e2,e3,e4,e5,e6);
76
end always @ (dec_msg) begin $display("time = %3d: key_complete=%d encrypt_complete=%d decrypt_complete= %d c= %d %d %d %d %d %d %d\n ",$time,key_complete,encrypt_complete,decrypt_complete,c0,c1,c2,c3,c4,c5,c6); end always @ (posedge decrypt_complete) begin dec_msg_reg <= dec_msg; if(msg==dec_msg) begin #30 $display("####################################################################################################################################\n"); $display("###################### Success!!! The decrypted message is the same as the original message sent ###################### \n"); $display("Public key h= %d %d %d %d %d %d %d\nMessage sent m= %20d \nEncrypted message e= %d %d %d %d %d %d %d\nDecrypted message c= %20d \n",ht0,ht1,ht2,ht3,ht4,ht5,ht6,msg,e0,e1,e2,e3,e4,e5,e6,dec_msg_reg); $display("####################################################################################################################################"); $finish; end else begin #30 $display("####################################################################################################################################\n"); $display("###################### Fail!!! The decrypted message and the original message sent are different ###################### \n"); $display("Public key h= %d %d %d %d %d %d %d\nMessage sent m= %20d \nEncrypted message, e= %d %d %d %d %d %d %d\nDecrypted message c= %20d
77
\n",ht0,ht1,ht2,ht3,ht4,ht5,ht6,msg,e0,e1,e2,e3,e4,e5,e6,dec_msg_reg); $display("####################################################################################################################################"); $finish; end end endmodule ___________________________________________________________ ___________________________________________________________
// Code for 4-bit counter
A.2 PARAMETERS N=11, q=32, p=3
`ifdef _bit4_cnt_ `else `define _bit4_cnt_ module bit4_cnt(cnt4,rst,clk); output reg [3:0] cnt4=0; input rst,clk; always @ (posedge clk or posedge rst) begin if (rst) cnt4 <= 0; else cnt4 <= cnt4+1; end endmodule // bit4_cnt `endif ___________________________________________________________ // Code for Barrel Shifter // Used to shift the coefficients of the inputs of the polynomial multiplication process to the correct position
78
`ifdef _BARREL_SHIFT_ `else `define _BARREL_SHIFT_ module barrel_shift(shift_out,poly_in,shift_num); parameter N =11; parameter q =5; parameter big_size = N*q; //Size of poly_in in bits input [3:0] shift_num; //Number of positions to be shifted input [big_size-1:0] poly_in; //Unshifted value of input polynomial output reg [big_size-1:0] shift_out=0; //Shifted value of polynomial always @ (poly_in or shift_num) begin case (shift_num) 4'd0 : shift_out <= {poly_in[9:5],poly_in[14:10],poly_in[19:15],poly_in[24:20],poly_in[29:25],poly_in[34:30],poly_in[39:35],poly_in[44:40],poly_in[49:45],poly_in[54:50],poly_in[4:0]}; 4'd1 : shift_out <= {poly_in[14:10],poly_in[19:15],poly_in[24:20],poly_in[29:25],poly_in[34:30],poly_in[39:35],poly_in[44:40],poly_in[49:45],poly_in[54:50],poly_in[4:0],poly_in[9:5]}; 4'd2 : shift_out <= {poly_in[19:15],poly_in[24:20],poly_in[29:25],poly_in[34:30],poly_in[39:35],poly_in[44:40],poly_in[49:45],poly_in[54:50],poly_in[4:0],poly_in[9:5],poly_in[14:10]}; 4'd3 : shift_out <= {poly_in[24:20],poly_in[29:25],poly_in[34:30],poly_in[39:35],poly_in[44:40],poly_in[49:45],poly_in[54:50],poly_in[4:0],poly_in[9:5],poly_in[14:10],poly_in[19:15]}; 4'd4 : shift_out <= {poly_in[29:25],poly_in[34:30],poly_in[39:35],poly_in[44:40],poly_in[49:45],poly_in[54:50],poly_in[4:0],poly_in[9:5],poly_in[14:10],poly_in[19:15],poly_in[24:20]};
79
4'd5 : shift_out <= {poly_in[34:30],poly_in[39:35],poly_in[44:40],poly_in[49:45],poly_in[54:50],poly_in[4:0],poly_in[9:5],poly_in[14:10],poly_in[19:15],poly_in[24:20],poly_in[29:25]}; 4'd6 : shift_out <= {poly_in[39:35],poly_in[44:40],poly_in[49:45],poly_in[54:50],poly_in[4:0],poly_in[9:5],poly_in[14:10],poly_in[19:15],poly_in[24:20],poly_in[29:25],poly_in[34:30]}; 4'd7 : shift_out <= {poly_in[44:40],poly_in[49:45],poly_in[54:50],poly_in[4:0],poly_in[9:5],poly_in[14:10],poly_in[19:15],poly_in[24:20],poly_in[29:25],poly_in[34:30],poly_in[39:35]}; 4'd8 : shift_out <= {poly_in[49:45],poly_in[54:50],poly_in[4:0],poly_in[9:5],poly_in[14:10],poly_in[19:15],poly_in[24:20],poly_in[29:25],poly_in[34:30],poly_in[39:35],poly_in[44:40]}; 4'd9 : shift_out <= {poly_in[54:50],poly_in[4:0],poly_in[9:5],poly_in[14:10],poly_in[19:15],poly_in[24:20],poly_in[29:25],poly_in[34:30],poly_in[39:35],poly_in[44:40],poly_in[49:45]}; 4'd10 : shift_out <={poly_in[4:0],poly_in[9:5],poly_in[14:10],poly_in[19:15],poly_in[24:20],poly_in[29:25],poly_in[34:30],poly_in[39:35],poly_in[44:40],poly_in[49:45],poly_in[54:50]}; 4'd11 : shift_out <=0; 4'd12 : shift_out <=0; 4'd13 : shift_out <=0; 4'd14 : shift_out <=0; 4'd15 : shift_out <=0; default : ; endcase end endmodule // barrel_shift `endif ___________________________________________________________
80
// Code for Procesing Unit- heart of the polynomial multiplier `ifdef _PROC_UNIT_ `else `define _PROC_UNIT_ module proc_unit (out,in1,in2,carry_in); parameter q = 5; input [q-1:0] in1,carry_in; input [1:0] in2; output [q-1:0] out; wire [q-1:0] in_10; wire [q-1:0] temp_out; wire [q-1:0] out; assign temp_out[4]=(((~in1[4]) & in2[1]) | (in1[4]&(~in2[1])&in2[0])); assign temp_out[3]=(((~in1[3]) & in2[1]) | (in1[3]&(~in2[1])&in2[0])); assign temp_out[2]=(((~in1[2]) & in2[1]) | (in1[2]&(~in2[1])&in2[0])); assign temp_out[1]=(((~in1[1]) & in2[1]) | (in1[1]&(~in2[1])&in2[0])); assign temp_out[0]=(((~in1[0]) & in2[1]) | (in1[0]&(~in2[1])&in2[0])); assign in_10 = (in2[1]&(~in2[0])) ? {in1[q-1:0]<<1} : temp_out ; assign out = in_10 + carry_in + (in2[1]&in2[0]); endmodule // pu `endif ___________________________________________________________ // Code for coeff block // Compute final polynomial coefficient by adding each column element got by polynomial multiplication process `ifdef _COEFF_ `else `define _COEFF_
81
module coeff(C_final,A,B); parameter N = 11 ; parameter q = 5 ; parameter big_size= N*q; //Size of A in bits parameter small_size= N*2; //Size of B in bits input [big_size-1:0] A; input [small_size-1:0] B; output wire [q-1:0] C_final; wire [big_size-1:0] C; assign C_final = C[54:50]; //Final value of the polynomial coefficient proc_unit proc0 (.out(C[4:0]),.in1(A[4:0]),.in2(B[1:0]),.carry_in(5'b0)); proc_unit proc1 (.out(C[9:5]),.in1(A[9:5]),.in2(B[3:2]),.carry_in(C[4:0])); proc_unit proc2 (.out(C[14:10]),.in1(A[14:10]),.in2(B[5:4]),.carry_in(C[9:5])); proc_unit proc3 (.out(C[19:15]),.in1(A[19:15]),.in2(B[7:6]),.carry_in(C[14:10])); proc_unit proc4 (.out(C[24:20]),.in1(A[24:20]),.in2(B[9:8]),.carry_in(C[19:15])); proc_unit proc5 (.out(C[29:25]),.in1(A[29:25]),.in2(B[11:10]),.carry_in(C[24:20])); proc_unit proc6 (.out(C[34:30]),.in1(A[34:30]),.in2(B[13:12]),.carry_in(C[29:25])); proc_unit proc7 (.out(C[39:35]),.in1(A[39:35]),.in2(B[15:14]),.carry_in(C[34:30])); proc_unit proc8 (.out(C[44:40]),.in1(A[44:40]),.in2(B[17:16]),.carry_in(C[39:35])); proc_unit proc9 (.out(C[49:45]),.in1(A[49:45]),.in2(B[19:18]),.carry_in(C[44:40]));
82
proc_unit proc10(.out(C[54:50]),.in1(A[54:50]),.in2(B[21:20]),.carry_in(C[49:45])); endmodule `endif ___________________________________________________________ // Code for the p.Fq multiplier block // Multiplies the constant integer value of p with each and every coefficient of the Fq polynomial `ifdef _const_mult_ `else `define _const_mult_ module const_mult(prod,poly,const); parameter N= 11; parameter q= 5; parameter big_size= N*q; output [big_size-1:0] prod; input [big_size-1:0] poly; input [1:0] const; assign prod[4:0] = poly[4:0]*const; assign prod[9:5] = poly[9:5]*const; assign prod[14:10] = poly[14:10]*const; assign prod[19:15] = poly[19:15]*const; assign prod[24:20] = poly[24:20]*const; assign prod[29:25] = poly[29:25]*const; assign prod[34:30] = poly[34:30]*const; assign prod[39:35] = poly[39:35]*const; assign prod[44:40] = poly[44:40]*const; assign prod[49:45] = poly[49:45]*const; assign prod[54:50] = poly[54:50]*const; endmodule // const_mult `endif ___________________________________________________________
83
// Code for NTRU Multiplier or Polynomial Multiplication Engine (PME) // Performs polynomial multiplication on two input polynomials, by shifting one of the inputs using the barrel shifter, to compute product `ifdef _POLYNOMIAL_MULT_ `else `define _POLYNOMIAL_MULT_ module polynomial_mult(poly_prod,poly1,poly2,poly_done,clk,rst); parameter N= 11; parameter q= 5; parameter big_size= N*q; //Size of poly1 in bits parameter small_size= N*2; //Size of poly2 in bits input clk,rst; input [big_size-1:0] poly1; input [small_size-1:0] poly2; output reg[big_size-1:0] poly_prod=0; output reg poly_done=0; wire [big_size-1:0] poly1_shift; wire [q-1:0] coeff; wire [big_size-1:0] prod_temp; wire [3:0] shift_cnt; assign prod_temp = coeff<<(5*shift_cnt); //Intermediate shifted product Coefficients bit4_cnt count(.cnt4(shift_cnt),.rst(rst),.clk(clk)); barrel_shift shift(.shift_out(poly1_shift),.poly_in(poly1),.shift_num(shift_cnt)); coeff cf(.C_final(coeff),.A(poly1_shift),.B(poly2)); always @ (posedge clk or posedge rst) begin if (rst) begin poly_prod <=0 ; end
84
else if (poly_done) poly_prod <=0 ; else begin poly_prod <= poly_prod + prod_temp; end end always @ (negedge clk) begin if(shift_cnt == 4'b1011) begin poly_done <= 1; end end endmodule `endif ___________________________________________________________ // Code for Key Creator block // Key 'h' is calculated using Polynomial multiplication : (p*Fq*g)mod64 module ntru_key(key,f_invq,g_poly,key_done,clk,rst); parameter N= 11; parameter q= 5; parameter p= 2'd3; parameter big_size= N*q; // Size of Fq, h in bits parameter small_size= N*2; // Size of g in bits input [big_size-1:0] f_invq; // Inverse of random polynomial F mod q input [small_size-1:0] g_poly; // Random input polynomial input clk,rst; output [big_size-1:0] key; // Output value of key generated output key_done; // Set when key creation process is complete wire [big_size-1:0] pFq;
85
const_mult const(.prod(pFq),.poly(f_invq),.const(p)); polynomial_mult poly1(.poly_prod(key),.poly1(pFq),.poly2(g_poly),.poly_done(key_done),.clk(clk),.rst(rst)); endmodule ___________________________________________________________ // Code for Encryptor block // Encrypted message is calculated as: e = r*h + m (modulo 64) module ntru_encryptor(enc_msg,key,r_poly,msg,encrypt_done,clk,rst); parameter N= 11; parameter q= 5; parameter big_size= N*q; // Size of h,e,rxh in bits parameter small_size= N*2; // Size of r,m in bits input [big_size-1:0] key; // Key created input [small_size-1:0] r_poly; // Random polynomial input [small_size-1:0] msg; // Message to be sent input clk,rst; output reg [big_size-1:0] enc_msg=0; // Encrypted message generated output reg encrypt_done=0; // Set when encryption process is complete wire mult_done; // Set when polynomial multiplication: r*h is complete wire [big_size-1:0] rxh; // Product of r*h polynomial_mult poly2(.poly_prod(rxh),.poly1(key),.poly2(r_poly),.poly_done(mult_done),.clk(clk),.rst(rst)); always @ (posedge mult_done) begin
86
if ((msg[1]&msg[0]) ==1) //check if msg bit is -1: if yes, subtract 1 else add msgessage msg to the r*h product enc_msg[4:0] = rxh[4:0] - 1; else enc_msg[4:0] = rxh[4:0] + msg[1:0]; if ((msg[3]&msg[2]) ==1) enc_msg[9:5] = rxh[9:5] - 1; else enc_msg[9:5] = rxh[9:5] + msg[3:2]; if ((msg[5]&msg[4]) ==1) enc_msg[14:10] = rxh[14:10] - 1; else enc_msg[14:10] = rxh[14:10] + msg[5:4]; if ((msg[7]&msg[6]) ==1) enc_msg[19:15] = rxh[19:15] - 1; else enc_msg[19:15] = rxh[19:15] + msg[7:6]; if ((msg[9]&msg[8]) ==1) enc_msg[24:20] = rxh[24:20] - 1; else enc_msg[24:20] = rxh[24:20] + msg[9:8]; if ((msg[11]&msg[10]) ==1) enc_msg[29:25] = rxh[29:25] - 1; else enc_msg[29:25] = rxh[29:25] + msg[11:10]; if ((msg[13]&msg[12]) ==1) enc_msg[34:30] = rxh[34:30] - 1; else enc_msg[34:30] = rxh[34:30] + msg[13:12]; if ((msg[15]&msg[14]) ==1) enc_msg[39:35] = rxh[39:35] - 1; else enc_msg[39:35] = rxh[39:35] + msg[15:14]; if ((msg[17]&msg[16]) ==1) enc_msg[44:40] = rxh[44:40] - 1; else
87
enc_msg[44:40] = rxh[44:40] + msg[17:16]; if ((msg[19]&msg[18]) ==1) enc_msg[49:45] = rxh[49:45] - 1; else enc_msg[49:45] = rxh[49:45] + msg[19:18]; if ((msg[21]&msg[20]) ==1) enc_msg[54:50] = rxh[54:50] - 1; else enc_msg[54:50] = rxh[54:50] + msg[21:20]; encrypt_done=1; //done signal goes high only only after encrypted msg is generated end endmodule // ntru_encryptor ___________________________________________________________ // Code for single stage of Decryptor block // This block computes polynomial multiplcation product and then does a mod3 operation on each coefficient element module mult_mod(poly_mod,mult1,mult2,mod_done,clk,rst); parameter N= 11; parameter q= 5; parameter big_size= N*q; parameter small_size= N*2; input [big_size-1:0] mult1; // input polynomial input [small_size-1:0] mult2; // input polynomial input clk,rst; output wire [small_size-1:0] poly_mod; // Product of polynomial multiplication, mod3 output wire mod_done; // mod_done wire [big_size-1:0] mult_out; // Product of polynomial multiplication //Function mod3 computes the different mod3 values for a 5-bit input ranging from 1-31
88
function[q-1:0] mod3; input [q-1:0] mod_in; begin case(mod_in) 5'd2,5'd5,5'd8,5'd11,5'd14,5'd16,5'd19,5'd22,5'd25,5'd28,5'd31 : mod3= -1; 5'd3,5'd6,5'd9,5'd12,5'd15,5'd17,5'd20,5'd23,5'd26,5'd29 : mod3= 0; 5'd1,5'd4,5'd7,5'd10,5'd13,5'd18,5'd21,5'd24,5'd27,5'd30 : mod3= 1; default : mod3 =0; endcase end endfunction // mod3 // Multiply 2 polynomials mod32 polynomial_mult poly3(.poly_prod(mult_out),.poly1(mult1),.poly2(mult2),.poly_done(mod_done),.clk(clk),.rst(rst)); // Compute the mod3 value of each coefficient for the product of the polynomial multiplication assign poly_mod[1:0] = mod3(mult_out[4:0]); assign poly_mod[3:2] = mod3(mult_out[9:5]); assign poly_mod[5:4] = mod3(mult_out[14:10]); assign poly_mod[7:6] = mod3(mult_out[19:15]); assign poly_mod[9:8] = mod3(mult_out[24:20]); assign poly_mod[11:10] = mod3(mult_out[29:25]); assign poly_mod[13:12] = mod3(mult_out[34:30]); assign poly_mod[15:14] = mod3(mult_out[39:35]); assign poly_mod[17:16] = mod3(mult_out[44:40]); assign poly_mod[19:18] = mod3(mult_out[49:45]); assign poly_mod[21:20] = mod3(mult_out[54:50]); endmodule // mult_mod ___________________________________________________________ // Code for combined Decryptor block
89
module ntru_decryptor(dec_msg,enc_msg,f_poly,f_invp,decrypt_done,clk,rst); parameter N= 11; parameter q= 5; parameter big_size= N*q; parameter small_size= N*2; input [big_size-1:0] enc_msg; // Encrypted message input [small_size-1:0] f_poly; // Random polynomial input [big_size-1:0] f_invp; // Inverse polynomial f mod p input clk,rst; output [small_size-1:0] dec_msg; // Decrypted message output wire decrypt_done; // Decryption done reg [small_size-1:0] dec_b_reg; wire [small_size-1:0] dec_b; wire done_fe,done_Fpb,rst_fe,rst_Fpb; assign rst_fe = rst; assign rst_Fpb = ~done_fe | rst; assign decrypt_done = done_Fpb; // Compute b = (f*e (modulo q))modulo_3 mult_mod fe(dec_b,enc_msg,f_poly,done_fe,clk,rst_fe); always @ (done_fe) begin dec_b_reg <= dec_b; end // Compute c = Fp*dec_b_reg (modulo_3) mult_mod Fpb(dec_msg,f_invp,dec_b_reg,done_Fpb,clk,rst_Fpb); endmodule ___________________________________________________________
90
// Code for the entire NTRU PKCS for input parameters: N=11,q=5,p=3 module ntru_pkcs; parameter N= 11; parameter q= 5; parameter p= 3; parameter big_size= N*q; parameter small_size= N*2; // NTRU Key Creator reg clk,rst_key; reg [big_size-1:0] f_invq; reg [big_size-1:0] pFq; reg [small_size-1:0] g_poly; wire [big_size-1:0] key; wire key_complete; // NTRU Encryptor reg [small_size-1:0] r_poly=0; reg [small_size-1:0] msg=0; reg [big_size-1:0] key_reg=0; wire rst_encrypt; wire [big_size-1:0] enc_msg; wire encrypt_complete; // NTRU Decryptor reg [small_size-1:0] f_poly; wire rst_decrypt; wire decrypt_complete; reg [big_size-1:0] f_invp; wire [small_size-1:0] dec_msg; reg [small_size-1:0] dec_msg_reg; assign rst_encrypt = ~key_complete; assign rst_decrypt = ~encrypt_complete | rst_key; //Instantiate all the three blocks for key creation, encryption, decryption ntru_key key_create(key,f_invq,g_poly,key_complete,clk,rst_key); ntru_encryptor encrypt(enc_msg,key_reg,r_poly,msg,encrypt_complete,clk,rst_encrypt);
91
ntru_decryptor decrypt(dec_msg,enc_msg,f_poly,f_invp,decrypt_complete,clk,rst_decrypt); wire [1:0] g10,g9,g8,g7,g6,g5,g4,g3,g2,g1,g0,f10,f9,f8,f7,f6,f5,f4,f3,f2,f1,f0,r10,r9,r8,r7,r6,r5,r4,r3,r2,r1,r0,m10,m9,m8,m7,m6,m5,m4,m3,m2,m1,m0,c10,c9,c8,c7,c6,c5,c4,c3,c2,c1,c0,ct10,ct9,ct8,ct7,ct6,ct5,ct4,ct3,ct2,ct1,ct0; wire [q-1:0] ht10,ht9,ht8,ht7,ht6,ht5,ht4,ht3,ht2,ht1,ht0,h10,h9,h8,h7,h6,h5,h4,h3,h2,h1,h0,fp10,fp9,fp8,fp7,fp6,fp5,fp4,fp3,fp2,fp1,fp0,fq10,fq9,fq8,fq7,fq6,fq5,fq4,fq3,fq2,fq1,fq0,e10,e9,e8,e7,e6,e5,e4,e3,e2,e1,e0; assign {g10,g9,g8,g7,g6,g5,g4,g3,g2,g1,g0}=g_poly; assign {r10,r9,r8,r7,r6,r5,r4,r3,r2,r1,r0}=r_poly; assign {f10,f9,f8,f7,f6,f5,f4,f3,f2,f1,f0}=f_poly; assign {m10,m9,m8,m7,m6,m5,m4,m3,m2,m1,m0}=msg; assign {c10,c9,c8,c7,c6,c5,c4,c3,c2,c1,c0}=dec_msg; assign {ct10,ct9,ct8,ct7,ct6,ct5,ct4,ct3,ct2,ct1,ct0}=dec_msg_reg; assign {fp10,fp9,fp8,fp7,fp6,fp5,fp4,fp3,fp2,fp1,fp0}=f_invp; assign {fq10,fq9,fq8,fq7,fq6,fq5,fq4,fq3,fq2,fq1,fq0}=f_invq; assign {h10,h9,h8,h7,h6,h5,h4,h3,h2,h1,h0}=key; assign {ht10,ht9,ht8,ht7,ht6,ht5,ht4,ht3,ht2,ht1,ht0}=key_reg; assign {e10,e9,e8,e7,e6,e5,e4,e3,e2,e1,e0}=enc_msg; initial $vcdpluson; // Clock generator initial begin clk = 0; #10 forever #5 clk = ~clk; end // Input polynomials and display commands
92
initial begin //g = -1 + X2 + X3 + X5 - X8 - X10 g_poly = {-2'd1,2'd0,-2'd1,2'd0,2'd0,2'd1,2'd0,2'd1,2'd1,2'd0,-2'd1}; //fq = 5 + 9X + 6X2 + 16X3 + 4X4 + 15X5 + 16X6 + 22X7 + 20X8 + 18X9 + 30X10 f_invq = {5'd30,5'd18,5'd20,5'd22,5'd16,5'd15,5'd4,5'd16,5'd6,5'd9,5'd5 } ; //fp = 1 + 2X + 2X3 + 2X4 + X5 + 2X7 + X8 + 2X9 f_invp = {5'd0,5'd2,5'd1,5'd2,5'd0,5'd1,5'd2,5'd2,5'd0,5'd2,5'd1}; //f = -1 + X + X2 - X4 + X6 + X9 - X10 f_poly = {-2'd1,2'd1,2'd0,2'd0,2'd1,2'd0,-2'd1,2'd0,2'd1,2'd1,-2'd1}; //r = -1 + X2 + X3 + X4 - X5 - X7. r_poly = {2'd0,2'd0,2'd0,-2'd1,2'd0,-2'd1,2'd1,2'd1,2'd1,2'd0,-2'd1}; //m = -1 + X3 - X4 - X8 + X9 + X10 msg = {2'd1,2'd1,-2'd1,2'd0,2'd0,2'd0,-2'd1,2'd1,2'd0,2'd0,-2'd1}; $display("####################################################################################################################################\n"); $display("############################################ NTRU PKCS for N=11 q=5 p=3 ############################################\n") ; $display(" Input polynomials, f= %d %d %d %d %d %d %d %d %d %d %d g= %d %d %d %d %d %d %d %d %d %d %d\n Inverse of f modp, Fp= %d %d %d %d %d %d %d %d %d %d %d Inverse of f modq, Fq= %d %d %d %d %d %d %d %d %d %d %d",f0,f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,g0,g1,g2,g3,g4,g5,g6,g7,g8,g9,g10,fp0,fp1,fp2,fp3,fp4,fp5,fp6,fp7,fp8,fp9,fp10,fq0,fq1,fq2,fq3,fq4,fq5,fq6,fq7,fq8,fq9,fq10); $display(" Random polynomials, r = %d %d %d %d %d %d %d %d %d %d %d Message m= %d %d %d %d %d %d %d %d %d %d
93
%d\n",r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,m0,m1,m2,m3,m4,m5,m6,m7,m8,m9,m10); $display("####################################################################################################################################\n"); rst_key =1; #10 rst_key =0; #600 $finish; end // initial begin always @ (key) begin $display("time = %3d: key_complete=%d encrypt_complete=%d decrypt_complete= %d h= %d %d %d %d %d %d %d %d %d %d %d\n",$time,key_complete,encrypt_complete,decrypt_complete,h0,h1,h2,h3,h4,h5,h6,h7,h8,h9,h10); end always @ (posedge key_complete) begin key_reg <= key; end always @ (enc_msg) begin $display("time = %3d: key_complete=%d encrypt_complete=%d decrypt_complete= %d e= %d %d %d %d %d %d %d %d %d %d %d\n",$time,key_complete,encrypt_complete,decrypt_complete,e0,e1,e2,e3,e4,e5,e6,e7,e8,e9,e10); end always @ (dec_msg) begin $display("time = %3d: key_complete=%d encrypt_complete=%d decrypt_complete= %d c= %d %d %d %d %d %d %d %d %d %d %d\n",$time,key_complete,encrypt_complete,decrypt_complete,c0,c1,c2,c3,c4,c5,c6,c7,c8,c9,c10); end
94
always @ (posedge decrypt_complete) begin dec_msg_reg <= dec_msg; if(msg==dec_msg) begin #30 $display("####################################################################################################################################\n"); $display("###################### Success!!! The decrypted message is the same as the original message sent ###################### \n"); $display("Public key h= %d %d %d %d %d %d %d %d %d %d %d\nMessage sent m= %d %d %d %d %d %d %d %d %d %d %d\nEncrypted message e= %d %d %d %d %d %d %d %d %d %d %d\nDecrypted message c = %d %d %d %d %d %d %d %d %d %d %d\n",ht0,ht1,ht2,ht3,ht4,ht5,ht6,ht7,ht8,ht9,ht10,m0,m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,e0,e1,e2,e3,e4,e5,e6,e7,e8,e9,e10,ct0,ct1,ct2,ct3,ct4,ct5,ct6,ct7,ct8,ct9,ct10); $display("####################################################################################################################################"); $finish; end else begin #30 $display("####################################################################################################################################\n"); $display("###################### Fail!!! The decrypted message and the original message sent are different ###################### \n"); $display("Public key h= %d %d %d %d %d %d %d %d %d %d %d\nMessage sent m= %d %d %d %d %d %d %d %d %d %d %d\nEncrypted message, e=%d %d %d %d %d %d %d %d %d %d %d\nDecrypted message, c=%d %d %d %d %d %d %d %d %d %d %d\n",ht0,ht1,ht2,ht3,ht4,ht5,ht6,ht7,ht8,ht9,ht10,m0,m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,e0,e1,e2,e3,e4,e5,e6,e7,e8,e9,e10,ct0,ct1,ct2,ct3,ct4,ct5,ct6,ct7,ct8,ct9,ct10); $display("####################################################################################################################################"); $finish;
95
end end endmodule ___________________________________________________________
96
APPENDIX B
Synthesis Results
The model for N=107 was synthesized and below are the synthesis reports for the Key, Encrypt, Decrypt blocks.
B.1 NTRU_Key
Release 10.1 - xst K.31 (nt) Copyright (c) 1995-2008 Xilinx, Inc. All rights reserved. --> Parameter TMPDIR set to C:/Documents and Settings/kamatp/N112_synth_key/xst/projnav.tmp Total REAL time to Xst completion: 0.00 secs Total CPU time to Xst completion: 0.08 secs --> Parameter xsthdpdir set to C:/Documents and Settings/kamatp/N112_synth_key/xst Total REAL time to Xst completion: 0.00 secs Total CPU time to Xst completion: 0.08 secs --> Reading design: ntru_key.prj TABLE OF CONTENTS 1) Synthesis Options Summary 2) HDL Compilation 3) Design Hierarchy Analysis 4) HDL Analysis 5) HDL Synthesis 5.1) HDL Synthesis Report 6) Advanced HDL Synthesis 6.1) Advanced HDL Synthesis Report 7) Low Level Synthesis 8) Partition Report 9) Final Report
97
========================================================================= * Synthesis Options Summary * ========================================================================= ---- Source Parameters Input File Name : "ntru_key.prj" Input Format : mixed Ignore Synthesis Constraint File : NO ---- Target Parameters Output File Name : "ntru_key" Output Format : NGC Target Device : Automotive 9500XL ---- Source Options Top Module Name : ntru_key Automatic FSM Extraction : YES FSM Encoding Algorithm : Auto Safe Implementation : No Mux Extraction : YES Resource Sharing : YES ---- Target Options Add IO Buffers : YES MACRO Preserve : YES XOR Preserve : YES Equivalent register Removal : YES ---- General Options Optimization Goal : Speed Optimization Effort : 1 Library Search Order : ntru_key.lso Keep Hierarchy : YES Netlist Hierarchy : as_optimized RTL Output : Yes Hierarchy Separator : / Bus Delimiter : <> Case Specifier : maintain Verilog 2001 : YES ---- Other Options Clock Enable : YES wysiwyg : NO
98
========================================================================= ========================================================================= * HDL Compilation * ========================================================================= Compiling verilog file "proc_unit.v" in library work Compiling verilog file "coeff.v" in library work Module <proc_unit> compiled Compiling verilog file "bit4_cnt.v" in library work Module <coeff> compiled Compiling verilog file "barrel_shift.v" in library work Module <bit4_cnt> compiled Compiling verilog file "polynomial_mult.v" in library work Module <barrel_shift> compiled Compiling verilog file "const_mult.v" in library work Module <polynomial_mult> compiled Compiling verilog file "ntru_key.v" in library work Module <const_mult> compiled Module <ntru_key> compiled No errors in compilation Analysis of file <"ntru_key.prj"> succeeded. ========================================================================= * Design Hierarchy Analysis * ========================================================================= Analyzing hierarchy for module <ntru_key> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" p = "11" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110" Analyzing hierarchy for module <const_mult> in library <work> with parameters.
99
N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" Analyzing hierarchy for module <polynomial_mult> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110" Analyzing hierarchy for module <bit4_cnt> in library <work>. Analyzing hierarchy for module <barrel_shift> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" Analyzing hierarchy for module <coeff> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110" Analyzing hierarchy for module <proc_unit> in library <work> with parameters. q = "00000000000000000000000000000110" ========================================================================= * HDL Analysis * ========================================================================= Analyzing top module <ntru_key>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 p = 2'b11 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Module <ntru_key> is correct for synthesis.
100
Analyzing module <const_mult> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 Module <const_mult> is correct for synthesis. Analyzing module <polynomial_mult> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Module <polynomial_mult> is correct for synthesis. Analyzing module <bit4_cnt> in library <work>. Module <bit4_cnt> is correct for synthesis. Analyzing module <barrel_shift> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 Module <barrel_shift> is correct for synthesis. Analyzing module <coeff> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Module <coeff> is correct for synthesis. Analyzing module <proc_unit> in library <work>. q = 32'sb00000000000000000000000000000110 Module <proc_unit> is correct for synthesis. ========================================================================= * HDL Synthesis * ========================================================================= Performing bidirectional port resolution... Synthesizing Unit <const_mult>.
101
Related source file is "const_mult.v". WARNING:Xst:643 - "const_mult.v" line 18: The result of a 6x2-bit multiplication is partially used. Only the 6 least significant bits are used. If you are doing this on purpose, you may safely ignore this warning. Otherwise, make sure you are not losing information, leading to unexpected circuit behavior. WARNING:Xst:643 - "const_mult.v" line 19: The result of a 6x2-bit multiplication is partially used. Only the 6 least significant bits are used. If you are doing this on purpose, you may safely ignore this warning. Otherwise, make sure you are not losing information, leading to unexpected circuit behavior. WARNING:Xst:643 - "const_mult.v" line 20: The result of a 6x2-bit multiplication is partially used. Only the 6 least significant bits are used. If you are doing this on purpose, you may safely ignore this warning. Otherwise, make sure you are not losing information, leading to unexpected circuit behavior. WARNING:Xst:643 - "const_mult.v" line 21: The result of a 6x2-bit multiplication is partially used. Only the 6 least significant bits are used. If you are doing this on purpose, you may safely ignore this warning. Otherwise, make sure you are not losing information, leading to unexpected circuit behavior. WARNING:Xst:643 - "const_mult.v" line 22: The result of a 6x2-bit multiplication is partially used. Only the 6 least significant bits are used. If you are doing this on purpose, you may safely ignore this warning. Otherwise, make sure you are not losing information, leading to unexpected circuit behavior. WARNING:Xst:643 - "const_mult.v" line 23: The result of a 6x2-bit multiplication is partially used. Only the 6 least significant bits are used. If you are doing this on purpose, you may safely ignore this warning. Otherwise, make sure you are not losing information, leading to unexpected circuit behavior. WARNING:Xst:643 - "const_mult.v" line 24: The result of a 6x2-bit multiplication is partially used. Only the 6 least significant bits are used. If you are doing this on purpose, you may safely ignore this warning. Otherwise, make sure you are not losing information, leading to unexpected circuit behavior. Found 6x2-bit multiplier for signal <prod_11_6$mult0002> created at line 19.
102
Found 6x2-bit multiplier for signal <prod_17_12$mult0002> created at line 20. Found 6x2-bit multiplier for signal <prod_23_18$mult0002> created at line 21. Found 6x2-bit multiplier for signal <prod_29_24$mult0002> created at line 22. Found 6x2-bit multiplier for signal <prod_35_30$mult0002> created at line 23. Found 6x2-bit multiplier for signal <prod_41_36$mult0002> created at line 24. Found 6x2-bit multiplier for signal <prod_5_0$mult0002> created at line 18. Summary: inferred 7 Multiplier(s). Unit <const_mult> synthesized. Synthesizing Unit <bit4_cnt>. Related source file is "bit4_cnt.v". Found 4-bit up counter for signal <cnt4>. Summary: inferred 1 Counter(s). Unit <bit4_cnt> synthesized. Synthesizing Unit <barrel_shift>. Related source file is "barrel_shift.v". Found 42-bit 16-to-1 multiplexer for signal <shift_out>. Unit <barrel_shift> synthesized. Synthesizing Unit <proc_unit>. Related source file is "proc_unit.v". Found 6-bit adder for signal <out>. Found 6-bit adder for signal <out$addsub0000> created at line 28. Summary: inferred 2 Adder/Subtractor(s). Unit <proc_unit> synthesized. Synthesizing Unit <coeff>. Related source file is "coeff.v". Unit <coeff> synthesized.
103
Synthesizing Unit <polynomial_mult>. Related source file is "polynomial_mult.v". Found 42-bit register for signal <poly_prod>. Found 1-bit register for signal <poly_done_temp>. Found 42-bit adder for signal <poly_prod$addsub0000> created at line 42. Found 4x3-bit multiplier for signal <prod_temp$mult0000> created at line 29. Found 42-bit shifter logical left for signal <prod_temp$shift0000>. Summary: inferred 1 D-type flip-flop(s). inferred 1 Adder/Subtractor(s). inferred 1 Multiplier(s). inferred 1 Combinational logic shifter(s). Unit <polynomial_mult> synthesized. Synthesizing Unit <ntru_key>. Related source file is "ntru_key.v". Unit <ntru_key> synthesized. ========================================================================= HDL Synthesis Report Macro Statistics # Multipliers : 8 4x3-bit multiplier : 1 6x2-bit multiplier : 7 # Adders/Subtractors : 15 42-bit adder : 1 6-bit adder : 14 # Counters : 1 4-bit up counter : 1 # Registers : 2 1-bit register : 1 42-bit register : 1 # Multiplexers : 1 42-bit 16-to-1 multiplexer : 1 # Logic shifters : 1 42-bit shifter logical left : 1
104
========================================================================= ========================================================================= * Advanced HDL Synthesis * ========================================================================= WARNING:Xst:1426 - The value init of the FF/Latch poly_done_temp hinder the constant cleaning in the block polynomial_mult. You should achieve better results by setting this init to 1. ========================================================================= Advanced HDL Synthesis Report Macro Statistics # Multipliers : 8 4x3-bit multiplier : 1 6x2-bit multiplier : 7 # Adders/Subtractors : 15 42-bit adder : 1 6-bit adder : 14 # Counters : 1 4-bit up counter : 1 # Registers : 43 Flip-Flops : 43 # Multiplexers : 1 42-bit 16-to-1 multiplexer : 1 ========================================================================= ========================================================================= * Low Level Synthesis * =========================================================================
105
Optimizing unit <ntru_key> ... Optimizing unit <barrel_shift> ... Optimizing unit <const_mult> ... Optimizing unit <bit4_cnt> ... implementation constraint: INIT=r : cnt4_3 implementation constraint: INIT=r : cnt4_2 implementation constraint: INIT=r : cnt4_1 implementation constraint: INIT=r : cnt4_0 Optimizing unit <proc_unit> ... Optimizing unit <coeff> ... Optimizing unit <polynomial_mult> ... implementation constraint: INIT=r : poly_prod_41 implementation constraint: INIT=r : poly_done_temp implementation constraint: INIT=r : poly_prod_0 implementation constraint: INIT=r : poly_prod_1 implementation constraint: INIT=r : poly_prod_2 implementation constraint: INIT=r : poly_prod_3 implementation constraint: INIT=r : poly_prod_4 implementation constraint: INIT=r : poly_prod_5 implementation constraint: INIT=r : poly_prod_6 implementation constraint: INIT=r : poly_prod_7 implementation constraint: INIT=r : poly_prod_8 implementation constraint: INIT=r : poly_prod_9 implementation constraint: INIT=r : poly_prod_10 implementation constraint: INIT=r : poly_prod_11 implementation constraint: INIT=r : poly_prod_12 implementation constraint: INIT=r : poly_prod_13 implementation constraint: INIT=r : poly_prod_14 implementation constraint: INIT=r : poly_prod_15 implementation constraint: INIT=r : poly_prod_16 implementation constraint: INIT=r : poly_prod_17 implementation constraint: INIT=r : poly_prod_18 implementation constraint: INIT=r : poly_prod_19 implementation constraint: INIT=r : poly_prod_20 implementation constraint: INIT=r : poly_prod_21 implementation constraint: INIT=r : poly_prod_22 implementation constraint: INIT=r : poly_prod_23 implementation constraint: INIT=r : poly_prod_24 implementation constraint: INIT=r : poly_prod_25
106
implementation constraint: INIT=r : poly_prod_26 implementation constraint: INIT=r : poly_prod_27 implementation constraint: INIT=r : poly_prod_28 implementation constraint: INIT=r : poly_prod_29 implementation constraint: INIT=r : poly_prod_30 implementation constraint: INIT=r : poly_prod_31 implementation constraint: INIT=r : poly_prod_32 implementation constraint: INIT=r : poly_prod_33 implementation constraint: INIT=r : poly_prod_34 implementation constraint: INIT=r : poly_prod_35 implementation constraint: INIT=r : poly_prod_36 implementation constraint: INIT=r : poly_prod_37 implementation constraint: INIT=r : poly_prod_38 implementation constraint: INIT=r : poly_prod_39 implementation constraint: INIT=r : poly_prod_40 ========================================================================= * Partition Report * ========================================================================= Partition Implementation Status ------------------------------- No Partitions were found in this design. ------------------------------- ========================================================================= * Final Report * ========================================================================= Final Results RTL Top Level Output File Name : ntru_key.ngr Top Level Output File Name : ntru_key Output Format : NGC Optimization Goal : Speed Keep Hierarchy : YES Target Technology : Automotive 9500XL Macro Preserve : YES XOR Preserve : YES
107
Clock Enable : YES wysiwyg : NO Design Statistics # IOs : 101 Cell Usage : # BELS : 1751 # AND2 : 870 # AND3 : 10 # AND4 : 22 # AND6 : 2 # AND7 : 2 # GND : 1 # INV : 267 # OR2 : 203 # OR3 : 15 # OR7 : 42 # VCC : 1 # XOR2 : 316 # FlipFlops/Latches : 47 # FDC : 46 # FDCE : 1 # IO Buffers : 101 # IBUF : 58 # OBUF : 43 ========================================================================= Total REAL time to Xst completion: 6.00 secs Total CPU time to Xst completion: 6.22 secs --> Total memory usage is 144528 kilobytes Number of errors : 0 ( 0 filtered) Number of warnings : 8 ( 0 filtered) Number of infos : 0 ( 0 filtered)
108
B.2 NTRU_ENCRYPTOR:
Release 10.1 - xst K.31 (nt) Copyright (c) 1995-2008 Xilinx, Inc. All rights reserved. --> Parameter TMPDIR set to C:/Documents and Settings/kamatp/N112_synth_enc/xst/projnav.tmp Total REAL time to Xst completion: 0.00 secs Total CPU time to Xst completion: 0.08 secs --> Parameter xsthdpdir set to C:/Documents and Settings/kamatp/N112_synth_enc/xst Total REAL time to Xst completion: 0.00 secs Total CPU time to Xst completion: 0.08 secs --> Reading design: ntru_encryptor_blk.prj TABLE OF CONTENTS 1) Synthesis Options Summary 2) HDL Compilation 3) Design Hierarchy Analysis 4) HDL Analysis 5) HDL Synthesis 5.1) HDL Synthesis Report 6) Advanced HDL Synthesis 6.1) Advanced HDL Synthesis Report 7) Low Level Synthesis 8) Partition Report 9) Final Report ========================================================================= * Synthesis Options Summary * ========================================================================= ---- Source Parameters Input File Name : "ntru_encryptor_blk.prj" Input Format : mixed Ignore Synthesis Constraint File : NO
109
---- Target Parameters Output File Name : "ntru_encryptor_blk" Output Format : NGC Target Device : Automotive 9500XL ---- Source Options Top Module Name : ntru_encryptor_blk Automatic FSM Extraction : YES FSM Encoding Algorithm : Auto Safe Implementation : No Mux Extraction : YES Resource Sharing : YES ---- Target Options Add IO Buffers : YES MACRO Preserve : YES XOR Preserve : YES Equivalent register Removal : YES ---- General Options Optimization Goal : Speed Optimization Effort : 1 Library Search Order : ntru_encryptor_blk.lso Keep Hierarchy : YES Netlist Hierarchy : as_optimized RTL Output : Yes Hierarchy Separator : / Bus Delimiter : <> Case Specifier : maintain Verilog 2001 : YES ---- Other Options Clock Enable : YES wysiwyg : NO ========================================================================= ========================================================================= * HDL Compilation * =========================================================================
110
Compiling verilog file "proc_unit.v" in library work Compiling verilog file "coeff.v" in library work Module <proc_unit> compiled Compiling verilog file "bit4_cnt.v" in library work Module <coeff> compiled Compiling verilog file "barrel_shift.v" in library work Module <bit4_cnt> compiled Compiling verilog file "polynomial_mult.v" in library work Module <barrel_shift> compiled Compiling verilog file "ntru_encryptor.v" in library work Module <polynomial_mult> compiled Compiling verilog file "ntru_encryptor_blk.v" in library work Module <ntru_encryptor> compiled Module <ntru_encryptor_blk> compiled No errors in compilation Analysis of file <"ntru_encryptor_blk.prj"> succeeded. ========================================================================= * Design Hierarchy Analysis * ========================================================================= Analyzing hierarchy for module <ntru_encryptor_blk> in library <work> with parameters. N = "00000000000000000000000001110000" big_size = "00000000000000000000001010100000" q = "00000000000000000000000000000110" small_size = "00000000000000000000000011100000" Analyzing hierarchy for module <ntru_encryptor> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110" Analyzing hierarchy for module <polynomial_mult> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110"
111
Analyzing hierarchy for module <bit4_cnt> in library <work>. Analyzing hierarchy for module <barrel_shift> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" Analyzing hierarchy for module <coeff> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110" Analyzing hierarchy for module <proc_unit> in library <work> with parameters. q = "00000000000000000000000000000110" ========================================================================= * HDL Analysis * ========================================================================= Analyzing top module <ntru_encryptor_blk>. N = 32'sb00000000000000000000000001110000 big_size = 32'sb00000000000000000000001010100000 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000011100000 Module <ntru_encryptor_blk> is correct for synthesis. Analyzing module <ntru_encryptor> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Module <ntru_encryptor> is correct for synthesis. Analyzing module <polynomial_mult> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010
112
q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Module <polynomial_mult> is correct for synthesis. Analyzing module <bit4_cnt> in library <work>. Module <bit4_cnt> is correct for synthesis. Analyzing module <barrel_shift> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 Module <barrel_shift> is correct for synthesis. Analyzing module <coeff> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Module <coeff> is correct for synthesis. Analyzing module <proc_unit> in library <work>. q = 32'sb00000000000000000000000000000110 Module <proc_unit> is correct for synthesis. ========================================================================= * HDL Synthesis * ========================================================================= Performing bidirectional port resolution... Synthesizing Unit <bit4_cnt>. Related source file is "bit4_cnt.v". Found 4-bit up counter for signal <cnt4>. Summary: inferred 1 Counter(s). Unit <bit4_cnt> synthesized. Synthesizing Unit <barrel_shift>. Related source file is "barrel_shift.v".
113
Found 42-bit 16-to-1 multiplexer for signal <shift_out>. Unit <barrel_shift> synthesized. Synthesizing Unit <proc_unit>. Related source file is "proc_unit.v". Found 6-bit adder for signal <out>. Found 6-bit adder for signal <out$addsub0000> created at line 28. Summary: inferred 2 Adder/Subtractor(s). Unit <proc_unit> synthesized. Synthesizing Unit <coeff>. Related source file is "coeff.v". Unit <coeff> synthesized. Synthesizing Unit <polynomial_mult>. Related source file is "polynomial_mult.v". Found 42-bit register for signal <poly_prod>. Found 1-bit register for signal <poly_done_temp>. Found 42-bit adder for signal <poly_prod$addsub0000> created at line 42. Found 4x3-bit multiplier for signal <prod_temp$mult0000> created at line 29. Found 42-bit shifter logical left for signal <prod_temp$shift0000>. Summary: inferred 1 D-type flip-flop(s). inferred 1 Adder/Subtractor(s). inferred 1 Multiplier(s). inferred 1 Combinational logic shifter(s). Unit <polynomial_mult> synthesized. Synthesizing Unit <ntru_encryptor>. Related source file is "ntru_encryptor.v". Found 42-bit register for signal <enc_msg>. Found 1-bit register for signal <encrypt_done>. Found 6-bit adder for signal <$add0000> created at line 31.
114
Found 6-bit adder for signal <$add0001> created at line 36. Found 6-bit adder for signal <$add0002> created at line 41. Found 6-bit adder for signal <$add0003> created at line 46. Found 6-bit adder for signal <$add0004> created at line 51. Found 6-bit adder for signal <$add0005> created at line 56. Found 6-bit adder for signal <$add0006> created at line 61. Found 6-bit subtractor for signal <$sub0000> created at line 29. Found 6-bit subtractor for signal <$sub0001> created at line 34. Found 6-bit subtractor for signal <$sub0002> created at line 39. Found 6-bit subtractor for signal <$sub0003> created at line 44. Found 6-bit subtractor for signal <$sub0004> created at line 49. Found 6-bit subtractor for signal <$sub0005> created at line 54. Found 6-bit subtractor for signal <$sub0006> created at line 59. Summary: inferred 43 D-type flip-flop(s). inferred 14 Adder/Subtractor(s). Unit <ntru_encryptor> synthesized. Synthesizing Unit <ntru_encryptor_blk>. Related source file is "ntru_encryptor_blk.v". WARNING:Xst:647 - Input <msg<223:214>> is never used. This port will be preserved and left unconnected if it belongs to a top-level block or it belongs to a sub-block and the hierarchy of this sub-block is preserved. WARNING:Xst:647 - Input <r_poly<223:214>> is never used. This port will be preserved and left unconnected if it belongs to a top-level block or it belongs to a sub-block and the hierarchy of this sub-block is preserved. WARNING:Xst:646 - Signal <d_9> is assigned but never used. This unconnected signal will be trimmed during the optimization process.
115
WARNING:Xst:646 - Signal <d_8> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_7> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_6> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_5> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_4> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_3> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_2> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_15> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_14> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_13> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_12> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_11> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_10> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_1> is assigned but never used. This unconnected signal will be trimmed during the optimization process. Unit <ntru_encryptor_blk> synthesized.
116
========================================================================= HDL Synthesis Report Macro Statistics # Multipliers : 16 4x3-bit multiplier : 16 # Adders/Subtractors : 464 42-bit adder : 16 6-bit adder : 336 6-bit subtractor : 112 # Counters : 16 4-bit up counter : 16 # Registers : 720 1-bit register : 704 42-bit register : 16 # Multiplexers : 16 42-bit 16-to-1 multiplexer : 16 # Logic shifters : 16 42-bit shifter logical left : 16 ========================================================================= ========================================================================= * Advanced HDL Synthesis * ========================================================================= WARNING:Xst:1426 - The value init of the FF/Latch poly_done_temp hinder the constant cleaning in the block polynomial_mult. You should achieve better results by setting this init to 1. WARNING:Xst:1426 - The value init of the FF/Latch encrypt_done hinder the constant cleaning in the block ntru_encryptor.
117
You should achieve better results by setting this init to 1. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc1>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc2>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc3>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc4>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc5>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc6>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc7>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc8>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc9>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc10>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc11>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc12>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc13>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc14>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc15>. ========================================================================= Advanced HDL Synthesis Report Macro Statistics # Multipliers : 16 4x3-bit multiplier : 16 # Adders/Subtractors : 464 42-bit adder : 16 6-bit adder : 336
118
6-bit subtractor : 112 # Counters : 16 4-bit up counter : 16 # Registers : 1376 Flip-Flops : 1376 # Multiplexers : 16 42-bit 16-to-1 multiplexer : 16 ========================================================================= ========================================================================= * Low Level Synthesis * ========================================================================= Optimizing unit <ntru_encryptor_blk> ... Optimizing unit <barrel_shift> ... Optimizing unit <bit4_cnt> ... implementation constraint: INIT=r : cnt4_3 implementation constraint: INIT=r : cnt4_2 implementation constraint: INIT=r : cnt4_1 implementation constraint: INIT=r : cnt4_0 Optimizing unit <proc_unit> ... Optimizing unit <coeff> ... Optimizing unit <polynomial_mult> ... implementation constraint: INIT=r : poly_prod_41 implementation constraint: INIT=r : poly_done_temp implementation constraint: INIT=r : poly_prod_0 implementation constraint: INIT=r : poly_prod_1 implementation constraint: INIT=r : poly_prod_2 implementation constraint: INIT=r : poly_prod_3 implementation constraint: INIT=r : poly_prod_4 implementation constraint: INIT=r : poly_prod_5 implementation constraint: INIT=r : poly_prod_6
119
implementation constraint: INIT=r : poly_prod_7 implementation constraint: INIT=r : poly_prod_8 implementation constraint: INIT=r : poly_prod_9 implementation constraint: INIT=r : poly_prod_10 implementation constraint: INIT=r : poly_prod_11 implementation constraint: INIT=r : poly_prod_12 implementation constraint: INIT=r : poly_prod_13 implementation constraint: INIT=r : poly_prod_14 implementation constraint: INIT=r : poly_prod_15 implementation constraint: INIT=r : poly_prod_16 implementation constraint: INIT=r : poly_prod_17 implementation constraint: INIT=r : poly_prod_18 implementation constraint: INIT=r : poly_prod_19 implementation constraint: INIT=r : poly_prod_20 implementation constraint: INIT=r : poly_prod_21 implementation constraint: INIT=r : poly_prod_22 implementation constraint: INIT=r : poly_prod_23 implementation constraint: INIT=r : poly_prod_24 implementation constraint: INIT=r : poly_prod_25 implementation constraint: INIT=r : poly_prod_26 implementation constraint: INIT=r : poly_prod_27 implementation constraint: INIT=r : poly_prod_28 implementation constraint: INIT=r : poly_prod_29 implementation constraint: INIT=r : poly_prod_30 implementation constraint: INIT=r : poly_prod_31 implementation constraint: INIT=r : poly_prod_32 implementation constraint: INIT=r : poly_prod_33 implementation constraint: INIT=r : poly_prod_34 implementation constraint: INIT=r : poly_prod_35 implementation constraint: INIT=r : poly_prod_36 implementation constraint: INIT=r : poly_prod_37 implementation constraint: INIT=r : poly_prod_38 implementation constraint: INIT=r : poly_prod_39 implementation constraint: INIT=r : poly_prod_40 Optimizing unit <ntru_encryptor> ... implementation constraint: INIT=r : enc_msg_5 implementation constraint: INIT=r : enc_msg_4 implementation constraint: INIT=r : enc_msg_3 implementation constraint: INIT=r : enc_msg_2 implementation constraint: INIT=r : enc_msg_1 implementation constraint: INIT=r : enc_msg_0 implementation constraint: INIT=r : enc_msg_11 implementation constraint: INIT=r : enc_msg_10 implementation constraint: INIT=r : enc_msg_9
120
implementation constraint: INIT=r : enc_msg_8 implementation constraint: INIT=r : enc_msg_7 implementation constraint: INIT=r : enc_msg_6 implementation constraint: INIT=r : enc_msg_17 implementation constraint: INIT=r : enc_msg_16 implementation constraint: INIT=r : enc_msg_15 implementation constraint: INIT=r : enc_msg_14 implementation constraint: INIT=r : enc_msg_13 implementation constraint: INIT=r : enc_msg_12 implementation constraint: INIT=r : enc_msg_23 implementation constraint: INIT=r : enc_msg_22 implementation constraint: INIT=r : enc_msg_21 implementation constraint: INIT=r : enc_msg_20 implementation constraint: INIT=r : enc_msg_19 implementation constraint: INIT=r : enc_msg_18 implementation constraint: INIT=r : enc_msg_29 implementation constraint: INIT=r : enc_msg_28 implementation constraint: INIT=r : enc_msg_27 implementation constraint: INIT=r : enc_msg_26 implementation constraint: INIT=r : enc_msg_25 implementation constraint: INIT=r : enc_msg_24 implementation constraint: INIT=r : enc_msg_35 implementation constraint: INIT=r : enc_msg_34 implementation constraint: INIT=r : enc_msg_33 implementation constraint: INIT=r : enc_msg_32 implementation constraint: INIT=r : enc_msg_31 implementation constraint: INIT=r : enc_msg_30 implementation constraint: INIT=r : enc_msg_41 implementation constraint: INIT=r : enc_msg_40 implementation constraint: INIT=r : enc_msg_39 implementation constraint: INIT=r : enc_msg_38 implementation constraint: INIT=r : enc_msg_37 implementation constraint: INIT=r : enc_msg_36 implementation constraint: INIT=r : encrypt_done WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc15>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc14>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc13>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc12>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc11>.
121
WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc10>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc9>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc8>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc7>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc6>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc5>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc4>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc3>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc2>. WARNING:Xst:2677 - Node <encrypt_done> of sequential type is unconnected in block <enc1>. ========================================================================= * Partition Report * ========================================================================= Partition Implementation Status ------------------------------- No Partitions were found in this design. ------------------------------- ========================================================================= * Final Report * ========================================================================= Final Results RTL Top Level Output File Name : ntru_encryptor_blk.ngr Top Level Output File Name : ntru_encryptor_blk Output Format : NGC
122
Optimization Goal : Speed Keep Hierarchy : YES Target Technology : Automotive 9500XL Macro Preserve : YES XOR Preserve : YES Clock Enable : YES wysiwyg : NO Design Statistics # IOs : 1165 Cell Usage : # BELS : 18000 # AND2 : 6688 # AND3 : 160 # AND4 : 352 # AND6 : 32 # AND7 : 32 # GND : 16 # INV : 4096 # OR2 : 2064 # OR3 : 16 # OR7 : 192 # VCC : 32 # XOR2 : 4320 # FlipFlops/Latches : 1440 # FD : 688 # FDC : 736 # FDCE : 16 # IO Buffers : 1145 # IBUF : 472 # OBUF : 673 ========================================================================= Total REAL time to Xst completion: 39.00 secs Total CPU time to Xst completion: 38.78 secs --> Total memory usage is 245584 kilobytes Number of errors : 0 ( 0 filtered) Number of warnings : 49 ( 0 filtered)
123
Number of infos : 0 ( 0 filtered)
B.3 NTRU_DECRYPTOR:
Release 10.1 - xst K.31 (nt) Copyright (c) 1995-2008 Xilinx, Inc. All rights reserved. --> Parameter TMPDIR set to C:/Documents and Settings/kamatp/N112_synth_dec/xst/projnav.tmp Total REAL time to Xst completion: 0.00 secs Total CPU time to Xst completion: 0.06 secs --> Parameter xsthdpdir set to C:/Documents and Settings/kamatp/N112_synth_dec/xst Total REAL time to Xst completion: 0.00 secs Total CPU time to Xst completion: 0.06 secs --> Reading design: ntru_decryptor_blk.prj TABLE OF CONTENTS 1) Synthesis Options Summary 2) HDL Compilation 3) Design Hierarchy Analysis 4) HDL Analysis 5) HDL Synthesis 5.1) HDL Synthesis Report 6) Advanced HDL Synthesis 6.1) Advanced HDL Synthesis Report 7) Low Level Synthesis 8) Partition Report 9) Final Report ========================================================================= * Synthesis Options Summary * ========================================================================= ---- Source Parameters Input File Name : "ntru_decryptor_blk.prj"
124
Input Format : mixed Ignore Synthesis Constraint File : NO ---- Target Parameters Output File Name : "ntru_decryptor_blk" Output Format : NGC Target Device : Automotive 9500XL ---- Source Options Top Module Name : ntru_decryptor_blk Automatic FSM Extraction : YES FSM Encoding Algorithm : Auto Safe Implementation : No Mux Extraction : YES Resource Sharing : YES ---- Target Options Add IO Buffers : YES MACRO Preserve : YES XOR Preserve : YES Equivalent register Removal : YES ---- General Options Optimization Goal : Speed Optimization Effort : 1 Library Search Order : ntru_decryptor_blk.lso Keep Hierarchy : YES Netlist Hierarchy : as_optimized RTL Output : Yes Hierarchy Separator : / Bus Delimiter : <> Case Specifier : maintain Verilog 2001 : YES ---- Other Options Clock Enable : YES wysiwyg : NO ========================================================================= =========================================================================
125
* HDL Compilation * ========================================================================= Compiling verilog file "proc_unit.v" in library work Compiling verilog file "coeff.v" in library work Compiling verilog include file "proc_unit.v" Module <proc_unit> compiled Compiling verilog file "bit4_cnt.v" in library work Module <coeff> compiled Compiling verilog file "barrel_shift.v" in library work Module <bit4_cnt> compiled Compiling verilog file "polynomial_mult.v" in library work Module <barrel_shift> compiled Compiling verilog file "mult_mod.v" in library work Module <polynomial_mult> compiled Compiling verilog file "ntru_decryptor.v" in library work Module <mult_mod> compiled Compiling verilog file "ntru_decryptor_blk.v" in library work Module <ntru_decryptor> compiled Module <ntru_decryptor_blk> compiled No errors in compilation Analysis of file <"ntru_decryptor_blk.prj"> succeeded. ========================================================================= * Design Hierarchy Analysis * ========================================================================= Analyzing hierarchy for module <ntru_decryptor_blk> in library <work> with parameters. N = "00000000000000000000000001110000" big_size = "00000000000000000000001010100000" q = "00000000000000000000000000000110" small_size = "00000000000000000000000011100000" Analyzing hierarchy for module <ntru_decryptor> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110"
126
Analyzing hierarchy for module <mult_mod> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110" Analyzing hierarchy for module <polynomial_mult> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110" Analyzing hierarchy for module <bit4_cnt> in library <work>. Analyzing hierarchy for module <barrel_shift> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" Analyzing hierarchy for module <coeff> in library <work> with parameters. N = "00000000000000000000000000000111" big_size = "00000000000000000000000000101010" q = "00000000000000000000000000000110" small_size = "00000000000000000000000000001110" Analyzing hierarchy for module <proc_unit> in library <work> with parameters. q = "00000000000000000000000000000110" ========================================================================= * HDL Analysis * ========================================================================= Analyzing top module <ntru_decryptor_blk>. N = 32'sb00000000000000000000000001110000 big_size = 32'sb00000000000000000000001010100000
127
q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000011100000 Module <ntru_decryptor_blk> is correct for synthesis. Analyzing module <ntru_decryptor> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 WARNING:Xst:905 - "ntru_decryptor.v" line 28: One or more signals are missing in the sensitivity list of always block. To enable synthesis of FPGA/CPLD hardware, XST will assume that all necessary signals are present in the sensitivity list. Please note that the result of the synthesis may differ from the initial design specification. The missing signals are: <dec_b> Module <ntru_decryptor> is correct for synthesis. Analyzing module <mult_mod> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Calling function <mod3>. Calling function <mod3>. Calling function <mod3>. Calling function <mod3>. Calling function <mod3>. Calling function <mod3>. Calling function <mod3>. Module <mult_mod> is correct for synthesis. Analyzing module <polynomial_mult> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Module <polynomial_mult> is correct for synthesis. Analyzing module <bit4_cnt> in library <work>. Module <bit4_cnt> is correct for synthesis. Analyzing module <barrel_shift> in library <work>. N = 32'sb00000000000000000000000000000111
128
big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 Module <barrel_shift> is correct for synthesis. Analyzing module <coeff> in library <work>. N = 32'sb00000000000000000000000000000111 big_size = 32'sb00000000000000000000000000101010 q = 32'sb00000000000000000000000000000110 small_size = 32'sb00000000000000000000000000001110 Module <coeff> is correct for synthesis. Analyzing module <proc_unit> in library <work>. q = 32'sb00000000000000000000000000000110 Module <proc_unit> is correct for synthesis. ========================================================================= * HDL Synthesis * ========================================================================= Performing bidirectional port resolution... Synthesizing Unit <bit4_cnt>. Related source file is "bit4_cnt.v". Found 4-bit up counter for signal <cnt4>. Summary: inferred 1 Counter(s). Unit <bit4_cnt> synthesized. Synthesizing Unit <barrel_shift>. Related source file is "barrel_shift.v". Found 42-bit 16-to-1 multiplexer for signal <shift_out>. Unit <barrel_shift> synthesized. Synthesizing Unit <proc_unit>. Related source file is "proc_unit.v". Found 6-bit adder for signal <out>. Found 6-bit adder for signal <out$addsub0000> created at line 28.
129
Summary: inferred 2 Adder/Subtractor(s). Unit <proc_unit> synthesized. Synthesizing Unit <coeff>. Related source file is "coeff.v". Unit <coeff> synthesized. Synthesizing Unit <polynomial_mult>. Related source file is "polynomial_mult.v". Found 42-bit register for signal <poly_prod>. Found 1-bit register for signal <poly_done_temp>. Found 42-bit adder for signal <poly_prod$addsub0000> created at line 42. Found 4x3-bit multiplier for signal <prod_temp$mult0000> created at line 29. Found 42-bit shifter logical left for signal <prod_temp$shift0000>. Summary: inferred 1 D-type flip-flop(s). inferred 1 Adder/Subtractor(s). inferred 1 Multiplier(s). inferred 1 Combinational logic shifter(s). Unit <polynomial_mult> synthesized. Synthesizing Unit <mult_mod>. Related source file is "mult_mod.v". WARNING:Xst:646 - Signal <mod3/7/mod3<5:2>> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <mod3/6/mod3<5:2>> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <mod3/5/mod3<5:2>> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <mod3/4/mod3<5:2>> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <mod3/3/mod3<5:2>> is assigned but never used. This unconnected signal will be trimmed during the optimization process.
130
WARNING:Xst:646 - Signal <mod3/2/mod3<5:2>> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <mod3/1/mod3<5:2>> is assigned but never used. This unconnected signal will be trimmed during the optimization process. Found 64x6-bit ROM for signal <mod3/1/mod3>. Found 64x6-bit ROM for signal <mod3/2/mod3>. Found 64x6-bit ROM for signal <mod3/3/mod3>. Found 64x6-bit ROM for signal <mod3/4/mod3>. Found 64x6-bit ROM for signal <mod3/5/mod3>. Found 64x6-bit ROM for signal <mod3/6/mod3>. Found 64x6-bit ROM for signal <mod3/7/mod3>. Summary: inferred 7 ROM(s). Unit <mult_mod> synthesized. Synthesizing Unit <ntru_decryptor>. Related source file is "ntru_decryptor.v". Unit <ntru_decryptor> synthesized. Synthesizing Unit <ntru_decryptor_blk>. Related source file is "ntru_decryptor_blk.v". WARNING:Xst:646 - Signal <d_9> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_8> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_7> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_6> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_5> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_4> is assigned but never used. This unconnected signal will be trimmed during the optimization process.
131
WARNING:Xst:646 - Signal <d_3> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_2> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_15> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_14> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_13> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_12> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_11> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_10> is assigned but never used. This unconnected signal will be trimmed during the optimization process. WARNING:Xst:646 - Signal <d_1> is assigned but never used. This unconnected signal will be trimmed during the optimization process. Unit <ntru_decryptor_blk> synthesized. ========================================================================= HDL Synthesis Report Macro Statistics # ROMs : 224 64x6-bit ROM : 224 # Multipliers : 32 4x3-bit multiplier : 32 # Adders/Subtractors : 480 42-bit adder : 32
132
6-bit adder : 448 # Counters : 32 4-bit up counter : 32 # Registers : 64 1-bit register : 32 42-bit register : 32 # Multiplexers : 32 42-bit 16-to-1 multiplexer : 32 # Logic shifters : 32 42-bit shifter logical left : 32 ========================================================================= ========================================================================= * Advanced HDL Synthesis * ========================================================================= WARNING:Xst:1426 - The value init of the FF/Latch poly_done_temp hinder the constant cleaning in the block polynomial_mult. You should achieve better results by setting this init to 1. ========================================================================= Advanced HDL Synthesis Report Macro Statistics # ROMs : 224 64x6-bit ROM : 224 # Multipliers : 32 4x3-bit multiplier : 32 # Adders/Subtractors : 480 42-bit adder : 32 6-bit adder : 448 # Counters : 32
133
4-bit up counter : 32 # Registers : 1376 Flip-Flops : 1376 # Multiplexers : 32 42-bit 16-to-1 multiplexer : 32 ========================================================================= ========================================================================= * Low Level Synthesis * ========================================================================= Optimizing unit <ntru_decryptor_blk> ... Optimizing unit <barrel_shift> ... Optimizing unit <bit4_cnt> ... implementation constraint: INIT=r : cnt4_3 implementation constraint: INIT=r : cnt4_2 implementation constraint: INIT=r : cnt4_1 implementation constraint: INIT=r : cnt4_0 Optimizing unit <proc_unit> ... Optimizing unit <coeff> ... Optimizing unit <polynomial_mult> ... implementation constraint: INIT=r : poly_prod_41 implementation constraint: INIT=r : poly_done_temp implementation constraint: INIT=r : poly_prod_0 implementation constraint: INIT=r : poly_prod_1 implementation constraint: INIT=r : poly_prod_2 implementation constraint: INIT=r : poly_prod_3 implementation constraint: INIT=r : poly_prod_4 implementation constraint: INIT=r : poly_prod_5 implementation constraint: INIT=r : poly_prod_6 implementation constraint: INIT=r : poly_prod_7 implementation constraint: INIT=r : poly_prod_8 implementation constraint: INIT=r : poly_prod_9
134
implementation constraint: INIT=r : poly_prod_10 implementation constraint: INIT=r : poly_prod_11 implementation constraint: INIT=r : poly_prod_12 implementation constraint: INIT=r : poly_prod_13 implementation constraint: INIT=r : poly_prod_14 implementation constraint: INIT=r : poly_prod_15 implementation constraint: INIT=r : poly_prod_16 implementation constraint: INIT=r : poly_prod_17 implementation constraint: INIT=r : poly_prod_18 implementation constraint: INIT=r : poly_prod_19 implementation constraint: INIT=r : poly_prod_20 implementation constraint: INIT=r : poly_prod_21 implementation constraint: INIT=r : poly_prod_22 implementation constraint: INIT=r : poly_prod_23 implementation constraint: INIT=r : poly_prod_24 implementation constraint: INIT=r : poly_prod_25 implementation constraint: INIT=r : poly_prod_26 implementation constraint: INIT=r : poly_prod_27 implementation constraint: INIT=r : poly_prod_28 implementation constraint: INIT=r : poly_prod_29 implementation constraint: INIT=r : poly_prod_30 implementation constraint: INIT=r : poly_prod_31 implementation constraint: INIT=r : poly_prod_32 implementation constraint: INIT=r : poly_prod_33 implementation constraint: INIT=r : poly_prod_34 implementation constraint: INIT=r : poly_prod_35 implementation constraint: INIT=r : poly_prod_36 implementation constraint: INIT=r : poly_prod_37 implementation constraint: INIT=r : poly_prod_38 implementation constraint: INIT=r : poly_prod_39 implementation constraint: INIT=r : poly_prod_40 Optimizing unit <mult_mod> ... Optimizing unit <ntru_decryptor> ... ========================================================================= * Partition Report * ========================================================================= Partition Implementation Status -------------------------------
135
No Partitions were found in this design. ------------------------------- ========================================================================= * Final Report * ========================================================================= Final Results RTL Top Level Output File Name : ntru_decryptor_blk.ngr Top Level Output File Name : ntru_decryptor_blk Output Format : NGC Optimization Goal : Speed Keep Hierarchy : YES Target Technology : Automotive 9500XL Macro Preserve : YES XOR Preserve : YES Clock Enable : YES wysiwyg : NO Design Statistics # IOs : 955 Cell Usage : # BELS : 82560 # AND2 : 36256 # AND3 : 3264 # AND4 : 736 # AND6 : 64 # AND7 : 64 # GND : 32 # INV : 19632 # OR2 : 10352 # OR3 : 1152 # OR7 : 1344 # VCC : 32 # XOR2 : 9632 # FlipFlops/Latches : 1504 # FDC : 1472 # FDCE : 32 # IO Buffers : 955 # IBUF : 730
136
# OBUF : 225 ========================================================================= Total REAL time to Xst completion: 76.00 secs Total CPU time to Xst completion: 76.10 secs --> Total memory usage is 381072 kilobytes Number of errors : 0 ( 0 filtered) Number of warnings : 24 ( 0 filtered) Number of infos : 0 ( 0 filtered)
137
APPENDIX C
The NTRU Public Key Cryptosystem (PKCS)
C.1 NTRU PKCS Parameters [3, 6]
The basic collection of objects used by the NTRU PKCS is the ring R that
consists of all truncated polynomials of degree N-1 having integer coefficients:
a = a0 + a1X + a2X2 + a3X3 + . . . + aN-2XN-2 + aN-1XN-1.
An implementation of the NTRU Public Key Cryptosystem is specified by the following
parameters [3].
N - The polynomials in the truncated polynomial ring have degree N-1.
q - Large modulus: usually, the coefficients of the truncated polynomials will be reduced
modq.
p - small modulus. As the final step in decryption, the coefficients of the message are
reduced mod p.
In order to ensure security, it is essential that p and q have no common factors. The
following table gives some possible values for NTRU parameters at various security
levels [3].
138
Table 4: NTRU Security Parameters [3,6]
Next the process on encryption and decryption in NTRU is explained using the
classical “Bob-Alice” example using some smaller parameter values: [3]
Table 5: Small Security Parameters [3]
C.2 Key Creation [3, 6]
• Overview :
Bob wants to create a public/private key pair for the NTRU PKCS. He first randomly
chooses two “small” polynomials f and g in the ring of truncated polynomials R. Bob
139
must keep the values of the polynomials f and g private, since anyone who knows the
value of either one of them will be able to decrypt messages sent to Bob. [3]
Bob's next step is to compute the inverse of f modulo q and the inverse of f
modulo p. Thus he computes polynomials fq and fpwith the property that
f*fq = 1 (modulo q) and f*fp = 1 (modulo p).
(If by some chance these inverses do not exist, Bob will need to go back and choose
another f) Now Bob computes the product [3].
h = pfq*g (modulo q)
Bob's private key is the pair of polynomials f and fp. Bob's public key is the polynomial h
[3].
• Example :
So now N=11 q=32 p=3
We also need to define a "small" polynomial more precisely. For the purposes of this
example, we do this using the quantities df and dg. [3]
The polynomial f has df coefficients equal to +1, (df - 1) coefficients equal to -1, and the
rest equal to 0. The polynomial g has dg coefficients equal to +1, dg coefficients equal to
-1, and the rest equal to 0 [3].
(The reason for the slight difference in form between f and g is that f has to be invertible,
while g doesn't).
For the purposes of this example, we take df = 4 dg = 3.
140
So Bob needs to choose a polynomial f of degree 10 with four 1's and three -1's,
and he needs to choose a polynomial g of degree 10 with three 1's and three -1's. Suppose
he chooses:
f = -1 + X + X2 - X4 + X6 + X9 - X10
g = -1 + X2 + X3 + X5 - X8 - X10
Next Bob computes the inverse fp of f modulo p and the inverse fq of f modulo q. He
finds that:
fp = 1 + 2X + 2X3 + 2X4 + X5 + 2X7 + X8 + 2X9
fq = 5 + 9X + 6X2 + 16X3 + 4X4 + 15X5 + 16X6 + 22X7 + 20X8 + 18X9 + 30X10
The final step in key creation is to compute the product
h = pfq*g = 8 + 25X + 22X2 + 20X3 + 12X4 + 24X5 + 15X6 + 19X7 + 12X8 + 19X9 +
16X10 (modulo 32) [3].
Bob's private key is the pair of polynomials f and fp, and his public key is the polynomial
h [3].
C.3 Encryption [3, 6]
• Overview :
Alice wants to send a message to Bob using Bob's public key h. She first puts her
message in the form of a polynomial m whose coefficients are chosen modulo p, (in other
words, m is a “small” polynomial mod p). Next she randomly chooses another “small”
polynomial, r which is used to obscure the message [3].
Alice uses the message m, her randomly chosen polynomial r, and Bob's public key h to
compute the polynomial
141
e = r*h + m (modulo q)
The polynomial e is the encrypted message which Alice sends to Bob [3].
• Example :
As before, we need to specify what we mean by saying that r is a "small" polynomial.
We do this using the quantity dr. r has dr of its coefficients equal to 1, it has dr of its
coefficients equal to -1, and it has all of the rest of its coefficients equal to 0 [3] .
Here, we take dr = 3.
Now, suppose Alice wants to send the message
m = -1 + X3 - X4 - X8 + X9 + X10
to Bob using Bob's public key
h = 8 + 25X + 22X2 + 20X3 + 12X4 + 24X5 + 15X6 + 19X7 + 12X8 + 19X9 + 16X10.
She first chooses a random polynomial r of degree 10 with three 1's and three -1's. Say
she chooses r = -1 + X2 + X3 + X4 - X5 - X7.
Then her encrypted message e is
e = r*h + m = 14 + 11X + 26X2 + 24X3 + 14X4 + 16X5 + 30X6 + 7X7 + 25X8 + 6X9 +
19X10 (modulo 32) [3].
Alice sends this encrypted message e to Bob [3].
C.4 Decryption [3, 6]
• Overview :
Now Bob has received Alice's encrypted message e and he wants to decrypt it. He begins
by using his private polynomial f to compute the polynomial [3].
142
a = f*e (modulo q)
Since Bob is computing a modulo q, he can choose the coefficients of a to lie between -
q/2 and q/2. It is very important that Bob does this before performing the next step. Bob
next computes the polynomial
b = a (modulo p)
That is, he reduces each of the coefficients of a modulo p. Finally Bob uses his other
private polynomial fp to compute
c = fp*b (modulo p)
The polynomial c will be Alice's original message m [3].
• Example :
Bob has received the encrypted message
e = 14 + 11X + 26X2 + 24X3 + 14X4 + 16X5 + 30X6 + 7X7 + 25X8 + 6X9 + 19X10
from Alice [3]. He uses his private key f to compute
a = f*e = 3 - 7X - 10X2 - 11X3 + 10X4 + 7X5 + 6X6 + 7X7 + 5X8 - 3X9 - 7X10 (modulo
32).
Note that when Bob reduces the coefficients of f*e modulo 32, he chooses values lying
between -15 and 16, not between 0 and 31. It is very important that he choose the
coefficients in this way. Next Bob reduces the coefficients of a modulo 3 to get
b = a = - X - X2 + X3 + X4 + X5 + X7 - X8 - X10 (modulo 3).
Finally Bob uses fp, the other part of his private key, to compute
c = fp*b = - 1 + X3 - X4 – X8 + X9 + X10 (modulo 3).
143
The polynomial c is Alice's message m, so Bob has successfully decrypted Alice's
message [3].
144
REFERENCES
[1] William Stallings, Network Security Essentials: Application and Standards, 2000,
Prentice Hall, Inc., Upper Saddle River, New Jersey 07458
[2] Colleen Marie O’Rourke, Worcester Polytechnic Institute, Efficient NTRU
Implementations, 2002
[3] http://www.ntru.com/cryptolab/pdf/ntrututorials.pdf
[4] Mohan Atreya, Ben Hammond, Stephen Paine, Paul Starrett, Stephen Wu
Digital Signature, 2002,The McGraw-Hill Companies.
[5] Gunar Gaubatz, Worcester Polytechnic Institute, Versatile Montgomery Multiplier
Architectures, 2002
[6] Jeffrey Hoffstein, Jill Pipher, Joseph H. Silverman, NTRU- A Ring-based Public-Key
Cryptosystem, 1998 (United States Patent: 6,298,137).
[7] www.securityinnovation.com
[8] Katherine Compton, Scott Hauck, An Introduction to Reconfigurable Computing,
2000
[9] Rodney D’Souza, The NTRU Cryptosystem: Implementation and Comparative Analysis, 2001