Regeldokument - Linnéuniversitetetcs.uccs.edu/.../stedla/doc/OGNSuneethaTedlaPhDThesisV3.docx ·...

REDUCED VECTOR TECHNIQUE HOMOMORPHIC ENCRYPTION WITH VERSORS

A SURVEY AND A PROPOSED APPROACH

by

SUNEETHA TEDLA

M.C.A, Osmania University, India 1998

A dissertation submitted to the Graduate Faculty of the

University of Colorado at Colorado Springs

in partial fulfillment of the

requirements for the degree of

Doctor of Philosophy

Department of Computer Science

2019

2

© COPYRIGHT BY SUNEETHA TEDLA 2019

ALL RIGHTS RESERVED

3

This dissertation for the Doctor of Philosophy degree by

Suneetha Tedla has

been approved for the

Department of Computer Science

By

Dr. Carlos Araujo, co-Chair

Dr. C. Edward Chow, co-Chair

Dr. T.S. Kalkur

Dr. Jonathan Ventura

Dr. Yanyan Zhuang

Date: 20 September 2023

4

Tedla, Suneetha (Ph.D., Security)

Reduced Vector Technique Homomorphic Encryption with Versors

A Survey and a Proposed Approach

Dissertation directed by Professors Carols Araujo and C. Edward Chow

ABSTRACT

In this research, a new type of homomorphic encryption technique, based on

geometric algebra and versors, called Reduced Vector Technique Homomorphic

Encryption (RVTHE) is designed, developed and analyzed. This new cipher method

is optimized to be faster and compact in cipher length while preserving the security

strength.

Performance criteria are proposed to generate benchmarks to evaluate the

homomorphic encryption for a fair comparison to benchmarks used for non-

homomorphic encryption. The basic premise behind these performance criteria is to

establish the understanding of the baseline to measure the variations of performance

between different encryption methods for Cloud Storage type Solid State Drives

(SSDs). Significant differences between in throughput penaltiesperformance, up to

20-50%, are observed among between our proposed encryption method and the AES

software methods on Cloud storage SSD or encrypted SSDs.

The central thesis of the research is to verify that homomorphic encryption is

better accomplished with the use of versors instead of multi-vectors. Using properties

of versors, it is possible to design a homomorphic cipher that has simple structure

versality of key assignments while achieving a great speed that rivals existing non-

Edward Chow, 04/01/19,

Always try use positive terms

5

homomorphic ciphers. In the thesis, I demonstrated that the versors based

homomorphic encryption is faster than an existing non-homomorphic encryption. Iit is

shown that RVTHE is a symmetric somewhat homomorphic encryption performing

addition, deletion, scalar multiplication, and scalar division. The evaluation of the

implementation shows a file can be edited/appended in .001 sec. And it showed, in the

case of full file encryption, RVTHE is 75% faster on encryption and 25% slower on

decryption, compared with the AES-Crypt encryption software which implements the

AES standard. The ciphertext sizes of RVTHE are found to be reduced on average of

25% from those of previous approaches using multi-vectors and Clifford Geometric

Algebra. RVTHE has the potential for use as an encryption method on real workloads.

Keywords: Encryption, Homomorphic, AES, SSD, AES-Crypt, Vectors, Versors.


Try to be specific. Do not use vague term like this. Just use AES, an existing non-homomorphic encryption.

6

DEDICATION

I wish to dedicate this body of research to my husband and my best friend

Shravan Tedla; with him everything is possible for me.

7

ACKNOWLEDGEMENTS

I am blessed with beautiful people in my life. I am very thankful to all who

supported me with my journey of schooling. I really appreciate all the support,

encouragement, love and understanding provided by my family, friends, colleagues

and Advisory Committee.

A special thank you to Dr. Carols Araujo and Dr. C. Edward Chow for their

support, sharing their knowledge, and guiding me for the last several years. Dr.

Xiaobo Charles Zhou advised me prior to Dr.Carols Araujo, and I am very thankful

to Dr. Xiaobo providing me the skills and insight needed to pursue my Ph.D. I very

much enjoyed and admired Dr. Carols Araujo’s knowledge and the way he educates

his thoughts to create a new way of doing the security, and that helped me

tremendously for my research. I really appreciate Dr. Chow’s support and

knowledge while discussing the ideas and analyzing how to put my thoughts and

ideas into actions. I am very thankful to both of you. I appreciate my Advisory

Committee members: Dr. Jonathan Ventura, Dr. Yanyan Zhuang, Dr. T.S.Kalkur

providing me their feedback and support. Many thanks to Ali Langfels who helps all

the students with a great smile while managing all the administrative work.

I am very thankful to my parents and my in-laws; one gave me the beautiful life

and one provided me the beautiful life partner with their unconditional love and

support. I am blessed with beautiful friend, my husband Shravan Tedla, and my kids

SaiKiran and Siddhartha and my gratitude to them supporting me in all aspects of

my life including my Ph.D. I am very thankful to my friend Tim Murphy spending

so many hours to help me to write thesis.

8

TABLE OF CONTENTS

CHAPTER 1.........................................................................................................................

INTRODUCTION................................................................................................................

1.1 Security Terminology...............................................................................1

1.2 Security systems.......................................................................................3

1.3 Cloud Storage Security.............................................................................4

1.4 Design Criteria for Cryptographic Algorithm..........................................5

1.5 Encryption................................................................................................6

1.6 Homomorphic Encryption........................................................................7

1.7 Possible fully homomorphic encryption method......................................9

1.8 Vector product spaces with Clifford Geometric Algebra.......................12

1.9 Reduce Vector Technique for Homomorphic Encryption.....................13

CHAPTER 2.......................................................................................................................

BACKGROUND................................................................................................................

2.1 Cloud Storage SSD.................................................................................14

2.1.1 Data Reliability and Integrity.............................................................15

2.1.2 Sanitization and Secure Deletion of SSD...........................................15

2.2 Survey of Various Encryption Approaches............................................16

2.2.1 Block Ciphers.....................................................................................17

2.2.2 Block Cipher Modes...........................................................................25

9

2.2.3 Encryption Methods for SSD..............................................................32

2.2.4 Comparable Encryption for Evaluations............................................36

2.2.5 Homomorphic Encryption..................................................................36

2.3 Mathematical Foundation.......................................................................37

2.3.1 Geometric Algebra Overview.............................................................40

2.3.2 Inner Product......................................................................................42

2.3.3 Outer Product......................................................................................43

2.3.4 Geometric Product..............................................................................45

2.3.5 Inverse of Vector................................................................................47

2.3.6 Versors................................................................................................47

CHAPTER 3......................................................................................................................

PROBLEMS AND LIMITATIONS....................................................................................

3.1 Defining the Problem.............................................................................50

3.1.1 Encryption Security Limitations and Problem...................................51

3.1.2 Encryption Limitations:......................................................................52

3.2 Other problems contributed for research motivation.............................52

3.2.1 Cyber Attacks.....................................................................................54

3.2.2 Real Randomness................................................................................55

3.2.3 Storage Security Limitations..............................................................56

3.2.4 SSD System Level Induced Limitations.............................................56

3.2.5 Existing research to mitigate the software limitations........................63

10

CHAPTER 4.......................................................................................................................

STORAGE ENCRYPTION ANALYSIS............................................................................

4.1 Measurement Environment....................................................................71

4.1.1 Selection of Encryption methods........................................................73

4.1.2 Experimental Tools and Workloads...................................................74

4.2 SSD performance without Encryption...................................................75

4.2.1 Performance differences between Amazon EC2 VMs.......................76

4.2.2 Did various block sizes significantly affect I/O throughput?.............76

4.2.3 Did various levels of parallelism affect I/O throughput?...................77

4.2.4 Did random and sequential jobs have a different IOPS?....................78

4.2.5 SSD Random Workload Analysis on t2.micro VM............................79

4.3 SSD performance with Encryption.........................................................81

4.3.1 Did various block sizes significantly affect IOPS..............................82

4.3.2 Did various block sizes affect Performance Throughput...................83

4.3.3 Did various Encryptions Versus Performance Throughput................85

4.3.4 Reads, Writes and Mixed workloads Versus Block Sizes..................86

4.4 Fully Homomorphic Encryption Limitations.........................................87

4.4.1 FHE with Vector Space......................................................................87

4.4.2 Previous homomorphic encryption using multivector technique.......88

CHAPTER 5.......................................................................................................................

RVTHE...............................................................................................................................

11

5.1 Design of RVTHE..................................................................................90

5.1.1 RVTHE Encryption and Decryption..................................................91

5.1.2 Encryption of RVTHE........................................................................91

5.1.3 Decryption of RVTHE........................................................................91

5.2 Mathematical Implementation of RVTHE Using Versors.....................92

5.3 Homomorphism of RVTHE...................................................................93

5.3.1 Addition..............................................................................................93

5.3.2 Subtraction..........................................................................................95

5.3.3 Multiplication.....................................................................................95

5.3.4 Division...............................................................................................96

5.4 Security of RVTHE................................................................................96

CHAPTER 6.......................................................................................................................

IMPLEMENTATION AND EVALUATION OF RVTHE.................................................

6.1 Implementation of RVTHE....................................................................97

6.2 Experimental Systems............................................................................98

6.3 Experimental Evaluations.......................................................................99

6.3.1 Time measurements on various key sizes.......................................99

6.3.2 Time measurements on various file sizes.........................................100

6.3.3 Size measurements on Encrypted Files............................................101

6.4 Security Evaluation of RVTHE............................................................101

CHAPTER 7.....................................................................................................................105

12

LESSONS LEARNED AND FUTURE WORK...............................................................105

7.1 Challenges and Lessons Learned.........................................................105

7.2 Contributions........................................................................................108

7.3 Success of work....................................................................................110

7.4 Future Work.........................................................................................110

CHAPTER 8.....................................................................................................................113

CONCLUSION.................................................................................................................113

REFERENCES.................................................................................................................116

Appendix A – Cloud Storage SSD...................................................................................133

Appendix B – Cloud Storage and Encryptions................................................................135

Appendix C – Multi-Vector Based Encryption................................................................141

Appendix D – RVTHE.....................................................................................................161

Appendix E – Acronym List............................................................................................185

13

LIST OF FIGURES

Figure 1 - Data Encryption Standard [27]..............................................................18

Figure 2 - TDEA [27]............................................................................................19

Figure 3 – AES encryption process........................................................................20

Figure 4 - Blowfish Algorithm...............................................................................22

Figure 5 – Twofish process [41].............................................................................23

Figure 6 – Serpent Algorithm - [45].......................................................................24

Figure 7 - CBC Encryption and Decryption...........................................................26

Figure 8 - CFB mode with 8 bits............................................................................27

Figure 9 - XTS mode..............................................................................................29

Figure 10 - GCM mode..........................................................................................30

Figure 11 - Outer Product.......................................................................................44

Figure 12 - Address Mapping between physical to logical....................................57

Figure 13 - Flashes and their parallel architecture.................................................61

Figure 14 - Consumer Vs Enterprise SSD..............................................................63

14

LIST OF GRAPHS

Graph 1 - IOPS Vs Block Size...............................................................................77

Graph 2 -Parallelism Vs Throughput.....................................................................78

Graph 3 - Random Versus Sequential Operations..................................................79

Graph 4 - t2.micro Block Size Versus IOPS..........................................................80

Graph 5 - t2.micro Block Size Versus KB/Sec......................................................80

Graph 6 - Encrypted SSD Block Size Versus IOPS...............................................82

Graph 7 - Best Crypt Block Size Vs IOPS.............................................................82

Graph 8 - Dm-crypt Block Size Vs IOPS...............................................................83

Graph 9 - Encrypted EBS SSD Volume Block Size Versus throughput................83

Graph 10 - BestCrypt Block Size Versus Throughput...........................................84

Graph 11 -Dm-Crypt Block Size Versus Throughput...........................................84

Graph 12 - Encryption Methods versus IOPS........................................................85

Graph 13 - Encryption Methods versus Throughput..............................................85

Graph 14 - Read workloads for various Block Sizes.............................................86

Graph 15 – Write workloads IOPS for various Block Sizes..................................86

Graph 16 - Mixed Workloads IOPS for Various Block Sizes................................87

Graph 17 - Multivector Based Homomorphic Encryption.....................................88

Graph 18 - Multivector based encrypted file sizes.................................................89

Graph 19 - Key size Vs Encryption/Decryption time in Sec.................................99

Graph 20 - File size and Encryption/Decryption times........................................100

Graph 21-Key Size and Time on Regualr SSD....................................................100

Graph 22 - Encrypted file sizes in MB.................................................................101

15

LIST OF TABLESTable 1--AES Key Size and Number of Rounds....................................................21

Table 2 - Key and data location in versors.............................................................90

CHAPTER 1

INTRODUCTION

Rapid changes in information technology, specifically the need to use data from

anywhere, are leading users to use Cloud environments with the expectations of

availability (able to provide the data access as needed), reliability, solid integrity

(maintain the data reliability accuracy throughout its life cycle), and full security

(assuring the data is accessed by only authorized parties with authorized level of

access). In this digital age, protecting the PII (Personal Identifiable Information) is

imperative. Tax IDs, Medical Information, Credit Information, and other extremely

sensitive data needs to be secured at the highest level, because it can be used for

Identity theft and other information crimes [1]. Various methods or processes are

implemented to secure the data; among these methods, encryption techniques are the

most commonly used. Scholars have been implementing different cryptographic

algorithms and methods such as the following: Secure Channel, Public-Key

encryption, Digital Signatures, and PKI. Cryptographic algorithms consist of Block

Ciphers (DES, AES, Serpent, and Twofish), Blocker Cipher Modes (Padding, ECB,

CBC, Fixed IV, Counter IV, Random IV, Nonce-Generated IV, OFB, CTR,

Combined Encryption, Authentication, and Hash functions (MD5, SHA-1, SHA-2,

SHA-256, SHA-512). But even with all these encryption methods each one requires

full decryption of all the data including decrypting the sensitive data. Also, I observed

a significant difference between throughput penalties up to 20-50% using encryption

software methods on Cloud storage SSD or encrypted SSDs, as described in abstract.


Page # start at 1 so that in table of content, this chapter is not starting at page 0.

1

FHE (Fully Homomorphic Encryption) allows computing on encrypted data without

decrypting it, keeping sensitive data encrypted and thus not exposed [2] [3].

This thesis is organized as follows: Chapter 1 discusses the introduction. Chapter

2 discusses the most common techniques to secure systems or data. Chapter 3 presents

background work and shows the proof of performance penalties of cloud storage SSD

and encryption software methods. Chapter 4 introduces math and RVTHE. Chapter 5

presents use of RVTHE evaluation in real workloads. Chapter 6 discusses future

work. Chapter 7 concludes the thesis.

This chapter discusses introduction of research in terms of generic survey of

overall security and storage. Discusses about terminology of storage, cloud, security

systems and various encryptions methods and ciphers.

1.1Security Terminology

The “Security” word Originated from “Late Middle English: from Old French

“securite” or Latin securitas, from securus ‘free from care’ and it means “check to

ensure that all nuts and bolts are secure” [4]. The following are some of the most used

terms in the field of cyber security. They will help to clearly define their role in

Information Technology System Security [5].

Assurance: Specific security method implementation that has adequately met

these four security goals: integrity, availability, confidentiality, and

accountability.

Integrity: Ensuring the data in intact with all the modifications only with

proper allowable authenticity.

Availability: Able to provide timely reliable access to an entity.

2

Confidentiality: A set of practices and procedures that supports a security

policy.

Accountability: Principle that an authorized individual is responsible to

follow the safeguard controls of the system.

Asymmetric Encryption: This encryption method which uses two unique

keys, a public key for encryption and a private key for decryption. It is

impossible to derive the private key from the public key.

Authentication: Able to verify the identity of an individual or system

accessing an entity.

Block Cipher: Arrays of bytes in the form of binary bits that are used as input,

output, state, and round key in the encryption process.

o State: Intermediate Cipher of encryption process.

o Round Key: Values derived from Cipher Key.

Cipher and Ciphertext: A procedure containing a series of operations that

convert plaintext to ciphertext. Output generated from a Cipher method on

plain-text.

Classified Information: Information requiring the highest level of security

and mandating authorized access.

Cloud Computing: Way to provide network access of shared resources that

can be rapidly provisioned with minimal effort.

Cryptography: Study that incorporates the foundations, mechanisms, or

methods used to hide data and protect it from unauthorized access.

Cyber Attack: Intentionally disrupting the assurance of a system or the data.

Decryption: A technique of converting ciphertext to plaintext.

Encryption: A technique of converting plaintext to ciphertext.

3

Key: A secret code needed to perform encryption and decryption.

Private Key: A key needed for the decrypting process of asymmetric

encryption.

Public Key: A key needed for the encrypting process of asymmetric

encryption.

Reliability: A system is consistently performing with quality.

Symmetric Encryption: A form of an encryption that uses same key for

encryption and decryption process.

User: An individual who has proper level of authorization to access the

system.

1.2 Security systems

“A security system is only as strong as its weakest link.” [6]

We can only guarantee the level of security of the system depending on how

strongly we secured the weakest links. If we create an attack tree for any real system,

it will provide an insight for possible lines of attack [7]. If we leave one single weak

link, the rest of the system would be just as vulnerable, even with having the strongest

security elsewhere. Secrecy systems are broken three categories:

Concealment Systems use a fake covering cryptography method to hide a

message.

Privacy Systems need special equipment to recover the original message.

“True” Secrecy Systems use cipher for recovering the message.

To build a “True” secrecy system, one must follow the design criteria for a

cryptographic algorithm [8].

4

1.3 Cloud Storage Security

Cloud uses SSD and there has been a lot of research done related to SSD

characteristics, internal design, and performance for different types of workloads [9]

[10]. Previous studies have shown SSD outperforms HDD in speed while accessing

the data from each device [11] but this research had not considered encryption on an

SSD. There has also been a lot of research related to different types of encryption

methods, vulnerable attacks, and secure methods [6] [12]. These existing algorithms

were suitable for regular HDD, but they may not be optimal for SSD. This is because,

with SSD, the physical structure is different, so the encryption algorithms for HDD

might not be ideal or even compatible for SSD.

There is a need for research to make sure these encryption methods are good

enough; they could be measured by calculating their impact on SSD in terms of

performance and security. The rethinking of existing encryption algorithms is good for

SSD or coming up with new algorithms that will accommodate new environments like

the Cloud. The best encryption method could be found using an assessment between

already-existing encryptions and new encryption methods. For this I study first about

SSD’s physical and logical limitations.

The research showed workloads performances improved, always adding SSD or

just SSD as the storage. SSD is faster than HDD [39] [13], so adding it to the storage

system is what is expected to improve performance. Very little research happened to

show the impact on performance of the different types of workloads with the different

encryption methodologies. When I explored what type of encryption is better for the

cloud, we need to consider data at all stages, which means data traveling and data at

rest for cloud [14]. This can be accomplished by using fully data-centric security [15].

This can also be accomplished by using homomorphic encryption methods.

5

1.4 Design Criteria for Cryptographic Algorithm

Encryption is a small component of the system but provides a higher level of

security during cyber-attacks [6]. Encryption is the original goal of cryptography.

Encryption converts plain text into unreadable data which is also called ciphertext. A

good encryption makes it impossible to find the plaintext from the ciphertext without

knowing the key. With good encryption, the only information that will be accessible

is the plaintext length and the time stamp [16]. The following are some of the design

principles that will help to generate stronger cipher [6].

Algorithm should provide effective security, should be easy to use, and

completely stated.

Security should depend on the key secrecy, not on the algorithm secrecy.

Algorithm should be available to users, adaptable to applications and systems.

Algorithm must be implementable on a targeted system.

Algorithm should be efficient, verifiable, and portable between systems.

The cipher must be dependent on the key, modifications in message should not

mask the key. Randomness of the key is critical for the security of the system and it is

hard to generate or guess it [8]. In 1999, NIST selected AES - Encryption method

Criteria Security. The evaluation criteria were divided into three major categories [17]:

Security:

Resistance of the algorithm to cryptanalysis.

Soundness of its mathematical basis.

Randomness of the algorithm output.

6

Cost:

Licensing requirements.

Computation efficiency on various platforms.

Algorithm and Implementation Characteristics:

Flexibility.

Hardware and software suitability.

Algorithm simplicity.

Flexibility.

Key and block size agility.

1.5 Encryption

Encryption is an imaginative technical derivate of cryptography. Building the

optimal encryption technique is still very important. In the encryption process the

“key” takes an important role for encrypting and decrypting data, without the key the

data can’t be interpreted. The strength of the key depends on its secrecy, randomness,

length (size), and complexity. Over the years the encryption processes became more

complex during each iteration of cipher text generation. Various encryption methods

use a unique key generated for each iteration. However, the definition of Kerckhoffs’

principle is, “security of the encryption depends on the secrecy of the key not the

algorithm.” Meaning that everybody knows how the key is applied in the algorithm,

therefore the complexity of that key is all that matters. Most of the common

cryptographic methods follow this principle.

7

In 1997, the NIST (National Institute of Standards and Technology) received

fifteen new security algorithms from twelve countries. Out of these encryption

methods, MARS, RC6, Rijndael, Serpent, and Twofish were selected as finalists [18].

Out of these finalists Rijndael, Serpent, and Twofish took top 3 places respectively.

The winning algorithm, Rijndael, also called AES (Advanced Encryption Standard) is

still in use by different encryption methods [19]. All these methods are symmetric

encryption ciphers.

1.6 Homomorphic Encryption

1978 was the first time the idea of homomorphism for cryptography was theorized

by Rivest, Adleman, and Dertouzous [20]. You can define homomorphism in

abstract algebra in terms of functions and algebraic structures. Once a function (map)

is applied on algebraic structures the result still holds the same algebraic structure

from the domain to range of algebraic sets. In group theory, homomorphism theorems

are developed on subgroups as quotients groups. Ideals introduced in 19th century

played a parallel role defining quotient rings and in the comparable homomorphism

theorems in ring theory [21]. In algebra, A and B are the same type of algebraic

structure and mapping a function “f ” from A to B is the homomorphism from A to B.

A map from A → B operation “µ” and arity “k” and a 1 , a2 …, ak elements in A.

f ¿ (1.1)

Mapping from A to B with µ and µepimorphism, B is homomorphic image of A.

When homomorphism holds a one to one relation it is called endomorphism and noted

as A=B.

8

This same homomorphism also can be derived using lattices, groups, modules, and

monoids [22]. In groups, homomorphism is a category of isomorphism when the

homomorphism must be a bijection. If the A and B are two rings, and f is a function

from A to B, where A is the domain of f and B is the range of f , then each a element

belongs to A, and f ( a )belongs to B. This homomorphism can hold addition,

subtraction, and multiplication algebraic operations (¿). It can be showed as below.

f ( a∗a' )=f (a )∗f (a ') (1.2)

If the f satisfies above the following are true

f (0 )=0 (1.3)

f (1 )=1 (1.4)

f (−a )=−f (a ) (1.5)

f ( a )=f (b ) then a=b (1.6)

If the properties of homomorphism are incorporated in an encryption method or

cipher, then it is a homomorphic encryption method. Homomorphic encryption can

be organized in three approaches that are partial, somewhat, and full. Partial

Homomorphic Encryption allows only one operation with unlimited iterations.

Somewhat Homomorphic Encryption allows more than one but not all types of

operations and limits the iterations. Fully Homomorphic Encryption allows all types

of operations with unlimited iterations [23].

FHE (Fully Homomorphic Encryption) can be defined as: Applying an encryption

method (E) on data 1 (D1) and data 2 (D2) where the ‘⨳’ represented any operation

(addition, subtraction, multiplication, and division).

This is the mathematical representation: E ( D 1⨳D 2 )=E (D 1 )⨳E ( D2 )

9

The first feasible form of FHE was proposed by Craig Gentry in 2009 using ideal

lattices with “bootstrappable” encryption methods [2].

1.7 Possible fully homomorphic encryption method

In 2009, Craig Gentry introduced the first possible fully homomorphic encryption

method with an arbitrary depth circuit (composed of additions and multiplications) on

the encrypted data. This research provided the blueprint of FHE. It is referred to as

SwHE (Somewhat Homomorphic Encryption) and it uses limited depth circuit,

addition, and multiplication for evaluation [2]. This research helped develop an

encryption method using lattice-based, integer-based, LWE (learning-with-errors),

and RLWE (ring-learning-with-errors). Further research of SwHE and FHE showed

promise for potential usage in cloud computing environments and other MPCs (multi-

party computing) [24] [25]. In the Gentry method, using a lattice-based scheme takes

too long to generate the key (ranging from 2.5 sec to 2.2 hours), the implementation is

complex, noise creation can exceed thresholds, and bigger key sizes ( 17MB to

2.25GB) require high memory resources; all this becomes impractical in real systems

[3] . Fully Homomorphic Encryption (FHE), is on the “bleeding edge” of encryption

technology. But currently there is no FHE available for real time applications [26].

There is still a lot of work that needs to be done to have “production ready” version of

FHE.

Gentry defines the algorithm, in public-key encryption scheme ε consists of three

algorithms: KeyGenε, Encrypt ε, and Decrypt ε. KeyGenε takes λ security parameter as

input and implemented as randomly which it results a public key pk and secret key sk

and public key pk. Plain text space P and ciphertext space C is defined by pk. Gentry’s

encryption method Encrypt ε also randomized algorithm and it uses pkand plaintext π

10

∈ P as input, and generates outputs a ciphertext ψ ∈ C. His decryption technique

Decrypt ε takes sk and ψ as input, and outputs the plaintext π. Algorithm computations

work of all of them must be polynomial in λ. Algorithm correctness is:

if (sk,pk) R← KeyGenε, π ∈ P, and ψ R← Encrypt ε(pk,π), then Decrypt ε(sk,ψ) → π.

Homomorphic encryption scheme has property possibly randomized efficient

algorithm ¿ε|¿ is derived using public key pk and a circuit C from a permitted set C ε

of circuits, and tuple of ciphertexts ψ = ⟨ ψ1 , ...,ψ t ⟩ for the input wires of C; generated

ciphertext ψ ∈ C. Informally, the functionality that I want from ¿ε|¿ is that, using pk

if ψi “encrypts π i”, then ψ ← ¿ε|¿(pk,C,ψ) “encrypts C(π1 ,... , π t)” using pk, for input

(π1 ,... , π t) generates output C(π1 ,... , π t) of C. For the encryption the minimal

requirement is correctness. The following are couple of different ways to formalize

Gentry’s homomorphic encryption methods. Gentry defined them as follows.

“Definition 1: (Correctness of Homomorphic Encryption). Gentry says a

homomorphic encryption scheme ε is correct for circuits in C ε if, for any key-pair (sk,

pk) output by KeyGenε(λ), any circuit C ∈ C ε, any plaintexts π1 ,... , π t, and any

ciphertexts ψ = ⟨ ψ1 , ...,ψ t ⟩ with ψi ← Encrypt ε(pk,π i), it is the case that: ψ ← ¿ε|¿(

pk,C,ψ), then Decrypt ε(sk,ψ ) → C(π1 ,... , π t) except with negligible probability over

the random coins in ¿ε|¿.

By itself, mere correctness fails to exclude trivial schemes. Suppose I define ¿ε|¿(

pk,C,ψ) to just output (C,ψ) without “processing” the circuit or ciphertexts at all, and

Decrypt ε to decrypt the component ciphertexts and apply C to results.

Definition 2: (Compact Homomorphic Encryption). We say that a homomorphic

encryption scheme E is compact if there is a polynomial f such that, for every value of

11

the security parameter λ, E’s decryption algorithm can be expressed as a circuit DE of

size at most f(λ).

Definition 3: (“Compactly Evaluates”). We say that a homomorphic encryption

scheme E “compactly evaluates” circuits in CE if E is compact and correct for circuits

in CE.

Definition 4: (Fully Homomorphic Encryption). We say that a homomorphic

encryption scheme E is fully homomorphic if it compactly evaluates all circuits.

Definition 5: (Leveled Fully Homomorphic Encryption). We say that a family of

homomorphic encryption schemes {E(d) : d ∈ Z+} is leveled fully homomorphic if,

for all d ∈ Z+, they all use the same decryption circuit, E(d) compactly evaluates all

circuits of depth at most d (that use some specified set of gates), and the

computational complexity of E(d)’s algorithms is polynomial in λ, d, and (in the case

of ¿ε|¿) the size of the circuit C.

Definition 6: ((Statistical) Circuit Private Homomorphic Encryption). We say that a

homomorphic encryption scheme E is circuit-private for circuits in CE if, for any

keypair (sk, pk) output by KeyGenE(λ), any circuit C ∈ CE, and any fixed ciphertexts

Ψ = hψ1,...,ψti that are in the image of EncryptE for plaintexts π1,...,πt, the following

distributions (over the random coins in EncryptE, ¿ε|¿) are (statistically)

indistinguishable:

EncryptE(pk, C(π1,...,πt))≈¿ε|¿(pk,C,Ψ)

The obvious correctness condition must still hold.

12

Definition 7: (Leveled Circuit Private Homomorphic Encryption). Like circuit private

homomorphic encryption, except that there can be a different distribution associated

to each level, and the distributions only need to be equivalent if they are associated to

the same level (in the circuit) [3].”

All the above definition to show a very high level of Gentry’s work to defend the

thinking behind homomorphic encryption. Gentry scheme is an asymmetric

encryption scheme and his work very revolutionary to the thought behind

homomorphic encryption scheme bringing back to the world, so for that reason all the

definitions are mentioned in this thesis, but the details of his work is out of scope of

this this research. The mathematics used in his scheme has some shortcomings

because the primitive itself is not homomorphic, but his circuit computation algorithm

allowed for homomorphic properties. His algorithm organizes the data and

manipulates the circuits to achieve the computations on encrypted data.

1.8 Vector product spaces with Clifford Geometric Algebra

Geometric algebra was used as the basis for various encryption methods. For

example, RSA (Rivest–Shamir–Adleman) uses math in the form of factors with larger

prime number based key sizes. This approach creates complex factoring for RSA [27].

AES uses mathematics in the form of bit manipulations to increase the “diffusion” of

cyphertext and register based operations to increase “confusion” on shared key.

Applying Clifford Geometric Algebra on vector product spaces gives the results

which is intractable because the results will produce output as a vector in different

direction, space or volume. The geometric product that is a Clifford Geometric

Algebra operation, is an extension of the inner product of the vectors and it represents

the geometric objects of all dimensions in vector space. Versors represents the

13

multiple vectors geometric product and hold the properties of vectors in vector space.

Selecting multiple vectors with smaller dimensions and performing a geometric

product on them, results in an intractable vector in vector product space.

1.9 Dissertation Contributions

This dissertation contribute to the state of art in the following <number>

contributions:

1.

2.

3.

1.10 How this dissertation is organized

Chapter 2 is . Chapter 3….

1.10 Reduce Vector Technique for Homomorphic Encryption

<merge this section to the 1.9 and 1.10.> The use of multi-vectors for

homomorphic encryption had been demonstrated by David Williams Honorio Araujo

Da Silva in his master thesis, the algorithm was designed using a concept invented by

Dr. Carlos Paz de Araujo in 2017. However, there is another way we can use vectors

with geometric product in vector space. Versors are vectors in the geometric product

space which have simpler inverse characteristics. RVTHE (Reduced Vector

Technique for Homomorphic Encryption) is cryptographic cipher powered by Clifford

Geometric Algebra and versors. This approach is an incredibly efficient method for

encryption, decryption, and real time usage [28].

14

Securing the data involves two stages, data at rest, and data while traveling. “Data

at rest” means to describe the data before or after sending to server, storage, or cloud.

“Data in transit” means sending the data between client and server, storage, or cloud. I

will refer to these two stages in this paper as ESD security (Every Stage of Data).

Enterprises have been using security networks, servers, and storage; but the data has

not always been stored in a secure state, therefore there is a need for fully data-centric

security. Data-centric encryption is a way to achieve data-centric security. RVTHE is

a data-centric encryption cipher which is simple to implement and provides ESD

security for the entire system. This method requires less resources to encrypt and

decrypt, and it offers real-time data updates. It is also scalable and adaptable from

small devices to large enterprise storage.

15

CHAPTER 2

BACKGROUND

This chapter discusses all the research related to storage and security methods

background. Mainly discussing about SSD storage device characteristics, various

encryption approaches, and mathematical foundation related to new cipher that will

be presented in this research.

2.1 Cloud Storage SSD

Most of the Cloud environments uses SSD as data storage or in the form of flash

cache to increase the performance. Data is stored regardless of power availability

status. It does not contain an actual disk (platter) as in a traditional HDD (Hard Disk

Drive). SSD technology uses electronic interfaces like SATA (serial ATA) Express to

make compatible with any host. It also uses typical traditional block input/output (I/O)

provided by any host, thus permitting simple replacement of traditional hard disk drive

technology in common applications. SSD is used as the primary data storage for

communication devices, storage systems, modern computers, etc. [11].

In the perspective of security for an SSD, it strives to achieve the best data

reliability, integrity, secure deletion, and encryption; plus, the unique physical nature

of the device. These aspects depend on ECCs, reliably erasing data from storage media

(no digital footprint), and proper encryption methods. SSD’s built-in commands are

effective for ECCs and deletion, but manufacturers sometimes implement them

incorrectly. However, previous research has been done to solve some of the above

16

issues by implementing a variety of different approaches for achieving better ECCs

and encryption methods. Previous research had not considered the sacrifice to

performance due to encryption, when they implemented their methods. This thesis will

consider those factors in the form of performance of SSD in IOPS for different

workloads, while doing the encryption process.

2.1.1 Data Reliability and Integrity

ECC is one of the functions of the FTL. ECC schemes are implemented to ensure

raw reliability of data. It usually contributes overhead on resources; thus, it impacts the

performance. Conventional ECCs such as the commonly used BCH (Bose-Chaudhuri-

Hocquengham) code reliability degrades as SSD capacity grows. It is important to

implement a powerful ECC Engine with LDPC (low-density parity-check) code to

improve the reliability of SSDs [29] . Previous research proposed different ECC

approaches to increase the data reliability. One of the ECC research approaches is

lightweight EDC (Error Detection Code) for the block to achieve better cache

performance [30].

2.1.2 Sanitization and Secure Deletion of SSD

The physical SSD architecture of the non-encrypted SSD had limitations for

sanitizing the disk or securely deleting a file from SSD. In case the vendor did not

implement the host interface built-in commands correctly, the sanitation of the SSD

will not be achieved. There is no full sanitizing technique that worked for HDD that is

guaranteed to work for an SSD. Usually we can achieve sanitization of SSD by

writing to the visible address space twice using FTL procedures. But this is a time-

consuming process and it is not a true sanitization, because it does not take care of

invisible address space (files marked as deleted, but physical data still exists). The data

in SSD can be erased with erasure-based sanitization techniques (overwriting the disk

17

with multiple IO operations) that may be able to sanitize the SSD but these techniques

have shortcomings and fail to do a real sanitization [31].

Completely deleting and securely erasing data in SSD is challenging. For that

reason, storing unencrypted data creates a risk of exposing that data to unauthorized

access. And although erasing files via sanitization methods will make the SSD more

secure, it also creates a lot wear and tear on the device which will shorten its lifespan.

To avoid these problems, the best option is encrypting the data on an SSD. Previous

research created a couple of methods for encrypting files on SSDs, they are node level

and password-based file level encryptions. In node level encryption, you encrypt the

nodes. It stores keys on the dedicated KSA (Key Storage Area). That is the concern

though, because KSA blocks can turn into bad blocks and at such a time they can be

read [32]. In password-based file level encryption, files are encrypted using

passwords, but encryption and deletion of the files is slow and accessing the files each

time is tedious [33].

Even with all the challenges of encryption, it is still the best option for securing the

data of SSD.

2.2 Survey of Various Encryption Approaches

This chapter describes encryption methods and algorithms; and we’ll look at their

strengths and weaknesses in high detail. They are used in encryption software for

SSD. This chapter also talks about real randomness for creating keys and the most

common types of existing encryption methods.

18

2.2.1 Block Ciphers

An encryption function on fixed-sized blocks of data is called a block cipher. “A

secure block cipher is one for which no attack exists” [6]. Block cipher is an

encryption function for fixed-sized blocks of plaintext and generates the same sized

block ciphertext using the same secret key (key size can be different then plaintext).

Without a secret key, no plaintext can be produced from ciphertext. Security of the

block cipher is also defined as using attacks as non-generic methods to differentiate

between block and an ‘ideal block cipher’ [6]. “Block cipher written in terms of

E(K,p) or Ek(p) for encryption of plaintext p with key K and D(K,c) or Dk(c) for

decryption of ciphertext c with key K” [6]. In block cipher encryption, the key is a

critical component and its integrity is absolute, changing a single bit in the key value

can result into a different ciphertext [6].

Using a permutation on k-bit values generating the k-bit cipher with each of the

key values can create 2k cipher values [6]. Suppose we have single permutation on

128-bit values, it will create a table of 2128 cipher values (each cipher is 128-bit). The

ideal cipher should have a random permutation for each key value, this will give the

ability to choose the look up table randomly. The “distinguisher” is an algorithm that

converts data to a block cipher or an ideal block cipher using a black-box function.

The distinguisher does not have knowledge of the internal process of the black-box

function. There are limited amounts of computing that a distinguisher can do,

otherwise more computing would complicate the process beyond an acceptable level

of efficiency. A practical block cipher should be designed such that each encryption

function appears to be a randomly chosen key with an invertible function [6].

19

The block cipher is an ‘ideal block cipher’ if it can withstand attacks like known

plaintext, ciphertext only, related key, chosen plaintext, and other types of attacks. In

SSD, encryption software uses one or more of the following block ciphers, the

following sections will address them and some of their attacks.

2.2.1.1 DES (Data Encryption Standard)

The algorithms described in this DES standard specifies both enciphering and

deciphering operations which are based on a binary number called a key. DES uses

Feistel ciphers design with 16 rounds.

Figure 1 - Data Encryption Standard [27]

20

In Figure 1, DES starts with a 64-bit input block (binary digits). DES then applies

a 56-bit key which was randomly generated from the 64-bit key. Out of this 56-bit key,

48 bits are used directly by the algorithm, and the other 8 bits are used for error

detection as needed. These 8 bits are used to set the parity of each 8-bit byte, the

output should have an odd number of "1"s. XOR operations performed between key

and data along with permutations makes the final cipher [18].

Figure 2 - TDEA [27]

In [27]Figure 2 - TDEA [27], a 3DES (TDEA) key is made out of three DES keys,

which are also referred to as a key bundle. The keys inside the key bundle are different

from each other. This key bundle is used for encryption and decryption. The

encryption process starts with encrypting using the first key, decrypting using the

second key and then encrypting using the third key. The decryption process follows

the reverse order of encryption process. The encryption algorithms specified in this

standard are commonly known among those using standard encryption [34].

21

3DES was heavily used by organizations until researchers discovered Active

Collision Attacks on different modes (e.g. CBC, CTR, GCM, OCB, etc.) [35]. The

small key size, data block size (64 bits), and using same key for encryption became a

vulnerability because it created the same ciphertext in every two to the power of data

block size (232). These matching collision ciphers can expose security to attacks like

birthday attacks. Due to XOR operation you can find plaintext XOR between the

collision ciphers. Cipher collision is not enough to discover the plaintext, but that

along with same secret key feed and some fraction of known plaintext, will make it

easier to perform successful attacks. Due to ever increasing computer power these

attacks are more easily committed by various attack methods like the man-in-the-

browser attack [35].

2.2.1.2 AES (Advanced Encryption Standard)

Figure 3 – AES encryption process

22

AES is symmetric encryption algorithm that was created to replace DES and 3DES

encryption. Joan Daemen and Vincent Rijmen developed AES encryption. It was

adopted as NIST encryption standard 2001 [36]. Figure 3 shows the AES encryption

process.

AES is created with the key lengths of 128, 192 and 256 bits. It encrypts a 128-bit

block of plaintext and generates a 128-bit block of ciphertext. It uses only one key for

encryption and decryption. AES encryption consists of repeated rounds of

implementing the following steps: sub bytes (replacing bytes using S-box table), shift

rows, mix columns, and add round keys. The last round does all functions except mix

columns. AES decryption reverses the encryption process.

Key Size Total Rounds

128 10

192 12

256 14

Table 1--AES Key Size and Number of Rounds.

Table 1 shows the total rounds performed based on key size [36] [37]. In recent

years most of the AES implementations are using the 256-bit key length instead of

192-bit key. Even though AES 256 has a longer key, the way the key schedule is

designed it makes it more vulnerable to sub-key attacks. AES is subject to a theoretical

brute-force attack, but even with current technology it would take a quintillion year to

break the encryption key. There are some additional theoretical attacks documented,

and they are cryptanalytic attacks, related key attacks on AES 192 and 256, middle

attacks on AES 128, and first key attack on all of AES. Exploits of AES 256 have

23

received the focus by the security community more than AES 128 and AES 192.

Despite this, all AES versions are considered not breakable by today’s technology [37]

[38].

2.2.1.3 Blowfish

Blowfish is a symmetric block cipher that was designed by Bruce Schneier to

replace DES in 1993.

Figure 4 - Blowfish Algorithm

In Figure 4 , the original design of Blowfish manipulates data in 32-bit, 64-bit and

128-bit block sizes with variable key size scales from 32 bits to 256 bits. In Figure ,

the algorithm uses the XOR operation, table lookup (S-box), and modular

multiplication. It has the same structure as the DES algorithm. This algorithm uses

precomputable sub-keys to expedite the speed of encryption. After a year, to increase

24

the security the key size was increased from 256 bits to 448 bits and published in Dr.

Dobb's Journal [39].

If the security person has chosen a small key length, then it will behave like weak

keys in Blowfish, which will make it vulnerable to chosen-key and related-key attacks.

Due to its Feistel structure and key dependent S-box substitution it is also prone to

slide and simple power attacks. Because Blowfish is a block cipher, it is vulnerable to

similar attacks as block ciphers are already prone to such as; side-channel, exhaustive

search, and birthday attacks are just name a few [40].

2.2.1.4 Twofish

Twofish symmetric encryption algorithm is like AES. It uses key lengths of 128,

192, and 256-bits and a 128-bit block cipher. The National Institute of Standards and

Technology selected it as one of the top 5 finalists, but it was not selected for

standardization in the end. Still recently developed encryption software for storage and

file systems incorporated this algorithm (i.e. TrueCrypt, BestCrypt, Dm-crypt, and

DiskCryptor). Twofish algorithm is one of the ciphers included in the OpenPGP

standard and it is free with no restrictions.

25

Figure 5 – Twofish process [41]

In Figure 5 , shows Twofish algorithm uses the same predefined key-dependent, S-

box, and key schedule as AES. The first half of the key is used for encryption and the

second half is used for an S-box lookup and modifying the encryption algorithm.

Twofish’s design looks like a mix of DES and AES, one half is like DES in that it uses

a Feistel structure, and other half is like AES in that it uses S-box and a Maximum

Distance Separable matrix. Twofish’s 128-bit key encryption is slower than its AES

counterpart, but the 256-bit key encryption is faster [41].

Researchers had claimed that when the weak key pairs were present, there might

have been vulnerability to a Twofish cipher by partial chosen-key and related-key

attacks. But it was determined that the existence of these key pairs was not realistic, so

the proposed attacks would not work [42]. With time the scholars found there are

vulnerabilities with the Twofish cipher after all. One attack, SPA (Simple Power

26

Analysis) revealed the secret key of the cipher. It uses S-box with 8-bit predefined

permutation and round operations so it is prone to Side Channel attack with one

iteration to discovering encryption key. [43].

2.2.1.5 Serpent

Serpent is also a block cipher, and it was published in 1998 by Ross Anderson, Eli

Biham and Lars Knudsen. This algorithm was selected as one of the finalists by the

US National Institute of Standards and Technology [44].

Figure 6 – Serpent Algorithm - [45]

Figure 6 shows the process of Serpent. Serpent uses 128, 192 and 256-bit key

lengths, and it uses 32-bit words with 32-bit round substitutions and a permutation

network with 4-bit S-boxes running 32 rounds key mixing operation [44].

In 2011, there was a cryptographic analysis performed using a multidimensional

linear method to find vulnerability with Serpent. Researchers proved Serpent breaks

27

in 11 rounds of using a 128-bit key length with key mixing operations that find the

encryption key [46].

2.2.2 Block Cipher Modes

A Block Cipher Mode is a repeated cryptographic conversion of a single-block

operation on several bits to achieve confidentiality and authenticity. It adapts to the

different operating environments and requirements. The following are some of the

most commonly used block cipher modes.

2.2.2.1 CBC (CIPHER BLOCK CHAINING) MODE:

CBC mode uses the IV (Initial Vector) to encrypt the first plaintext using XOR

operation. This method uses the previous ciphertext to encrypt the next plaintext

block. The encrypted ciphertext is stored in a feedback register and used for inputting

the XOR function with the next plaintext. This process repeats until all the plaintext

has been addressed. From the second block onwards, all the blocks depend on the

previous blocks. In the decryption process, the same thing is applied but in reverse

order. To decrypt the next cipher text block, use the cipher from previous decryption

cycle and apply XOR with decryption key to get the next plaintext block. After each

decryption cycle the cipher is stored in the feedback register.

Encryption: Ci = Ek(Pi Ci − 1) and Decryption: Pi = Ci − 1 Dk(Ci)

28

Figure 7 - CBC Encryption and Decryption

Figure 7 shows CBC encryption and decryption process. CBC structure may be

exposed to some vulnerabilities. For example, in CBC mode the encryption process

will not start until there is enough plaintext data to fill the entire block being

processed. In secured network communications, the terminals need to immediately

send each character or string of bytes to the destination host, as they can’t wait until

the block is full. But when the string of bytes is smaller than a block, CBC mode will

not be able to handle the encryption. Another weakness, the birthday paradox exposes

identical patterns of the plaintext every 2m/2 blocks (m = block size), this is due to

chaining. There are ways, you can mitigate these issues for example: taking care of the

message starting point and endpoint, including controlled redundancy and

authentication [27].

If an attacker added some bits to the ciphertext block and it was undetected during

the decryption cycle, that block will result in gibberish. Sometimes it may not be an

issue, but other times it can cause problematic situations. Altering the ciphertext by

even one bit will cause the subsequent block to have the wrong input, and that will

affect the decryption of that block. The combination of SSL v3 and TLS v1 with CBC

29

is not recommended as it uses the entire traffic single set of ‘initialization vector’ for

the communication. This exposes the targeted block to a padding oracle attack, where

an attacker can figure out the padding information, then the attacker can determine the

plaintext bytes from the ciphertext by running multiple queries [47]. This was

addressed in TLS 1.2, which checks for multiple queries and stops the connection to

prevent that type of queries, and it was recommended to upgrade all the secure

communications by implementing this change.

2.2.2.2 CFB (CIPHER-FEEDBACK) MODE

Usually the block ciphering won’t start until the block data is received. As

mentioned in the CBC section, CBC cannot handle when a string of bytes is smaller

than a block. On the other hand, CFB mode can handle this smaller string of bytes.

This process derives the next key from encrypting the previous ciphertext. This key is

used for the next iteration to encrypt next plaintext bytes.

Figure 8 - CFB mode with 8 bits

Figure 8 shows the encryption and decryption for n bit block. The encryption and

decryption use block size with shifting and XOR operations.

30

Encryption:

Ci = Pi Ek(Ci − 1)

Decryption:

Pi =Ci Ek(Ci − 1)

CFB uses synchronous stream cipher on both the encryption side and the

decryption side. Encryption and decryption keystream generators need to derive the

exact same keys on corresponding iterations. If any of them miss a cycle it can result

in generating the wrong ciphertext or plaintext. CFB mode is like CBC mode, in that

one incorrect bit can propagate to all the subsequent processing [27].

2.2.2.3 LRW (Liskov, Rivest, and Wagner) mode

To prevent attacks from the CBC mode the LRW mode was introduced. This is a

tweakable narrow-block encryption, which is a random permutation using a key with

a known tweak I on the plaintext P, the result of which will be block cipher C. This

method uses two keys: The first key K is used to encrypt the plaintext with XOR, and

the second key F is used for a finite field permutation. The key F is the same size as a

block, it is used in the finite field permutation with a precomputation tweak of the

plaintext. This outcome X will be used for the encrypting process [48].

Encryption:

C = Ek(P X) X

Where X = F I

Decryption:

P = C Ek(Ci − 1)

The XOR and multiplication are performed using key K and F on the plaintext and

finite field (GF (2128) for AES) with a precomputation tweak.

F I = F (I0 ) = F I0 F

31

represents all possible values in the binary finite field of (GF (2128)).

This method protects from CBC mode attacks, but still have its own leak. If the

attacker changes a single block it only affects that cipher block but not all the

subsequent cipher blocks.

2.2.2.4 XTS Mode

Figure 9 - XTS mode

In Figure 9, XTS mode is an Advanced Encryption Standard with XEX (XOR

Encrypt XOR) tweakable code value and ciphertext stealing. Simplified tweaked AES

with XEX method will use the XOR operation on the plaintext to generate the tweaked

output. Then the second-time AES encryption is applied on the tweaked output it will

generate the final ciphertext. Ciphertext stealing is a block cipher mode that allows the

encryption of the messages without having to divide them into sizes that are not

divisible by the block size, this results in same size ciphertext, but it is more complex.

X = Ek(I) αj

C = Ek(P X) X

P - The plaint text.


The above figure exceed right margin.

32

I - The number of the sector.

α - The primitive element of (GF (2128)) defined by polynomial.

j - The number of the block within the sector.

XTS mode has similar vulnerabilities like CBC mode. For example, tampering of

data can go unrecognized, which will when decryption occurs, generate gibberish. The

system must be built to recognize this potential threat and be able to protect the data

using checksums and authentication tags. This mode is prone to other vulnerabilities

like replay attacks and randomization attacks. If the attacker has access to ciphertext

blocks they can analyze them and use them for replay attacks and randomization

attacks [49].

2.2.2.5 GCM (Galois/Counter Mode)

Figure 10 - GCM mode

In Figure 10, GCM is a symmetric key cryptographic block cipher. It is derived

from GMAC (Galois Message Authentication Code), an authenticated incremental

message communication. All blocks are numbered and then they are encrypted using

XOR operation (similar to a stream cipher operation order in the form of counters).

https://en.wikipedia.org/wiki/File:GCM-Galois_Counter_Mode.svg

33

GCM uses a hash key H, it is a string of a 128 zero bits encrypted using the block

cipher. For encryption, along with the hash key, it uses a unique arbitrary length

initialization vector for each stream [50].

GCM mode does not have vulnerabilities like CBC. For example, in CBC mode

tampering can occur without noticing, but in GCM the operations are performed using

an authenticated encryption method, which keeps data and communication

confidential. It also maintains integrity, by using the main function’s authentication

tag or mode to verify the data. It uses reasonable hardware resources (memory, CPU,

etc.,), it also performs very efficiently due to parallel processing, and provides high

speed communication [50].

The key in GCM mode is similar to the one in LRW mode (multiplication for

Galois field) per each 128-bit block cipher (GF (2128) for AES). The GF polynomial is

defined as:

x128+x7+x2+1.

Feeding the blocks of data into the GHASH function and encrypting the output

will generate the authentication tag.

GHASH (H,A,C) = X m+n+1

H - Hash Key ,A - Authenticated data (plaintext)

C – Ciphertext ,m - The number of 128-bit blocks in A

n - The number of 128-bit blocks in C

34

This encryption method has been shown to be secure and efficient. Currently,

Google uses as it’s mode for their website certificate.

2.2.3 Encryption Methods for SSD

SSD serves as a typical alternative to HDD. In fact, SSD considerably emulates the

technology of HDD such as the communication protocol and hardware interfaces. So,

the technology of HDD can quickly be adapted to SSD. However, the methods that

SSDs employ to process data is different from HDDs in storing, managing, accessing,

and securing. Because of the differences between the two technologies, it is possible

that the processing of the same commands on HDD will produce different results on

an SSD [11]. When it comes to encryption, we need to consider these differences.

There are couple of encryption techniques that have been used for SSD. This chapter

will discuss those methods.

2.2.3.1 Dm-crypt

Dm-crypt is a disk encryption method compatible with Linux kernel version 2.6 or

later. It uses API routines. Devices are mapped to encrypted containers using a device

mapper [51]. This API uses AES-256 cryptographic method along with other

methods. Dm-crypt uses Linux Unified Key Setup (LUKS) to create encrypted

containers which are independent from outside platforms. LUKS was developed by

Clemens Fruhwirth in 2004 [52]. Using this method, user can even encrypt the root

device. A passphrase is required to create encrypted containers.

There has been some research around the drawbacks of dm-crypt. For example, it has

been discovered that hackers can sidestep the passphrase to access encrypted

containers by hitting the ‘Enter’ key couple of times. They can also delete the

containers, because to delete the containers they do not require the passphrase.

35

Utilizing disk commands on the system an intruder can determine critical components

of the hidden containers relatively easily [53] [54].

2.2.3.1.1 Process Method

Dm-crypt uses device mapper and the Linux kernel’s Crypto API routines. This API is

built with a cryptographic method using an AES-256 algorithm. Dm-crypt supports

XTS, LRW, and other modes for the encryption. The encrypted containers are stored

as files inside a folder. Users can create these containers (volumes) with LUKS (Linux

Unified Key Setup) encryption specification that is protected by a passphrase. Using

the system device mapper, it mounts the encrypted containers on the top of existing

devices. Clemens Fruhwirth created LUKS in 2004, Dm-crypt uses this method to

create encrypted containers, which are independent from the existing platform and

allow compatibility from system to system.

2.2.3.1.2 Weaknesses

Using this method, a user can encrypt the root device, but they may need a smart

device attached to the system so that they can boot to the primary system. When

creating the containers, a passphrase is required, but to delete a container a passphrase

is not even requested. This method is mainly used for Linux like systems. Some of the

research showed that you can bypass the passphrase to access the containers by just

pressing the enter key a couple of times. The file systems information displays the

sizes of volumes, which may result in someone guessing information about the hidden

containers.

36

2.2.3.2 BestCrypt

BestCrypt is encryption software implemented in 1995 and it is still in use. It

creates, mounts, and manages encrypted volumes called containers. Because this

encryption software is still in use, it will be selected for evaluation.

2.2.3.1 Process Method

This encryption method stores files in encrypted containers and keep them safe from

unauthorized access. The benefits of BestCrypt is the system disk volumes can be

mounted and stored as encrypted files when not in use. This method can be applied to

removable media, network shares, archived storage, and email attachments on

Windows or Linux OS. It uses the following cryptographic methods: AES, Blowfish,

DES, Triple DES, Twofish, Serpent, and GOST 28147-89. All these cryptographic

methods use LRW and CBC modes. AES, Twofish and Serpent also use XTS mode

[55].

2.2.3.2 Weaknesses

It seems a viable option, but like any software it can have bugs, these errors can be as

large as damaging entire partitions.

2.2.3.3 FDE (Full Disk Encryption)

FDE is a hardware encryption method, it started implementation in 2009 and is still in

use. It encrypts all the partitions, system files and operating system using hardware

component. This technique is used by Samsung SSDs which are commonly used.

Applying FDE on an SSD is called an SED (Self-Encrypting Drive). Self-encrypting

SSDs provide better performance than SSDs where the encryption software is installed

[12]. This encryption implementation method will be selected for evaluation.

https://en.wikipedia.org/wiki/GOST_28147-89

37

When Full Drive Encryption (FDE) is applied on an SSD it is called a Self-Encrypting

Drive (SED). FDE was developed in 2009, it is a literal encryption of the entire system

which includes all the partitions, system files, and operating system. This encryption

method assigns the process to use the hardware component of the drive. This helps to

enhance the security by utilizing the Opal Storage Specification (which is a set of

specification features of SEDs) [12]. SED needs a master password for the SED and a

user password for each user. They are stored in the BIOS and handled by the hard disk

controller. SED uses AES 128 and AES 256.

Researchers have found the following vulnerabilities of this method: Hot Plug Attack,

Hot Unplug Attack, Forced Restart Attack, and Key Capture Attack. They have also

shown that attackers can bypass the encryption and access data; this undermines the

purpose of securing the data [56].

2.2.3.3.1 Process Method

This encryption method delegates the process logic to a dedicated hardware

component of the drive using the Opal storage specification (a set of specification

features of SEDs) to enhance security. The hard disk controller handles key

management, it enhances the security and protects the data from unauthorized access.

SED will have two passwords and they are User and Master password. Both

passwords are stored in the BIOS. The Master password is generated by the SED and

the user password generated by users for system access. In situations where user

password is lost or forgotten then the Master password can be used to unlock the

system. It uses the following cryptographic methods: AES 128 and AES 256. Using a

BIOS password, it is used for pre-boot authentication of the system.

38

2.2.3.3.2 Weaknesses

There are some attacks that are related to this method: Hot Plug Attack, Hot Unplug

Attack, Forced Restart Attack, and Key Capture Attack. Research has shown the

attacker can bypass the encryption and access data; this undermines the purpose of

securing the data [56].

2.2.4 Comparable Encryption for Evaluations

“AES Crypt is a file encryption software available on several operating systems

that uses the industry standard Advanced Encryption Standard (AES) to easily and

securely encrypt files.” Represented in this paper as AES-Crypt [57].

2.2.5 Homomorphic Encryption

The first practical and feasible version of homomorphic encryption was introduced

by Craig Gentry applying addition and multiplication on the encrypted data over

circuits in 2009 [58] [2]. Research had shown that there were advantages of

leveraging the homomorphic encryption in the Cloud and in Multi-Party Computing

environments [24] [25]. Most of the previous implementations were asymmetric

homomorphic methods. But researchers observed that some behaviors not practical

for real world usage and they were:

Key Sizes: Ranged from 17MB to 2.25GB

Key Generation Time: Ranged from 2.5secs to 2.2hours

Cipher Text size: Much larger cipher texts

Noise: Creation exceeding thresholds

Time: Very long execution times

39

These weaknesses made homomorphic encryption impractical to use in the cloud

or real time systems [3] [59]. Currently there is no encryption method in production

which can take advantages of homomorphic features for any system [26].

It must be way; we could create an encryption methodology that could derive

great value from the advantages of the unique features of homomorphic encryption.

Using Versors from Clifford Algebra and Versors I developed a symmetric

homomorphic encryption scheme. The next section will discuss mathematical

foundation of new encryption method.

2.3 Mathematical Foundation

This section discusses the mathematical foundation which was used to architect

RVTHE.

Algebra is the base for most homomorphic encryption methods. It uses positive

numbers, real numbers, complex numbers, linear algebra, geometric algebra and

function spaces (e.g., Hilbert Spaces and Clifford Algebra) for number fields. If,

Geometric Algebra uses vector spaces with a quadratic form and it is associative, then

it is called Clifford Algebra. I chose to use Clifford Algebra for RVTHE, because it

calculates a geometric product of vectors and the generated results are not traceable,

this is ideal for level of security that we want to achieve. So, it is important to

understand these Clifford Geometric Algebra terms [60]:

Geometric Algebra is the foundation for homomorphic encryption. It uses positive

numbers, real numbers, complex numbers, linear algebra, and function spaces (e.g.,

Hilbert Spaces and Clifford Algebra) for number fields. If Geometric Algebra uses

vector spaces with a quadratic form and it is associative then it is called Clifford

Algebra. It is important to understand these Clifford Geometric Algebra terms [60]:

40

Vector: “a quantity having direction as well as magnitude, especially as

determining the position of one point in space relative to another.”

Vector Dimension: “Let V be a finite dimensional vector space over the field

𝔽. The Dimension of V denoted dim𝔽 V is the number of vectors in any

basis of V. If V is an infinite dimensional vector space over 𝔽 then we

write dim𝔽 V =∞”.

We can represent a “n” dimension vector as “nD”.

i.e. If n=2 then “2D” is used to represent a 2-dimensional vector.

Vector Space or Bivectors: “a space consisting of vectors, together with the

associative and commutative operation of addition of vectors, and the

associative and distributive operation of multiplication of vectors by scalars.”

Multivector: “a mathematical structure comprising a linear combination of

elements of different grade, such as scalars, vectors, bivectors, tri-vector, etc.”

Geometric Algebra Axioms: To understand combinations of scalars, vectors,

and bivectors, we first need to know the axioms behind the geometric algebra.

These are the proven axioms in geometric algebra. Vectors are represented by

(a , b , c ), scalars by ¿,ε ¿ , and bivectors by (ab ,ba ,ac , etc ¿.

Axiom 1: associative rule

a (bc)=(ab)c (4.1.1.1)

Axiom 2: distributive rules

a (b+c )=ab+ac(b+c)a=ba+ca

(4.1.1.2)

Axiom 3: (λ a)b=λ(ab)=λ ab[ λ∈R] (4.1.1.3)

Axiom 4: λ (ε a)=(λε )a[ λ , ε∈R] (4.1.1.4)

Axiom 5: λ (a+b)=λ a+λ b[ λ∈R] (4.1.1.5)

41

Axiom 6: (λ+ε )a=λ a+ε a [ λ , ε∈ R] (4.1.1.6)

Axiom 7: a2=¿a∨¿2 ¿ (4.1.1.7)

Axiom 8: |a · b| = |a||b| cos θ (4.1.1.8)

Axiom 9: |a ∧ b| = |a||b| sin θ (4.1.1.9)

Axiom 10: ab = a · b + a ⋀ b (4.1.1.10)

Axiom 11: a ⋀ b = −b ⋀ a. (4.1.1.11)

Product of Vectors: The result of multiplying the vectors with scalar and

cross products. These two products are foundation for geometric algebra’s

inner, outer, and geometric products of vectors.

o Scalar Product: (Also known as dot product) The magnitude of

production of vector quotients.

o Cross Product: (Also known as vector product) A binary operation on

two vectors in three-dimensional space.

o Outer Product: (Also known as wedge product) The tensor product of

two coordinate vectors.

o Inner Product: The dot product of the Cartesian coordinates of

two vectors.

o Geometric Product: The sum of the inner and outer products

Vector Inverse: When performing geometric product between vector A and

another vector B; if the result is “1” then vector B is called the inverse of

vector A and vice versa.

Blade: The outer product of k vectors is called a k-blade, suppose 1-blade

means vector, 2-blade means bivector, 3-blade means tri-vector, and so on.

Where k indicates the grade of the blade.

42

Versors: Versors are multiple vectors using geometric product following Clifford Geometric Algebra.

Vector: “a quantity having direction as well as magnitude, especially as

determining the position of one point in space relative to another.”

Vector Dimension: “Let V be a finite dimensional vector space over the field

𝔽. The Dimension of V denoted dim𝔽 V is the number of vectors in any basis

of V. If V is an infinite dimensional vector space over 𝔽 then we write dim𝔽 V =∞”.

We can represent a “n” dimension vector as “nD”.”

i.e. If n=2 then “2D” is used to represent a 2-dimensional vector.

Vector Space or Bivectors: “a space consisting of vectors, together with the

associative and commutative operation of addition of vectors, and the

associative and distributive operation of multiplication of vectors by scalars.”

Multi-vector: “a mathematical structure comprising a linear combination of

elements of different grade, such as scalars, vectors, bivectors, tri-vector, etc.”

To show how Clifford Geometric Algebra is represented in math, I will use two

dimensional (2D) vectors for inner product, outer product, and geometric product

representations [21] [60].

2.3.1 Geometric Algebra Overview

Geometric Algebra combines the work of Hamilton (Quartenion) and Grassman

(Non-Commutative Algebra) into a field that generalizes the product of two vectors,

including the 3-dimensionally restricted “Cross Product” to an n-dimensional

subspace of the vector space (V) over number fields (Z , R , C , N ,etc .) such that the

43

subspace is a product space that allows two vectors to have a “geometric product” as

[60]::

V 1 V 2 ¿V 1∙ V 2+V 1∧V 2 ¿

Where V 1 and V 2 are vectors or multivectors (i.e.: a collection of “blades”). The

peration V 1∧V 2 is known as a “wedge product” or “exterior product.” The operation

V 1 ∙V 2 is the “dot product” or “interior product” (aka. “inner product”).

For a simple pair of two-dimensional vectors:

V 1=a1 e1+a2 e2

V 2=b1 e1+b2 e2

where the set {e1 , e1 } are unit vectors and {ai } , {bi } ,i=1,2 are scalars, the geometric

product follows the rules of Geometric Algebra, as described below:

e i∧ e i=0 e i∧ e j=−e j∧ ei

e i∧ e j=eij (compact notation)

e i∧ e i=0

e i ∙ e i=1

e i ∙ e j=0

Thus, by performing the geometric product of V 1and V 2we have

44

V 1 V 2=[( a1b1 ) e1 ∙ e1⏞ei ∙ ei=1

+( a1b2 ) e1 ∙ e2⏞e i ∙e j=0

+( a2b1 ) e2 ∙ e1⏞e j∙ e i=0

+( a2b2 ) e2 ∙ e2⏞e j ∙ e j=1]⏟̇

product

+

[ (a1 b1 ) e1∧ e1⏞ei∧e i=0

+( a1b2 ) e1∧e2+(a2 b1 ) e2∧ e1⏞e j∧e i=−e i∧e j

+ (a2 b2 ) e2∧ e2⏞e j∧e j ]

⏟wedge product

Resulting in

V 1 V 2=(a1b1+a2b2 )+(a1 b2−b1 a2 ) e1∧e2

The product V 1 V 2 produces a scalar and an object e1∧ e2 which in compact

notation is written as e12 and represents an area created bye1∧ e2 rotation (clockwise)

or −e2∧ e1 in anti-clockwise. The orientation is given by the sign of the term in front

of the e1∧ e2 component.

A versors is product of vectors in the geometric product space which has simpler

inverse characteristics. V=¿ V 1 V 2 V 3… V n

2.3.2 Inner Product

Inner product (also called dot product or scalar product) is synonymous with

transforming

vectors into scalars. Inner product of vectors ‘a’ and ‘b’ is represented by a “a ·b

”.

If ‘a’ and ‘b’ are vectors, defined as: a=(a1e1+a2 e2 ) and b=(b1e1+b2 e2 ) then:

a ·b=(a1 e1+a2 e2 ) · ( b1 e1+b2e2 )

a · b=(a1 b1 e1 · e1+a1b2e1 · e2+a2 b1 e2 ·e1+a2 b2 e2 · e2 )

45

a ·b=a1 b1+a2 b2

Inner product is the magnitude of production of vector quotients. If we were to

reverse the order of the vectors to the inner product, then the resulting value will

always be the same.

a·b=b · a

Example:

W hen a=(2 e1+3 e2 ) and b=( 4 e1+5 e2 )

Then the inner product a ·bis:

a · b=(2e1+3e2 )· ( 4e1+5 e2 )

a · b=(8 e1· e1+10 e1 · e2+12 e2 · e1+15 e2 ·e2 )

a ·b=8+15

a ·b=23

Reversing the order of the vectors, the inner productb · a is:

b · a=(4 e1+5 e2 )· (2e1+3 e2 )

b · a=(8 e1· e1+12 e1 · e2+10 e2 · e1+15 e2 ·e2 )

b ·a=8+15

b ·a=23=a·b 2.3.3 Outer Product

Outer product of vectors ‘a’ and ‘b’ (also called wedge product) is represented by

a “a ⋀ b”. If ‘a’ and ‘b’ are vectors defined as: a=(a1e1+a2 e2 ) and b=(b1e1+b2 e2 )

then:

a ⋀ b=( a1 e1+a2e2 )⋀ (b1e1+b2 e2 )

46

a ⋀ b=( a1 b1 e1 ⋀ e1+a1 b2 e1 ⋀e2+a2 b1 e2 ⋀ e1+a2 b2 e2 ⋀ e2 )

a ⋀ b=( a1 b2 e1 ⋀ e2−a2b1e1 ⋀ e2 )

a ⋀ b=(a1 b2−a2 b1)e1 ⋀ e2

a ⋀ b=(a1 b2−a2 b1)e12

In the above formula the “(a1b2−a2 b1)” represents a coefficient scalar term of the

area of a parallelogram associated with the plane containing the two basis vectors e1

and e2.

Figure 11 - Outer Product

Figure 11 shows outer product. Outer product of two vectors is antisymmetric.

Such that a ⋀ b=−b⋀ a

Example:

‘a’ and ‘b’ are vectors and when a=(2 e1+3 e2 ) and b=( 4 e1+5 e2 ) ;

a ⋀ b=( 2e1+3 e2) ⋀ (4 e1+5e2)

a ⋀ b=( 8 e1 ⋀e1+10 e1 ⋀e2+12 e2 ⋀ e1+15 e2 ⋀ e2 )

a ⋀ b=10 e1 ⋀ e2−12 e1 ⋀ e2

a ⋀ b=−2e1 ⋀ e2 a ⋀ b ¿−2 e12

If we reverse the order of the vectors, then the outer productb ⋀ a is:

47

b ⋀ a=( 4 e1+5e2) ⋀ ( 2 e1+3e2)

b ⋀ a=( 8 e1 ⋀e1+12 e1 ⋀ e2+10 e2 ⋀ e1+15 e2 ⋀ e2 )

b ⋀ a=12 e1 ⋀ e2−10 e1 ⋀ e2

b ⋀ a=2 e1 ⋀ e2 b ⋀ a ¿2e12 −b ⋀ a ¿−2 e12

The math confirms that the outer product is antisymmetric: a ⋀ b ¿ −b ⋀ a 2.3.4 Geometric Product

Geometric product (also called wedge product) of vectors ‘a’ and ‘b’ is

represented by a “ab”. If ‘a’ and ‘b’ are vectors defined as: a=(a1e1+a2 e2 ) and

b=(b1e1+b2 e2 ) then [60]:

As per

V 1 V 2=[( a1b1 ) e1 ∙ e1⏞ei ∙ ei=1

+( a1b2 ) e1 ∙ e2⏞e i ∙e j=0

+( a2b1 ) e2 ∙ e1⏞e j∙ e i=0

+( a2b2 ) e2 ∙ e2⏞e j ∙ e j=1]⏟̇

product

+

[ (a1 b1 ) e1∧ e1⏞ei∧e i=0


+ (a2 b2 ) e2∧ e2⏞e j∧e j ]

⏟wedge product

ab=( a1e1+a2 e2) ( b1 e1+b2 e2 )

ab=( a1e1+a2 e2) · (b1 e1+b2 e2 )+( a1 e1+a2e2 )⋀ (b1 e1+b2 e2 )

ab=( a1b1e1 · e1+a1 b2 e1 ·e2+a2 b1 e2· e1+a2b2e2 · e2 )+(a1b1 e1 ⋀ e1+a1b2e1 ⋀ e2+a2b1e2 ⋀ e1+a2b2 e2 ⋀ e2 )

ab=(a¿¿1 b1+a2 b2)+(a1 b2 e1 ⋀ e2−a2 b1 e1 ⋀ e2 )¿

ab=(a1 b1+a2 b2)+(a1b2−a2 b1)e1 ⋀ e2

48

ab=( a1b1+a2 b2 )+(a1 b2−a2b1)e12

The output of the geometric product contains two terms. The first term from the

output “(a1b1+a2 b2 )” is a scalar. The second term “e12” is bivector with a coefficient

of “(a1b2−a2 b1)”.

Geometric product of two vectors is not equal when we change the order of

vectors. Such that ab ≠ bathe exception would be if the vectors are parallel then

ab=ba.

Example:

W hen a=(2 e1+3 e2 ) and b=( 4 e1+5 e2 )

As per

V 1 V 2=[( a1b1 ) e1 ∙ e1⏞ei ∙ ei=1

+( a1b2 ) e1 ∙ e2⏞e i ∙e j=0

+( a2b1 ) e2 ∙ e1⏞e j∙ e i=0

+( a2b2 ) e2 ∙ e2⏞e j ∙ e j=1]⏟̇

product

+

[ (a1 b1 ) e1∧ e1⏞ei∧e i=0


+ (a2 b2 ) e2∧ e2⏞e j∧e j ]

⏟wedge product

From above formula

ab=(2 e1+3 e2 ) · ( 4 e1+5 e2 )+( 2 e1+3e2) ⋀ (4 e1+5 e2 )

ab=(8 e1 · e1+10e1 · e2+12e2 · e1+15 e2 · e2 )+( 8 e1 ⋀ e1+10e1 ⋀ e2+12 e2 ⋀e1+15e2 ⋀e2 )

ab=(8+15 )+(10 e1 ⋀ e2−12e1 ⋀ e2)

ab=23−2e1 ⋀ e2 ab ¿23−2e12

49

Reversing the order of the vectors, the outer productba is:

ba=(4 e1+5 e2 )· (2e1+3 e2 )+( 4 e1+5 e2 ) ⋀ (2 e1+3e2 )

ba=(8e1 · e1+12e1· e2+10 e2 · e1+15 e2 · e2 )+(8 e1 ⋀ e1+12e1 ⋀ e2+10 e2 ⋀ e1+15 e2 ⋀ e2 ) ba=8+15+2e1 ⋀ e2

ba=23+2 e12

The math confirms that the ab ≠ ba.

2.3.5 Inverse of Vector

If a vector geometric product A−1L A=1 then A−1L is called the left inverse of

vector A and if AA−1R=1 then A−1R is called the right inverse of vector A . Geometric

product is not commutative, therefore the left inverse and right inverse may or may

not be equal.

2.3.6 Versors

“One type of multivector that lends itself for inversion has the form

A=a1 a2 a3 ... an where a1a2a3 ...an are vectors, and Versor A is their collective

geometric product. Such multi-vectors are called versors.

“Versor A=a1 a2 a3 ... an geometric product of vectors.”

Reverse of versors A is A†=an . . . a3 a2 a1 .

Multiplying A† with A

A† A=(an ... a3 a2 a1 ) (a1a2a3 ...an )

A† A=¿

50

A† A=¿ (¿a1∨¿2 ¿ + ¿a2∨¿2+¿a3∨¿2+. ..+¿ an∨¿2¿¿¿) Furthermore Multiplying A withA†

A A†=(a1a2a3 ... an ) (an ...a3 a2 a1 )

A A†=¿

A A†=¿ (¿a1∨¿2 ¿ + ¿a2∨¿2+¿a3∨¿2+. ..+¿ an∨¿2¿¿¿) A† A=¿ A A† and it is scalar

A A−1=1

We can say A† A A−1=A†

¿¿

A−1=¿ A†

AA †

A−1 A ¿ A†

AA † A ¿ A† AA† A

= 1

For versors implies that A−1L∧A−1R are same.

Suppose A=a is a multivector, if writing A in reverse order A† = a.

A A† = ¿a∨¿2 ¿ A−1 = a−1 = a

¿a∨¿2¿

There for given ab we can derive b multiplying with a−1

aa−1 = 1 a−1 ab = b b= a

¿a∨¿2ab¿ similarly, we can obtaina= b

¿b∨¿2ab=a¿

Example: using versors and inverse we derive component of geometric product.

51

Assume

secret key s1 = 5 is defined as a vector a ¿ (2e1+3 e2 )

data value d1 = 9 is defined as a vector b=( 4 e1+5 e2 )

secret key s2 = 7 is defined as a vector c=( 3e1+4 e2)

W hen a=(2 e1+3 e2 ) , b=( 4 e1+5 e2 ) and c=( 3 e1+4 e2)

ab=(2e1+3 e2 ) · ( 4e1+5 e2 )+( 2 e1+3e2) ⋀ (4 e1+5e2 )

ab=23−2e12

abc=(23−2e12) (3e1+4e2 )

abc=61e1+98 e2

To derive value of b=a a−1 bcc−1 b=¿ (61 e1+98 e2 )(

3e1+4e2

25)

b=¿ 2 e1+3 e2

13(23−2 e12)

b=¿ 113

((46+6)e1+(69−4)e2)

b=4 e1+5e2 . This is foundation for new encryption cipher. How the geometric production

and inverse will play a big role in the development of new cipher using versors.

Versors gives a choice to have multiple vectors in the geometric product which

results two types of output. The intermediate result produced contains a scalar and

a multi-vector. The result of the vectors geometric product is a vector.

52

CHAPTER 3

PROBLEMS AND LIMITATIONS

In this chapter, I will present various security problems with Cloud and SSD

storage. I will present about various types of cyberattacks and discuss the importance

of randomness of encryption methods and its limitations. I evaluate existing

encryption methods and their performance on SSD in the Cloud and the performance

penalties in terms of IOPS. This section will show that encryption methods/techniques

affect workload performance. I used Amazon Web Services (AWS) for this

performance benchmarking. First, I studied the storage (SSD) performance impact

between various storage options provided by AWS without encryption. Next, I

benchmarked workloads with various block sizes, read/write ratio, and encryption

methods on VMs with regular, encrypted SSD, and software encrypted containers.

Also, this chapter will discuss existing encryption methods including homomorphic

encryption methods.

53

3.1 Defining the Problem

In the cloud computing environment, there are several security threats. Cloud

Storage SSDs brings their own strengths and weaknesses. Here I consider the causes,

conditions, and limitations of enterprise cloud storage that can generate security

concerns, to see if there are practical solution(s) to all stages ESD security. I will also

explain how these weaknesses are exploited using cyber-attacks. Currently various

encryption methods are used to handle this problem, but each has its limitations. I will

discuss the limitations and problems of existing and proposed encryption methods

including FHE.

3.1.1 Encryption Security Limitations and Problem

Practicality of Homomorphic Encryption: Practical Homomorphic Encryption

Survey [26] say “A significant amount of research on homomorphic cryptography

appeared in the literature over the last few years; yet the performance of existing

implementations of encryption schemes remains unsuitable for real time applications”.

Due to homomorphic encryption speeds are one of the main reasons for this

conclusion, such as because it takes ranges from 2.5 sec to 2.2 hours to generate the

key, the implementation is complex, noise creation can exceed thresholds, and

bigger key sizes ( 17MB to 2.25GB) require high memory resources; all this becomes

impractical in real systems [3] . Fully Homomorphic Encryption (FHE), is on the

“bleeding edge” of encryption technology. But currently there is no FHE available for

real time applications [26]. There is still a lot of work that needs to be done to have

“production ready” version of FHE.

Execution of Encryption method in the Cloud: The conventional encryption

methods have a couple of issues.

54

Large amount of data that needs to be transferred between the client and the

cloud.

If client is okay to have the encryption key on the cloud, that means the very

item used to decrypt the file will be readily available, in case an attacker gets

into the cloud system, which is clearly a security concern.

If the client chooses to not store the key in the cloud, to update a file; they must

download all the encrypted file, decrypt it, modify it, encrypt it again, and

upload the encrypted file back to the cloud. As the file grows it increases the

overhead on the resources.

This research will focus on deriving production ready secure, efficient, scalable,

and portable homomorphic encryption method focusing on the following section

Encryption Limitations.

3.1.2 Encryption Limitations:

Key Strength: If the data is encrypted, customers must use a key to manage the data

storage process. If the key was generated with low randomness, that will create

weaker security.

Encryption Algorithm: The degree of the system’s security depends on the strength

of the cryptography method and its implementation. Increased computing power

allows hackers to break encryption algorithms that were once considered state of the

art.

Encryption vs Performance: There is very little research on how various encryption

software methodologies impact performance of various workloads on SSD in the

cloud. The problem with these methods is that enterprises use the same encryption

software for all types of workloads and different storage systems. Encrypting and

55

doing regular application workload functions simultaneously will adversely impact the

read write performance of SSD drives.

3.2 Other problems contributed for research motivation

All of the following problems also motivated to do this work but mainly solving

problem mentioned in 3.1.1 section.

SSD Physics: Some SSD vendors implemented their FTL (Flash Translation

Layer) with errors, those errors may prevent full sanitization or may delete all the data

by overwriting the entire visible address space. Overwriting SSD address space is not

always sufficient to sanitize the drive because the data persists, and this is a time-

consuming process [31]. When a file is deleted, from the OS’s perspective it is deleted,

but on the SSD it may remain until garbage collection happens with the TRIM process

[11].

Persistence of Data: When an SSD write occurs, data writes to new cells, but the

data still exists in the old cells until a TRIM is executed [31]. If the key and encrypted

file are stored on the same system, there is a possibility to read the encryption key

from the SSD key storage area [32]. The SSD’s internal design and the way

IO(Input/Output) operations happen are different than HDD’s. Yet, most encryption

software for SSDs was developed using the same cryptographic algorithms that were

used for HDDs. However, this does not account for SSD’s ghost data.

Data Exposure: If the data is not encrypted, then there is a risk of exposing

personal data, this state can pose a security threats while data is at rest or traveling.

The data can be accessed from different devices like PCs, phones, and public

networks, which can each pose a security threat due to malware, adware, and non-

secured public networks if they can be accessed by hacker. Public cloud poses its own

56

security issues due to other cloud security threats like account hijacking, human error,

etc.

Account Hijacking: One of the major security issues for the cloud is account

hijacking, where someone gains access to account credentials and uses them for

nefarious purposes.

Human Error: Human error and negligence can pose a security threat. For

example, not removing the key or plain-text file from the cloud system. In Cloud

computing users must move the key between their system and the cloud. Security

issues can be caused, if the users are not following proper security procedures and

practices; such as writing passwords on sticky notes, forgetting passwords, sharing

passwords, and sharing keys in non-secure way, etc.

3.2.1 Cyber Attacks

There are various attacks can be performed by attackers. One must remember

while designing the encryption cipher should able to protect the data from these

attacks.

Ciphertext-Only: When an attacker has access to ciphertext and nothing else, such as

the key or plaintext, then using statistical methods they can guess the distribution of

characters and use them to reveal the plaintext or secret key. This is called a

Ciphertext-Only attack. This most difficult type of attack for the attacker, since the

attacker has the smallest amount of information [27].

Known-Plaintext: In this case if an attacker will have some of the plaintext/ciphertext

pairs and then they use them to derive the key. This is called a Known-Plaintext

attack. I will show using statistical methods and mathematical operations manipulation

and see how I can able to derive the keys.

57

Chosen-Plaintext: It is similar to a Known-Plaintext attack, but an attacker can

choose and manipulate the plaintext input to the encryption algorithm, then evaluate

the resulting cipher text to obtain the key.

Distinguishing-Attack: The goal of a distinguishing attack is to distinguish the

keystream of the cipher from a truly random sequence. An attacker can distinguish the

cipher output from random data faster than a brute force search is found. This sort of

information can be very valuable to an attacker to reveal the plain-text.

Birthday Attack: A Birthday-Attack is based on the statistical concept of the

Birthday Paradox where a match between two random items increases as the number

of elements to use increases. For example, if there are 23 people in a room the

probability of two people having same birthday increases to 50.7%. This concept is

expounded upon with determining the encryption key (Birthday Attack). While the

numbers are higher, the concept of matching the encryption key is statistically much

higher than the true randomness of the key.

Meet-in-the-Middle Attack: In this method the attacker builds a table with keys and

MACs (Message Authentication Code). A MAC is computed using 50% of the

possible keys of key length on the same plaintext. Then the attacker eavesdrops on

each transaction and compares the cipher with MAC table and reveals the key.

There are several more methods of attacks and cyber threats like spectra and

meltdown. The impact of an attacker finding a key could be devasting; this would

give attackers to access to personal, financial, medical information and prevent access

to this information from authorized users. All of these are a justification to constantly

increase the strength and complexity of ciphers which are an important part of

security [6].

58

3.2.2 Real Randomness

To generate an encryption key, real randomness is critical but extremely hard to

achieve on computer system. Pseudorandom numbers can be generated from the

system’s entropy resources: timing of keystrokes, exact movements of a mouse, and

fluctuations of hard-disk access time [61]. The key generated from randomness of

these sources may become suspect, if an attacker is able to measure those sources and

apply them to simulate the same random number generation; but this is difficult, due

to the amount of entropy generated from these resources.

Timing of a single keystroke will generate 1 to 2 bytes of random data and

cryptographers think that is not enough entropy to thwart off the threat of attacker

determining the key. Better typists have a consistent typing pace, where the timing

between each keystroke will be within milliseconds, limiting frequency of which

keystroke timing can be scanned, so timing of typing data may not be random. In this

example, the attacker may have access to resources such as the computer’s

microphone to hear the keystrokes and determine the timings (pace). Even generating

the randomness using quantum physics force specific patterns that may be prone to

attacks. This is because an attacker can use the RF (Radio Frequency) field to

influence these patterns [55]. Suppose I have a key with 128 bits of random data, this

can still be vulnerable because an attacker can try 2128 computations. This brute force

attack is of growing concern as computation speeds increase.

3.2.3 Storage Security Limitations

This thesis first evaluates the SSD storage security and modern encryption

software for securing the SSD. First, I will discuss the importance of reliability and

integration of the SSD and then I will address security. Cloud storage primarily uses

SSD as storage to achieve performance guarantees. Second, I studied SSD

59

characteristics to understand SSD strengths and performance metrics, when I use

various storage specific encryption methods. By using performance benchmarking, I

want to prove that encryption will impact the performance of read and write operation

of storage.

3.2.4 SSD System Level Induced Limitations

SSD physical structure poses reliability and scalability limitations. This can result

system level limitation like wear leveling (endurance), Bad Block Management, and

Performance. Understanding the SSD limitations can help to determine or derive

better security techniques for the device.

3.2.4.1 Physical Limitations Contribute to Logical (Software) Limitations

This chapter will describe the SSD physical limitations and how they will impact

logical SSD functions. The following four major components of SSD functions will

detail the physical and logical limitations.

3.2.4.2 Physical Level Address Map

In SSD, the address map is applied the same as traditional hard disk drives. The

SSD FTL maintains all the address table information. In figure 12, the top row is the

logical address space and the bottom row is the physical address space. From the

host’s perspective the writes and edits happen in plain sight.

60

Figure 12 - Address Mapping between physical to logical

Due to the limitations of SSD, it does not allow writes on the unused pages in the

block, it instead writes to a new page in a new block, which is assigned in the physical

block (in physical world it is the string). The old pages are not erased, but they are

marked as invalid pages. Writing and rewriting to a cell causes cells to be exposed to

multiple voltage impacts which deteriorates the cell walls, which reduces its life span.

To avoid deterioration of an individual or set of SSD blocks, each rewrite follows a

wear leveling algorithm to make sure all the cells deteriorate consistently. Also, when

the current physical block is full, then another free one is assigned to the logical block.

These changes add mapping addresses to the translation table (address mapping table),

which is also stored on the SSD. The data for this table may be stored on the SSD

itself, that could decrease the storage capacity of the device [11] [62].

Even with the best wear leveling algorithm, bad blocks will be created due to the

inherent limitations of SSD writes and erases. When the blocks are not reliable, they

are called bad blocks; information about these addresses are maintained by the BBM

(Bad Block Management) map. The limitation is keeping the BBM up to date, which

is important for reliability. If the BBM is not maintained with correct information

61

about bad blocks, then the system will try to write to those blocks. The particular data

which is written to bad blocks will not be reliable. Monitoring the BER (Bit Error

Rate) is also important to achieve a reliable system. ECC (Error Correction Code) is

used to maintain the BER, but the ECC engine may cause performance issues, if it is

not designed to perform in parallel for multiple channels. Correcting too many errors

though, will negatively impact the efficiency of the drive [63].

3.2.4.3 Physical Wear Leveling Limitation

TOX (ZrO2) is a dielectric material and its thickness is a limiting factor in SSD.

Floating gate cells will lose their charge over time through TOX, due to the thinness

of the TOX layer. Floating gate cells also experience wear and tear due to additional

stresses caused by voltage fluctuations. Electric charge for “program” (writes)

operations are transferred through the TOX in the form of oxide traps. The

concentration of the traps increases along with each write and erase operation, this

called oxide stress. When electrons leak from a floating gate, these traps are used as a

path for these electrons to travel toward the cell channel region [64]. The number of

electrons leaking through the border of TOX is lower than the electrons traveling

through SILC (stress-induced leakage current). If you have a close distance for SILC

between each tunneling step, it increases the leakage. The TOX thickness scalability

limitation is defined by important factors: the number of traps, SILC, and oxide

voltage of the floating gate cell during retention. It’s been determined that the TOX

thickness must be 8.0-7.5 nm [65].

The floating gate cells should be able to hold a charge for minimum of 10 years.

This was determined based on how much leakage is acceptable in a 10-year time span.

The TOX thickness requirement plays an important role in defining the acceptable

62

leakage. The number of cycles of program/erase operations applied to that cell also

depends on TOX thickness. After about 10 thousand program/erase cycles the cell

voltage threshold shifts upwards which would then require more voltage to do the

operations of the cell. Physically neighboring cells share the same sensing amplifier.

Because of this, a voltage shift in one cell will be used by neighboring cells. But this

could damage cells which do not require more voltage. The effects of cells going bad

will change the over-provisioned cell amount (each SSD is manufactured with more

storage, at least 25% more than the stated amount). Over-provisioned cells play a main

role on endurance, as they decrease the SSD life span also decreases [65].

3.2.4.4 Physical Limitation of Parallelism

When I discuss parallelism in terms of SSD, we are discussing parallelism of the

read, write, and erase operations. The performance of these operations in parallel will

be faster because multiple operations are processed at the same time. There are a

couple of ways to increase the parallelism, one would be increasing the dies per

channel, another would be increasing the number of channels. In increasing the dies

per channel method, this may cause channel overloading and it may not be helpful for

write performance. In increasing the number channels method, this can pose different

Error Correction Codes for each channel, for this it needs dedicated SRAM (Static

RAM). This option is scalable and can increase performance for the read and write

operations. Hence, memory components must be coordinated to operate in parallel.

The serial ‘interface’ is over flash packages which can cause a bottle neck for the

performance.

Other techniques to consider that may improve performance with parallelism: page

size, page spanning process, queueing methods, ganging multiple flash, interleaving

63

between flash, and the background cleaning process. With the page size technique, if

the page size is smaller this will make look up times faster and take less space than if

the page size table were larger. But this may not be good for performance if the data

blocks are not consistently accessed. With the page spanning process technique,

different flash packages can distribute the information to a single or multiple package.

If the data stays on the same package, the results will have faster performance;

otherwise it goes through different packages which will lower the performance. With

the separate queue technique, each package handles parallel requests simultaneously,

this means there is access to all the flash packages at the same time. This process is

scalable and flexible and wear-leveling is maintained equally. The drawback in this is

each queue needs to maintain its own ECC, SRAM, and it also complicates the FTL.

Handling too many ECCs may decrease performance. Ganging multiple flash

packages technique is when SSD algorithms combines multiple flash packages

together, then maintains for that group packages the same queues, ECCs, and FTL. It

handles multi-page requests with a reduced number of queues than the separate queue

technique uses. This processing helps with less overhead for the ECC, but too few

queues to work with, can cause a bottle neck for a busy system. With interleaving in

flash packages all processes occur within a single die to speed up the read and write

operations. To avoid the latency in this process, it can access all related blocks in one

place, which is faster than crossing between flash packages through a serial

connection. The drawback of this process is it may be writing to the same blocks over

and over. When we focus on interleaving the benefits of wear-leveling are lost.

Background cleaning process of SSD happens on packages when the system is not

busy. When the cleaning process occurs, crossing between different packages means

moving the erase blocks from one package to another through the serial connection.

64

This generally is slower than cleaning the same die, but it will maintain wear-leveling.

Each technique has its own pros and cons, so we need to carefully analyze which

technique is better depending on each workload situation [11] [66].

There is another form of parallelism which may improve performance, placing

continuously allocated data from one domain over a set of N domains (A set of flash

memories that share a specific set of resources like channels, queues, and ECCs; that

can be divided into sub-domains as packages) like a stripe using mapping policy.

Most flash memory packages support two-plane operations to read multiple pages

from two planes in parallel and the operation across the dies can be interleaved. Since

logical pages are normally striped over the flash memory array, reading multiple

logical continuous pages in parallel for read ahead can be performed efficiently [11].

Figure 13 - Flashes and their parallel architecture

Most of the SSD operations store two bits per MLC cell. It was theorized that

storing more (3 to 4) bits in each cell would increase the performance. But research

65

showed, the Vth voltage threshold required for the read, write, and erase operations

took longer for 3 and 4 bits than it took for 2 bits per cell. Strategy wise, running

NAND chips in parallel (Figure13) would give the best performance, but it has its own

limitations. More chips require more current flow, and that may not be possible due to

the limiting factor of the maximum allowed current. Also, you need to read these

strings using thousands of reading circuits with lots of sensors, which can make the

process too complex and is more error prone [11].

3.2.4.5 Physical Limitation of Workload Management

In the current market, the SSD for consumer and enterprise versions are different.

Vendors built according to the anticipated workloads. Depending on the workload

requirements, they are built and programed with different designs. The consumer

version does not need as complicated algorithms as does the enterprise version. In the

real world, the consumer version of SSD falls short of the needs of the enterprise

version (Figure 14), in that it does not have algorithms for zero tolerance of data loss,

the uptime reliability, the endurance, the performance, and the error correction code

handling; plus, it does not need to work with multiple I/O operations. Usually

enterprise SSD systems come as pure flash (SSD) storage or hybrid (combined HDD

and SSD) storage. Enterprise SSDs must be able to simultaneously handle workloads

like file, database, email, etc.; that are generated by multiple users with various traffic

patterns. These different traffic patterns are multi-threaded random workloads, they

are handled independently using multiple initiators. Additionally, for enterprise usage

it must maintain consistent I/O throughput (IOPS), integrity, and availability. The

SSD controller needs to be tested thoroughly before it can be placed into enterprise

usage to handle workloads 24/7/365.

66

Figure 14 - Consumer Vs Enterprise SSD

In the case of power failures or other disruptions in a data center the work-loads

must be protected, so enterprise SSD systems are designed to handle those situations

with the help of ECCs and CRCs (Cyclic Redundancy Check). Reliability of the

work-loads is very important, and SSD systems are built using redundancy techniques

(RAID) to cover any hardware failures. If an enterprise wanted to have the higher

performance, they can replace HDD storage with SSD, but it can become expensive.

The details will be discussed in the existing research section [11].

3.2.5 Existing research to mitigate the software limitations

Some of the main limitations in SSD are address mapping, parallelism

(performance), wear leveling, and workload management. The user will not have the

option to change the physical structure of the SSD. They will be limited to software

approaches to mitigate the physical limitations. This section explains the research that

has been done to mitigate these limitations. Most approaches have been focused on

67

improving processes within the FTL. The FTL is a core part of the SSD controller

that maintains a sophisticated address mappings ( Indirect address mappings between

‘physical block address’ and ‘logical block address’), log-like write mechanism, GC

(Garbage Collection), wear leveling, ECC, and over-provisioning [67].

3.2.5.1 Address Mapping

One of the FTL main functions is to maintain a mapping table of virtual addresses

to physical addresses. Write operations can only happen when the block is in a special

state called “Erased”. The erase operations happen at a much coarser spatial

granularity than write operations, since page-level erases are extremely time

consuming [68]. Page-level FTL mapping can provide compact and efficient

utilization of each block, but the issue is that this takes a large amount of printing

paging-table space (32MB SRAM large page table for 16GB Flash) and in some

situations the lookup time will also be higher than calculating the off-set in block-level

mapping. The block-level FTL mapping uses offset to calculate the page number, to

maintain page information it requires just a fraction of the printing page-table space.

However, looking up a page information in this mapping is more time- consuming

than it is in page-level mapping. It also forces the logical page to be mapped to a

physical page within each block. As a result, garbage collection overhead grows. Still

the block level address mapping is the better option to use because it uses a lot less

space [69]. Both schemes are opposite extremes in their weaknesses. This means page

level mapping uses more space for the mapping table while block level mapping

generates more garbage collection [11].

To address this issue, researchers implemented hybrid FTL, which combines

page-level and block-level address mapping in the SRAM. In this method, some of the

68

address table is stored on SRAM while the rest is stored on flash. This results into a

problem with the hybrid FTL approach, because random writes (need to look both

areas for addresses) induce costly garbage collection which it impacts the performance

on subsequent operations. Demand-based page-mapped FTL-DFTL (Demand-based

Flash Translation Layer) addresses this problem in their approach. DFTL stores only

the most recently used address translations on SRAM, while the rest are stored on

flash [69]. The reason for this storage strategy is that most enterprise-scale workloads

exhibit significant temporal locality. However, the DFTL does not support spatial

locality of workloads, which means frequent “evict out” operations will cause extra

erase operations and page mapping lookup overhead for workloads with less temporal

locality. DFTL limits the space to store the page table and it suffers from frequent

updates to the page mapping table in the SSD flash for write intensive workloads and

garbage collection [69]. The CFTL (Convertible Flash Translation Layer) approach

tries not to depend on the space of SRAM. CFTL is a hybrid FTL with efficient

caching strategies and can dynamically change according to data access patterns.

CFTL’s concept is to use read-intensive data managed by block level mapping and

write-intensive data managed by page level mapping. CFTL uses a hot data (data that

is accessed the most by users) identification method to change the page mapping table.

The CFTL uses a bloom-filters-based scheme which can capture recent and frequently

accessed information at a fine-grained level. CFTL considers temporal and spatial

locality of workloads for page level cache. If the page size is large, this means the

chance that a file is spanning to multiple pages is lower; hence, the consecutive field

of CFTL will be less effective [70]. SCFTL (Strategy Caching Flash Translation

Layer) deals with the large page size and the spanning issue of pages. SCFTL stores a

69

page-mapping table in several TPs (translation pages) containing thousands of

physical page numbers and mapped to consecutive logical addresses. SCFTL’s PMT

(page-mapping table) contains TPD (translation page directory) and CMT (cache

mapping table). TPD is in RAM and indexes CMT by the most significant bits of

logical addresses. The performance degradation from offloading the mapping table is

reduced by caching several mapping entries in the CMT. CMT integrates two spatial

locality exploitation techniques and a customized cache replacement policy to enhance

its efficiency of SCFTL. SCFTL performs multilevel page table lookups for address

maps. If there were a cache miss then the request goes to TPs, if a cache miss occurs

there too, then the requested block must get it from flash [71]. CA-SSD (Content

Aware SSD) is a modified FTL that adds minimal support in the form of additional

hardware for hash functions. It uses hashes as values in the mapping table instead of

page information. It also requires battery-backed RAM to store hashes. The drawback

of the approach of CA-SSD is that it depends on battery power and extra hardware

[72].

Implementing encryption on the above approaches will become cumbersome.

When the scholars studied address mapping enhancements, they may have not

considered encryption. The existing research results may not be the same with

encryption and that needs to be studied further.

3.2.5.2 Wear Leveling

Due to the locality in most workloads, writes are often performed over a subset of

blocks (e.g. file system metadata blocks). Some flash memory blocks may be

frequently overwritten and tend to wear out earlier than other blocks [11] [66]. FTLs

usually employ some wear-leveling mechanism to ‘shuffle’ cold blocks with hot

blocks to even out writes over flash memory blocks. There is has been some research

70

with some variations on how to approach wear-leveling in the form of managing

workloads. Researchers approached implementing CAFTL (content aware FTL) for

removing unnecessary duplicate writes to improve the efficiency of garbage

collection, wear-leveling, and reduce the write traffic to flash [73]. One of the

previous researchers came up with an approach to solve the wear-leveling issue by

reusing the flash blocks, which have been cycled to the specified worn out algorithm

SR-FTL (Smart Retirement FTL) [74]. Another approach is to use a dual-pool

algorithm to store cold data to the blocks that have been identified as more worn and

smartly leave them alone until wear leveling takes effect [75].

With all the bodies of research on wear leveling approaches, it is a complex (full

of unknown variables) process and there may never be a perfect solution. That’s

because there are no consistent workflows nor predictable usage of storage. So, the

researchers weigh the pros and cons for various approaches to evaluate the

performance versus endurance versus reliability with different workloads. But the

inherent nature of SSD is to move data around to maintain wear leveling. In doing so,

it leaves valuable data in the invisible address space, even though it is not retrievable

by normal operations, it is still there. Ideally, purging or overwriting the address space

is most desired, but it may create a lot of wear on an SSD. Encrypting the data allows

us to retain existing wear-leveling algorithms without exposing this valuable data.

3.2.5.3 Parallelism

The bandwidth and operation rate of any given flash chip is not enough to achieve

optimal performance. SSD has multiple flash arrays so we can run multiple I/O jobs

concurrently and this will improve the performance of the SSD. A single flash

memory package can only provide limited bandwidth (e.g. 32-40MB/sec). Writes are

slower than reads, other necessary background jobs like garbage collection, wear-

71

leveling, can incur latencies as high as milliseconds [66]. These limitations can be

addressed by SSD’s clever structure that is built with an array of flash memory

packages connected through multiple channels to flash memory controllers to provide

internal parallelism. The logical block addresses as the logical interface to the host

system, and it can stripe over multiple flash memory packages. This way the data

accesses can be conducted independently in parallel, it will provide high bandwidth in

aggregate and hide high latency operations, that combination can result in high

performance [73]. One way is to improve the sequential writes is by dividing the flash

array into banks; each bank will be able to read/write/erase independently. The

performance gains from internal parallelism are highly dependent on how the SSD

internal-mapping and resource management compete for critical hardware resources.

The workloads are in the form of mixing reads and writes, but they interfere with each

other, so proper address mapping management and design of applications is critical.

Most of the applications are designed for HDD storage. When we execute them to an

SSD, this may be not optimal. The critical issues in SSD parallelism include: thin

interface between the storage device and the host, workload access patterns,

asynchronous background operations generated by reads and writes, effect on read

ahead, ill-mapped data layout, and application designs [76]. There are different levels

of parallelisms in SSD: Channel, Package, Die, and Plane. The previous research [76]

concluded that read ahead is not affected by access patterns in MLC-SSD, writes

though are strongly correlated to access patterns. Small size random writes suffer from

high latencies and high interference between reads and writes [76]. Adding a disk

cache helped improve the performance for read and write operations. But background

operations like the erase operation can cause interference with reads and writes and

internal fragmentation is too high for excessive random writes. Studies on the four

72

levels of parallelism such as channel, chip, die, and plane have shown a direct impact

to SSD performance, but they provided limited information, considering that the SSD

structure is a block box. The advanced commands utilize only die and plane levels of

parallelism; they explore how allocation schemes can determine priority order for

multiple levels of parallelism for different types of application loads. The channel-

level parallelism should be given the highest priority order among the four levels and

it was observed that chip level parallelism keeps chips very busy. The service request

can only be handled when chips are idle [76].

Parallelism has the biggest impact on SSD performance. The advantages of

existing parallelism can still be viable even with the addition of encryption

methodologies for storage.

3.2.5.4 Workload Management Integrated with SSD

Performance is highly workload-dependent. Well-designed systems, databases,

and applications improve performance. The following are some of the classic

examples of integrating SSD to systems to achieve better performance [11].

Integrating the SSD into existing system is a complex process. Scalability (replacing

1GB of HDD with 1GB of SSD) is limited by cost effectiveness, because the gains in

performance don’t justify the added expenses. HybridDyn (Integration of HDD and

SSD storage) is an innovative storage design that is cost-effective and improves

performance and endurance. It handles incoming workloads by dynamically

partitioning and distributing them between SSD and HDD. This design showed better

performance than HDD alone [77]. Another research approach is LSM-tree-based

store with an open-channel SSD to utilize channel level parallelism. Level DB (a fast

key-value storage library in LSM-tree-based store) is extended as multi-threaded to

fully utilize the channel level parallelism with evaluating optimal I/O request

73

scheduling and dispatching. Evaluating the utilization of channel level parallelism’s

impact on I/O performance showed that it outperforms conventional SSDs [13].

Another system, Libra tracks the I/O consumption of each tenant; it recognizes the

application’s dynamic I/O usage profiles and provides I/O resources accordingly.

Libra based VOP (virtual I/O operations) captures the non-linear relationship between

SSD I/O bandwidth and I/O operations throughput; it does this while considering the

disk-IO (disk Input Output) cost model [78]. Hadoop workloads showed a

performance increase over HDD alone when an SSD was integrated into the

underlying storage system.

The research showed workloads performance always improved with adding SSD

or just SSD as the storage. SSD is faster than HDD, so adding it to the storage system

it was expected to improve performance. But, in some cases, the applications won’t

able to utilize the SSD performance fully due to the nature write guarantees. This

research studies the impact on performance of the different types of workloads with

the different encryption methodologies.

74

CHAPTER 4

STORAGE ENCRYPTION ANALYSIS

In this section, I showed how the SSD storage performance is affected by storage

type(t2 micro versus i1.xlarge) and encryption software methods I proved that in both

aspects there is performance penalties for workloads.

4.1 Measurement Environment

Each Amazon EC2 (Elastic Compute Cloud) instance can access disk storage from

disks that are physically attached to the host computer. This disk storage is referred to

as an instance store or EBS (Elastic Block Store) volumes. An instance store provides

temporary block-level storage for use with an instance. The size of an Amazon

instance store ranges from 8GB to 48TB, and varies by instance type (i.e., larger

instance types have larger instance stores) for HDD. Using regular SATA SSD, the

storage ranges from 8GB to 6.4TB. If the storage type is NVMe (Non-Volatile

Memory express) SSD, then the storage ranges from 8GB to 16TB.

Amazon EBS provides two volume types: Standard volumes and Provisioned

IOPS volumes, which differ in performance characteristics and price. Standard

volumes offer storage for applications with moderate or burst I/O requirements.

These volumes deliver approximately 100 IOPS on average but can burst up to

hundreds of IOPS. Provisioned IOPS volumes offer storage with consistent and low-

latency performance, which allows users to predictably scale to thousands of I/O

75

operations per second per Amazon EC2 instance. These volume-types are designed

for applications with I/O-intensive workloads. Backed by SSDs, Provisioned IOPS

volumes support up to 30 IOPS per GB, which enables a system to be provisioned up

to a maximum of 4,000 IOPS per volume. While it is possible to stripe multiple

volumes together to achieve up to 48,000 IOPS when attached to larger EC2

instances, but as per theory it may show as regular SSD disk volumes, so we did not

evaluate this type of VMs. When attached to an EBS-optimized instance, Provisioned

IOPS volumes are designed to deliver consistent performance within 10 percent of

the guaranteed rate throughput (Provisioned IOPS) 99.9% of the time. In addition, the

delivered IOPS rate depends on the block size of the various reads and writes.

Amazon Provisioned IOPS volumes process reads and writes in I/O block sizes of

16KB or less with every increase in I/O size above 16KB, linearly increasing. A

significant amount of data was produced during the experiments and it was used to

analyze the main concepts about SSD performance variations with different variables

including encryption methods.

The experiments in this study have been conducted on three different 64-bit VM

(Virtual Machine) instances in Amazon EC2; the first one was an Amazon Linux

AMI (HVM) 2014.03.1 and the remaining two VMs were Amazon Ubuntu Server

16.04 LTS (HVM). The first VM is an instance store (i2.xlarge) of an 800GB SSD,

which can provide up to 36,000 IOPS. The second VM (standard t2.micro) is an 8GB

instance store with 3,000 IOPS. And the third VM (standard t2.micro) is an 8GB

encrypted EBS General Purpose (SSD) Volume Type with 3,000 IOPS.

The first VM is drastically different from the other two (in: memory, vCPUs, and

processor model), I chose those VMs to analyze their unique SSD characteristics. The

second and third VMs are similar (having the same: ECUs, 1GB memory, vCPUs (1),

76

and processor (2.5 GHz, Intel Xeon Family)); the only difference between the two

VMs is one of them is a standard instance store SSD without encryption and the other

VM has an attached EBS SSD volume with encryption.

4.1.1 Selection of Encryption methods

I selected the following two software encryption methods; encrypted SSD and

regular SSD. The following explains each in very high level of them and what type of

algorithm I used in these evaluations.

Dm-crypt:

Dm-crypt is a disk encryption method compatible with Linux kernel version 2.6 or

later. It uses API routines. Devices are mapped to encrypted containers using a device

mapper [51]. This API uses AES-256 cryptographic method along with other

methods. Dm-crypt uses Linux Unified Key Setup (LUKS) to create encrypted

containers which are independent from outside platforms. LUKS was developed by

Clemens Fruhwirth in 2004 [52]. Using this method, the user can even encrypt the

root device. A passphrase is required to create encrypted containers.

There has been some research around the drawbacks of dm-crypt. For example, it has

been discovered that hackers can sidestep the passphrase to access encrypted

containers by hitting the ‘Enter’ key a couple of times. They can also delete the

containers, because deleting containers does not require a passphrase. An intruder can

determine critical components of the hidden containers relatively easily by utilizing

disk commands on the system [53] [54].

BestCrypt:

77

BestCrypt is an encryption software installed on the OS level that care create

encrypted containers or volumes downloaded and created encrypted volumes. Use

them to store secure data with encryption password. These volumes are mounted as

file system to store data. I applied AES encryption algorithm as option to gather

performance statistics [55].

Self-Encrypting Drive (SED):

When Full Drive Encryption (FDE) is applied on an SSD, it is called a Self-

Encrypting Drive (SED). FDE was developed in 2009. It is a literal encryption of the

entire system which includes all the partitions, system files, and operating system.

This encryption method assigns the process to use the hardware component of the

drive. This helps to enhance the security by utilizing the Opal Storage Specification

(which is a set of specification features of SEDs). SED needs a master password for

the SED and a user password for each user. They are stored in the BIOS and handled

by the hard disk controller. SED uses AES 128 and AES 256.

Researchers have found the following vulnerabilities of this method: Hot Plug Attack,

Hot Unplug Attack, Forced Restart Attack, and Key Capture Attack. They have also

shown that attackers can bypass the encryption and access data; this undermines the

purpose of securing the data [56].

4.1.2 Experimental Tools and Workloads

To evaluate the internal parallelism of SSDs by producing the necessary

workloads in this research, FIO (Flexible I/O) Synthetic Benchmarks were used1. FIO

is a tool that generates multi-threaded workloads with different configuration

variables to fully utilize the hardware, such as: a read/write ratio, a block size, and the 1 http://freecode.com/projects/fio

78

number of concurrent jobs. This process produces a report that contains the

bandwidth, the IOPS, the latency, plus many other measurements. I used various SSD

storage device with different I/O workloads to calculate their performance metrics;

each workload was run for 60 seconds using FIO. A sample FIO command is

provided below:

fio --filename=/dmcrypt/4krandreadwrite6040j8 --direct=1 --rw=randrw --size=1024m --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=4k --rwmixread=60 --iodepth=8 --numjobs=8 --runtime=60 --group_reporting --name=4krandreadwrite60j8 --output=/home/output/4kdmcryptrandreadwrite60j8

Sample FIO Command

In the Sample FIO Command, the file size to be written is 1024MB (size=1024m)

in block sizes of 4K (bs=4k). The workload is split between 60 percent random read

(rwmixread=60) and 40 percent random write (=100-60% read) with 8 jobs

(numjobs=8) running in parallel for 60 seconds(runtime=60).

Experiments were executed independently on each virtual machine to fully utilize

the SSD parallelism capability while introducing variations in the block size (4k, 8k,

16k, 32k, 64k, and 128k), the number of parallel jobs (8), and the random read/write

ratio (100 percent reads, 100 percent writes, and 60/40 read/write workloads). These

factors were tested on an unencrypted SSD, different SSD’s with two software-based

encryption methods, and one fully Amazon encrypted SSD. Each experiment was

executed for a total of 60 seconds utilizing the FIO benchmark, version 2.1.7.

The FIO command were in the following order: 100% write, 60/40 read write,

and 100% read. Each one was executed for six different block sizes with 8 number of

jobs. To emulate an enterprise workload environment, I used random read/writes

workload environment. The research is about how these workloads get affected based

on the encryption method and its implementation. A queue depth of eight was

79

selected as sufficient, because only a handful of earlier trials were utilizing a depth

past eight.

4.2 SSD performance without Encryption

I completed lengthy experiments and exposed the knowledge of the internal

structure of SSDs, and background information regarding the storage options within

Amazon EC2. I am now positioned to evaluate the experimental results and answer

several related questions.

I created different types of VM instances using SSDs with different IOPS ranges.

Our research considered all of those to understand the internal characteristics of SSD.

Baseline metrics were created from those experiments to use for performance

comparisons with various encryption implementations.

4.2.1 Performance differences between Amazon EC2 VMs

There were significant differences in the performance between the two Amazon

EC2 instances. While this was expected, it was interesting to validate the actual

performance characteristics of the two different instances versus the specs that

Amazon provided about their VMs.

In Graph 1 and Graph 4, the performance of the i2.xlarge instance consistently

out-performed the t2.micro instance in all experimental runs with all block sizes. In

addition, this difference typically increases as the read/write ratio transition closer to

100 percent reads, regardless of whether evaluating a sequential or random

read/write. This is likely since the instance store volume is physically attached to the

computer to which the EC2 instance is running. Our experiments focused on random

read writes. One of the limitations in this comparison is that the total random reads

and writes were limited to 35,000 IOPS on the i2.xlarge instance and only 3,000

IOPS for the t2.micro instance. This prompted me have a more in-depth comparison

80

between t2.micro instance store and EBS storage volume to perform a more in-depth

comparison of the two different storage mechanisms. Section 4.3.1 I discuss the

results.

4.2.2 Did various block sizes significantly affect I/O throughput?

In both Amazon EC2 instances I observed that as the block size increases the

number of IOPS decreases along with the execution time to complete the required

reading and writing of data by FIO. This is most likely because as the block size

increases, there is less frequent overhead required to manage the writing of larger

blocks. In addition, as expected with increased block sizes, the reading or writing of

data is also completed in increasingly larger chunks. The metrics in Graph 1 plot the

ratio of reads and writes versus the number of IOPS completed for various levels of

block sizes. I can see that IOPS decreased as block size increased; the only exception

was that the 16K 100% read out performed the 8K 100% read.

0 10 20 30 40 50 60 70 80 90 1000

10000

20000

30000

40000

50000

60000

70000

i2.xlarge Block size can affect number of IOPS

4k - rand read

8k - rand read

16k - rand read

4k - rand write

8k - rand write

16k - rand write

Read Percentage

IOP

S

Graph 1 - IOPS Vs Block Size

81

4.2.3 Did various levels of parallelism affect I/O throughput?

Experiments were performed consisting of 8, 16, and 32 threads, or jobs,

operating in parallel on all block sizes. As seen in the Graph 2 (using a block size of

8K), I did not see any significant improvements between 8 threads, 16 threads, or 32

threads; but instead saw a drop in IOPS for the 16 thread and 32 thread simulations.

This may indicate the SSD is saturated after 8 threads and cannot provide any

increase in performance using parallelism. The main observation is that 8 threads or

jobs saturated the SSD parallelism and increasing the jobs did not help.

0 10 20 30 40 50 60 70 80 90 1000

5000

10000

15000

20000

25000

30000

35000

40000

45000i2.xlarge Number of jobs VS IOPS

16 jobs rand read

8 jobs rand read

32 jobs rand read

8 jobs rand write

16 jobs rand write

32 jobs rand write

8 jobs rand read write



Read Percentage

KB /

Sec

Graph 2 -Parallelism Vs Throughput

4.2.4 Did random and sequential jobs have a different IOPS?

In Graph 3 , I observed there was no significant difference between the observed

behavior of sequential reads and writes versus those of random reads and writes. The

i2.xlarge instance has been optimized by Amazon for random reads and writes; as it

even performed better than the corresponding sequential reads and writes. This occurs

around 55 percent reads and 45 percent writes and continue until about 90 percent

reads, where sequential outperforms random reads/writes again. The results showed

82

that at 100 percent sequential write it was significantly slower than the equivalent

random write. I hypothesize this is related to garbage collection or trying to understand

the changing of write mode at the FTL level. However, there is no such gain for

random reads/writes on the t2.micro machine. As can be seen in Graph 3, the total

random reads/writes are capped around 3,000 IOPS for 4k or 8k block sizes. This

performance is expected per the performance metrics provisioned by Amazon for the

EBS volume attached to this instance. Additionally, at no time does random

read/write operations outperform sequential read/write operations. This type of

performance is more in line with what is expected from a traditional SSD.

0 10 20 30 40 50 60 70 80 90 1000

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

i2.xlarge Random vs Sequential 8 jobs rand read

8 jobs se-quential read

8 jobs rand write

8 jobs seq write


8 jobs seq read and write

Read Percentage

KB /

Sec

Graph 3 - Random Versus Sequential Operations

4.2.5 SSD Random Workload Analysis on t2.micro VM

From Section 5.3.1 to 5.3.4, I observed that random and sequential operations are

very close in IOPS. Amazon provides different numbers of IOPS for various types of

SSD VMs. I chose Amazon t2.micro (16.04 LTS HVM, SSD volume type VM

instance store in EC2) as the VM machine. And I used the block sizes (4k, 8k, 16k,

83

32k, 64k, and 128k) on random reads, writes, and read/writes to establish baseline

metrics. These metrics will be used for comparing with different encryption methods

workloads. These experiments were done using random workloads for 100 percent

reads, writes and 60/40 read/writes (Mixed).

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500

IOPS and Random Workloads Without Encryptions

Read IOPS

WriteIOPS

Mixed IOPS

Block Size

IOP

S

Graph 4 - t2.micro Block Size Versus IOPS

4k 8k 16k 32k 64k 128k0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Block Size and No Encrypted SSD Performance

Read IO

Write IO

Mixed IO

Block Size

KB

/Se

c

Graph 5 - t2.micro Block Size Versus KB/Sec

In Graph 4 and Graph 5, it was observed workloads for 100 percent reads, writes

and 60/40 read/writes showed similar IOPS (maximum IOPS Amazon provisioned)

for 4k, 8k, and 16k block sizes. Once it reached 32k the IOPS decreased 40%, 64k

IOPS decreased 60%, and 128k decreased to 85% of the 4k block size IOPS, but as

84

seen in Graph 5, overall reading and writing of data to the disk increased because of

increased block size. I hypothesize that this is related to the block size overhead, but

the increase is not proportional to the block size data input. Also, another important

SSD characteristics I observed was that reads were faster than writes as shown in

Graph 5 for the block sizes: 32k, 64k, and 128k (which were less impacted by

Amazon maximum provisioned IOPS). This type of performance is more in line with

what is expected from a traditional SSD. Going back to Graph 4, the evidence of

Amazon’s data capping is clear at 4k, 8k, and 16k, plus 32k mixed, because the IOPS

hovers around 3,110. I used these metrics as baseline for future comparisons.

4.3 SSD performance with Encryption

In chapter 5.3, I established set of baseline metrics. I then ran the same

experiments with various encryption methods, block sizes, and workloads. I chose not

to vary the number of jobs based on the data described in section 5.3.3, which showed

little difference between 8 jobs versus 16 or 32 jobs. So, I set the number of

jobs/threads to 8 for all block sizes and all workloads. These experiments were

conducted on two different software encryption methods (BestCrypt and Dm-crypt)

and one encrypted SSD by Amazon. Amazon EBS volumes were encrypted with

unique 256-bit key using the AES-256 algorithm. Also, when you snapshot (a way of

cloning storage volumes) these volumes share the same key2. Customers maintain

these keys using their own key management infra-structure.

To execute the experiments, I created a working environment by creating a VM in

Amazon EC2 and installing encryption software and FIO benchmarking software. I

used the same process for both software-based encryption methods. For the encrypted

2 http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html

85

SSD, I created a VM and attached encrypted EBS SSD volume to it. The following

graphs (Graph 6 – 16) will show the different encryption methods and their

performance patterns for the different the block sizes versus IOPS and KB/Sec.

4.3.1 Did various block sizes significantly affect IOPS

I observed that as the block size increases the number of IOPS decreases along

with the execution time to complete the required reading and writing of data by FIO

in all types of encryption software methods. The performance metrics from Graph 6,

7, 8 showed a similar decrease of IOPS for all types of encryption methods.

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500

Encrypted SSD Block Size Versus IOPS

Read IOPS WriteIOPS Mixed IOPS

Block Size

IOP

S

Graph 6 - Encrypted SSD Block Size Versus IOPS

86

4k 8k 16k 32k 64k 128k0

50100150200250300350400450500

BestCrypt Block Size Versus IOPS

Read IOPS

WriteIOPS

Mixed IOPS

Block Size

IOPS

Graph 7 - Best Crypt Block Size Vs IOPS

4k 8k 16k 32k 64k 128k0

200

400

600

800

1000

1200

1400

dm-crypt Block Size Versus IOPS

Read IOPS

WriteIOPS

MixedIOPS

Block Size

KB

/Se

c

Graph 8 - Dm-crypt Block Size Vs IOPS

87

One of the main characteristics of SSD is reads outperform writes, but when I use

encryption software they showed the opposite results; writes performed better than

reads (Graph 6 versus , Graph 7, Graph 8 - This is a very significant discovery about

doing encryption on SSDs. This finding may indicate that when using software-based

encryption on an SSD, the decryption (read) process takes more time than the

encryption (write) process.

4.3.2 Did various block sizes affect Performance Throughput

In the previous section, I observed that IOPS decreased as block size increased in

all encryption methods.

4k 8k 16k 32k 64k 128k0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Encrypted EBS SSD Volume Block Size Versus throughput

Read IO

Write IO

Mixed IO

Block Size

KB

/Se

c

Graph 9 - Encrypted EBS SSD Volume Block Size Versus throughput

88

4k 8k 16k 32k 64k 128k0

50000

100000

150000

200000

250000

300000

BestCrypt Block Size Versus Throughput

Read IO

Write IO

Mixed IO

Block Size

KB

/Sec

Graph 10 - BestCrypt Block Size Versus Throughput

4k 8k 16k 32k 64k 128k0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

Dm-Crypt Block Size Versus Throughput

Read IO

Write IO

Mixed IO

Block Size

KB

/Se

c

Graph 11 -Dm-Crypt Block Size Versus Throughput

In the Graph 9, I observed there was no significant difference between reads and

writes versus unencrypted SSD throughput. Also, I observed that at 32k and higher I

do not see any significant throughput increase. In Graph 10 10, using the software

encryption method Best Crypt, I observed the 32k block size had the lowest

performance of all other block sizes. In Graph 11, using software Dm-crypt encryption

I observed that the throughput has a linear increase as the block size increases.

89

4.3.3 Did various Encryptions Versus Performance Throughput

The experiments showed that there is a significant difference in between SSD’s

that use encryption software methods and encrypted SSD.

Without En-cryption

Encrypted EBS SSD

Dm-crypt BestCrypt0

500

1000

1500

2000

2500

3000

3500

Encryption Methods Versus IOPS

Read IOPS

Write IOPS

Mixed IOPS

Encryption Methods

IOP

S

Graph 12 - Encryption Methods versus IOPS

Without En-cryption

Encrypted EBS SSD

Dm-crypt BestCrypt0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

Encryption Methods Versus Throughput

Read IO

Write IO

Mixed IO

Encryption Methods

KB

/Se

c

Graph 13 - Encryption Methods versus Throughput

I did various workload performance metrics experiments for all sizes (4k, 8k, 16k,

32k, 64k, and 128k) of blocks. In Graph 12 and Graph 13, I observed that an

encrypted SSD outperformed software-based encryption methods (graphs show only

for 4k block size). In Graph 12, the encrypted volume showed very similar

performance to regular unencrypted SSD.

90

4.3.4 Reads, Writes and Mixed workloads Versus Block Sizes.

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500

Read workloads - Block Sizes Versus Throughput

Without Encryption

Encrypted EBS SSD

Dm-crypt

BestCrypt

Block Size

IOP

S

Graph 14 - Read workloads for various Block Sizes

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500

Write workloads - Block Sizes Versus Throughput

Without Encryption

Encrypted EBS SSD

Dm-crypt

BestCrypt

Block Size

IOP

S

Graph 15 – Write workloads IOPS for various Block Sizes

91

4k 8k 16k 32k 64k 128k0

500

1000

1500

2000

2500

3000

3500

Mixed Workloads - Block Sizes Versus Throughput

Without Encryption

Encrypted EBS SSD

Dm-crypt

BestCrypt

Block Size

IOP

S

Graph 16 - Mixed Workloads IOPS for Various Block Sizes

Graph 14, Graph 15, and Graph 16 indicates that as the block size increased, the

IOPS decreased. For software encryption methods, block sizes of 64k and 128k had

lower performance than the 4k, 8k, 16k, and 32k. For 128k block size, the encryption

method Best Crypt, 100% reads showed such a low performance that it could be

measured in just a single digit (only 9 IOPS).This was by far the lowest performance

of all the encryption methods.

4.4 Fully Homomorphic Encryption Limitations

4.4.1 FHE with Vector Space

The first simple FHE cipher using multi vectors, called EDCHE (Enhanced Data –

Centric Homomorphic Encryption) was presented by DaSilva. It uses geometric

algebra and multivector spaces Rn , where n is 2 or 3. And these vectors represent the

dimensions of vector space 2D and 3D respectively. When using a 3D vector space, it

will generate an encrypted file that is 8 to 10 times the size of original plaintext file

92

CITATION DASILVA ¿1033[28]. This makes it hard for users to justify this method

for their applications. When creating the most robust secure algorithms, the

cryptographer needs to keep in mind that the algorithms should be simple, efficient,

secure, practical, and able to accommodate computer resources. This gives an

opportunity to develop a new FHE cipher to fulfill these requirements.

4.4.2 Previous homomorphic encryption using multivector technique.

64 128 256 512 10240

20

40

60

80

100

120

140

Key Size and Time in sec on Regular SSD

AES-Crypt Ecnryption Xlg-Crypt Encryption AES Crypt Decryption xlg Crypt Decryption

Graph 17 - Multivector Based Homomorphic Encryption

Graph 17 shows that I used different key sizes ranging from 64 bits to 1024

bits for encryption and decryption. I observed that when comparing the performance

in terms of time; xlg-crypt underperformed than AES-Crypt for full file encryption

and decryption. Xlg separates itself from AES because it is fully homomorphic

encryption and does not need to encrypt/decrypt an entire file on every update. Due

to this unique characteristic, xlg-crypt will outperform AES-Crypt on smaller

updates.

Even though xlg-crypt takes more time to encrypt than AES-Crypt, it offers

additional security features. Such as the unique nature of xlg allows a client to work

with all, some, or even none of the encrypted files from the server. This allows the

system to only expose necessary parts of encrypted files to the client keeping the rest

of the files encrypted and secured on the server. When using Symmetric encryption

93

methods, during any update process the decrypted (plain text) version of the file

exists until it is deleted. As xlg-crypt is homomorphic encryption I do not need to

have any plain text file on VM due to its characteristics.

32 64 128 256 5120

100

200

300

400

500

600

700

800

900

Encrypted file size in MB on Regular SSD VM

AES-Crypt xlg-Crypt

Graph 18 - Multivector based encrypted file sizes

I also observed that when I encrypt a 100 MB file AES-Crypt created a 101 MB

encrypted file while xlg-crypt created an 801MB encrypted file. In general, the xlg-

crypt I have observed 8 times the encrypted file generated from original plain text

file. This is due to xlg-crypt math algorithm calculations it creates bigger encrypted

file. Its output file is 8 times larger, due to that it takes longer time to decrypt. Xlg

uses 3 dimensional and infinite filed space and it causes this growth. Even though it

takes more space, each update does not require a rewrite of cells in the SSD, this is

more aligned with endurance concerns on SSD storage devices.

94

CHAPTER 5

RVTHE

In this chapter I will present a new symmetric homomorphic encryption method

“Reduced Vector Technique Homomorphic Encryption”. This section will discuss its

design, mathematical implementation, and homomorphic properties.

RVTHE (Reduced Vector Technique Homomorphic Encryption) is a new

symmetric homomorphic encryption method and this chapter descripts its design and

homomorphic properties.

5.1 Design of RVTHE

Design of RVTHE depends on Versors. Mathematically a Versor can be represented as:

A=a1 a2 a3 ... an

In the design of RVTHE we will have n−1 vectors as the number of keys and one vector as the data. For example, if n=5 then there would be 4 vectors of keys and 1 vector of data.

Vectors Example 1 Example2a1 Key3 Dataa2 Key1 Key1a3 Data Key4a4 Key2 Key2a5 Key4 Key3

Table 2 - Key and data location in versors

The location of each key and the data are flexible, their locations are determined by the designer.

To reduce the generated cipher text, we must choose two term vectors.

95

5.1.1 RVTHE Encryption and Decryption

Each key is a random generated number that is converted into base 10. We divide

each key and the data into two parts and use them as two terms (coefficients) for each

vector.

5.1.2 Encryption of RVTHE

Once we design the keys and data locations. We perform a geometric product

operation from the first two vectors and that will generate an intermediate ciphertext.

Next, we perform geometric product operation between the intermediate cyphertext

and the next vector, repeat this calculation for each vector. This will generate cipher

text.

E(d1)=s1 s2 s3d1 …sn<<add formula number>>

From above s1 s2 s3 …sn are keys and d1 is data and E is encryption.

5.1.3 Decryption of RVTHE

For the decryption process finding the inverse of the key vectors is critical. First,

we perform a geometric product operation between the cipher text and the inverse of

the key, that will generate an intermediate ciphertext. Next, we perform a geometric

product operation between the intermediate cyphertext and the next vector inverse,

repeat this calculation for each vector. This will generate the plain text.

D(c1)=s3−1 s2

−1 s1−1 s1 s2 s3 d1… sn sn

−1

96

From above s3−1 s2

−1 s1−1… sn

−1 inverse vectors fors1 s2 s3 …sn keys and c1 is data

and D is decryption.

In our implementation we chose to use three vectors, two vectors for keys ( s1 s2¿

and one vector for data(d1).

5.2 Mathematical Implementation of RVTHE Using Versors

In the versors example from section 3.3, while using the vector inverse, we

derived the vector ‘b’ value. If we chose to present the same example in terms of

encryption methods then the a , b ,∧c from the math become s1 , d1 ,∧s2 in the scheme

RVTHE, and they represent the first secret key, the data value, and the second secret

key, respectfully.

RVTHE’s mathematical representation choosing two secret keys and one data is

shown in the format of s1 d1 s2 . In other words, we chose only three vectors a1 , a2 , a3

in our implementation.

Encryption method is represented as ‘E’ and Decryption method represented as

‘D’.

Assume

a1=s1=a

a2=d1=b

a3=s2=c

And assigning these values, where e1∧e2are the vectors.

97

a=(2 e1+3e2 )

b=( 4 e1+5 e2 )

c=( 3 e1+4e2)

Then

For Encryption of d1 which E(d1 ¿=abc

ab=(2e1+3 e2 ) · ( 4e1+5 e2 )+( 2 e1+3e2) ⋀ (4 e1+5e2 )

ab=23−2e12

abc=(23−2e12) (3e1+4e2 )

abc=61e1+98e2

For Decryption to derive d1

D(E(d ¿¿1))=D (abc )=a a−1 bc c−1=d1=¿¿ b

b=¿ (61 e1+98 e2 )(3e1+4 e2

25)

b=¿ 2e1+3e2

13(23−2 e12)

b=¿ 1

13((46+6)e1+(69−4)e2)

b=4 e1+5e2 .

This Encryption implementation is based on Versors providing a new way to

utilize the Geometric Product of Algebra.

5.3 Homomorphism of RVTHE

In this section I will show the properties of Homomorphism of RVTHE.

Homomorphism will have addition, subtraction, multiplication and division properties

[79].

98

5.3.1 Addition

We represent

data 1 ( d1 ) , data2 (d2 ) ,

secret key 1 ( s1 )∧secret key 2(s2)Prove the following

E(d1+d2 ¿=E ( d1 )+E (d2)

Example:

When d1=8∧d2=6 thend1+d2=16

s1=( 2e1+3e2 ), d1=( 4 e1+4 e2 ) and s2=( 3e1+4 e2)

Applying regular geometric product

V 1 V 2=[( a1b1 ) e1 ∙ e1⏞ei ∙ ei=1

+( a1b2 ) e1 ∙ e2⏞e i ∙e j=0

+( a2b1 ) e2 ∙ e1⏞e j∙ e i=0

+( a2b2 ) e2 ∙ e2⏞e j ∙ e j=1]⏟̇

product

+[( a1 b1 ) e1∧ e1⏞ei∧ei=0

+(a1b2) e1∧ e2+( a2 b1 ) e2∧ e1⏞e j∧e i=−ei∧e j

+( a2 b2 ) e2∧ e2⏞e j∧e j ]

⏟wedge product

Then Encryption of

E(d1 ¿=s1 d1 s2

E(d1 ¿=( (2e1+3 e2 ) ( 4 e1+4 e2 ) (3e1+4 e2 ))

E(d1 ¿=44e1+92e2

Then Encryption of

E(d2 ¿=s1 d2 s2

E(d2 ¿=( (2e1+3 e2 ) ( 3 e1+3 e2 ) ( 3 e1+4 e2))

99

E(d2 ¿=33 e1+69 e2

E(d1+d2 ¿=( (2 e1+3 e2 ) (7 e1+7e2 ) (3e1+4 e2 ))

E(d1+d2 ¿=¿ 77 e1+161e2

E(d1 ¿ + E(d2 ¿=77 e1+161 e2

It proves E(d1+d2 ¿=E ( d1 )+E (d2)

5.3.2 Subtraction

E(d1−d2¿=E (d1 )−E (d2)

Example:

When d1=8∧d2=6 thend1−d2=2

s1=( 2e1+3e2 )

d1=( 4 e1+4 e2 )

s2=( 3e1+4 e2)

Then Encryption of

E(d1 ¿=44e1+92e2

E(d2 ¿=33 e1+69 e2

E(d1−d2¿=((2 e1+3 e2 ) (e1+e2 ) (3e1+4 e2 ))

E(d1−d2¿=¿ 11e1+23 e2

E(d1 ¿−¿ E(d2 ¿=11 e1+23 e2

It proves E(d1−d2¿=E (d1 )−E (d2)

5.3.3 Multiplication

In vectors we have scalar multiplication.

d1=8∧scalar r1=2then E (r1 d1)=r 1 E(d1)

100

s1=( 2e1+3e2 ) ,d1=( 4 e1+4 e2 ) and s2=( 3e1+4 e2) Then Encryption of

E(r1 d1¿=(( 2e1+3 e2 ) (8 e1+8e2 ) (3 e1+4 e2 ))

E(r1 d1¿=88 e1+184 e2

E(d1 ¿=44 e1+92 e2

r1 E(d1)¿=88 e1+184e2

This proves E(r1 d1¿=r1 E(d1)¿ for scalar multiplication.

5.3.4 Division

In vectors we have scalar division.

d1=8∧scalar r1=1 /2 then E (r1 d1)=r1 E(d1)

s1=( 2e1+3e2 ) ,d1=( 4 e1+4 e2 ) and s2=( 3e1+4 e2) Then Encryption of

E(r1 d1¿=(( 2e1+3 e2 ) (2 e1+2e2 ) (3 e1+4e2 ))

E(r1 d1¿=22 e1+46e2

E(d1 ¿=44 e1+92 e2

r1 E(d1)¿=22 e1+46 e2

This proves E(r1 d1¿=r1 E(d1)¿ for scalar division.

Design of RVTHE depends on Versors. Mathematically a Versor can be represented as:

A=a1 a2 a3 ... an

5.4 Security of RVTHE

There is a need to make sure this encryption method is good enough in terms of

security. The design of RVTHE depends on versors (two-dimensional vectors),

Geometric Product, and inverse. The dimensions of vectors contribute an extra layer

101

of security. This can be accomplished by using simple mathematical manipulations on

known information. The security of RVTHE is derived from applying mathematical

manipulations on known-plaintext and known-ciphertext and try to derive keys.

<< This section should have more substantative discussion than just one paragraph.>>

CHAPTER 6

IMPLEMENTATION AND EVALUATION OF

RVTHE

In this chapter I will discuss how I converted RVTHE into an executable program.

This section deeply discusses the implementation of RVTHE into various applications

and compares it with AES-Crypt encryption method in terms of speed of encryption

and decryption performance. I evaluated the RVTHE security at high level by

analyzing mathematical operations and performing statistical evaluations on cipher-

texts.

I evaluated encryption, decryption and the ability to update/append real time files

without decrypting and re-encrypting on the RVTHE scheme. I ran these evaluations

on a cloud system provided by one of the leading cloud providers Amazon (AWS

EC2).

6.1 Implementation of RVTHE

AES-Crypt is one of most widely known methods for encrypting individual real-

time files. It offers a high speed and security. I developed the same executable

102

program as one package for both encryption, decryption and append. So, I created an

executable crypto program in ‘C’ language based on the RVTHE scheme like AES-

Crypt program. I executed that on real-time files for encryption, decryption, and

appending new data to the end of the encrypted file without decrypting the original

ciphertext.

I used the following command to run encryption and decryption.

AES-Crypt:

time aescrypt -e -p key plaintext_file_name

time aescrypt -d -p key plaintext_file_name.aes

RVTHE:

time xlg -e -x key1 -y key2 plaintext_file_name

time xlg -d -x key1 -y key2 plaintext_file_name.xlg

time xlg -a -x key1 -y key2 “data” plaintext_file_name.xlg

In the above commands, the ‘-e’ indicates encryption, ‘-d’ indicates decryption,

and ‘-a’ indicates append.

When we used a 512-bit key for the AES-Crypt program we then split that key

into two 256-bit keys for RVTHE (xlg) program. I did this for all key sizes starting

from 64 to 1028-bit key sizes. For evaluation, evaluate I chose 256-bit key size.

6.2 Experimental Systems

Our evaluations have been conducted on a 64-bit Amazon EC2 virtual machine

SSD instance. I chose an instance type of t2-micro, which has 1 vCPU, 1GB

Memory, and 8GB maximum storage. I specifically selected a VM with SSD storage,

103

because SSD has high performance and has become an industry standard for cloud

computing.

AES-Crypt is one of most widely known methods for encrypting individual real-

time files. It offers a high speed and security. I choose it to develop baseline statistics

on speed and output file size after encryption. It is also using an AES algorithm and it

is a symmetric method like RVTHE. I compared them in the context of encryption

speeds, decryption speeds and encrypted file size (disk storage used by cipher text).

In addition, I will also explain the additional security and efficiency benefits that

are unique to a homomorphic encryption method.

6.3 Experimental Evaluations

I ran both executables (AES-Crypt and RVTHE) on various key sizes and files

sizes. The key sizes were 64, 128, 256, 512, and 1024-bit and file sizes were 1MB,

10MB, and 100MB. From that we measured the speed for encryption and decryption

plus the storage size of the encrypted file on the cloud server.

6.3.1 Time measurements on various key sizes

104

64 128 256 512 10240

2

4

6

8

10

12

Key Size and Time in sec on Regular SSD for 100MB file

AES-Crypt Ecnryption RVTHE Encryption AES Crypt Decryption RVTHE Decryption

Key Sizes in Bits

Tim

e in S

ec

Graph 19 - Key size Vs Encryption/Decryption time in Sec

Graph 19 show the key size versus time to create encryption and decryption using

RVTHE and AES-Crypt on regular SSD. For Encryption I, did not observe a sizeable

increase in time for any key size for either encryption method. Across the board,

RVTHE required less time for encryption than AES-Crypt. For decryption the fastest

method was RVTHE at 64-bit. However, the decryption proves for the RVTHE

method took longer as the cipher-text size got bigger. The most commonly used key

size is 256-bit; for that both encryption method speeds are almost same.

Note: Having homomorphic features as in RVTHE means that full file decryption

should be rare. In other words, file updates do not require the full file to be decrypted.

105

6.3.2 Time measurements on various file sizes

1 10 1000

1

2

3

4

5

6

Key Size and Time in sec on Regular SSD -used in paper

AES-Crypt Ecnryption RVTHE Encryption AES Crypt Decryption RVTHE Decryption

File size in MB

Tim

e in S

ec

Graph 20 - File size and Encryption/Decryption times

I chose 256-bit key size for all tests. Encryption and decryption for both RVTHE

and AES-Crypt methods took more time with larger file sizes.

1 10 1000

0.010.020.030.040.050.060.070.08

Key Size and Time in sec on Regular SSD

AES-Crypt Ecnryption Rate RVTHE Encryption Rate AES Crypt Decryption Rate RVTHE Decryption Rate

File size in MB

Tim

e in

Sec

Graph 21-Key Size and Time on Regular SSD

In Graph 20 and Graph 21 showed across the board, RVTHE required less time

for encryption than AES-Crypt. Also, RVTHE performed at about the same rate

106

regardless of the file sizes for encryption. As with encryption, the RVTHE decryption

process performed at about the same regardless of file sizes.

6.3.3 Size measurements on Encrypted Files

64 128 256 512 10240

50

100

150

200

250

Encrypted file size in MB on Encrypted Volume in SSD

AES-Crypt xlg-Crypt

Key Size

Siz

e o

f t

he f

ile i

n M

B

Graph 22 - Encrypted file sizes in MB

Graph 22 shows the output file generated from encryption process is always the

double the of original file for RVTHE whereas AES-Crypt has only 10% penalty.

When you decrypt a 1GB file using AES-Crypt you need an extra 1.2GB of space.

In case of full file decryption, you need 2.2GB of extra space for RVTHE needed but

it will be rare as it allows computation on cipher-text. Security Evaluation of RVTHE

6.4 Security Evaluation of RVTHE

<<Move this the end of chapter 5; security discussion first; then performance>>

There are various attacks that can be performed by attackers. I evaluate RVTHE in

two major type of attacks to show designing the encryption cipher of RVTHE is very

secure.

Ciphertext-Only:

Assume that an attacker has access to the cipher-text produced by RVTHE and

nothing else. In such cases, it is not possible to find the plaintext or secret key by

107

using mathematical and statistical operations. I will show you a high-level evaluation

of this.

I represent data 1 ( d1 ) , data2 (d2 ) , secret key 1 ( s1 )∧secret key 2(s2)

Example:

When d1=8∧d2=6 thend1+d2=14s1=( 2e1+3e2 ), d1=( 4 e1+4 e2 ) , d2=( 3 e1+3e2 )and s2=( 3e1+4 e2)


V 1 V 2=[( a1b1 ) e1 ∙ e1⏞ei ∙ ei=1

+( a1b2 ) e1 ∙ e2⏞e i ∙e j=0

+( a2b1 ) e2 ∙ e1⏞e j∙ e i=0

+( a2b2 ) e2 ∙ e2⏞e j ∙ e j=1]⏟̇

product

+[( a1 b1 ) e1∧ e1⏞ei∧ei=0


+( a2 b2 ) e2∧ e2⏞e j∧e j ]

⏟wedge product

Then Encryption of

E(d1 ¿=s1 d1 s2

E(d1 ¿=( (2e1+3 e2 ) ( 4 e1+4 e2 ) (3e1+4 e2 ))

E(d1 ¿=44e1+92e2 = C1

Then Encryption of

E(d2 ¿=s1 d2 s2

E(d2 ¿=( (2e1+3 e2 ) ( 3 e1+3 e2 ) ( 3 e1+4 e2))

E(d2 ¿=33e1+69e2 = C2

Cipher-text c1 and c2is produced by data 1 ( d1 ) , data2 (d2 )by while applying

statistical methods. It is very hard to evaluate because the cipher texts are stored with

108

two dimensional. Even applying statistical and mathematical operations such as

additions and subtractions, I do not see way to derive the keys.

C1 + C2=77 e1+161 e2

C1−C2=11e1+23 e2

Known-Plaintext:

In this case, if an attacker has some of the plaintext/ciphertext pairs, then they use

them to derive the key. This is called a Known-Plaintext attack. I will demonstrate

this using statistical methods and mathematical operations manipulation to show I can

derive the keys.

I represent data 1 ( d1 ) , data2 (d2 ) , secret key 1 ( s1 )∧secret key 2(s2)Example:

When d1=8∧d2=6 thend1+d2=14s1=( 2e1+3e2 ), d1=( 4 e1+4 e2 ) , d2=( 3 e1+3e2 )and s2=( 3e1+4 e2)


V 1 V 2=[( a1b1 ) e1 ∙ e1⏞ei ∙ ei=1

+( a1b2 ) e1 ∙ e2⏞e i ∙e j=0

+( a2b1 ) e2 ∙ e1⏞e j∙ e i=0

+( a2b2 ) e2 ∙ e2⏞e j ∙ e j=1]⏟̇

product

+[( a1 b1 ) e1∧ e1⏞ei∧ei=0


+( a2 b2 ) e2∧ e2⏞e j∧e j ]

⏟wedge product

Then Encryption of

E(d1 ¿=s1 d1 s2

E(d1 ¿=( (2e1+3 e2 ) ( 4 e1+4 e2 ) (3e1+4 e2 ))

E(d1 ¿=44e1+92e2 = C1

109

Then Encryption of

E(d2 ¿=s1 d2 s2

E(d2 ¿=( (2e1+3 e2 ) ( 3 e1+3 e2 ) ( 3 e1+4 e2))

E(d2 ¿=33e1+69e2 = C2

Cipher-textC1¿data1 (d1 ) , C2 ¿data2 (d2 )

Performing the following operations on ciphertexts.

C1 + C2=77e1+161 e2

C1−C2=11e1+23 e2

while applying statistical method is very hard to evaluate as I have two keys while

designing RVTHE. Even applying statistical methods and mathematical operations

such as additions and subtractions, I do not able to derive the secret keys because

math is implemented with two keys and two-dimensional vectors. There is a pattern

11,33,44, and 77 but no way I can guess plaintext or keys.

110

CHAPTER 7

LESSONS LEARNED AND FUTURE WORK

7.1 Challenges and Lessons Learned

This section will present the research flow, some of the challenges and issues that

were faced during the study of encryption methods and how I overcame those

challenges.

1. The basis of new research expands on previous work and improves it by

eliminating shortcomings and risks in the same domain. When I began

working on my research, I first educated myself about SSDs using surveys or

previous literature. To further understand these SSDs, I did various

benchmarks to calculate its performance on AWS. After these tests, I was able

to prove that modern SSD’s perform better under random workloads than

previous SSDs. Because there weren’t any deficiencies found to investigate, it

was difficult to find a problem for my research. I felt lost, so I took a step back

and started to think about what area of research I wanted to pursue, and I

started researching existing literature related to SSD security.

2. I deleted all the data from local SSD drive and was able to recreate the file

using freely available recovery software. That itself gave me the first step to

investigate the sanitization of SSD. I demonstrated that SSD has limitations

about sanitization. This gave me some hope to start finding a problem case

which is very critical for any research. I read previous research papers related

to Cloud security and Storage security. Most of the previous work in the

111

domain of security on Storage and Cloud was using encryption as primary

method to protect the data. All this proved that we can’t leave plain text on the

SSD. Still I was not sure whether my findings were enough to pursue this as a

research problem for further study.

3. I thought about finding the various encryption methods for Cloud Storage. Did

research and read about various encryption methods and their weaknesses.

While I am reading about these methods, I learned that there is some overhead

for encrypting the data in the Cloud. I conveyed this to my advisor, and he

provided needed confidence and motivation to do research in this area. With

that support, I started my research vigorously.

4. I started investigating encryption methods and running them in the Cloud to

see how much performance overhead and security issues existed. It was harder

to choose what encryption software should be used for evaluating performance

benchmarking. Over few months, I chose some encryption methods to perform

performance benchmarking. I found that performance decreases between 20 -

50% performance tradeoffs, when we use encryption software. Also, I found

that hidden folders exist for encrypted containers. Now I knew that there are

issues related to encryption methods. My next step was expanding my

knowledge in encryption methods. I met my Professor and he suggested that I

should consider homomorphic encryption. I started reading about

homomorphic encryption method from previously written literature.

Homomorphic encryption allows computation on encrypted files without

decrypting and my thought was that this method would incur less overhead. At

this time, I felt that I was one step closer and confident about my direction. I

had a conversation with my advisor, and he explained about multi-vectors can

112

be used to achieve encryption. With that, I knew my next steps, which was to

learn mathematics in the area of geometric algebra.

5. Mathematical knowledge is needed to create a secure and efficient cipher. My

last work in math was about 25 years ago. Trying to understand what my

advisor was telling me about this new primitive of multi-vectors was not easy,

to say the least. For many months, I started studying geometric algebra to gain

knowledge in that area. Finding an implementation of this math as an

encryption cipher was challenging. By this time, the implementation of cipher

using multi-vectors was completed by another researcher for his master’s

thesis. I read his thesis paper and found and converted his design into

executable program. This multi-vector approach was creating a ciphertext file

which was eight times of the original plaintext file. This big ciphertext file

increased the time of encryption and decryption and this method was going to

be hard to be used on bigger files. I had conversation with my advisor about

the results and he asked me to consider using compression techniques. I

experimented the program on storage with deduplication enabled thinking that

this might reduce the output file size, but the speeds of encryption and

decryption were very low. I learned that compression or any type of

deduplication will add overhead on the top of the encryption. My main

challenge was to improve the multi-vector-based encryption. I was thinking of

this challenge all the time. It took many months and I was getting discouraged,

but I decided to keep on learning. One night around 3:00AM, I woke up with

an idea on how to decrease the size of the encrypted file using versors. I

proved the math by hand and make sure it was feasible. Immediately, I put

113

these thoughts on the paper with all homomorphic properties and sent an email

to my advisor. I learned to keep pushing myself.

6. Once I found out that I can decrease the encrypted file size, it was a challenge

to design the RVTHE because it implements a scalar and multi-vector as

intermediate products before generating the final output vector. I developed

RVTHE into an executable program in C language. It was challenging to find

a comparable encryption method. I found AES-Crypt, which is symmetric and

uses AES encryption, which was perfect match to compare RVTHE.

I enjoyed the process but at times it was very stressful. I experienced how joyful when

we find a solution for a problem. This process made me grow into better person.

7.2 Contributions

<<Move this to conclusion chapter>>This chapter provides the challenges of

coming up with a new encryption method. Next, I discussed about how I designed,

implemented and calculated the performance and security strength of this new

encryption cipher RVTHE.

The goal and purpose of this study is to explain what encryption can do for

devices in terms of tradeoffs between performance and security. The main thought

behind this thesis is that there is a way to have high security and performance without

having to compromise either. This challenge prompted me to study how different

types of encryptions, like Best Crypt, Dm-crypt, and SED, can affect SSD security

and performance. The above-mentioned encryption software has their own drawbacks.

The study has proved these encryptions have performance differences for sequential,

random reads and writes. Most enterprise workloads are random. However, little

research has been done for various workloads like random read writes. With my

114

experiments, I showed that random workload performs better on newer SSD storage

systems. I then evaluated how modern SSDs handle random workloads when using

encryption. Evaluating different workloads with many types of functions for different

encryptions will produce various performance metrics. I went through selected

methods (BestCrypt, Dm-crypt, and encrypted Elastic Block Store volume from

Amazon) to analyze their strengths and weaknesses for SSDs evaluating both security

and performance. This helped to came to conclusion that use of homomorphic

encryption is best suited for each workload in the Cloud.

By applying fully homomorphic encryption, it is possible to achieve cyber security

as it allows a series of new computations on encrypted data. Technically, I can start a

zero bytes file or data and encrypt it and apply homomorphic encryption on that zero

bytes file and never expose or leave a data footprint on the disk. Rivest et al in [20]

first mentioned this idea and Gentry first proposed fully homomorphic encryption [2]

using binary circuits on encrypted data and performed basic mathematical operations.

All other scholars inspired by Gentry’s approach improved his scheme or contributed

new approaches. His theoretical approach of homomorphic encryption providing a

new way to solve security encryption, but his solution was not ready to be applied

easily and thus was impractical.

RVTHE is a new homomorphic symmetric encryption scheme based on Clifford

Geometric Algebra. The foundation for this encryption used mathematics extensions

of versors, geometric product and inverse in the form of language. Geometric Algebra

is a very critical part for developing its design and framework. The design of this new

cipher is simple, but combining versors, geometric product, and inverse will generate

a strong cipher. This is a very powerful and substantial cipher which fulfils the

requirements to build a new cipher. These requirements include the security of the

system defending from various attacks with smaller updates to cipher-text. In this

115

work, I showed how to design the following design principles. Its application in real

world showing the performance and mathematical approach of security defense

towards attacks. I created a measurable benchmark to calculate encryption speeds.

First, I did experiments to understand encryption performance penalties on SSD in the

Cloud and then created an experimental environment to calculate RVTHE

performance.

7.3 Success of work

In this work, I developed a Reduced Vector Technique Homomorphic Encryption

(RVTHE) and it is a symmetric and somewhat homomorphic encryption. RVTHE

was developed based on using Versors and Clifford Geometric Algebra properties.

The evaluation of our implementation shows I can edit/append a file in .001 sec. In

the case of full file encryption, RVTHE is 75% faster on encryption and 25% slower

on decryption compared with encryption software ‘AES-Crypt’. RVTHE generated

ciphertext size that was reduced to 25% from previous approaches, which used multi-

vectors and Clifford Geometric Algebra; RVTHE has the potential to be used on real

workloads. It is a great success as it is faster, more efficient and only takes twice the

size for cipher text.

7.4 Future Work

In cloud computing, homomorphic encryption provides secure computing on

encrypted data. It is an encryption method that allows the users to compute in the

Cloud, without converting the ciphertext into plaintext. In recent years, there is lot of

research and interest in the domain of homomorphic encryption; most of them focus

on asymmetric homomorphic encryptions. Very little research has happened for

symmetric homomorphic encryption. Some applications can use symmetric

116

homomorphic encryption very well. We proposed a very simple cryptographic

primitive which had low time for encryption and decryption.

RVTHE encryption method is a symmetric homomorphic encryption which

supports addition, subtraction, scalar multiplication, and scalar division. RVTHE is

developed using Clifford Geometric Algebra as a foundation. It uses Vectors, Versors,

and the Inverse of a Vector. RVTHE showed promising preliminary results that it is

feasible to apply on real files. The converted RVTHE algorithm was implemented into

a program and executed on various file types and sizes. I also added an appending

feature for the program. A comparison was conducted between AES-Crypt and

RVTHE in terms of time to encrypt, decrypt and generated output file size.

RVTHE was designed with three vectors in the current work. In future research it

can be extended to various designs including more vectors as secret keys in the

algorithm. Then the performance of the algorithm can be calculated. Experimenting

with new designs for encryption and decryption of various file sizes and file types is a

great way to explore the RVTHE. At this time, I added the code for addition, but it

can be expanded to deletion, scalar multiplication, and scalar division. These additions

to the program can be tested to see how that enhances the user experience computing

on encrypted data. RVTHE can expand to various application, OS’s, and hardware.

We can also explore using RVTHE in various applications and databases such as

password stores. Also, leveraging multithreading for computation of encryption and

decryption will improve the performance.

There is a possibility for exploring in the area of scalability and availability of

algorithm. RVTHE has been implemented and applied on various types of files such

as: .txt, .doc, .pdf, .xlsx, and .jpeg. It worked as expected without any noise and

117

holding integrity, but it will be helpful to expand this program for databases and other

application updates like add, update and delete operations. It would be nice to see if

this approach can be used in various device level encryption systems and see if it is

possible to expand to all types of devices including mobile and IoT.

When using Cloud computing, we can segregate the application and computation

part. The RVTHE encryption method can be utilized to perform heavy computation

by outsourcing computation to the Cloud withholding security. I performed a high-

level security analysis for RVTHE but doing a depth security analysis of RVTHE will

enhance its security strength. Any encryption adds overhead on the performance of

reads and writes on SSD.

The theory in Geometric Algebra is vast and there is always room for

improvements to find various encryption methods through a deep learning in the area

of geometric algebra. Geometric algebra can be studied in more depth along with new

encryption algorithms and new ciphers like RVTHE.

Overall, the future work can be presented as the use of the RVTHE encryption

method with different types of technologies and systems because RVTHE is a

symmetric cipher like AES. We can include hardware systems and other features. In

cloud computing, the homomorphic encryption provides secure computing. It does

this by allowing users to compute in the Cloud without converting the cipher text into

plain text. RVTHE (an implementation of homomorphic encryption) satisfies that

requirement efficiently. We can explore using RVTHE in various applications and

databases such as password stores. Geometric algebra can be studied in more depth

and new encryption algorithms/ciphers can still be created.

118

CHAPTER 8

CONCLUSION

SSD contains the following data functions: read, write, erase, purging and securing.

These functions are processed differently than the same functions on an HDD. This

study started showing SSD storage has performance differences for sequential reads,

sequential writes, and random read writes. Most of the enterprise workloads are a mix

of random reads and random writes. This research showed SSD has been changed to

handle and perform better for random workloads over the years. In SSD, writing,

purging, and securing functions are drastically different. Previous research has shown

some data is nearly impossible to completely delete on some SSD. Research showed

that deleted data from an SSD can be restored via recovery(block-level) software.

This ultimately prompted me to study different types of encryptions. These

encryptions include TrueCrypt, DiskCryptor, BestCrypt, Dm-crypt, VeraCrypt, tomb,

BitLocker, and SED. Each of the encryptions was studied to see their weaknesses and

strengths. I selected two software-based encryption methods (BestCrypt, Dm-crypt)

and selected encrypted volumes (Elastic Block Store volume from Amazon) for

further study to analyze their strengths and weaknesses for SSDs security and

performance. I chose sequential read, sequential write, sequential mixed, random read,

random write, random mixed for different blocks sizes (4k,8k,16k,32k,64k,128k).

After evaluating different workloads with different block sizes and percentage of read

119

write ratios for BestCrypt, Dm-crypt encryptions and encrypted Elastic Block Store

volume from Amazon produced various performance metrics. This research presented

IOPS (Input/output Operations Per Second) performance metrics to show how each of

encryption methods impacted different workloads. Results proved that an encryption

can have 20-50% performance decrease on SSD for TrueCrypt and Best Crypt

software. The results showed how modern encryption software methods impact

storage devices such as SSDs in the Cloud. This proved that traditional symmetric

encryption has high performance penalties on workloads.

Any existing symmetric encryption software utilizes the systems resources like

memory and CPU when they encrypt and decrypt, which causes delays in operations.

Securing the data involves two stages: data at rest and data while transiting. “Data at

rest” is the stage before or after sending data to the Cloud. “Data while transiting” is

the stage between sending the data between the Client and the Cloud. The further

study of encryption methods leads to homomorphic encryption approaches. So, we

can’t ignore the possibilities and potential of homomorphic cryptography in cloud

computing environments. Simple homomorphic encryption methods can be made

feasible in cloud computing without sacrificing the security and enhancing the user

experience while performing the operations as need on encrypted data. I conducted a

study of previous existing homomorphic encryption literature and found that most of

them are asymmetric and are very slow. There is no homomorphic encryption which

is faster and can easily implemented on real systems. RVTHE is the solution to this

problem.

Using properties from Clifford Geometric Algebra including Versors, Vectors and

Inverse of Vector, it is possible to design a homomorphic cipher that has simple

120

structure, versality, flexibility of key assignments, and a great speed that rivals

previous approaches.

In conclusion, homomorphic encryption provides secure cloud computing. It does

this by allowing users to compute in the cloud without converting the cipher text into

plain text. RVTHE (an implementation of homomorphic encryption) satisfies this

requirement efficiently.

121

REFERENCES

[1] E. Aïmeur and D. Schőnfeld, "The ultimate invasion of privacy: Identity

theft," in 2011 Ninth Annual International Conference on Privacy, Security

and Trust, 2011.

[2] C. Gentry, "Fully Homomorphic Encryption Using Ideal Lattices," in

Proceedings of the Forty-first Annual ACM Symposium on Theory of

Computing, New York, NY, USA, 2009.

[3] C. Gentry and S. Halevi, "Implementing Gentryś Fully-homomorphic

Encryption Scheme," in Proceedings of the 30th Annual International

Conference on Theory and Applications of Cryptographic Techniques:

Advances in Cryptology, Berlin, 2011.

[4] O. Dictionaries, "Definition of security," [Online]. Available:

https://en.oxforddictionaries.com/definition/security.

[5] R. Kissel, R. Kissel, R. Blank and A. Secretary, "Glossary of key

information security terms," in NIST Interagency Reports NIST IR 7298

Revision 1, National Institute of Standards and Technology, 2011.

[6] N. Ferguson, B. Schneier and T. Kohno, Cryptography Engineering:

122

Design Principles and Practical Applications, Wiley Publishing, 2010.

[7] S. Mauw and M. Oostdijk, "Foundations of Attack Trees," in

Proceedings of the 8th International Conference on Information Security

and Cryptology, Berlin, 2006.

[8] C. E. Shannon, "Communication theory of secrecy systems," The Bell

System Technical Journal, vol. 28, pp. 656-715, Oct 1949.

[9] "Intro-Samsung Elec. Datasheet (K9LBG08U0M).," 2007.

[10] J.-U. Kang, J.-S. Kim, C. Park, H. Park and J. Lee, "A Multi-channel

Architecture for High-performance NAND Flash-based Storage System," J.

Syst. Archit., vol. 53, pp. 644-658, sep 2007.

[11] R. Micheloni, A. Marelli and K. Eshghi, Inside Solid State Drives

(SSDs), Springer Publishing Company, Incorporated, 2012.

[12] B. Bosen, "Full Drive Encryption with Samsung Solid State Drives,"

nov 2010.

[13] P. Wang, G. Sun, S. Jiang, J. Ouyang, S. Lin, C. Zhang and J. Cong,

"An Efficient Design and Implementation of LSM-tree Based Key-value

Store on Open-channel SSD," in Proceedings of the Ninth European

Conference on Computer Systems, New York, NY, USA,, 2014.

[14] D. E. Denning and P. J. Denning, "Data Security," ACM Comput. Surv.,

vol. 11, pp. 227-249, 9 1979.

123

[15] M. Tebaa, S. E. Hajji and A. E. Ghazi, "Homomorphic encryption

method applied to Cloud Computing," in 2012 National Days of Network

Security and Systems, 2012.

[16] S. I. M. O. N. SINGH, The code book : the science of secrecy from

ancient Egypt to quantum cryptography, NEW YORK : ANCHOR

BOOKS, 2000.

[17] J. Nechvatal, E. Barker, L. Bassham, W. Burr and M. Dworkin, "Report

on the development of the Advanced Encryption Standard (AES)," 2000.

[18] N. I. of Standards and T. (NIST), "FIPS Publication 46-2: Data

Encryption Standard," 1993.

[19] J. Nechvatal, E. Barker, D. Dodson, M. Dworkin, J. Foti and E. Roback,

"Status report on the first round of the development of the Advanced

Encryption Standard," Journal of Research of the National Institute of

Standards and Technology, vol. 104, 1999.

[20] R. L. Rivest, L. Adleman and M. L. Dertouzos, "On Data Banks and

Privacy Homomorphisms," Foundations of Secure Computation, Academia

Press, pp. 169-179, 1978.

[21] L. N. Childs, A Concrete Introduction to Higher Algebra, Volume1,

Springer, 1979.

[22] S. Burris and H. P. Sankappanavar, A Course in Universal Algebra-

With 36 Illustrations, 2006.

124

[23] A. Acar, H. Aksu, A. S. Uluagac and M. Conti, "A Survey on

Homomorphic Encryption Schemes: Theory and Implementation," CoRR,

vol. abs/1704.03578, 2017.

[24] A. López-Alt, E. Tromer and V. Vaikuntanathan, "On-the-fly

Multiparty Computation on the Cloud via Multikey Fully Homomorphic

Encryption," in Proceedings of the Forty-fourth Annual ACM Symposium

on Theory of Computing, New York, NY, USA, 2012.

[25] M. Tebaa and S. E. Hajji, "Secure Cloud Computing through

Homomorphic Encryption," CoRR, vol. abs/1409.0829, 2014.

[26] C. Moore, M. OŃeill, E. OŚullivan, Y. Doröz and B. Sunar, "Practical

homomorphic encryption: A survey," in 2014 IEEE International

Symposium on Circuits and Systems (ISCAS), 2014.

[27] B. Schneier, Applied Cryptography (2Nd Ed.): Protocols, Algorithms,

and Source Code in C, New York, NY, USA,: John Wiley & Sons, Inc.,

1995.

[28] D. W. H. A. D. A. Silva, "Fully Homomorphic Encryption over exterior

product spaces," 2017.

[29] K. Zhao, W. Zhao, H. Sun, X. Zhang, N. Zheng and T. Zhang, "LDPC-

in-SSD: Making Advanced Error Correction Codes Work Effectively in

Solid State Drives," in Presented as part of the 11th USENIX Conference

on File and Storage Technologies (FAST 13), San, 2013.

125

[30] P. Huang, P. Subedi, X. He, S. He and K. Zhou, "FlexECC: Partially

Relaxing ECC of MLC SSD for Better Cache Performance," in

Proceedings of the 2014 USENIX Conference on USENIX Annual

Technical Conference, Berkeley, 2014.

[31] M. Wei, L. M. Grupp, F. E. Spada and S. Swanson, "Reliably Erasing

Data from Flash-based Solid State Drives," in Proceedings of the 9th

USENIX Conference on File and Stroage Technologies, Berkeley, 2011.

[32] J. Reardon, S. Capkun and D. Basin, "Data Node Encrypted File

System: Efficient Secure Deletion for Flash Memory," in Proceedings of

the 21st USENIX Conference on Security Symposium, Berkeley, 2012.

[33] Y. Choi, D. Lee, W. Jeon and D. Won, "Password-based Single-file

Encryption and Secure Data Deletion for Solid-state Drive," in Proceedings

of the 8th International Conference on Ubiquitous Information

Management and Communication, New York, NY, USA,, 2014.

[34] N. I. of Standards and Technology, FIPS PUB 46-3: Data Encryption

Standard (DES), pub-NIST:adr,: pub-NIST, 1999.

[35] K. Bhargavan and G. Leurent, "On the Practical (In-)Security of 64-bit

Block Ciphers: Collision Attacks on HTTP over TLS and OpenVPN," in

Proceedings of the 2016 ACM SIGSAC Conference on Computer and

Communications Security, New York, NY, USA,, 2016.

126

[36] M. A. Wright, "Feature: The Advanced Encryption Standard," Netw.

Secur., vol. 2001, pp. 11-13, oct 2001.

[37] N. Ferguson, J. Kelsey, S. Lucks, B. Schneier, M. Stay, D. Wagner and

D. Whiting, "Improved Cryptanalysis of Rijndael," in Proceedings of the

7th International Workshop on Fast Software Encryption, London, 2001.

[38] A. Biryukov, O. Dunkelman, N. Keller, D. Khovratovich and A.

Shamir, Key Recovery Attacks of Practical Complexity on AES Variants

With Up To 10 Rounds, 2009.

[39] B. Schneier, "Description of a New Variable-Length Key, 64-bit Block

Cipher (Blowfish)," in Fast Software Encryption, Cambridge Security

Workshop, London, 1994.

[40] A. Biryukov and D. Wagner, Slide Attacks, L. Knudsen, Ed., Berlin,

Heidelber: Springer Berlin Heidelberg, 1999, pp. 245-259.

[41] B. Schneier, J. Kelsey, D. Whiting, D. Wagner and C. Hall, "On the

Twofish Key Schedule," in Proceedings of the Selected Areas in

Cryptography, London, 1999.

[42] N. Ferguson, J. Kelsey, B. Schneier and D. Whiting, "A Twofish

Retreat: Related-Key Attacks Against Reduced-Round Twofish," 2000.

[43] J. J. G. Ortiz and K. J. Compton, "A Simple Power Analysis Attack on

the Twofish Key Schedule," CoRR, vol. abs/1611.07109, 2016.

127

[44] R. Anderson, E. Biham and L. Knudsen, Serpent: A Proposal for the

Advanced Encryption Standard, 1998.

[45] User:Dake commonswiki, "File:Serpent-linearfunction.png," 2005.

[Online]. Available: https://commons.wikimedia.org/wiki/File:Serpent-

linearfunction.png.

[46] M. Hermelin, J. Y. Cho and K. Nyberg, "Multidimensional Linear

Cryptanalysis of Reduced Round Serpent," in Proceedings of the 13th

Australasian Conference on Information Security and Privacy, Berlin,

2008.

[47] J. Rizzo and T. Duong, "Practical Padding Oracle Attacks," in

Proceedings of the 4th USENIX Conference on Offensive Technologies,

Berkeley, 2010.

[48] M. Liskov, R. L. Rivest and D. Wagner, "Tweakable Block Ciphers," in

Proceedings of the 22Nd Annual International Cryptology Conference on

Advances in Cryptology, London, 2002.

[49] L. Martin, "XTS: A Mode of AES for Encrypting Hard Disks," IEEE

Security and Privacy, vol. 8, pp. 68-69, may 2010.

[50] D. A. McGrew and J. Viega, "The Security and Performance of the

Galois/Counter Mode (GCM) of Operation," in Proceedings of the 5th

International Conference on Cryptology in India, Berlin, 2004.

128

[51] Dm-crypt, "Dm-crypt," [Online]. Available:

https://wiki.archlinux.org/index.php/dm-crypt/Device_encryption.

[Accessed 10 12 2016].

[52] C. Fruhwirth, "LUKS- Wikipedia," [Online]. Available:

https://en.wikipedia.org/wiki/Linux_Unified_Key_Setup. [Accessed 2018].

[53] L. s. weakness, "https://thehackernews.com/2016/11/hacking-linux-

system.html," https://thehackernews.com/2016/11/hacking-linux-

system.html. [Online].

[54] d.-c. plausible-deniability, "https://blog.linuxbrujo.net/posts/plausible-

deniability-with-luks/," https://blog.linuxbrujo.net/posts/plausible-

deniability-with-luks/. [Online].

[55] M. Bauer, "Paranoid Penguin: BestCrypt: Cross-platform Filesystem

Encryption," Linux J., vol. 2002, pp. 9--, jun 2002.

[56] B. Daniel and K. Fowler, "Bypassing Self-Encrypting Drives (SED) in

Enterprise Environments," Europe,,, 2015.

[57] packetizer, "AES Crypt or AES-Crypt," 2018. [Online]. Available:

https://www.aescrypt.com.

[58] C. Gentry, "A fully homomorphic encryption scheme," 2009.

[59] W. Wang, Y. Hu, L. Chen, X. Huang and B. Sunar, "Accelerating fully

homomorphic encryption using GPU," in 2012 IEEE Conference on High

Performance Extreme Computing, 2012.

129

[60] J. Vince, Geometric Algebra: An Algebraic System for Computer

Games and Animation, 1st ed., Springer Publishing Company,

Incorporated, 2009.

[61] D. Davis, R. Ihaka and P. Fenstermacher, Cryptographic Randomness

from Air Turbulence in Disk Drives, Y. G. Desmedt, Ed., Berlin, Heidelber:

Springer Berlin Heidelberg, 1994, pp. 114-120.

[62] J. Kim, J. M. Kim, S. H. Noh, S. L. Min and Y. Cho, "A Space-efficient

Flash Translation Layer for CompactFlash Systems," IEEE Trans. on

Consum. Electron., vol. 48, pp. 366-375, may 2002.

[63] R. Micheloni, A. Marelli and R. Ravasio, Error Correction Codes for

Non-Volatile Memories, 1st ed., Springer Publishing Company,

Incorporated, 2010.

[64] J. H. Stathis, "Reliability Limits for the Gate Insulator in CMOS

Technology," IBM J. Res. Dev., vol. 46, pp. 265-286, mar 2002.

[65] P. Olivo, T. N. Nguyen and B. Ricco, "High-field-induced degradation

in ultra-thin SiO2 films," IEEE Transactions on Electron Devices, vol. 35,

pp. 2259-2267, dec 1988.

[66] N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse and

R. Panigrahy, "Design Tradeoffs for SSD Performance," in USENIX 2008

Annual Technical Conference, Berkeley, 2008.

130

[67] A. Birrell, M. Isard, C. Thacker and T. Wobber, "A Design for High-

performance Flash Disks," New York, NY, USA,, 2007.

[68] F. Chen, D. A. Koufaty and X. Zhang, "Understanding Intrinsic

Characteristics and System Implications of Flash Memory Based Solid

State Drives," in Proceedings of the Eleventh International Joint

Conference on Measurement and Modeling of Computer Systems, New

York, NY, USA,, 2009.

[69] A. Gupta, Y. Kim and B. Urgaonkar, "DFTL: A Flash Translation Layer

Employing Demand-based Selective Caching of Page-level Address

Mappings," in Proceedings of the 14th International Conference on

Architectural Support for Programming Languages and Operating Systems,

New York, NY, USA,, 2009.

[70] D. Park, B. Debnath and D. H. C. Du, "A Workload-Aware Adaptive

Hybrid Flash Translation Layer with an Efficient Caching Strategy," in

2011 IEEE 19th Annual International Symposium on Modelling, Analysis,

and Simulation of Computer and Telecommunication Systems, 2011.

[71] P. Thontirawong, M. Ekpanyapong and P. Chongstitvatana, "SCFTL:

An efficient caching strategy for page-level flash translation layer," in 2014

International Computer Science and Engineering Conference (ICSEC),

2014.

131

[72] A. Gupta, R. Pisolkar, B. Urgaonkar and A. Sivasubramaniam,

"Leveraging Value Locality in Optimizing NAND Flash-based SSDs," in

Proceedings of the 9th USENIX Conference on File and Stroage

Technologies, Berkeley, 2011.

[73] F. Chen, T. Luo and X. Zhang, "CAFTL: A Content-aware Flash

Translation Layer Enhancing the Lifespan of Flash Memory Based Solid

State Drives," in Proceedings of the 9th USENIX Conference on File and

Stroage Technologies, Berkeley, 2011.

[74] P. Huang, G. Wu, X. He and W. Xiao, "An Aggressive Worn-out Flash

Block Management Scheme to Alleviate SSD Performance Degradation," in

Proceedings of the Ninth European Conference on Computer Systems, New


[75] L.-P. Chang, "On Efficient Wear Leveling for Large-scale Flash-

memory Storage Systems," in Proceedings of the 2007 ACM Symposium on

Applied Computing, New York, NY, USA,, 2007.

[76] Y. Hu, H. Jiang, D. Feng, L. Tian, H. Luo and C. Ren, "Exploring and

Exploiting the Multilevel Parallelism Inside SSDs for Improved

Performance and Endurance," IEEE Transactions on Computers, vol. 62,

pp. 1141-1155, jun 2013.

132

[77] Y. Kim, A. Gupta, B. Urgaonkar, P. Berman and A. Sivasubramaniam,

"HybridStore: A Cost-Efficient, High-Performance Storage System

Combining SSDs and HDDs," in 2011 IEEE 19th Annual International

Symposium on Modelling, Analysis, and Simulation of Computer and

Telecommunication Systems, 2011.

[78] D. Shue and M. J. Freedman, "From Application Requests to Virtual

IOPs: Provisioned Key-value Storage with Libra," in Proceedings of the

Ninth European Conference on Computer Systems, New York, NY, USA,,

2014.

[79] F. Armknecht, C. Boyd, C. Carr, K. Gjøsteen, A. Jäschke, C. A. Reuter

and M. Strand, A Guide to Fully Homomorphic Encryption, 2015.

[80] S. S. W. Jr, Cryptanalysis of number theoretic ciphers, CRC Press,

2002.

[81] C. Swenson, Modern cryptanalysis: techniques for advanced code

breaking., John Wiley & Sons, 2008.

[82] J. Yi-ming and L. Sheng-li, "The Analysis of Security Weakness in

BitLocker Technology," in Proceedings of the 2010 Second International

Conference on Networks Security, Wireless Communications and Trusted

Computing - Volume 01, Washington, 2010.

[83] J. Suter, Geometric Algebra Primer, 2013.

133

[84] K.-D. Suh, B.-H. Suh, Y.-H. Lim, J.-K. Kim, Y.-J. Choi, Y.-N. Koh, S.-

S. Lee, S.-C. Kwon, B.-S. Choi, J.-S. Yum and others, "A 3.3 V 32 Mb

NAND flash memory with incremental step pulse programming scheme,"

IEEE Journal of Solid-State Circuits, vol. 30, pp. 1149-1156, 1995.

[85] D. Stehlé, Floating-Point LLL: Theoretical and Practical Aspects,

Springer, 2010, pp. 179-213.

[86] D. Stehlé and R. Steinfeld, "Faster Fully Homomorphic Encryption,"

{IACR} Cryptology ePrint Archive, vol. 2010, p. 299, 2010.

[87] D. Stehlé and R. Steinfeld, "Faster Fully Homomorphic Encryption," in

ASIACRYPT, 2010.

[88] R. Snyder, "Some Security Alternatives for Encrypting Information on

Storage Devices," in Proceedings of the 3rd Annual Conference on

Information Security Curriculum Development, New York, NY, USA,,

2006.

[89] B. Schneier, Secrets & Lies: Digital Security in a Networked World, 1st

ed., New York, NY, USA: John Wiley & Sons, Inc., 2000.

[90] V. Rijmen and B. Preneel, "Improved Characteristics for Differential

Cryptanalysis of Hash Functions Based on Block Ciphers," in Fast

Software Encryption: Second International Workshop. Leuven, Belgium,

14-16 December 1994, Proceedings, 1994.

134

[91] N. Palaniswamy, D. M. Dipesh, J. N. D. Kumar and S. G. Raaja,

"Notice of Violation of IEEE Publication Principles Enhanced Blowfish

algorithm using bitmap image pixel plotting for security improvisation," in

2010 2nd International Conference on Education Technology and

Computer, 2010.

[92] E. OŚullivan and F. Regazzoni, "Efficient Arithmetic for Lattice-based

Cryptography: Special Session Paper," in Proceedings of the Twelfth

IEEE/ACM/IFIP International Conference on Hardware/Software

Codesign and System Synthesis Companion, New York, NY, USA, 2017.

[93] R. Olsson, Performance differences in encryption software versus

storage devices, 2012, p. 36.

[94] D. Mittal, D. Kaur and A. Aggarwal, "Secure Data Mining in Cloud

Using Homomorphic Encryption," in 2014 IEEE International Conference

on Cloud Computing in Emerging Markets (CCEM), 2014.

[95] K. Minematsu, "Improved Security Analysis of XEX and LRW Modes,"

in Proceedings of the 13th International Conference on Selected Areas in

Cryptography, Berlin, 2007.

[96] D. N. G. C. R. Micheloni, VLSI-Design of Non-Volatile Memories,

New York,,: (Springer), 2005.

135

[97] D. Micciancio, The Geometry of Lattice Cryptography, A. Aldini and

R. Gorrieri, Eds., Berlin, Heidelber: Springer Berlin Heidelberg, 2011, pp.

185-210.

[98] L. Martin, "XTS: A Mode of AES for Encrypting Hard Disks," IEEE

Security Privacy, vol. 8, pp. 68-69, may 2010.

[99] J.-D. Lee, S.-H. Hur and J.-D. Choi, "Effects of floating-gate

interference on NAND flash memory cell operation," IEEE Electron Device

Letters, vol. 23, pp. 264-266, may 2002.

[100] S. K. Lai, J. Lee and V. K. Dham, "Electrical properties of nitrided-

oxide systems for use in gate dielectrics and EEPROM," in 1983

International Electron Devices Meeting, 1983.

[101] D. Kahng and S. M. Sze, "A floating gate and its application to memory

devices," The Bell System Technical Journal, vol. 46, pp. 1288-1295, jul

1967.

[102] C. Gentry, S. Halevi and N. P. Smart, "Homomorphic Evaluation of the

AES Circuit," in Proceedings of the 32Nd Annual Cryptology Conference

on Advances in Cryptology --- CRYPTO 2012 - Volume 7417, New York,

NY, USA, 2012.

[103] N. Ferguson, J. Kelsey, S. Lucks, B. Schneier, M. Stay, D. Wagner and

D. Whiting, "Improved Cryptanalysis of Rijndael," in Proceedings of the

7th International Workshop on Fast Software Encryption, London, 2001.

136

[104] Y. Doröz, J. Hoffstein, J. Pipher, J. H. Silverman, B. Sunar, W. Whyte

and Z. Zhang, "Fully Homomorphic Encryption from the Finite Field

Isomorphism Problem," {IACR} Cryptology ePrint Archive, vol. 2017, p.

548, 2017.

[105] Diskcryptor, "Diskcryptor," 2011. [Online]. Available:

https://diskcryptor.net/wiki/Main_Page.

[106] W. Dai, Y. Doröz, Y. Polyakov, K. Rohloff, H. Sajjadpour, E. Savas

and B. Sunar, "Implementation and Evaluation of a Lattice-Based Key-

Policy ABE Scheme," {IEEE} Trans. Information Forensics and Security,

vol. 13, pp. 1169-1184, 2018.

[107] A. Czeskis, D. J. S. Hilaire, K. Koscher, S. D. Gribble, T. Kohno and B.

Schneier, "Defeating Encrypted and Deniable File Systems: TrueCrypt

V5.1a and the Case of the Tattling OS and Applications," Berkeley, 2008.

[108] J. H. Cheon and D. Stehlé, "Fully Homomorphic Encryption over the

Integers Revisited," {IACR} Cryptology ePrint Archive, vol. 2016, p. 837,

2016.

[109] J. H. Cheon and D. Stehlé, "Fully Homomophic Encryption over the

Integers Revisited," in EUROCRYPT (1), 2015.

[110] K. K. Chauhan, A. K. S. Sanger and A. Verma, "Homomorphic

Encryption for Data Security in Cloud Computing," in 2015 International

Conference on Information Technology (ICIT), 2015.

137

[111] N. Chan, M. F. Beug, R. Knoefler, T. Mueller, T. Melde, M.

Ackermann, S. Riedel, M. Specht, C. Ludwig and A. T. Tilke, "Metal

control gate for sub-30nm floating gate NAND memory," in 2008 9th

Annual Non-Volatile Memory Technology Symposium (NVMTS), 2008.

[112] A. Chakraborti, C. Chen and R. Sion, "POSTER: DataLair: A Storage

Block Device with Plausible Deniability," in Proceedings of the 2016 ACM

SIGSAC Conference on Computer and Communications Security, New


[113] Z. Brakerski, C. Gentry and V. Vaikuntanathan, "(Leveled) Fully

Homomorphic Encryption Without Bootstrapping," in Proceedings of the

3rd Innovations in Theoretical Computer Science Conference, New York,

NY, USA, 2012.

[114] E. Biham, O. Dunkelman and N. Keller, "The Rectangle Attack -

Rectangling the Serpent," in Advances in Cryptology – Proceedings of

EUROCRYPT 2001, LNCS 2045, 2001.

[115] E. Biham, "New Types of Cryptanalytic Attacks Using Related Keys,"

in Advances in Cryptology --- Eurocrypt'93, Berl, 1994.

[116] D. Benarroch, Z. Brakerski and T. Lepoint, "FHE over the Integers:

Decomposed and Batched in the Post-Quantum Regime," in Proceedings,

Part II, of the 20th IACR International Conference on Public-Key

Cryptography --- PKC 2017 - Volume 10175, New York, NY, USA, 2017.

138

Appendix A – Cloud Storage SSDIn this Appendix, I will guide you through tThe steps that walk through createing

various types of VMs created selectingwith various types of SSD in Amazon Cloud.

Login into Amazon Cloud and select EC2. Launch instance and select instance type

select i2.xlarge and t2.micro. Both are SSD storage VMs.

1. Visit Amazon Cloud Services EC2 website at

https://us-west-2.console.aws.amazon.com/console.

2. Create a VM following instruction from Amazon.

In the first evaluation I compared between these two types of VMS and proved

storage optimized VMs sure will have better performance. First, I created two type of

VMs in AWS. Instance type i2.xlarge follows:

Instance type t2.micro follows


Make sure to line up the author section with the rest of the reference.

https://us-west-2.console.aws.amazon.com/console

139

3. Installed FIO benchmark tool.

root@ip-172-31-17-80: /home/ubuntu/Desktop#wget http://brick.kernel.dk/snaps/fio-2.1.10.tar.gz .

root@ip-172-31-17-80:/home/ubuntu/Desktop# gunzip fio-2.1.10.tar.gzroot@ip-172-31-17-80:/home/ubuntu/Desktop# tar -xf fio-2.1.10.tar

Run the following command to calculate benchmarks for performance

fio --filename=/dmcrypt/4krandreadwrite6040j8 --direct=1 --rw=randrw --size=1024m --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=4k --rwmixread=60 --iodepth=8 --numjobs=8 --runtime=60 --group_reporting --name=4krandreadwrite60j8--output=/home/output/4kdmcryptrandreadwrite60j8

Sample Generated output:

Used this output gathered IOPS information for 4k to 1024kb block sizes for 1GB

files. Also calculated the time for sequential and random read writes. I used this

performance metrics to understand SSD characteristics in terms of performance in the

Cloud.

http://brick.kernel.dk/snaps/fio-2.1.10.tar.gz

http://brick.kernel.dk/snaps/fio-2.1.10.tar.gz

140

Appendix B – Cloud Storage and EncryptionsIn this Appdenix, we demonstrate what are problems in …

After evaluating the storage optimized SSD VM versus regular SSD, and the results

showed storage optimized SSD outperformed regular SSD. After that I evaluated

regular SSD, hardware encrypted SSD and software encryption creating container

performance running the FIO benchmarks to understand the performance of the

penalties of encryption software in Cloud environment.

1. Visit Amazon Cloud Services EC2 website at

https://us-west-2.console.aws.amazon.com/console.

All the VM types are t2.micro (Variable ECUs, 1 vCPU, 2.5 GHz, Intel Xeon

Family, 1 GiB memory, EBS only). Ubuntu Server 16.04 LTS (HVM), SSD Volume

Type - ami-efd0428f

2. Create two VMs following instruction from Amazon.

Instance type t2.micro with regular SSD

https://us-west-2.console.aws.amazon.com/console

141

3. Created Instance type t2.micro with encrypted SSD.

4. Installed an encryption Best crypt to 3GB volume and Dm-Crypt software on

3GB volume on one of the t2.micro regular SSD.

root@ip-172-31-17-80:yum install gcc kernel-devel kernel-headers dkmsroot@ip-172-31-17-80: wget -O /etc/yum.repos.d/bestcrypt.repo https://www.jetico.com/packages/el/bestcrypt.repo

root@ip-172-31-17-80: yum install bestcrypt bestcrypt-panelroot@ip-172-31-17-80: bctool new /root/BestCrypt -a Rijndael -s 3gb -d password root@ip-172-31-17-80: bctool format /root/BestCrypt -t ext3root@ip-172-31-17-80: Enter password:

142

root@ip-172-31-17-80:/sys/block/xvda/queue# apt-get install cryptsetupReading package lists... DoneBuilding dependency treeReading state information... Donecryptsetup is already the newest version.0 upgraded, 0 newly installed, 0 to remove and 50 not upgraded.root@ip-172-31-17-80:/sys/block/xvda/queue# fallocate -l 2048M /root/dmcryptroot@ip-172-31-17-80:/sys/block/xvda/queue# cryptsetup -y luksFormat /root/dmcrypt

WARNING!========This will overwrite data on /root/dmcrypt irrevocably.

Are you sure? (Type uppercase yes): yroot@ip-172-31-17-80:/sys/block/xvda/queue# cryptsetup -y luksFormat /root/dmcrypt


Are you sure? (Type uppercase yes): yesroot@ip-172-31-17-80:/sys/block/xvda/queue# cryptsetup -y luksFormat /root/dmcrypt


Are you sure? (Type uppercase yes): YESEnter passphrase:Verify passphrase:Passphrases do not match.root@ip-172-31-17-80:/sys/block/xvda/queue# cryptsetup -y luksFormat /root/dmcrypt


Are you sure? (Type uppercase yes): YESEnter passphrase:Verify passphrase:root@ip-172-31-17-80:/sys/block/xvda/queue# df -hFilesystem Size Used Avail Use% Mounted onudev 492M 12K 492M 1% /dev

143

tmpfs 100M 384K 99M 1% /run/dev/xvda1 7.8G 3.7G 3.7G 50% /none 4.0K 0 4.0K 0% /sys/fs/cgroupnone 5.0M 0 5.0M 0% /run/locknone 497M 68K 497M 1% /run/shmnone 100M 8.0K 100M 1% /run/userroot@ip-172-31-17-80:/sys/block/xvda/queue# cd /rootroot@ip-172-31-17-80:~# ls -lastotal 2097208 4 drwx------ 8 root root 4096 Apr 9 20:40 . 4 drwxr-xr-x 22 root root 4096 Apr 9 09:06 .. 8 -rw------- 1 root root 6914 Apr 9 13:17 .bash_history 4 -rw-r--r-- 1 root root 3106 Feb 20 2014 .bashrc 4 drwxr-xr-x 3 root root 4096 Apr 9 10:11 BestCrypt 4 drwx------ 2 root root 4096 Apr 9 09:07 .cache 4 drwxr-xr-x 3 root root 4096 Apr 9 10:02 .config 4 drwx------ 3 root root 4096 Apr 9 10:02 .dbus2097156 -rw-r--r-- 1 root root 2147483648 Apr 9 20:41 dmcrypt 4 drwxr-xr-x 2 ubuntu ubuntu 4096 Apr 9 13:09 plain 4 -rw-r--r-- 1 root root 140 Feb 20 2014 .profile 4 drwx------ 2 root root 4096 Apr 9 09:06 .ssh 4 -rw------- 1 root root 648 Apr 9 10:41 .viminforoot@ip-172-31-17-80:~# file /root/dmcrypt/root/dmcrypt: LUKS encrypted file, ver 1 [aes, xts-plain64, sha1] UUID: 5302390d-a47a-47cc-99a7-a846d164197croot@ip-172-31-17-80:~# cryptsetup luksOpen /root/dmcrypt dmcryptEnter passphrase for /root/dmcrypt:root@ip-172-31-17-80:~# df -hFilesystem Size Used Avail Use% Mounted onudev 492M 12K 492M 1% /devtmpfs 100M 388K 99M 1% /run/dev/xvda1 7.8G 3.7G 3.7G 50% /none 4.0K 0 4.0K 0% /sys/fs/cgroupnone 5.0M 0 5.0M 0% /run/locknone 497M 68K 497M 1% /run/shmnone 100M 8.0K 100M 1% /run/userroot@ip-172-31-17-80:~# mkfs.ext4 -j /dev/mapper/dmcryptmke2fs 1.42.9 (4-Feb-2014)Filesystem label=OS type: LinuxBlock size=4096 (log=2)Fragment size=4096 (log=2)Stride=0 blocks, Stripe width=0 blocks131072 inodes, 523776 blocks26188 blocks (5.00%) reserved for the super userFirst data block=0Maximum filesystem blocks=53687091216 block groups32768 blocks per group, 32768 fragments per group

144

8192 inodes per groupSuperblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912

Allocating group tables: doneWriting inode tables: doneCreating journal (8192 blocks): doneWriting superblocks and filesystem accounting information: done

root@ip-172-31-17-80:~# df -hFilesystem Size Used Avail Use% Mounted onudev 492M 12K 492M 1% /devtmpfs 100M 388K 99M 1% /run/dev/xvda1 7.8G 3.7G 3.7G 50% /none 4.0K 0 4.0K 0% /sys/fs/cgroupnone 5.0M 0 5.0M 0% /run/locknone 497M 68K 497M 1% /run/shmnone 100M 8.0K 100M 1% /run/userroot@ip-172-31-17-80:~# mkdir dmcryptmkdir: cannot create directory ‘dmcrypt’: File existsroot@ip-172-31-17-80:~# pwd/rootroot@ip-172-31-17-80:~# cd /root@ip-172-31-17-80:/# mkdir dmcryptroot@ip-172-31-17-80:/# mount /dev/mapper/dmcrypt /dmcryptroot@ip-172-31-17-80:/# df -hFilesystem Size Used Avail Use% Mounted onudev 492M 12K 492M 1% /devtmpfs 100M 388K 99M 1% /run/dev/xvda1 7.8G 3.7G 3.7G 50% /none 4.0K 0 4.0K 0% /sys/fs/cgroupnone 5.0M 0 5.0M 0% /run/locknone 497M 68K 497M 1% /run/shmnone 100M 8.0K 100M 1% /run/user/dev/mapper/dmcrypt 2.0G 3.0M 1.9G 1% /dmcryptroot@ip-172-31-17-80:/#

5. Run FIO benchmarks on these four types of VMs: SSD, Encrypted SSD, Dm-

Crypt container, and Bestcrypt container.

fio --filename=/dmcrypt/4krandreadwrite6040j8 --direct=1 --rw=randrw --size=1024m --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=4k --rwmixread=60 --iodepth=8 --numjobs=8 --runtime=60 --group_reporting --name=4krandreadwrite60j8--output=/home/output/4kdmcryptrandreadwrite60j8

145

6. Sample Generated output:

Used this similar output gathered IOPS information for 4k to 1024kb block sizes

for 1GB files. Also calculated the time for sequential and random read writes for

all VMs. I used this performance metrics to understand SSD characteristics versus

encryption software performance penalties in the Cloud. It proved there is a

performance overhead for software-based encryption versus regular or encrypted

SSDs.

7. Hidden encrypted containers information.

146

In the above command df-h simply can be run by anyone and will show the

encrypted container information may be security concern.

147

Appendix C – Multi-Vector Based EncryptionIn this Appendix, we demonstrate…..

After that I performed survey of Homomorphic encryption techniques. I converted

Multi-Vector based homomorphic encryption proposed by David Williams Honorio

Araujo Da Silva for his Masters’ thesis. Converted that into executable program and

run it on AWSVM on file level encryption. This is a symmetric encryption that’s why

I chose AES-Crypt symmetric encryption software to compare the results. file level

encryption and compared the results. Written similar type of program as AES-Crypt.

1. //2. // main.c3. // XLogos with MPQ4. //

5. #include <stdio.h>6. #include "test_xlg.h"7. #include "test_xlm.h"8. #include "test_xlg_massive_encryption.h"

9. int main(int argc, const char * argv[]) {

10.//====================== TEST XLM ======================11.//test_xlm_set_xlz();12.//test_xlm_set_int();13.//test_xlm_import();14.//test_xlm_encryption_decryption();15.//test_xlm_pack_unpack();

16.//xlg_test_pair_unpair();17.//xlg_test_compression();

18.//====================== TEST XLG ======================19.//test_xlg_encrypt_decrypt_str();20.//test_xlg_encrypt_decrypt_int();21.//test_xlg_encrypt_decrypt_file();22.//test_xlm_encode();23.//test_xlm_decode();

24.encrypt_decrypt_file(argc,argv);

148

25.return 0;26.}

27.//28.// xlg.h

29.#ifndef xlg_h30.#define xlg_h

31.#include <stdio.h>32.#include "xlm.h"33.#include <time.h>

34.struct xlg_t{35.xlm_t key1;36.xlm_t key2;37.xlm_t key1_inverse;38.xlm_t key2_inverse;39.};40.typedef struct xlg_t xlg_t;

41.//============================== INIT and SET ==============================

42.void xlg_init(xlg_t *xlg);43.void xlg_generate_keys(xlg_t *xlg, int key_size);44.void xlg_set_keys(xlg_t *xlg, xlm_t k1, xlm_t k2);

45.//============================== OPERATIONS ==============================

46.void xlg_encrypt(xlm_t *dest_cypher, xlm_t message, xlg_t xlg);

47.void xlg_decrypt(xlm_t *dest_decrypt, xlm_t cypher, xlg_t xlg);

48.//============================== UTILS ==============================

49.void xlg_clear(xlg_t *xlg);50.void xlg_print(xlg_t xlg);

51.#endif /* xlg_h */

52.//53.// xlg.c

54.#include "xlg.h"

55.//============================== INIT and SET ==============================

56.void xlg_init(xlg_t *xlg){57.xlm_init(&xlg->key1);58.xlm_init(&xlg->key2);59.xlm_init(&xlg->key1_inverse);60.xlm_init(&xlg->key2_inverse);61.}

62.void xlg_generate_keys(xlg_t *xlg, int key_size){63.gmp_randstate_t state;

149

64.gmp_randinit_default(state);65.time_t t;66.gmp_randseed_ui(state, time(&t));

67.mpz_t key1_z;68.mpz_init(key1_z);69.mpz_urandomb(key1_z, state, key_size);70.xlm_t key1_m;71.xlm_init(&key1_m);72.xlm_set_z(&key1_m, key1_z);

73.mpz_t key2_z;74.mpz_init(key2_z);75.mpz_urandomb(key2_z, state, key_size);76.xlm_t key2_m;77.xlm_init(&key2_m);78.xlm_set_z(&key2_m, key2_z);

79.xlg_set_keys(xlg, key1_m, key2_m);

80.//Clean up81.mpz_clear(key1_z);82.mpz_clear(key2_z);83.xlm_clear(&key1_m);84.xlm_clear(&key2_m);85.gmp_randclear(state);86.}

87.void xlg_set_keys(xlg_t *xlg, xlm_t k1, xlm_t k2){88.//Set keys89.xlm_set(&xlg->key1,k1);90.xlm_set(&xlg->key2,k2);

91.//Set inverse keys92.xlm_t key1_inverse;93.xlm_init(&key1_inverse);94.xlm_set(&key1_inverse,k1);95.xlm_inverse(&key1_inverse);96.xlm_set(&xlg->key1_inverse, key1_inverse);97.xlm_clear(&key1_inverse);

98.xlm_t key2_inverse;99.xlm_init(&key2_inverse);100.xlm_set(&key2_inverse,k2);101.xlm_inverse(&key2_inverse);102.xlm_set(&xlg->key2_inverse, key2_inverse);103.xlm_clear(&key2_inverse);

104.}

105.//============================== OPERATIONS ==============================

106.void xlg_encrypt(xlm_t *dest_cypher, xlm_t message, xlg_t xlg){107.//Encrypt108.xlm_t gp1_encryption;109.xlm_init(&gp1_encryption);110.gmp_printf("Key: %Qd + %Qd\n", xlg.key1.m0, xlg.key1.m1);111.gmp_printf("Message: %Qd + %Qd\n", message.m0, message.m1);112.xlm_geometric_product(&gp1_encryption,&xlg.key1,&message);

150

113.xlm_t cypher;114.xlm_init(&cypher);115.xlm_geometric_product_bivector(&cypher,&gp1_encryption,&xlg.key2)

;116.gmp_printf("Encrypted Values: %Qd + %Qd\n", cypher.m0,

cypher.m1);117.xlm_set(dest_cypher,cypher);

118.//Clean up119.xlm_clear(&gp1_encryption);120.xlm_clear(&cypher);121.}

122.void xlg_decrypt(xlm_t *dest_decrypt, xlm_t cypher, xlg_t xlg){123.//Decrypt124.xlm_t gp1_decryption;125.xlm_init(&gp1_decryption);126.gmp_printf("\nKey1 Inverse: %Qd + %Qd\n", &xlg.key1_inverse.m0,

&xlg.key1_inverse.m1);127.gmp_printf("Key2 Inverse: %Qd + %Qd\n", &xlg.key2_inverse.m0,

&xlg.key2_inverse.m1);128.xlm_geometric_product(&gp1_decryption,&cypher,&xlg.key2_inverse);129.gmp_printf("Decrypted Values: %Qd + %Qd\n", gp1_decryption.m0,

gp1_decryption.m1);130.xlm_t decrypt;131.xlm_init(&decrypt);132.xlm_geometric_product_bivector_vector(&decrypt,&xlg.key1_inverse,

&gp1_decryption);133.gmp_printf("Decrypted Values: %Qd + %Qd\n", decrypt.m0,

decrypt.m1);134.xlm_set(dest_decrypt,decrypt);

135.//Clean up136.xlm_clear(&gp1_decryption);137.xlm_clear(&decrypt);

138.}

139.//============================== UTILS ==============================

140.void xlg_clear(xlg_t *xlg){141.xlm_clear(&xlg->key1);142.xlm_clear(&xlg->key2);143.xlm_clear(&xlg->key1_inverse);144.xlm_clear(&xlg->key2_inverse);145.}

146.void xlg_print(xlg_t xlg){147.mpz_t key1;148.mpz_init(key1);149.xlm_get_z(&key1, xlg.key1);

150.mpz_t key2;151.mpz_init(key2);152.xlm_get_z(&key2, xlg.key2);

153.gmp_printf("key 1 => %Zd\n", key1);154.gmp_printf("key 2 => %Zd\n", key2);

151

155.mpz_clear(key1);156.mpz_clear(key2);157.}

158.//159.// xlm.h

160.#ifndef xlm_h161.#define xlm_h

162.#include <stdio.h>163.#include <stdlib.h>164.#include <string.h>165.#include <gmp.h>166.#include <math.h>167.#include "xlg_compression.h"

168.struct xlm_t {169.mpq_t m0;170.mpq_t m1;

171.};172.typedef struct xlm_t xlm_t;

173.//============================== INIT and SET ==============================

174.void xlm_init(xlm_t * dest);175.void xlm_set(xlm_t *dest, xlm_t src);176.void xlm_set_z(xlm_t * dest, mpz_t z);177.void xlm_set_si(xlm_t * dest, signed long int si);178.void xlm_import_str(xlm_t * dest,char* str);179.void xlm_import_str_w_size(xlm_t * dest,char* str, long size);

180.//============================== XLM EXPORT ==============================

181.void xlm_get_z(mpz_t *dest, xlm_t xlm);182.signed long int xlm_get_si(xlm_t xlm);183.char* xlm_export_str(xlm_t xlm, long *buffer_size);

184.//============================== UTILS ==============================

185.void xlm_print(xlm_t m);186.void xlm_clear(xlm_t * m);

187.void xlm_pack(mpz_t dst, xlm_t src);188.void xlm_unpack(xlm_t *dst, mpz_t src);189.size_t xlm_out_raw(FILE* stream, xlm_t src);190.size_t xlm_inp_raw(xlm_t *dst,FILE* stream);

191.//============================== OPERATIONS ==============================

192.void xlm_geometric_product(xlm_t *dest, xlm_t * m0, xlm_t * m1);193.void xlm_geometric_product_bivector(xlm_t *dest, xlm_t * m0,

152

xlm_t * m1);194.void xlm_geometric_product_bivector_vector(xlm_t *dest, xlm_t *

m0, xlm_t * m1);195.void xlm_clifford_conjugation(xlm_t *m);196.void xlm_reverse(xlm_t *m);197.void xlm_amplitude_squared(xlm_t * m);198.void xlm_amplitude_squared_reversed(xlm_t * m);199.void xlm_rationalize(xlm_t *m);200.void xlm_scalar_div(xlm_t * m, mpq_t scalar);201.void xlm_inverse(xlm_t *m);202.void xlm_lambda_0(mpq_t *m0, xlm_t * mv1, xlm_t * mv2);203.void xlm_lambda_1(mpq_t *m1, xlm_t * mv1, xlm_t * mv2);204.void xlm_lambda_0_bivector(mpq_t *m0, xlm_t * mv1, xlm_t *

mv2);205.void xlm_lambda_1_bivector(mpq_t *m1, xlm_t * mv1, xlm_t *

mv2);206.void xlm_lambda_0_bivector_vector(mpq_t *m0, xlm_t * mv1, xlm_t

* mv2);207.void xlm_lambda_1_bivector_vector(mpq_t *m1, xlm_t * mv1, xlm_t

* mv2);

208.#endif /* xlm_h */

209.//210.// xlm.c

211.#include "xlm.h"

212.//============================== INIT and SET ==============================

213.void xlm_init(xlm_t * dest){214.mpq_init(dest->m0);215.mpq_init(dest->m1);

216.}

217.void xlm_set(xlm_t *dest, xlm_t src){218.mpq_set(dest->m0, src.m0);219.mpq_set(dest->m1, src.m1);220.}

221.void xlm_set_z(xlm_t * dest, mpz_t z){222.//Init base and reminder223.mpz_t base;224.mpz_init(base);225.mpz_t reminder;226.mpz_init(reminder);

227.//Compute values228.mpz_div_ui(base,z,2);229.mpz_mod_ui(reminder,z,2);

230.//Get reminder in mpq231.mpq_t reminder_mpq;232.mpq_init(reminder_mpq);

153

233.mpq_set_z(reminder_mpq,reminder);

234.mpq_set_z(dest->m0,base);235.mpq_set_z(dest->m1, base);

236.mpq_add(dest->m1,dest->m1,reminder_mpq);

237.//Adjust coefficients238.if(mpz_cmp_ui(reminder,0) == 0){239.mpq_t mpq_1;240.mpq_init(mpq_1);241.mpq_set_ui(mpq_1,1,1);242.mpq_add(dest->m1, dest->m1, mpq_1);243.mpq_clear(mpq_1);244.}

245.mpz_clear(base);246.mpz_clear(reminder);247.mpq_clear(reminder_mpq);248.}

249.void xlm_set_si(xlm_t * dest, signed long int si){250.mpz_t z;251.mpz_init_set_si(z,si);252.xlm_set_z(dest,z);253.mpz_clear(z);254.}

255.void xlm_import_str(xlm_t * dest,char* str){256.mpz_t z;257.mpz_init(z);258.mpz_import(z,sizeof(str),1,sizeof(str[0]), 0, 0,str);259.xlm_set_z(dest,z);260.mpz_clear(z);261.}

262.void xlm_import_str_w_size(xlm_t * dest,char* str, long size){263.mpz_t z;264.mpz_init(z);265.mpz_import(z,size,1,sizeof(str[0]),0, 0,str);266.xlm_set_z(dest,z);267.mpz_clear(z);268.}

269.//============================== XLM GET ==============================

270.void xlm_get_z(mpz_t *dest, xlm_t xlm){271.mpz_t mpz_m0;mpz_init(mpz_m0);272.mpz_t mpz_m1;mpz_init(mpz_m1);

273.mpz_set_q(mpz_m0,xlm.m0);274.mpz_set_q(mpz_m1,xlm.m1);

275.mpz_add(*dest, mpz_m0, mpz_m1);

276.mpz_clear(mpz_m0);277.mpz_clear(mpz_m1);

154

278.}

279.signed long int xlm_get_si(xlm_t xlm){280.mpz_t z;281.mpz_init(z);282.xlm_get_z(&z,xlm);283.signed long int si = mpz_get_si(z);284.mpz_clear(z);285.return si;286.}

287.char* xlm_export_str(xlm_t xlm, long *buffer_size){288.mpz_t z;289.mpz_init(z);290.xlm_get_z(&z,xlm);

291.//Alloc memory to destination buffer292.long size =sizeof(char);293.long nail = 0;294.long numb = 8*size - nail;295.long count = (mpz_sizeinbase (z, 2) + numb-1) / numb;296.char* buffer;297.buffer = malloc(count * size);

298.if(*buffer_size != NULL){299.*buffer_size =count * size;300.}

301.//Export to buffer302.mpz_export(buffer, NULL, 1, size, 0, nail, z);303.mpz_clear(z);

304.return buffer;305.}

306.//============================== UTILS ==============================

307.void xlm_clear(xlm_t * m){308.mpq_clear(m->m0);309.mpq_clear(m->m1);

310.}

311.void xlm_print(xlm_t m){312.gmp_printf("%+Qd e0 ", m.m0);313.gmp_printf("%+Qd e1 \n", m.m1);

314.}

315.void xlm_pack(mpz_t dst, xlm_t src){316.mpz_t m0_m1;317.mpz_t m0,m1;318.mpz_inits(m0_m1,m0,m1,NULL);

319.//Get mpz values of coefficients320.mpz_set_q(m0,src.m0);321.mpz_set_q(m1,src.m1);

155

322.//Set absolute values323.mpz_abs(m0,m0);324.mpz_abs(m1,m1);

325.//Pair coefficients326.xlg_pair(dst, m0, m1);

327.//Pack signs of coefficients328.unsigned int sings = 0;329.sings = sings + (int)((mpq_cmp_si(src.m0,0,0)<0)? pow(2,7):0);330.sings = sings + (int)((mpq_cmp_si(src.m1,0,0)<0)? pow(2,6):0);

331.mpz_mul_ui(dst,dst,256);332.mpz_add_ui(dst,dst,sings);

333.mpz_clears(m0_m1,m0,m1,NULL);334.}

335.void xlm_unpack(xlm_t *dst, mpz_t src){336.mpz_t m0_m1;337.mpz_t m0,m1;338.mpz_inits(m0_m1,m0,m1,NULL);

339.//Get sings340.mpz_t signs_z;341.mpz_init(signs_z);342.mpz_mod_ui(signs_z,src,256);343.mpz_div_ui(src,src,256);344.unsigned long signs = mpz_get_ui(signs_z);

345.//Unpair coefficients

346.xlg_unpair(m0, m1, src);

347.//Adjust sign348.if((signs & 1) > 0)349.mpz_mul_si(m0,m0,-1);350.if((signs & 2) > 0)351.mpz_mul_si(m1,m1,-1);

352.//Set coefficients353.mpq_set_z(dst->m0,m0);354.mpq_set_z(dst->m1,m1);

355.mpz_clear(signs_z);356.mpz_clears(m0_m1,m0,m1,NULL);

156

357.}

358.size_t xlm_out_raw(FILE* stream, xlm_t src){

359.mpz_t blades[2];360.for (int i = 0; i < 2; i++) {361.mpz_init(blades[i]);362.}

363.mpz_set_q(blades[0],src.m0);364.mpz_set_q(blades[1],src.m1);

365.size_t size = 0;366.for (int i = 0; i < 2; i++) {367.size += mpz_out_raw(stream,blades[i]);368.}

369.for (int i = 0; i < 2; i++) {370.mpz_clear(blades[i]);371.}372.return size;373.}

374.size_t xlm_inp_raw(xlm_t *dst,FILE* stream){

375.mpz_t blades[2];376.for (int i = 0; i < 2; i++) {377.mpz_init(blades[i]);378.}379.size_t rsize = 0;380.for (int i = 0; i < 2; i++) {381.rsize += mpz_inp_raw(blades[i],stream);382.}

383.mpq_set_z(dst->m0,blades[0]);384.mpq_set_z(dst->m1,blades[1]);

385.for (int i = 0; i < 2; i++) {386.mpz_clear(blades[i]);387.}

388.return rsize;389.}

390.//============================== OPERATIONS ==============================

391.void xlm_geometric_product(xlm_t * dest, xlm_t * m0, xlm_t * m1){392.xlm_lambda_0(&dest->m0,m0,m1);393.xlm_lambda_1(&dest->m1,m0,m1);

394.}

157

395.void xlm_geometric_product_bivector(xlm_t * dest, xlm_t * m0, xlm_t * m1){

396.xlm_lambda_0_bivector(&dest->m0,m0,m1);397.xlm_lambda_1_bivector(&dest->m1,m0,m1);

398.}

399.void xlm_geometric_product_bivector_vector(xlm_t * dest, xlm_t * m0, xlm_t * m1){

400.xlm_lambda_0_bivector_vector(&dest->m0,m0,m1);401.xlm_lambda_1_bivector_vector(&dest->m1,m0,m1);

402.}

403.void xlm_lambda_0(mpq_t *m, xlm_t * mv1, xlm_t *mv2){404.mpq_t ma;405.mpq_t mb;

406.mpq_init(ma);407.mpq_init(mb);408.mpq_mul(ma,mv1->m0,mv2->m0);409.mpq_mul(mb,mv1->m1,mv2->m1);410.mpq_add(*m,*m,ma);411.mpq_add(*m,*m,mb);412.mpq_clear(ma);413.mpq_clear(mb);

414.}


418.mpq_init(ma);419.mpq_init(mb);420.mpq_mul(ma,mv1->m0,mv2->m1);421.mpq_mul(mb,mv1->m1,mv2->m0);422.mpq_add(*m,*m,ma);423.mpq_sub(*m,*m,mb);424.mpq_clear(ma);425.mpq_clear(mb);

426.}

427.void xlm_lambda_0_bivector(mpq_t *m, xlm_t * mv1, xlm_t *mv2){428.mpq_t ma;429.mpq_t mb;

430.mpq_init(ma);431.mpq_init(mb);

158

432.mpq_mul(ma,mv1->m0,mv2->m0);433.mpq_mul(mb,mv1->m1,mv2->m1);434.mpq_add(*m,*m,ma);435.mpq_add(*m,*m,mb);436.mpq_clear(ma);437.mpq_clear(mb);

438.}



450.}

451.void xlm_lambda_0_bivector_vector(mpq_t *m, xlm_t * mv1, xlm_t *mv2){

452.mpq_t ma;453.mpq_t mb;


462.}




159

474.}

475.void xlm_clifford_conjugation(xlm_t *m) {476.mpq_t minus_one;477.mpq_init(minus_one);478.mpq_set_si(minus_one,1,-1);

479.//mpq_mul(m->m0,m->m0,minus_one);480.//mpq_mul(m->m1,m->m1,minus_one);481.//mpq_mul(m->m1,m->m1,minus_one);

482.mpq_clear(minus_one);483.}

484.void xlm_reverse(xlm_t *m) {485.mpq_t minus_one;486.mpq_init(minus_one);487.mpq_set_si(minus_one,1,-1);

488.//mpq_mul(m->m0,m->m0,minus_one);489.// mpq_mul(m->m0,m->m0,minus_one);490.mpq_mul(m->m1,m->m1,minus_one);


493.void xlm_amplitude_squared(xlm_t * m) {494.gmp_printf("Input1: %Qd\n", m->m0);495.gmp_printf("Input2: %Qd\n", m->m1);

496.//Compute clifford congugation of m and store it on clifford_conj497.xlm_t clifford_conj;498.xlm_init(&clifford_conj);499.xlm_set(&clifford_conj, *m);500.xlm_clifford_conjugation(&clifford_conj);501.gmp_printf("Clifford Conjugate: %Qd\n", clifford_conj.m0);502.gmp_printf("Clifford Conjugate: %Qd\n", clifford_conj.m1);503.//Compute geometric product of m and cg and store it on

amplitude_squared504.xlm_t amplitude_squared;505.xlm_init(&amplitude_squared);506.xlm_geometric_product(&amplitude_squared, m, &clifford_conj);

507.gmp_printf("Amplitude squared: %Qd\n", amplitude_squared.m0);508.//Make the pointer content equal amplitude_squared509.xlm_clear(m);510.xlm_init(m);511.xlm_set(m, amplitude_squared);

512.//Clean up

160

513.xlm_clear(&clifford_conj);514.xlm_clear(&amplitude_squared);515.}

516.void xlm_amplitude_squared_reversed(xlm_t * m) {517.//Compute amplitude squared of m518.xlm_t amplitude_squared_reversed;519.xlm_init(&amplitude_squared_reversed);520.xlm_set(&amplitude_squared_reversed, *m);521.xlm_amplitude_squared(&amplitude_squared_reversed);

522.//Compute reverse of mv_amplitude_squared and store it on mv_amplitude_squared

523.xlm_reverse(&amplitude_squared_reversed);

524.//Make the pointer content equal amplitude_squared_reversed525.xlm_clear(m);526.xlm_init(m);527.xlm_set(m, amplitude_squared_reversed);

528.//Clean up529.xlm_clear(&amplitude_squared_reversed);530.}

531.void xlm_rationalize(xlm_t *m) {532.//Compute amplitude squared of m and store it on

mv_amplitude_squared533.xlm_t mv_amplitude_squared;534.xlm_init(&mv_amplitude_squared);535.xlm_set(&mv_amplitude_squared, *m);536.xlm_amplitude_squared(&mv_amplitude_squared);

537.//Compute amplitude squared reversed of m and store it on mv_amplitude_squared_reversed

538.xlm_t mv_amplitude_squared_reversed;539.xlm_init(&mv_amplitude_squared_reversed);540.xlm_set(&mv_amplitude_squared_reversed, *m);541.xlm_amplitude_squared_reversed(&mv_amplitude_squared_reversed);

542.//Compute geometric product of mv_amplitude_squared and mv_amplitude_squared_reversed and store it on mv_geometric_product

543.xlm_t mv_geometric_product;544.xlm_init(&mv_geometric_product);545.xlm_geometric_product(&mv_geometric_product,&mv_amplitude_squared

,&mv_amplitude_squared_reversed);

546.//Make the pointer content equal mv_geometric_product547.xlm_clear(m);548.xlm_init(m);549.xlm_set(m,mv_geometric_product);

550.//Clean up551.xlm_clear(&mv_amplitude_squared);552.xlm_clear(&mv_amplitude_squared_reversed);553.xlm_clear(&mv_geometric_product);554.}

555.void xlm_scalar_div(xlm_t *m, mpq_t scalar) {

161

556.mpq_div(m->m0, m->m0, scalar);557.mpq_div(m->m1, m->m1, scalar);

558.}

559.void xlm_inverse(xlm_t *m){560.//Compute clifford congugation of m and store it on clifford_conj561.xlm_t clifford_conj;562.xlm_init(&clifford_conj);563.xlm_set(&clifford_conj, *m);564.xlm_clifford_conjugation(&clifford_conj);


566.xlm_t mv_amplitude_squared;567.xlm_init(&mv_amplitude_squared);568.xlm_set(&mv_amplitude_squared, *m);569.xlm_amplitude_squared(&mv_amplitude_squared);

570.//Rationalize571.xlm_t mv_rationalize;572.xlm_init(&mv_rationalize);573.xlm_set(&mv_rationalize, *m);574.xlm_rationalize(&mv_rationalize);

575.xlm_t mv_geometric_product;576.xlm_init(&mv_geometric_product);577.xlm_set(&mv_geometric_product, *m);578.//Perform scalar div on geometric product579.xlm_scalar_div(&mv_geometric_product, mv_amplitude_squared.m0);

580.//Make the pointer content equal mv_geometric_product581.xlm_clear(m);582.xlm_init(m);583.xlm_set(m, mv_geometric_product);

584.//Clean up585.xlm_clear(&clifford_conj);586.//xlm_clear(&mv_amplitude_squared_reversed);587.//xlm_clear(&mv_geometric_product);588.xlm_clear(&mv_rationalize);589.}

590.//591.// xlg_massive_encryption.h

592.#ifndef xlg_massive_encryption_h593.#define xlg_massive_encryption_h

594.#include <stdio.h>595.#include "xlg.h"

596.void xlg_encrypt_file(char* src, char* dst, xlg_t xlg);

162

597.void xlg_decrypt_file(char* src, char* dst, xlg_t xlg);598.void xlg_append_encypted_data(char* dst_path, char* data_buffer,

xlg_t xlg);599.void xlg_encode(xlg_t xlg);600.void xlg_decode(xlg_t xlg);

601.#endif /* xlg_massive_encryption_h */

602.//603.// xlg_massive_encryption.c

604.#include "xlg_massive_encryption.h"605.int BUFFER_SIZE = 1024*10;

606.void xlg_encrypt_file(char* src, char* dst, xlg_t xlg){607.FILE *src_file = fopen(src, "rb");608.FILE *dst_file= fopen(dst, "wb");

609.while (!feof(src_file)) {610.//Read file611.long nread = 1;612.char buffer[BUFFER_SIZE];613.buffer[0] = 1;614.while(nread<BUFFER_SIZE-1 && !feof(src_file)){615.int c = getc(src_file);616.if(c!= EOF){617.buffer[nread]=c;618.nread++;619.}620.}621.//Import622.xlm_t message;623.xlm_init(&message);624.xlm_import_str_w_size(&message,buffer,nread);

625.gmp_printf("Message Values: %Qd + %Qd\n", message.m0, message.m1);

626.//Encrypt627.xlm_t cypher_xlm;628.xlm_init(&cypher_xlm);629.xlg_encrypt(&cypher_xlm, message, xlg);630.gmp_printf("Encrypted Values: %Qd + %Qd\n", cypher_xlm.m0,

cypher_xlm.m1);631.//Write to file632.xlm_out_raw(dst_file,cypher_xlm);

633.//Clean up634.xlm_clear(&message);635.xlm_clear(&cypher_xlm);636.}637.fclose(src_file);638.fclose(dst_file);639.}

163

640.void xlg_decrypt_file(char* src, char* dst, xlg_t xlg){641.FILE *src_file= fopen(src, "rb");642.FILE *dst_file= fopen(dst, "wb");

643.while(!feof(src_file)){644.//Read File645.xlm_t cypher_xlm;646.xlm_init(&cypher_xlm);647.size_t nread = xlm_inp_raw(&cypher_xlm, src_file);648.if(nread <=0){649.xlm_clear(&cypher_xlm);650.break;651.}

652.//Decrypt653.xlm_t decrypt;654.xlm_init(&decrypt);655.xlg_decrypt(&decrypt,cypher_xlm,xlg);656.gmp_printf("Decrypted Values: %Qd + %Qd\n", decrypt.m0,

decrypt.m1);

657.//Export658.long size;659.char* buffer = xlm_export_str(decrypt,&size);

660.//Write file661.long nwrite = 1;662.while(nwrite<size){663.putc(buffer[nwrite++], dst_file);664.}

665.//Clean up666.free(buffer);667.xlm_clear(&cypher_xlm);668.xlm_clear(&decrypt);669.}

670.fclose(src_file);671.fclose(dst_file);672.}

673.void xlg_append_encypted_data(char* dst_path, char* data_buffer, xlg_t xlg){

674.FILE *dst_file= fopen(dst_path, "ab");

675.char * buffer = malloc(strlen(data_buffer)+2);676.memset(buffer,0,strlen(data_buffer)+2);677.buffer[0]=1;678.buffer = strcat(buffer, data_buffer);

679.//Import680.xlm_t data;681.xlm_init(&data);682.xlm_import_str_w_size(&data,buffer,strlen(data_buffer)+2);

683.//Encrypt684.xlm_t cypher_xlm;685.xlm_init(&cypher_xlm);686.xlg_encrypt(&cypher_xlm, data, xlg);

164

687.//Write to file688.xlm_out_raw(dst_file,cypher_xlm);

689.//Clean up690.xlm_clear(&data);691.xlm_clear(&cypher_xlm);692.free(buffer);

693.fclose(dst_file);694.}

695.void xlg_encode(xlg_t xlg){

696.while (!feof(stdin)) {

697.long nread = 1;698.char buffer[BUFFER_SIZE];699.buffer[0] = 1; // THANKS

HANES!!!700.while(nread<BUFFER_SIZE-1 && !feof(stdin)){701.int c = getchar();702.if(c!= EOF){703.buffer[nread]=c;704.nread++;705.}706.}

707.//Import708.xlm_t message;709.xlm_init(&message);710.xlm_import_str_w_size(&message,buffer,nread);

711.//Encrypt712.xlm_t cypher_xlm;713.xlm_init(&cypher_xlm);714.xlg_encrypt(&cypher_xlm, message, xlg);

715.gmp_fprintf(stdout,"%Qd\n", cypher_xlm.m0);716.gmp_fprintf(stdout,"%Qd\n", cypher_xlm.m1);

717.xlm_clear(&message);718.xlm_clear(&cypher_xlm);719.}720.}

721.void xlg_decode(xlg_t xlg){722.FILE *stream;723.char *line = NULL;724.size_t len = 0;725.size_t read;

726.stream = stdin;727.if (stream == NULL)728.exit(0);

165

729.int count = 0;730.mpq_t m0;731.mpq_t m1;

732.xlm_t cypher;733.while ((read = getline(&line, &len, stdin)) != -1) {734.if(count%2 == 0){735.mpq_init(m0);736.mpq_set_str(m0,line,10);737.count++;738.}739.else if(count%2 == 1){740.mpq_init(m1);741.mpq_set_str(m1,line,10);742.count++;

743.xlm_init(&cypher);744.mpq_set(cypher.m0,m0);745.mpq_set(cypher.m1,m1);

746.//Decrypt747.xlm_t decrypt;748.xlm_init(&decrypt);749.xlg_decrypt(&decrypt,cypher,xlg);


753.long nwrite = 1; //Thanks Hanes754.while(nwrite<size){755.putchar(buffer[nwrite++]);756.}

757.count =0;758.xlm_clear(&cypher);759.xlm_clear(&decrypt);760.free(buffer);761.mpq_clear(m0);762.mpq_clear(m1);

763.}764.}

765.free(line);766.fclose(stream);767.}

166

Compilation:

8. Download math libraries and install them in the VM.

sudo apt-get install libmath-mpfr-perl

9. Install AES-Crypt executable

wget https://www.aescrypt.com/download/v3/linux/AESCrypt-GUI-3.11-Linux-x86_64-Install.gzgunzip AESCrypt-GUI-3.11-Linux-x86_64-Install.gzchmod +x AESCrypt-GUI-3.11-Linux-x86_64-Install./AESCrypt-GUI-3.11-Linux-x86_64-Install

10. Compile the code with following command.

gcc main.c xlg_compression.c xlm.c xlg.c xlg_massive_encryption.c -o xlg -lgmp -w

11. Using the following commands compared against AES-Crypt.

AES-Crypt:




AES-Crypt:



RVTHE:




167

Appendix D – Demonstrate RVTHE performance and improvement on cipher size

In this section Appendix, we demon….

once I design RVTHE then I converted into similar executable program like AES-

168

Crypt and run it on AWS VMs on file level encryption. RVTHE and AES-Crypt both

are symmetric encryptions and it is very much comparable to each other. file level

encryption and compared the results. Written similar type of program as AES-Crypt.

1. //2. // main.c

3. #include <stdio.h>4. #include "test_xlg.h"5. #include "test_xlm.h"6. #include "test_xlg_massive_encryption.h"

7. int main(int argc, const char * argv[]) {8. encrypt_decrypt_file(argc,argv);9. //10. // xlg.h11. // XLogos with MPQ

12. #ifndef xlg_h13. #define xlg_h

14. #include <stdio.h>15. #include "xlm.h"16. #include <time.h>

17. struct xlg_t{18. xlm_t key1;19. xlm_t key2;20. xlm_t key1_inverse;21. xlm_t key2_inverse;22. };23. typedef struct xlg_t xlg_t;

24. //============================== INIT and SET ==============================

25. void xlg_init(xlg_t *xlg);26. void xlg_generate_keys(xlg_t *xlg, int key_size);27. void xlg_set_keys(xlg_t *xlg, xlm_t k1, xlm_t k2);

28. //============================== OPERATIONS ==============================

29. void xlg_encrypt(xlm_t *dest_cypher, xlm_t message, xlg_t xlg);30. void xlg_decrypt(xlm_t *dest_decrypt, xlm_t cypher, xlg_t xlg);

31. //============================== UTILS ==============================

32. void xlg_clear(xlg_t *xlg);33. void xlg_print(xlg_t xlg);

34. #endif /* xlg_h */

169

35. //36. // xlg.c

37. #include "xlg.h"

38. //============================== INIT and SET ==============================

39. void xlg_init(xlg_t *xlg){40. xlm_init(&xlg->key1);41. xlm_init(&xlg->key2);42. xlm_init(&xlg->key1_inverse);43. xlm_init(&xlg->key2_inverse);44. }

45. void xlg_generate_keys(xlg_t *xlg, int key_size){46. gmp_randstate_t state;47. gmp_randinit_default(state);48. time_t t;49. gmp_randseed_ui(state, time(&t));

50. mpz_t key1_z;51. mpz_init(key1_z);52. mpz_urandomb(key1_z, state, key_size);53. xlm_t key1_m;54. xlm_init(&key1_m);55. xlm_set_z(&key1_m, key1_z);

56. mpz_t key2_z;57. mpz_init(key2_z);58. mpz_urandomb(key2_z, state, key_size);59. xlm_t key2_m;60. xlm_init(&key2_m);61. xlm_set_z(&key2_m, key2_z);

62. xlg_set_keys(xlg, key1_m, key2_m);

63. //Clean up64. mpz_clear(key1_z);65. mpz_clear(key2_z);66. xlm_clear(&key1_m);67. xlm_clear(&key2_m);68. gmp_randclear(state);69. }

70. void xlg_set_keys(xlg_t *xlg, xlm_t k1, xlm_t k2){71. //Set keys72. xlm_set(&xlg->key1,k1);73. xlm_set(&xlg->key2,k2);

74. //Set inverse keys75. xlm_t key1_inverse;76. xlm_init(&key1_inverse);77. xlm_set(&key1_inverse,k1);78. xlm_inverse(&key1_inverse);79. xlm_set(&xlg->key1_inverse, key1_inverse);80. xlm_clear(&key1_inverse);

81. xlm_t key2_inverse;

170

82. xlm_init(&key2_inverse);83. xlm_set(&key2_inverse,k2);84. xlm_inverse(&key2_inverse);85. xlm_set(&xlg->key2_inverse, key2_inverse);86. xlm_clear(&key2_inverse);

87. }

88. //============================== OPERATIONS ==============================

89. void xlg_encrypt(xlm_t *dest_cypher, xlm_t message, xlg_t xlg){90. //Encrypt91. xlm_t gp1_encryption;92. xlm_init(&gp1_encryption);93. xlm_geometric_product(&gp1_encryption,&xlg.key1,&message);

94. xlm_t cypher;95. xlm_init(&cypher);96. xlm_geometric_product_bivector(&cypher,&gp1_encryption,&xlg.key2);97. xlm_set(dest_cypher,cypher);

98. //Clean up99. xlm_clear(&gp1_encryption);100.xlm_clear(&cypher);101.}

102.void xlg_decrypt(xlm_t *dest_decrypt, xlm_t cypher, xlg_t xlg){103.//Decrypt104.xlm_t gp1_decryption;105.xlm_init(&gp1_decryption);106.xlm_geometric_product(&gp1_decryption,&cypher,&xlg.key2_inverse);107.xlm_t decrypt;108.xlm_init(&decrypt);109.xlm_geometric_product_bivector_vector(&decrypt,&xlg.key1_inverse,&

gp1_decryption);110.xlm_set(dest_decrypt,decrypt);

111.//Clean up112.xlm_clear(&gp1_decryption);113.xlm_clear(&decrypt);

114.}

115.//============================== UTILS ==============================

116.void xlg_clear(xlg_t *xlg){117.xlm_clear(&xlg->key1);118.xlm_clear(&xlg->key2);119.xlm_clear(&xlg->key1_inverse);120.xlm_clear(&xlg->key2_inverse);121.}

122.void xlg_print(xlg_t xlg){123.mpz_t key1;124.mpz_init(key1);125.xlm_get_z(&key1, xlg.key1);

126.mpz_t key2;127.mpz_init(key2);128.xlm_get_z(&key2, xlg.key2);

171

129.gmp_printf("key 1 => %Zd\n", key1);130.gmp_printf("key 2 => %Zd\n", key2);

131.mpz_clear(key1);132.mpz_clear(key2);133.}

134.//135.// xlm.h

136.#ifndef xlm_h137.#define xlm_h

138.#include <stdio.h>139.#include <stdlib.h>140.#include <string.h>141.#include <gmp.h>142.#include <math.h>143.#include "xlg_compression.h"

144.struct xlm_t {145.mpq_t m0;146.mpq_t m1;

147.};148.typedef struct xlm_t xlm_t;

149.//============================== INIT and SET ==============================

150.void xlm_init(xlm_t * dest);151.void xlm_set(xlm_t *dest, xlm_t src);152.void xlm_set_z(xlm_t * dest, mpz_t z);153.void xlm_set_si(xlm_t * dest, signed long int si);154.void xlm_import_str(xlm_t * dest,char* str);155.void xlm_import_str_w_size(xlm_t * dest,char* str, long size);

156.//============================== XLM EXPORT ==============================

157.void xlm_get_z(mpz_t *dest, xlm_t xlm);158.signed long int xlm_get_si(xlm_t xlm);159.char* xlm_export_str(xlm_t xlm, long *buffer_size);

160.//============================== UTILS ==============================

161.void xlm_print(xlm_t m);162.void xlm_clear(xlm_t * m);

163.void xlm_pack(mpz_t dst, xlm_t src);164.void xlm_unpack(xlm_t *dst, mpz_t src);165.size_t xlm_out_raw(FILE* stream, xlm_t src);166.size_t xlm_inp_raw(xlm_t *dst,FILE* stream);

167.//============================== OPERATIONS

172

==============================168.void xlm_geometric_product(xlm_t *dest, xlm_t * m0, xlm_t * m1);169.void xlm_geometric_product_bivector(xlm_t *dest, xlm_t * m0, xlm_t

* m1);170.void xlm_geometric_product_bivector_vector(xlm_t *dest, xlm_t *

m0, xlm_t * m1);171.void xlm_clifford_conjugation(xlm_t *m);172.void xlm_reverse(xlm_t *m);173.void xlm_amplitude_squared(xlm_t * m);174.void xlm_amplitude_squared_reversed(xlm_t * m);175.void xlm_rationalize(xlm_t *m);176.void xlm_scalar_div(xlm_t * m, mpq_t scalar);177.void xlm_inverse(xlm_t *m);178.void xlm_lambda_0(mpq_t *m0, xlm_t * mv1, xlm_t * mv2);179.void xlm_lambda_1(mpq_t *m1, xlm_t * mv1, xlm_t * mv2);180.void xlm_lambda_0_bivector(mpq_t *m0, xlm_t * mv1, xlm_t * mv2);181.void xlm_lambda_1_bivector(mpq_t *m1, xlm_t * mv1, xlm_t * mv2);182.void xlm_lambda_0_bivector_vector(mpq_t *m0, xlm_t * mv1, xlm_t

* mv2);183.void xlm_lambda_1_bivector_vector(mpq_t *m1, xlm_t * mv1, xlm_t

* mv2);

184.#endif /* xlm_h */

185.//186.// xlm.c

187.#include "xlm.h"

188.//============================== INIT and SET ==============================

189.void xlm_init(xlm_t * dest){190.mpq_init(dest->m0);191.mpq_init(dest->m1);

192.}

193.void xlm_set(xlm_t *dest, xlm_t src){194.mpq_set(dest->m0, src.m0);195.mpq_set(dest->m1, src.m1);196.}

197.void xlm_set_z(xlm_t * dest, mpz_t z){198.//Init base and reminder199.mpz_t base;200.mpz_init(base);201.mpz_t reminder;202.mpz_init(reminder);

203.//Compute values204.mpz_div_ui(base,z,2);205.mpz_mod_ui(reminder,z,2);

173

206.//Get reminder in mpq207.mpq_t reminder_mpq;208.mpq_init(reminder_mpq);209.mpq_set_z(reminder_mpq,reminder);

210.mpq_set_z(dest->m0,base);211.mpq_set_z(dest->m1, base);

212.mpq_add(dest->m1,dest->m1,reminder_mpq);

213.//Adjust coefficients214.if(mpz_cmp_ui(reminder,0) == 0){215.mpq_t mpq_1;216.mpq_init(mpq_1);217.mpq_set_ui(mpq_1,0,1);218.mpq_add(dest->m1, dest->m1, mpq_1);219.mpq_clear(mpq_1);220.}

221.mpz_clear(base);222.mpz_clear(reminder);223.mpq_clear(reminder_mpq);224.}

225.void xlm_set_si(xlm_t * dest, signed long int si){226.mpz_t z;227.mpz_init_set_si(z,si);228.xlm_set_z(dest,z);229.mpz_clear(z);230.}

231.void xlm_import_str(xlm_t * dest,char* str){232.mpz_t z;233.mpz_init(z);234.mpz_import(z,sizeof(str),1,sizeof(str[0]), 0, 0,str);235.xlm_set_z(dest,z);236.mpz_clear(z);237.}

238.void xlm_import_str_w_size(xlm_t * dest,char* str, long size){239.mpz_t z;240.mpz_init(z);241.mpz_import(z,size,1,sizeof(str[0]),0, 0,str);242.xlm_set_z(dest,z);243.mpz_clear(z);244.}

245.//============================== XLM GET ==============================

246.void xlm_get_z(mpz_t *dest, xlm_t xlm){247.mpz_t mpz_m0;mpz_init(mpz_m0);248.mpz_t mpz_m1;mpz_init(mpz_m1);

249.mpz_set_q(mpz_m0,xlm.m0);250.mpz_set_q(mpz_m1,xlm.m1);

251.mpz_add(*dest, mpz_m0, mpz_m1);

174

252.mpz_clear(mpz_m0);253.mpz_clear(mpz_m1);254.}

255.signed long int xlm_get_si(xlm_t xlm){256.mpz_t z;257.mpz_init(z);258.xlm_get_z(&z,xlm);259.signed long int si = mpz_get_si(z);260.mpz_clear(z);261.return si;262.}

263.char* xlm_export_str(xlm_t xlm, long *buffer_size){264.mpz_t z;265.mpz_init(z);266.xlm_get_z(&z,xlm);

267.//Alloc memory to destination buffer268.long size =sizeof(char);269.long nail = 0;270.long numb = 8*size - nail;271.long count = (mpz_sizeinbase (z, 2) + numb-1) / numb;272.char* buffer;273.buffer = malloc(count * size);

274.if(*buffer_size != NULL){275.*buffer_size =count * size;276.}

277.//Export to buffer278.mpz_export(buffer, NULL, 1, size, 0, nail, z);279.mpz_clear(z);

280.return buffer;281.}

282.//============================== UTILS ==============================

283.void xlm_clear(xlm_t * m){284.mpq_clear(m->m0);285.mpq_clear(m->m1);

286.}

287.void xlm_print(xlm_t m){288.gmp_printf("%+Qd e0 ", m.m0);289.gmp_printf("%+Qd e1 \n", m.m1);

290.}

291.void xlm_pack(mpz_t dst, xlm_t src){292.mpz_t m0_m1;293.mpz_t m0,m1;294.mpz_inits(m0_m1,m0,m1,NULL);

295.//Get mpz values of coefficients

175

296.mpz_set_q(m0,src.m0);297.mpz_set_q(m1,src.m1);

298.//Set absolute values299.mpz_abs(m0,m0);300.mpz_abs(m1,m1);

301.//Pair coefficients302.xlg_pair(dst, m0, m1);

303.//Pack signs of coefficients304.unsigned int sings = 0;305.sings = sings + (int)((mpq_cmp_si(src.m0,0,0)<0)? pow(2,7):0);306.sings = sings + (int)((mpq_cmp_si(src.m1,0,0)<0)? pow(2,6):0);

307.mpz_mul_ui(dst,dst,256);308.mpz_add_ui(dst,dst,sings);

309.mpz_clears(m0_m1,m0,m1,NULL);310.}

311.void xlm_unpack(xlm_t *dst, mpz_t src){312.mpz_t m0_m1;313.mpz_t m0,m1;314.mpz_inits(m0_m1,m0,m1,NULL);

315.//Get sings316.mpz_t signs_z;317.mpz_init(signs_z);318.mpz_mod_ui(signs_z,src,256);319.mpz_div_ui(src,src,256);320.unsigned long signs = mpz_get_ui(signs_z);

321.//Unpair coefficients

322.xlg_unpair(m0, m1, src);

323.//Adjust sign324.if((signs & 1) > 0)325.mpz_mul_si(m0,m0,-1);326.if((signs & 2) > 0)327.mpz_mul_si(m1,m1,-1);

328.//Set coefficients329.mpq_set_z(dst->m0,m0);330.mpq_set_z(dst->m1,m1);

176

331.mpz_clear(signs_z);332.mpz_clears(m0_m1,m0,m1,NULL);333.}

334.size_t xlm_out_raw(FILE* stream, xlm_t src){

335.mpz_t blades[2];336.for (int i = 0; i < 2; i++) {337.mpz_init(blades[i]);338.}

339.mpz_set_q(blades[0],src.m0);340.mpz_set_q(blades[1],src.m1);

341.size_t size = 0;342.for (int i = 0; i < 2; i++) {343.size += mpz_out_raw(stream,blades[i]);344.}

345.for (int i = 0; i < 2; i++) {346.mpz_clear(blades[i]);347.}348.return size;349.}

350.size_t xlm_inp_raw(xlm_t *dst,FILE* stream){

351.mpz_t blades[2];352.for (int i = 0; i < 2; i++) {353.mpz_init(blades[i]);354.}355.size_t rsize = 0;356.for (int i = 0; i < 2; i++) {357.rsize += mpz_inp_raw(blades[i],stream);358.}

359.mpq_set_z(dst->m0,blades[0]);360.mpq_set_z(dst->m1,blades[1]);

361.for (int i = 0; i < 2; i++) {362.mpz_clear(blades[i]);363.}

364.return rsize;365.}

366.//============================== OPERATIONS ==============================

367.void xlm_geometric_product(xlm_t * dest, xlm_t * m0, xlm_t * m1){368.xlm_lambda_0(&dest->m0,m0,m1);

177

369.xlm_lambda_1(&dest->m1,m0,m1);

370.}

371.void xlm_geometric_product_bivector(xlm_t * dest, xlm_t * m0, xlm_t * m1){

372.xlm_lambda_0_bivector(&dest->m0,m0,m1);373.xlm_lambda_1_bivector(&dest->m1,m0,m1);

374.}

375.void xlm_geometric_product_bivector_vector(xlm_t * dest, xlm_t * m0, xlm_t * m1){

376.xlm_lambda_0_bivector_vector(&dest->m0,m0,m1);377.xlm_lambda_1_bivector_vector(&dest->m1,m0,m1);

378.}



390.}



402.}


178


414.}



426.}




438.}



442.mpq_init(ma);443.mpq_init(mb);444.mpq_mul(ma,mv1->m1,mv2->m0);445.mpq_mul(mb,mv1->m0,mv2->m1);446.mpq_add(*m,*m,ma);447.mpq_add(*m,*m,mb);448.mpq_clear(ma);

179

449.mpq_clear(mb);

450.}

451.void xlm_clifford_conjugation(xlm_t *m) {452.mpq_t minus_one;453.mpq_init(minus_one);454.mpq_set_si(minus_one,1,-1);

455.//mpq_mul(m->m0,m->m0,minus_one);456.//mpq_mul(m->m1,m->m1,minus_one);457.//mpq_mul(m->m1,m->m1,minus_one);


460.void xlm_reverse(xlm_t *m) {461.mpq_t minus_one;462.mpq_init(minus_one);463.mpq_set_si(minus_one,1,-1);

464.//mpq_mul(m->m0,m->m0,minus_one);465.// mpq_mul(m->m0,m->m0,minus_one);466.mpq_mul(m->m1,m->m1,minus_one);


469.void xlm_amplitude_squared(xlm_t * m) {470.//suni gmp_printf("Input1: %Qd\n", m->m0);471.//suni gmp_printf("Input2: %Qd\n", m->m1);

472.//Compute clifford congugation of m and store it on clifford_conj473.xlm_t clifford_conj;474.xlm_init(&clifford_conj);475.xlm_set(&clifford_conj, *m);476.xlm_clifford_conjugation(&clifford_conj);477.//suni gmp_printf("Clifford Conjugate: %Qd\n", clifford_conj.m0);478.// suni gmp_printf("Clifford Conjugate: %Qd\n", clifford_conj.m1);479.//Compute geometric product of m and cg and store it on

amplitude_squared480.xlm_t amplitude_squared;481.xlm_init(&amplitude_squared);482.xlm_geometric_product(&amplitude_squared, m, &clifford_conj);

483.//suni gmp_printf("Amplitude squared: %Qd\n", amplitude_squared.m0);

484.//Make the pointer content equal amplitude_squared485.xlm_clear(m);

180

486.xlm_init(m);487.xlm_set(m, amplitude_squared);

488.//Clean up489.xlm_clear(&clifford_conj);490.xlm_clear(&amplitude_squared);491.}

492.void xlm_amplitude_squared_reversed(xlm_t * m) {493.//Compute amplitude squared of m494.xlm_t amplitude_squared_reversed;495.xlm_init(&amplitude_squared_reversed);496.xlm_set(&amplitude_squared_reversed, *m);497.xlm_amplitude_squared(&amplitude_squared_reversed);

498.//Compute reverse of mv_amplitude_squared and store it on mv_amplitude_squared

499.xlm_reverse(&amplitude_squared_reversed);

500.//Make the pointer content equal amplitude_squared_reversed501.xlm_clear(m);502.xlm_init(m);503.xlm_set(m, amplitude_squared_reversed);

504.//Clean up505.xlm_clear(&amplitude_squared_reversed);506.}

507.void xlm_rationalize(xlm_t *m) {508.//Compute amplitude squared of m and store it on

mv_amplitude_squared509.xlm_t mv_amplitude_squared;510.xlm_init(&mv_amplitude_squared);511.xlm_set(&mv_amplitude_squared, *m);512.xlm_amplitude_squared(&mv_amplitude_squared);


514.xlm_t mv_amplitude_squared_reversed;515.xlm_init(&mv_amplitude_squared_reversed);516.xlm_set(&mv_amplitude_squared_reversed, *m);517.xlm_amplitude_squared_reversed(&mv_amplitude_squared_reversed);

518.//Compute geometric product of mv_amplitude_squared and mv_amplitude_squared_reversed and store it on mv_geometric_product

519.xlm_t mv_geometric_product;520.xlm_init(&mv_geometric_product);521.xlm_geometric_product(&mv_geometric_product,&mv_amplitude_squared,

&mv_amplitude_squared_reversed);

522.//Make the pointer content equal mv_geometric_product523.xlm_clear(m);524.xlm_init(m);525.xlm_set(m,mv_geometric_product);

526.//Clean up527.xlm_clear(&mv_amplitude_squared);528.xlm_clear(&mv_amplitude_squared_reversed);

181

529.xlm_clear(&mv_geometric_product);530.}

531.void xlm_scalar_div(xlm_t *m, mpq_t scalar) {532.mpq_div(m->m0, m->m0, scalar);533.mpq_div(m->m1, m->m1, scalar);

534.}

535.void xlm_inverse(xlm_t *m){536.//Compute clifford congugation of m and store it on clifford_conj537.xlm_t clifford_conj;538.xlm_init(&clifford_conj);539.xlm_set(&clifford_conj, *m);540.xlm_clifford_conjugation(&clifford_conj);


542.xlm_t mv_amplitude_squared;543.xlm_init(&mv_amplitude_squared);544.xlm_set(&mv_amplitude_squared, *m);545.xlm_amplitude_squared(&mv_amplitude_squared);

546.//Rationalize547.xlm_t mv_rationalize;548.xlm_init(&mv_rationalize);549.xlm_set(&mv_rationalize, *m);550.xlm_rationalize(&mv_rationalize);

551.xlm_t mv_geometric_product;552.xlm_init(&mv_geometric_product);553.xlm_set(&mv_geometric_product, *m);554.//Perform scalar div on geometric product555.xlm_scalar_div(&mv_geometric_product, mv_amplitude_squared.m0);

556.//Make the pointer content equal mv_geometric_product557.xlm_clear(m);558.xlm_init(m);559.xlm_set(m, mv_geometric_product);

560.//Clean up561.xlm_clear(&clifford_conj);562.//xlm_clear(&mv_amplitude_squared_reversed);563.//xlm_clear(&mv_geometric_product);564.xlm_clear(&mv_rationalize);565.}

566.//567.// xlg_massive_encryption.h

568.#ifndef xlg_massive_encryption_h569.#define xlg_massive_encryption_h

182

570.#include <stdio.h>571.#include "xlg.h"

572.void xlg_encrypt_file(char* src, char* dst, xlg_t xlg);573.void xlg_decrypt_file(char* src, char* dst, xlg_t xlg);574.void xlg_append_encypted_data(char* dst_path, char* data_buffer,

xlg_t xlg);575.void xlg_encode(xlg_t xlg);576.void xlg_decode(xlg_t xlg);

577.#endif /* xlg_massive_encryption_h */

578.//579.// xlg_massive_encryption.c

580.#include "xlg_massive_encryption.h"581.int BUFFER_SIZE = 1024*10;

582.void xlg_encrypt_file(char* src, char* dst, xlg_t xlg){583.FILE *src_file = fopen(src, "rb");584.FILE *dst_file= fopen(dst, "wb");

585.while (!feof(src_file)) {586.//Read file587.long nread = 1;588.char buffer[BUFFER_SIZE];589.buffer[0] = 1;590.while(nread<BUFFER_SIZE-1 && !feof(src_file)){591.int c = getc(src_file);592.if(c!= EOF){593.buffer[nread]=c;594.nread++;595.}596.}597.//Import598.xlm_t message;599.xlm_init(&message);600.xlm_import_str_w_size(&message,buffer,nread);

601.//suni gmp_printf("Message Values: %Qd + %Qd\n", message.m0, message.m1);

602.//Encrypt603.xlm_t cypher_xlm;604.xlm_init(&cypher_xlm);605.xlg_encrypt(&cypher_xlm, message, xlg);606.//suni gmp_printf("Encrypted Values: %Qd + %Qd\n", cypher_xlm.m0,

cypher_xlm.m1);607.//Write to file608.xlm_out_raw(dst_file,cypher_xlm);

609.//Clean up610.xlm_clear(&message);611.xlm_clear(&cypher_xlm);612.}613.fclose(src_file);614.fclose(dst_file);

183

615.}

616.void xlg_decrypt_file(char* src, char* dst, xlg_t xlg){617.FILE *src_file= fopen(src, "rb");618.FILE *dst_file= fopen(dst, "wb");

619.while(!feof(src_file)){620.//Read File621.xlm_t cypher_xlm;622.xlm_init(&cypher_xlm);623.size_t nread = xlm_inp_raw(&cypher_xlm, src_file);624.if(nread <=0){625.xlm_clear(&cypher_xlm);626.break;627.}

628.//Decrypt629.xlm_t decrypt;630.xlm_init(&decrypt);631.xlg_decrypt(&decrypt,cypher_xlm,xlg);632.//suni gmp_printf("Decrypted Values: %Qd + %Qd\n", decrypt.m0,

decrypt.m1);


636.//Write file637.long nwrite = 1;638.while(nwrite<size){639.putc(buffer[nwrite++], dst_file);640.}

641.//Clean up642.free(buffer);643.xlm_clear(&cypher_xlm);644.xlm_clear(&decrypt);645.}

646.fclose(src_file);647.fclose(dst_file);648.}

649.void xlg_append_encypted_data(char* dst_path, char* data_buffer, xlg_t xlg){

650.FILE *dst_file= fopen(dst_path, "wb");

651.char * buffer = malloc(strlen(data_buffer)+2);652.memset(buffer,0,strlen(data_buffer)+2);653.buffer[0]=1;654.buffer = strcat(buffer, data_buffer);

655.//Import656.xlm_t data;657.xlm_init(&data);658.xlm_import_str_w_size(&data,buffer,strlen(data_buffer)+2);

659.//Encrypt

184

660.xlm_t cypher_xlm;661.xlm_init(&cypher_xlm);662.xlg_encrypt(&cypher_xlm, data, xlg);

663.//Write to file664.xlm_out_raw(dst_file,cypher_xlm);

665.//Clean up666.xlm_clear(&data);667.xlm_clear(&cypher_xlm);668.free(buffer);

669.fclose(dst_file);670.}

671.void xlg_encode(xlg_t xlg){

672.while (!feof(stdin)) {

673.long nread = 1;674.char buffer[BUFFER_SIZE];675.buffer[0] = 1; // THANKS HANES!!!676.while(nread<BUFFER_SIZE-1 && !feof(stdin)){677.int c = getchar();678.if(c!= EOF){679.buffer[nread]=c;680.nread++;681.}682.}

683.//Import684.xlm_t message;685.xlm_init(&message);686.xlm_import_str_w_size(&message,buffer,nread);

687.//Encrypt688.xlm_t cypher_xlm;689.xlm_init(&cypher_xlm);690.xlg_encrypt(&cypher_xlm, message, xlg);

691.gmp_fprintf(stdout,"%Qd\n", cypher_xlm.m0);692.gmp_fprintf(stdout,"%Qd\n", cypher_xlm.m1);

693.xlm_clear(&message);694.xlm_clear(&cypher_xlm);695.}696.}

697.void xlg_decode(xlg_t xlg){698.FILE *stream;699.char *line = NULL;700.size_t len = 0;701.size_t read;

702.stream = stdin;703.if (stream == NULL)704.exit(0);

185

705.int count = 0;706.mpq_t m0;707.mpq_t m1;

708.xlm_t cypher;709.while ((read = getline(&line, &len, stdin)) != -1) {710.if(count%2 == 0){711.mpq_init(m0);712.mpq_set_str(m0,line,10);713.count++;714.}715.else if(count%2 == 1){716.mpq_init(m1);717.mpq_set_str(m1,line,10);718.count++;

719.xlm_init(&cypher);720.mpq_set(cypher.m0,m0);721.mpq_set(cypher.m1,m1);

722.//Decrypt723.xlm_t decrypt;724.xlm_init(&decrypt);725.xlg_decrypt(&decrypt,cypher,xlg);


729.long nwrite = 1; //Thanks Hanes730.while(nwrite<size){731.putchar(buffer[nwrite++]);732.}733.count =0;734.xlm_clear(&cypher);735.xlm_clear(&decrypt);736.free(buffer);737.mpq_clear(m0);738.mpq_clear(m1);

739.}740.}

741.free(line);742.fclose(stream);743.}

Compilation:

186

13. Download math libraries and install them in the VM.

sudo apt-get install libmath-mpfr-perl

14. Install AES-Crypt executable

wget https://www.aescrypt.com/download/v3/linux/AESCrypt-GUI-3.11-Linux-x86_64-Install.gzgunzip AESCrypt-GUI-3.11-Linux-x86_64-Install.gzchmod +x AESCrypt-GUI-3.11-Linux-x86_64-Install./AESCrypt-GUI-3.11-Linux-x86_64-Install

15. Compile the code with following command.

gcc main.c xlg_compression.c xlm.c xlg.c xlg_massive_encryption.c -o xlg -lgmp -w


AES-Crypt:



RVTHE:




17. Sample Output of created encrypted file:

187

18. Cipher text size and contents:

19. Sample Performance Metrics:

20. Various Type Files and their encryption and decryption outputs:

.TXT: The below screenshot includes append option.

188

.JPEG:

.PDF:

189

.DOCX:

190

.XLSX:

.PPTX:

All of the above worked with RVTHE encryption met

191

Appendix E – Acronym List

Abbreviation Term HDD Hard Disk Drive SATA Serial AT AttachmentSSD Solid State DriveFDE Full-disk encryptionAES Advanced Encryption Standard DES Data Encryption Standard TDEA Triple Data Encryption AlgorithmRSA Rivest–Shamir–AdlemanMD5 Message-digest algorithmSHA Secure Hash Algorithm CBC Cipher Block ChainingCTR Counter GCM Galois/Counter Mode OCB Offset Codebook ModeECB Electronic Codebook OFB Output FeedbackAWS Amazon Web ServicesNIST National Institute of Standards and TechnologyESD Every Stage of Data FHE Fully Homomorphic Encryption RVTHE Reduced Vector Technique Homomorphic Encryption SSL Secure Sockets Layer UCCS University of Colorado, Colorado Springs

Regeldokument - Linnéuniversitetetcs.uccs.edu/.../stedla/doc/OGNSuneethaTedlaPhDThesisV3.docx ·...

Documents

Transcript of Regeldokument - Linnéuniversitetetcs.uccs.edu/.../stedla/doc/OGNSuneethaTedlaPhDThesisV3.docx ·...