Post on 17-Jul-2020
1
Search for Quadratic Residue Detector
using Artificial Neural Networks
by Ashish Reddy
A Project Report Submitted in
Partial Fulfillment of the Requirements for the Degree of
Master of Science in
Computer Science
Supervised by Dr. Leon Reznik
Department of Computer Science B. Thomas Golisano College of Computing and Information Sciences
Rochester Institute of Technology Rochester, New York
August 2017
2
ABSTRACT
The quadratic residue problem is one of the unsolved np- problem in the field of
mathematics. Various data classification and data clustering techniques have been used
in the past but they all failed to find a pattern in this problem. Many classification
techniques failed to generate promising results and this called the need for a neural
network model. Artificial neural network is a generic technique that has been providing
the solution for different open problems. It eliminates the need for human interaction
and makes a decision by selectively choosing weights for connection lines between the
inputs. In a way, the network gets trained internally as the data is fed to the system. The
network architecture acts as a black box to the problem and it is one of the best solution
one could try to solve the quadratic residue problem. The Jacobi algorithm is one of the
best known algorithms to determine if one number is a quadratic residue of another
number. However, this algorithm has limitations. When the modular base number
consists of only 2 safe prime factors, the algorithm fails. So using the deductions from the
Jacobi algorithm, we generated the features which act as the input to the network model.
The model has multiple hidden layers and a detailed analysis has been done by varying
the hidden layers and increasing the module base number. From large numbers in the
order of 105 to very large numbers in the order of 109, a variety of numbers have been
tested so as to analyze the network model. For very large numbers, atleast 3 trials were
made before studying the results. After each trial, the weights and other variables are
stored and the values are restored before the next trial. As the number of iterations
increase, an increase in the accuracy level suggests a pattern detection for this problem.
The paper presents 3D plots to analyze the results based on number of hidden layers and
the number of iterations ran till that point. The report suggests the optimum model
architecture based on the run time and the accuracies generated at different levels.
3
Table of Contents
ABSTRACT .....................................................................................................................................2
1. INTRODUCTION .............................................................................................................................4
1.1 EULER’S CRITERION ..................................................................................................................4
1.2 LEGENDRE SYMBOL ..................................................................................................................5
1.3 JACOBI SYMBOL .......................................................................................................................5
2. ALGORITHMS AND ANN.................................................................................................................7
2.2 ARTIFICIAL NEURAL NETWORKS ................................................................................................8
2.3 TENSORFLOW...........................................................................................................................9
3. DESIGN MODEL ........................................................................................................................... 12
3.1 GENERATING DATA ................................................................................................................ 13
4. RESULTS ...................................................................................................................................... 14
4
1. INTRODUCTION
In Number theory, an integer q is a quadratic residue modulo n if and only if the
following two conditions hold:
• The greatest common divisor of q and n should be 1.
• q is congruent to a perfect square modulo n.
𝑋2 ≡ q (mod n)
Where 𝑋 is an integer in the range, 0~𝑋 < q
If any of the above conditions fail, q is quadratic nonresidue of n.
For instance, to compute all quadratic residues of 13.
Brute Force: For X in range 0 < X < 13
• Computing the residues of all squares modulo 13,
1, 4, 9, 3, 12, 10, 10, 12, 3, 9, 4, 1
Quadratic Residues are: 1, 3, 4, 9, 10, 12.
If n is an odd prime, all the quadratic residues occur exactly twice. For an odd prime n,
the number of quadratic residues will be (n-1)/2.
1.1 Euler’s Criterion: [1]
For an odd prime p, with p not dividing a, then a is a quadratic residue (mod p) if and
only if:
𝑎𝑝−1
2 ≡ 1 (mod p)
The residue status is defined as:
𝑟𝑖 = 𝑎𝑝−1
2 (mod p)
if 𝑟𝑖 = 1, a is quadratic residue of p.
if 𝑟𝑖 = -1, a is quadratic nonresidue of p.
if 𝑟𝑖 = 0, a and p are not relatively prime.
5
1.2 Legendre Symbol: [1]
For an odd prime p and let a 𝜖 𝑍, we define Legendre symbol (𝑎
𝑝) as:
(𝑎
𝑝) = 1 if a is a quadratic residue of p.
= -1 if a is a quadratic non residue of p.
Thus the Legendre symbol can be calculated by:
(𝑎
𝑝) ≡ 𝑎
𝑝−1
2 (mod p)
Properties of Legendre Symbol: [1]
• If a ≡ b (mod p), then (𝑎
𝑝) = (
𝑏
𝑝)
• (1
𝑝) = 1 and (
0
𝑝) = 0
• (𝑎
𝑝) (
𝑏
𝑝) = (
𝑎𝑏
𝑝)
• (𝑎
𝑝) (
𝑎
𝑞) = (
𝑎
𝑝𝑞)
• (2𝑎
𝑝) = (
𝑎
𝑝) if p = ±1 (mod 8), otherwise (
2𝑎
𝑝) =− (
𝑎
𝑝)
• If both a and p are odd, then:
(𝑎
𝑝) = (
𝑝
𝑎) if a ≡ 1 (mod 4) or p ≡ 1 (mod 4)
(𝑎
𝑝) = −(
𝑝
𝑎) if a ≡ 3 (mod 4) and p ≡ 3 (mod 4)
1.3 Jacobi Symbol:
Let the factorization of n be ∏ 𝑝𝑖𝑒𝑖𝑘
𝑖=1
(𝑎
𝑛) = (
𝑎
𝑝1)𝑒1
(𝑎
𝑝2)𝑒2
….(𝑎
𝑝𝑘)𝑒𝑘
The residue status for each of the primes is calculated and multiplied.
6
But if the prime factorization of n is p and q (only 2 primes), then the Jacobi symbol
cannot not be used to decide if a number a is a quadratic residue of n. It is not always
true that a is a quadratic residue of n even though (𝑎
𝑛) is equal to 1. [2]
For instance, to check if 8 a quadratic residue of 15.
(8
15) = (
8
3) (
8
5)
= -1 (3
5)
= -1 (2
3) since 5 ≡ 1 (mod 4)
= (-1) (-1)
= 1.
The residue status calculated by the Jacobi symbol is 1. But there exists no integer X such
that 𝑋2 ≡ 8 (mod 15). So 8 is a quadratic non residue of 15.
If the number of prime factors of n are 2 (u and v), then the residue status calculated for
each of those primes ru and rv can be -1. Then the overall residue status would be 1
indicating that it is a quadratic residue. But if both ru and rv are -1, then a is not a quadratic
residue of n. In the case where both ru and rv are 1, a is a quadratic residue of n. Both the
cases are equally probable when the overall residue status is 1. Hence, if a is quadratic
residue mod n, then a is a quadratic residue mod p for all primes dividing n [3].
7
2. ALGORITHMS AND ANN
According to the definition of quadratic residue, to check the GCD condition for the two
numbers we use the following function. The function returns the greatest common divisor
of the two numbers a and b.
Figure 1: GCD Algorithm
2.1 Jacobi Algorithm:
One of the best algorithms to determine if one number is a quadratic residue of another
number is the Jacobi algorithm [4]. The Jacobi residue is computed by the below function.
The function returns either 0, 1 or -1 for different values of a and n. If the function returns
1, then a is a quadratic residue of n and if the function returns -1 then a is a quadratic non
residue of n.
As explained in the above section, the algorithm fails when the prime factorization of the
modular base n has only 2 safe primes. There is 50% probability in determining the
quadratic residue status when the function returns a value of 1. In the case where the
below function returns -1, we can be sure that a is the quadratic non residue of n.
8
Figure 2: Jacobi Algorithm
2.2 Artificial Neural Networks:
Artificial neural networks are nothing but computing systems whose mechanism is similar
to that of a biological neural network. They comprise of interconnected artificial nodes
which acts as processing units. These neurons contain an activation function and
summing function for all the inputs. Weights and known biases are added to acknowledge
priority of some inputs over others. These nodes and their calibrated behavior help in
tackling the various problems like voice recognition, object tracking, pattern recognition
etc.
9
Figure 3: Neural Network Architecture
The working of neural networks can be described as
• Assigning initial weights to all inputs for a given perceptron.
• Using the summation function finding the activated neurons in each network.
• Each activated neuron is then used as input for its succeeding layer.
• Finding the activation rate for the output layer for a given network.
• Find the error rate in the output layer and update weights in the nodes accordingly.
• Continue until convergence is achieved.
2.3 TensorFlow:
TensorFlow is an open source machine learning library developed by Google. Google uses
TensorFlow algorithm in almost most of the applications such as image search, translation
and pattern recognition. It is developed to build and train neural networks which are used
to detect patterns. The library was built to scale so that it could run on multiple GPU’s
10
and CPU’s [5]. It has several wrappers using different languages. Python is one of the most
commonly used languages to build applications. The library is also customized to run on
mobile operating systems. In tensor-flow, the model is represented as a data flow graph
and the graph contains a set of nodes. Each node has a computational operation
associated with it. Every node takes in a tensor as an input and outputs a tensor. The data
is represented in the form of a tensor. It is a multidimensional array of numbers which
flow through the nodes [5]. The version of the TensorFlow used here is 1.2.0.
Figure 4: TensorFlow Architecture
2.4 Literature Review:
Many researchers have tried using classification and clustering techniques to solve the
quadratic residue problem. Decision tree and random forest models were also tested to
find any patterns. Random forest, of all the clustering and classification mechanisms,
seems to find the best solution for this problem. The theory of quadratic residue problem
is applied to different cloud systems [6]. Recently a paper was published which uses
11
attributed based encryption using quadratic residue for the big data in cloud
environment. The encryption should be securely tightly such that an unauthorized access
cannot get into the cloud. Attribute based encryption is an efficient access control
mechanism to guarantee the security for large amounts of information in the cloud.
Another case where ANN is useful is to measure sensitivity, prediction uncertainty and
other figure estimates [7]. Convolutional Neural Networks have been very successful in
image processing. The models used in the neural networks are being efficiently designed
to self-tune according to the problem statement and train it accordingly. In Convolutional
Neural Network (CNN), the mechanism is changed in one or two layers according to the
problem [8]. The rest of the layers are adjusted according to the training data and the
ability to transfer the network model from one problem to another problem can be easily
achieved. This is one of the major advantages and more research is being carried out in
the present world to build the models which could eliminate the need for human
interaction [7]. CNN is mainly used in image recognition. A lot of research has been done
in the past where CNN was used to read the image or translating hand written text [8].
Neural networks are widely used in bio-informatics [9]. With the help of 2D recursive
neural networks, an accurate prediction can be made on the inter-residue distances in
proteins [10]. Last summer Michael Potter under the guidance of Dr. Leon Reznik has
started this project [11]. We have continued the project to test the model for bigger
numbers and analyze the model by changing the parameters. The number of hidden
layers used were varied to comprehensively study the model.
12
3. DESIGN MODEL
The network model was designed using the TensorFlow library. The network used is a
multilayer perceptron which means it belongs to the feed forward artificial neural
network class. In a feed forward neural network, there are no cycles formed and the
output from one layer sends information to the layers which are in the forward direction.
The model in general has one input layer, multiple hidden layers, one dropout layer and
one soft-max output layer.
Each hidden layer consists of 100 neurons. All the neurons have the hyperbolic tangent
activation. The dropout layer employs the technique where randomly some neurons are
ignored in each of the iterations so as to establish a direct relation between the neurons
in last hidden layer and the output layer. This technique is used so that the network model
does avoid overfitting or underfitting scenarios.
Each of the layers in the network has weights associated for all the connections between
the neurons. The TensorFlow object could be used to initialize all the system variables.
Before creating the different hidden layers, the ‘weightvariable’ and ‘biasvariable’
methods are to be initialized [12]. The last hidden layer is connected to the dropout layer
and the dropout layer is connected to the soft-max output layer. The cross-entropy and
trueclass training functions have to be initialized to train the network.
A session is created before training the model and the progress is saved after every 1000
iterations. A checkpoint file is saved in the local directory for every 1000 iterations. It is
stored in the meta data format in machine language. When the program is re-run, it
checks for the last saved checkpoint in the local directory and restores the previous
session. The stored values for weight and bias variable are picked up from the last
iteration. It is essentially resuming the network program. The performance and error
values are computed after every 1000 iterations. The testing and training accuracies along
13
with the confusion matrix can be analyzed at different points in the process. If the training
or testing accuracies near 99.99% for 10 times in a row, we stop the process indicating a
sign of overfitting.
3.1 Generating Data:
The complexity of the problem depends on the size of modular base n and the number of
layers in the network. For a base n, we consider all the integers from 1 to n. Each of the
numbers is checked for the quadratic residue condition using the GCD and the Jacobi
algorithm. Once the residues and non-residues are separated, we generate features for
each of the numbers and assign a class (either residue or non-residue) accordingly. Data
from both the lists are added to a single data list and the data is shuffled. The data is
divided into 80% training and 20% testing data sets randomly. As the value of n increases,
the number of instances in the data set increases by a huge margin. The higher number
of instances would help the network detect any patterns in the data.
The features generated were deduced from the Jacobi algorithm and these set of features
gave promising results when used on ANN. The features used were q (mod 2), n (mod 4),
n (mod 8), n (mod 7), a (mod atemp) and atemp (mod 4). When additional features were
added, the run time, for the network with 3 or 4 hidden layers, was very high. Hence the
above features generated by the Jacobi algorithm was used to compare and analyze the
network [13].
For very large numbers, when we generate numbers 1 to n and use it as the data to train
and test the model there would be lot of instances. This would take significant amount of
time to run all the testing and training data. So to reduce the run time, we consider only
70% of those data instances. This will be helpful to analyze the accuracies for very large
numbers.
14
4. RESULTS
The neural network was trained with different sets of numbers. A good number of
numbers were used so that the model could be analyzed on various levels. The same set
of numbers were also tested on the model with 2 hidden layers, 3 hidden layers and 4
hidden layers. Initially the model was designed with 1 input layer, 2 hidden layers, 1
dropout layer and 1 output layer. When the value of n is 108733, the testing accuracy and
training accuracy after 100000 iterations were 55.46% and 53.10%. The accuracies were
just above 50% which was slightly better than the accuracy provided by the Jacobi
algorithm. The network was tested with different numbers and the accuracies achieved
after 100000 iterations are shown in the below table.
For 2 hidden layers:
Input number Training Accuracy Testing accuracy Run time
108733 55.46%
53.10%
~1 hour
124573 46.48%
53.24%
~1 hour
125321 50.21% 52.96% ~1 hour
401882 52.16% 53.49%
~2 hours.
706777 49.03% 51.44%
~2 hours
848329 52.07%
51.72% ~2 hours
1307377 51.29% 52.91% ~3 hours
1551937
50.18% 51.25% ~3 hours
15
When the number n was increased to 10480741, the run time increased exponentially.
Every 1000 iterations would take about a couple of hours. So I decided to store the
checkpoints after the 1st trail. The 1st trail took around 8 hours. The testing and training
accuracies after 4000 iterations were 49.79% and 51.03%. For the next trial, I have
initialized the network to take those weights which were previously stored after the 1st
trial. This would effectively mean the count of iteration would resume from the last
iteration. Similarly, the 2nd trail and 3rd trail also took approximately 8 hours and an
increase in the testing accuracy can be noted. Even though the increase in testing
accuracy in very little, the increase suggests that the network is being trained in the
positive direction. If the number of resources allocated to run the program increase, we
can hope that the time taken for every 1000 iterations would decrease and the accuracies
could be compared after a large number of iterations. A slight dip in the training accuracy
during the 2nd and 3rd trail can be neglected and the number of instances considered are
very high. In totality there is an increase in the training accuracy as the number of trails
increased.
For 2 hidden layers:
Trials Input number Training Accuracy Testing accuracy Run time
1 10480741 49.79% 51.03%
~8 hours
2 10480741 51.13% 51.25% ~8 hours
3 10480741 51.09% 52.89% ~8 hours
Since the testing and training accuracies were not high with 2 hidden layers, one more
hidden layer was added to the model between the second hidden layer and the dropout
layer. In most of the problems, 3 hidden layers seems to be a good choice. The model
was initialized again to train the network. The same set of numbers were tested with 3
16
hidden layers also so as to compare the performance with 2 hidden layers. There was a
significant increase in training and testing accuracies. When the value of n is 124573, the
training and testing accuracies after 100000 iterations are 92.57% and 91.73%. For other
numbers also the training and testing accuracies were noted after 100000 iterations.
When the value of n was 1076437 or 1551937, the training and testing accuracies were
around 60%. As these prime numbers were large, there might be lot of instances in the
training and testing sets. Since the model has 3 hidden layers, time taken for each
iteration increases and we would be needing a large number of iterations to analyze the
accuracy values. There was not a significant increase in accuracy for large prime numbers
when it was compared to the model with two hidden layers. The run time for the model
with three hidden layers does not vary much when compared to that of the model with
two hidden layers. The following table presents the accuracy results achieved with 3
hidden layers.
For 3 Hidden layers (after 100000 iterations):
Input number Training Accuracy Testing accuracy Run time
124573 92.57%
91.73% ~1 hour
125321 97.65%
96.70% ~1 hour
401882 96.91% 96.99% ~2 hours
848329 95.84% 95.14% ~2 hours
706777 82.78% 76.20% ~2 hours
1076437
62.81%
61.82% ~4 hours
1551937 53.61% 52.18% ~4 hours
17
For these large numbers (1076437 and 1551937), as the number of iterations increased
there was a sign that the accuracies increased steadily. Below table presents the
accuracies values after 200000 iterations.
For 3 Hidden layers (after 200000 iterations):
Input number Training Accuracy Testing accuracy Run time
1076437
72.63%
70.28% ~9 hours
1551937 70.91% 68.01% ~9 hours
With 3 hidden layers, the testing and training accuracies for n = 10480741 did not
increase. The accuracies were noted after 4000 iterations in each trail. Each trial took
around 10 hours. Both the accuracies increased steadily with 3 hidden layers, same as in
the case with 2 hidden layers.
Trials Input number Training Accuracy Testing accuracy Run time
1 10480741 48.97%
49.73% ~10 hours
2 10480741 51.34% 50.53% ~10 hours
3 10480741 50.91% 51.48% ~10 hours
To have a better analysis of the model with 3 hidden layers, the model was also tested on
various large prime numbers which were of the order p*q (where p and q are both safe
primes). The accuracies were in consistent with the results obtained above. Below table
specifies the testing and training accuracies after 3 trials, where each trail again had 4000
iterations. Each trial took around 16 hours.
18
Trails Input number Training Accuracy Testing accuracy
3 (Total 12000 iterations) 11300137 49.89% 51.21%
3 (Total 12000 iterations) 12044101 48.70% 50.19%
3 (Total 12000 iterations) 11423917 50.78% 52.71%
Both the testing and training accuracies have reached more than 90% with 3 hidden layers
for numbers in the order of 105. However, for larger prime numbers the accuracies were
nearly 60%. Adding another hidden layer would mean that the run time would increase
significantly for larger prime numbers. The fourth hidden layer is added between the third
hidden layer and the dropout layer. The accuracies were almost in the same range for
that of the 3 hidden layers. There was a dip in both the accuracies when the value of n
was 706777. This may be due to the fact that it is not a strong prime number. The run
time for smaller prime number of order 106 did not change much but as the number
increased to 1076437 the run time increased steeply. The testing accuracy when n was
1076437 was barely above 50%.
For 4 hidden layers:
Input number Training Accuracy Testing accuracy Run time
108733 93.75% 93.55%
~1 hour
124573 96.10% 95.35% ~1 hour
125321 99.61% 98.89% ~1 hour
401882 98.15%
98.34%
~3 hours
706777 78.67% 78.26% ~3 hours
19
848329 96.02%
94.28%
~3 hours
1076437 56.08%
50.22% ~10 hours
When the value of n was increased to very large prime numbers, the run time for around
3000 iterations almost reached 16 hours. It took considerable amount of time as the
network was trained and tested on a dual core system. Both the testing and training
accuracies were barely above 50% again. The high run time can also be attributed to the
fact that there are four hidden layers and one dropout layer between the input and
output layers.
For these large numbers (1076437 and 1551937), as the number of iterations increased,
the network trained better and provided better results. Below table presents both the
accuracy values after 200000 iterations.
For 4 Hidden layers (after 200000 iterations):
Input number Training Accuracy Testing accuracy Run time
1076437
80.32%
78.81% ~18 hours
1551937 74.47% 76.19% ~18 hours
With 4 hidden layers, the model was training better. But for n = 10480741, the accuracies
did not increase to a great extent. The accuracies were noted after 3000 iterations in each
trail. Each trial took around 16 hours. Both the accuracies increased steadily in this case
and as the model gets trained for more time with different numbers of the same order,
we can expect better results.
20
Trials Input number Training Accuracy Testing accuracy Run time
1 10480741 50.71%
50.03% ~16 hours
2 10480741 51.53% 52.35% ~16 hours
3 10480741 50.09% 52.48% ~16 hours
When the model was tested on other very large numbers, the accuracies were similar to
the results obtained above. The accuracies were noted after 3 trails where each trail had
3000 iterations. The run time for each of the trails was around 16 hours. The checkpoints
were saved after each trail and the next trail would restore the previous saved variables
to resume the process.
Trails Input number Training Accuracy Testing accuracy
3 (Total 9000 iterations) 11300137 50.35% 51.28%
3 (Total 9000 iterations) 12044101 51.90% 49.91%
To try very large numbers in the order of 109, the model with 2 or more hidden layers was
not able to generate any results even after a run time of 20 hours. This is due to limited
resource allocation running on dual core. The program could to testing on a parallelized
distributed system to run faster. When the model was designed to have only one hidden
layer, very large numbers in the order of 109 gave initial iteration results. The number of
instances was reduced to 70% of the total instances originally used. The accuracies for
these very large numbers with one hidden layer network were in the range of 40-42%(less
than what could be obtained by random guessing on the Jacobi algorithm). The network
would need Atleast two hidden layers to outperform random guessing and random forest
21
implementation. Below table presents the training and testing accuracies for some
numbers. The accuracies were noted after 200000 iterations.
For 1 hidden layer:
Input number Training Accuracy Testing accuracy Run time
1307377 45.62% 47.49%
~3 hours.
1551937
46.70% 46.32%
~3 hours
For larger numbers like 10480741 or 11300137, the testing and training accuracies were
noted after 100000 iterations. The program was stopped as it was showing better results
as the number of iterations increased. For very large numbers like 101100721, both the
accuracies were noted after 10000 iterations. Again the number of instances were
reduced to 70%, to analyze the confusion matrix.
For 1 hidden layer: Input number Training Accuracy Testing accuracy Run time
10480741 40.78% 39.51% ~12 hours
11300137 40.29% 41.21% ~12 hours
101100721 40.12% 40.49% ~20 hours
The training and testing accuracies were plotted against the number of iterations run on
the network model. It is evident that the network is training in a positive direction and 3
hidden layers would be appropriate for numbers in the order of 107 and 108. The
accuracies are plotted after 0, 25000, 50000 and 100000 iterations.
22
Figure 5
Figure 6
23
The plots in the above graphs show that there is a dip in accuracies for some numbers in
initial set of iterations. It may be that number of instances in training and testing are less
for those iterations and finally the model shows that there is a drastic improvement in
the accuracy. It was also tested for other bigger numbers with less iterations to be sure
that the network model is appropriate. The accuracies for smaller numbers reach as high
as 97% but for large and larger numbers, a lot of iterations would be needed to be able
to get some meaningful results. As the module base number n increases to high powers
of 8 or 9, the time taken for each iteration also increases. It would take a lot of time to
analyze the accuracies after around 100000 iterations. So we would be needing a
distributed system to synchronize the threads for each process. This would make the
program to run faster for very large numbers.
Figure 7
24
3-Dimensional graphs were plotted so that the network model could be studied based on
3 parameters. The X axis had different numbers which were used as modular bases. The
Y axis presents the number of hidden layers in the model and the Z axis plots the Training
and Testing accuracies individually. These would suggest the appropriate number of
layers which have to be used based on the base number which is inputted.
Figure 8
In the above three-dimensional graph we have the base number plotted along the X-axis. We
tried to plot the number of iterations on the X axis, to analyze how the model was performing
with different numbers using different layers. We have noted the training and testing accuracies
25
after 50000 and 100000 iterations. The same set of numbers are being considered so that it
would be easy to compare the different performance levels.
Figure 9
The above plot shows that for some numbers there is a dip in accuracy in initial set of iterations.
Even though the number of iterations are increasing, the accuracies go down because they try to
correlate the neurons in 2 different layers. The weights and bias variables gets adjusted with
more and more training and eventually, we can see that the accuracies are consistently
increasing. The graph for testing accuracies is similar to that of the testing accuracies 3D plot.
26
5. CONCLUSION
In the past, Artificial Neural Networks have been very successful in detecting patterns for
images and complex problems. The quadratic residue problem has been an unsolved
problem for many years. Many people tried classification techniques like random forest
and others. The need for Artificial Neural Network was recognized and a number of
attempts were made to solve this problem.
The network architecture designed in this paper was varied by changing the number of
hidden layers in the network. The order of the modulus base does matter a lot in selecting
the number of hidden layers. Numbers, in the order of 108 or 109, were impossible to train
using a four hidden layer network running on a single system. When the accuracies were
analyzed using more than one hidden layer, the results showed signs of increment and
more training time would result in better performance of the model. Smaller numbers, in
the order of 105 or 106, worked very good on the three hidden layer model. The accuracies
after regular intervals of time were noted and analyzed. Although for smaller numbers
the time taken by two hidden layers or three hidden layers did not differ much but the
accuracies were comparatively high with three hidden layers. Two hidden layers is not a
good choice for smaller numbers in the order of 106. The tests were also conducted on
the 4 hidden layer model to compare both the testing and training accuracies. For very
large numbers in the order of 109, the model was able to generate some results with only
one hidden layer. With more than one hidden layer, the iterations were not processing
due to high run time and limited resource allocation.
To test the model with very large numbers, the resource allocation has to be increased.
The program would be needing a parallelized distributed system, in which case, the run
time for each iteration could increase drastically. This would give a better chance to
analyze the result after a significant number of iterations.
27
REFERENCES
1) Mathworld Wolfram – QuadraticResidue.
2) Fjellstedt, Lars. A theorem concerning the least quadratic residue and non-residue.
Ark. Mat. 3 (1956), no. 3, 287--291.doi:10.1007/BF02589415.
http://projecteuclid.org/euclid.afm/1485893276. 2001.
3) Troy Michael John Vasiga (2008). Error Detection in Number-Theoretic and Algebraic
Algorithms. UWSpace. http://hdl.handle.net/10012/3895,
4) Bruce Schneier, Applied Cryptography: Protocols, Algorithms and Source Code in C.
2nd edition, New York: Wiley, c1996.
5) Rajat Monga, Engineering director TensorFlow — Deep Learning with TensorFlow.
6) Balaji Chandrasekaran and Ramadoss Balakrishnan, Attribute Based Encryption Using
Quadratic Residue for the Big Data in Cloud Environment, Proceedings of the
International Conference on Informatics and Analytics Article No. 19. Pondicherry,
India — August 25 - 26, 2016.
7) Franco Allegrini and Alejandro C. Olivieri Sensitivity — Prediction Uncertainty, and
Detection Limit for Artificial Neural Network Calibrations, Anal. Chem., 2016, 88 (15),
pp 7807–7812. 10.1021/acs.analchem.6b01857. July 1, 2016.
8) Samer Hijazi, Rishi Kumar, and Chris Rowen, IP Group, Cadence, Using Convolutional
Neural Networks for Image Recognition.
9) Ke Chen, Lukasz A. Kurgan — Neural Networks in Bioinformatics, Springer Berlin
Heidelberg. 10.1007/978-3-540-92910-9_18.
10) Predrag Kukic, Claudio Mirabello, Giuseppe Tradigo, Ian Walsh, Pierangelo Veltri and
Gianluca Pollastri — Toward an accurate prediction of inter-residue distances in
proteins using 2D recursive neural networks, licensee BioMed Central Ltd.
https://doi.org/10.1186/1471-2105-15-6. 10 January 2014.
28
11) Michael Potter, Leon Reznik, Stanisław Radziszowski — Neural Networks and the
Search for a Quadratic Residue Detector, IEEE, Anchorage, AK, USA.
10.1109/IJCNN.2017.7966080 14-19 May 2017.
12) F.J Macwilliams and N.J.A Sloane, The Theory of Error Correcting codes, Amsterdam,
New York: North-Holland Pub. Co.; New York 2000.
13) Demeter Krupka — Introduction to Global Variational Geometry, Atlantis Press, 1st
edition, 10.2991/978-94-6239-073-7. 2015.