DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image...

14
http://www.iaeme.com/IJARET/index.asp 846 [email protected] International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 11, Issue 6, June 2020, pp. 846-859, Article ID: IJARET_11_06_076 Available online athttp://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=6 ISSN Print: 0976-6480 and ISSN Online: 0976-6499 DOI: 10.34218/IJARET.11.6.2020.076 © IAEME Publication Scopus Indexed DETECTING NEAR DUPLICATE IMAGES USING HELLINGER CORRELATION COEFFICIENT G. Kalaiarasi Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai, India M. Maheswari Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai, India Prathima Devadas Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai, India M. Selvi Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai, India ABSTRACT Near duplicate images are nothing but the similar images with minute change in the original image. These similar / transformed images are available in the internet and they are displayed as a result of user search. Discovery of these near duplicate images may permit the users to abstain from encountering the presentation of a similar picture in the result set. In this work, a detection methodology is presented to identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images using Pulse-Coupled Neural Network. Once the features are extracted, similarity measure is calculated. Here Hellinger coefficient is computed. In view of the relationship between the inquiry image and the images in the image set, the near duplicate images are recognized. This results in effective detection of near duplicate images. The detected near duplicate images help the users to avoid redundancy in images, detect copyright infringement, illegal copy of images detection, facilitates users’ browsing since image search engine returns multiple copies of image for a particular query.

Transcript of DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image...

Page 1: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

http://www.iaeme.com/IJARET/index.asp 846 [email protected]

International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 11, Issue 6, June 2020, pp. 846-859, Article ID: IJARET_11_06_076

Available online athttp://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=6

ISSN Print: 0976-6480 and ISSN Online: 0976-6499

DOI: 10.34218/IJARET.11.6.2020.076

© IAEME Publication Scopus Indexed

DETECTING NEAR DUPLICATE IMAGES

USING HELLINGER CORRELATION

COEFFICIENT

G. Kalaiarasi

Department of Computer Science and Engineering,

Sathyabama Institute of Science and Technology,

Chennai, India

M. Maheswari

Department of Computer Science and Engineering,

Sathyabama Institute of Science and Technology,

Chennai, India

Prathima Devadas

Department of Computer Science and Engineering,

Sathyabama Institute of Science and Technology,

Chennai, India

M. Selvi

Department of Computer Science and Engineering,

Sathyabama Institute of Science and Technology,

Chennai, India

ABSTRACT

Near duplicate images are nothing but the similar images with minute change in

the original image. These similar / transformed images are available in the internet

and they are displayed as a result of user search. Discovery of these near duplicate

images may permit the users to abstain from encountering the presentation of a

similar picture in the result set. In this work, a detection methodology is presented to

identify near duplicate image from the image set for an inquiry image by the user.

Initially features are extracted from the images using Pulse-Coupled Neural Network.

Once the features are extracted, similarity measure is calculated. Here Hellinger

coefficient is computed. In view of the relationship between the inquiry image and the

images in the image set, the near duplicate images are recognized. This results in

effective detection of near duplicate images. The detected near duplicate images help

the users to avoid redundancy in images, detect copyright infringement, illegal copy of

images detection, facilitates users’ browsing since image search engine returns

multiple copies of image for a particular query.

Page 2: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

G. Kalaiarasi, M. Maheswari, Prathima Devadas and M. Selvi

http://www.iaeme.com/IJARET/index.asp 847 [email protected]

Key words: Near Duplicates, Feature Extraction, PCNN, Hellinger Correlation

Coefficient.

Cite this Article: G. Kalaiarasi, M. Maheswari, Prathima Devadas and M. Selvi,

Detecting Near Duplicate Images using Hellinger Correlation Coefficient,

International Journal of Advanced Research in Engineering and Technology, 11(6),

2020, pp. 846-859.

http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=6

1. INTRODUCTION

The presence of near duplicate pictures in the web picture search indicates the event of excess

and speaks to the demonstration of copyright infringement. Such pictures are gotten from the

first picture by straightforward picture altering activities utilizing picture altering

programming or a few changes should be possible by the clients physically. For instance, if a

client passes an inquiry as "TajMahal" in the picture web index, it returns pictures identified

with a question. It is obvious from fig. 1 that the vast majorities of the pictures are variations

and identified with each other.

Figure 1 Image Search Results for the Query ―TajMahal‖

To improve the client's inquiry quality, these close copy pictures ought to be distinguished

as opposed to introducing the variations of a similar picture to the clients; in this way staying

away from overhead in a survey the close to copying pictures by the clients. In this manner,

the capacity to recognize such variations dependably and precisely permits the recognition of

copyright infringement, location of literary theft of pictures [1], duplicate move phony

discovery [2], falsification distinguishing proof in works of art [3], restriction of canvases i.e.,

confining where adjustments have been made to the first, abstain from spreading unlawful

substance over the web and decreases in maintaining a strategic distance from the presence of

comparable pictures in the showcase of picture list items.

In this paper, the PCNN calculation is utilized for the component extraction process.

There are different calculations for highlight extraction of pictures, including PCA, Haar

change, Hough change, Edge Detection, and so forth. In any case, these strategies are

compelling to extricate includes just in straightforward pictures like little shapes, characters,

and so on... Be that as it may, they are delicate to clamor and may not identify the picture with

Page 3: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

Detecting Near Duplicate Images using Hellinger Correlation Coefficient

http://www.iaeme.com/IJARET/index.asp 848 [email protected]

geometric changes as comparative ones. Along these lines, the PCNN model is utilized as a

component extraction technique to produce an element vector by the iterative procedure.

The principle favorable position of the Pulse-Coupled Neural Network (PCNN) model is

that it can work with no preparation. Likewise, it is a productive strategy in computerized

picture preparation, for example, include extraction, face discovery, edge identification,

picture improvement, picture division, picture diminishing, movement location, design

acknowledgment, clamor expulsion, picture characterization, and so on. The PCNN

calculation is clamor resistive and strong against the interpretation, scale, and revolution of

the info designs when contrasted and that of other picture handling calculations. The PCNN is

known for the insignificant number of produced highlights, on the negligible arrangement of

utilized etalons and on the fixed structure of info object networks. The downside of PCNN is

ideal boundary estimation and high figuring multifaceted nature for include age.

The rest of the area of this paper has the accompanying: the related references are talked

about in segment 2, the proposed framework is examined in segment 3, segment 4 has test

results lastly, segment 5 has the end some portion of the proposed framework.

2. RELATED WORKS

2.1. Near Duplicate Images

The issue of identification of near-duplicate images paved its research way to the specialists

in the field of computer vision and content based image retrieval due to lack of its results. In

[4], near duplicate keyframe (NDK) identification is done based on the local interest points

(LIPs) along with PCA-SIFT descriptors. Matching is done by one-to-one symmetric (OOS)

mapping, LIP-IS(index structure) is used for filtering and learning of patterns in NDKs is by

Support Vector Machines(SVMs). It finds its application in multilingual news stories, news

video clustering and summarization. Scale-Rotation invariant Pattern Entropy (SR-PE) is an

example assessment strategy, utilized for the recognition of close to copies in enormous scope

video corpus [5]. It estimated the spatial consistency of coordinating examples framed by

neighborhood keypoints. Be that as it may, division of copy articles and foundation scenes for

semantic acknowledgment is troublesome. Powerful photograph the executives and

encouraging client's perusing additionally utilizes the close to copy discovery [6]. In [7],

designs handling units (GPU) is utilized to defeat the downsides in sack of-neighborhood

highlights (BOF).

Lei et.al., proposed geometric invariant feature based on Radon transform[8] for the

recognition of near duplicates which is powerful not only to rotation and scaling, yet

additionally to tasks, for example, compression, noise contamination, blurring, illumination,

modification, cropping, and so on. But computational complexity and storage load is high.

This descriptor is robust to image editing operators but not to perspective transformation of

image. Variable length signature identified the near duplicate images from image set [9].

Another visual descriptor i.e., probabilistic center-symmetric local binary pattern represented

image patch and earth mover's separation that is utilized to register the similitude between two

pictures. This methodology can be applied predominantly in close copy archive picture

recovery and close to copy regular picture discovery. Close to copy picture revelation on one

billion pictures is effectively actualized by utilizing MapReduce structure. Jun Jie et. al., in

[10] evacuated close to copy pictures in the arrangements of answers utilizing Dynamic

Partial Functions and PCA-SIFT (Principal Component Analysis-Scale Invariant Feature

Transform).

Considering the expulsion of close to copy pictures from the inquiry is very unreasonable.

Yet, expulsion from the arrangements of answers is conceivable [11]. Close to copy pictures

are additionally bunched where all the close to copy pictures of a specific inquiry picture will

Page 4: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

G. Kalaiarasi, M. Maheswari, Prathima Devadas and M. Selvi

http://www.iaeme.com/IJARET/index.asp 849 [email protected]

be gathered/grouped – solo picture set grouping [12], Nearest Neighbor based bunching [13],

Fingerprint bunching [14], Clustering utilizing nearby discriminant models and worldwide

mix [15], k-implies grouping [16], Sequential square grouping [17], Hierarchical and level

grouping [18], k-implies joined with multilayer perceptron fake neural system [19],

Interactive semi-regulated bunching [20], Predictive grouping [21], Visually divisible

grouping [22], [23]. Audits and difficulties of picture bunching are talked about in [24].

2.2. Feature Extraction

Selection of features rely upon the application and the database utilized for retrieval. Some of

the features are Colour, Texture, Shape, Scale Invariant Feature Transform [25][26], SVD-

SIFT [27], Edge-SIFT [28], Scale Rotation Invariant Pattern Entropy [29], Principal

Component Analysis-Scale Invariant Feature Transform, Bag of Visual Words [30][31],

Speed Up Robust Features, Haar Wavelet, etc. In [32], discriminative local and semi-global

shape features are used for shape matching and object recognition. This method is mainly

capable of real time hand gesture recognition. SIFT features are used for near duplicate image

detection at high confidence and large-scale [33]. Additionally entropy-based filtering found

near duplication relationship with a solitary match and a query extension strategy with graph

cut is developed to increase the quality of image search results. [34] use Bags of Visual

Phrases for near duplicate image detection. This feature is used when image retrieval is from

large databases like social networks i.e., when there is indexing of image using hashing

techniques.

Distinctive local descriptors [35] is utilized for getting excellent matches considerably

under extreme changes and for ordering these picture highlights area touchy hashing is

utilized. This technique does proficient close to copy discovery and sub picture recovery. The

primary utilization of this framework is in discovering duplicate corrected pictures and

recognizing fashioned pictures. In [36], falsification location calculation is proposed for

acknowledgment of altered in painting pictures. This is finished by recognizing dubious

locale and afterward produced area is distinguished via looking through comparability

obstructs in a picture and likeness vector field. In [37], Unit-connecting PCNN changed a 2-

dimensional picture into a 1-dimensional time arrangement, including highlights of the first

picture and its changes. To improve the arrangement precision in surface grouping scale and

revolution invariant highlights are figured from nearby paired example [38]. Neighborhood

spiking design, picture descriptor is proposed in [39] to group the surface pictures which are

gained under different turn and light conditions. PCNN alongside Tsallis entropy is utilized as

a component extraction strategy for face acknowledgment [40].

Tsallis entropy as a component is interpretation, scaling and revolution invariant. PCNN

utilizing discrete Fourier changes breaks down the beat of the system to accomplish scale and

interpretation autonomous acknowledgment for separated articles [41]. Pixel based visually

impaired picture phony identification is must since numerous incredible picture altering

apparatuses control pictures and change the substance which turns into a paltry errand as we

are moving towards paperless work environments [42]. PCNN based Feature extraction is

likewise utilized in Noisy Image [43]. [44] looked into the element extraction strategies and

PC vision based methodologies utilizing PCNN for the discovery of pictures is talked about in

[45]. Likewise, the pictures are bunched utilizing packaged highlights [46].

2.3. Pulse-Coupled Neural Network

This is unaided system and great at computerized picture handling. The beat coupled neural

system models and its application models are talked about in [47]. Wang et. al., introduced the

audit of exploration concentrates in beat coupled neural system and some changed models

Page 5: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

Detecting Near Duplicate Images using Hellinger Correlation Coefficient

http://www.iaeme.com/IJARET/index.asp 850 [email protected]

[48]. PCNN is known for its simple usage, higher precise acknowledgment and hostile to

clamor aggravation. It is applied for clamor expulsion [49]. Double Channel Pulse Coupled

Neural Network calculation is utilized for combination of multimodality mind pictures [50].

Optimized PCNN along with K-Nearest Neighbor (K-NN) is utilized for content-based image

classification and retrieval of images by extracting visual features in a type of numeric vector

called image signature [51].

In [52], PCNN is utilized for picture acknowledgment and decrease in the quantity of

highlights. The quantity of created highlights relies upon the quantity of PCNN emphasis

steps. The time signal (G(n)) is only wholes of yield amounts Yij of enacted neurons in each

cycle step n. This time signal is the produced highlights for speaking to pictures. G(n) of

PCNN is utilized as highlight for all the applications. Conventional PCNN technique for

picture include extraction depends on dim picture, disposing of the shading data of the

picture. So HSI-PCNN is utilized for face acknowledgment [53]. In any case, Face

acknowledgment under complex foundation is troublesome. Picture coordinating utilizing

PCNN is likewise applied in Content Based Image Retrieval (CBIR) framework [54].

Some of the works related to that of Hellinger Correlation Coefficient is –

recommendation for movies based on similar user is done using hellinger and KNN [55].

Using supervised Maximal similarity based region classification characterized into sky and

non sky regions by through Hellinger kernel-based distance[56]. Efficient performance of

hellinger distance for feature selection of high-dimensional class-imbalanced data for various

disciplines[57]. The distance between two nodes on the same side of a bipartite network using

Hellinger distance[58]. Pearson’s Correlation Coefficient is used for image compression

scheme with block classification; here the affine similarity between two blocks is equivalent

to the absolute value of Pearson’s Correlation Coefficient (APCC) between them. This results

in increase of matching probability [59].

3. PROPOSED SYSTEM

In this paper, a successful methodology is proposed to distinguish close to copy picture from

the picture set for a question picture by the client. At first highlights are separated from the

pictures utilizing Pulse-Coupled Neural Network. When the highlights are extricated,

similitude measure ought to be determined. Along these lines, Hellinger Correlation

Coefficient is registered. In view of the relationship between's the inquiry picture and the

pictures in the picture set, close to copy pictures are recognized.

3.1. The PCNN Model

The basic structure of the PCNN model is shown in fig. 2. The working of PCNN is examined

here. The neurons present in the system depends on the quantity of info pictures. The pixels in

the picture are associated with an exceptional neuron. All the neurons are associated through

the connecting field. There are three sections that structure neuron – a taking care of field

(responsive field or dendritic tree), a connecting tweak and a heartbeat generator. Open field

is the essential part to get input signals from the neighboring neurons and from outer sources

and the field has two outside channels known as Feeding Compartment F and connecting

compartment L. The all out inside movement U is shaped by increasing the one-sided,

connecting data sources and taking care of information. This structures the Linking or

Modulation part. A stage work generator and a limit signal generator is in the beat generator.

Page 6: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

G. Kalaiarasi, M. Maheswari, Prathima Devadas and M. Selvi

http://www.iaeme.com/IJARET/index.asp 851 [email protected]

Figure 2 Basic structure of PCNN neuron

The response given by the neurons in the network is known as firing. This firing actuated

when the inward movement of the neuron surpasses a specific limit. The neuron yield Y is set

to 1. The yield of the neuron is reset to zero when the limit is bigger than the interior

movement U. These neurons produce series of pulse yields after n number of iterations. The

pulse output carries information about the input image. The decision on the content of the

image is accomplished by examining the pulse output of the network.

Feeding input and Linking input make use of the synaptic weights M and W to

communicate with the surrounding neurons. Input stimulus is offered uniquely to the feeding

compartment. The compartments output is determined by the following equations:

[ ] [ ] ∑ [ ] (1)

[ ] [ ] ∑ [ ] (2)

[ ] [ ] [ ] (3)

[ ] [ ] [ ] (4)

[ ] { [ ] [ ]

(5)

In the above equations, (i,j) is the position of a neuron in the network and a pixel in the

image. If the input image is 128 X 128 pixels, (i,j) will be somewhere in the range of (1,1)

and (128,128). (k,l) represent for the positions of the surrounding neurons; n is the iterative

number; and Sij is the gray level of the input image pixel.

Fij[n], Lij[n], Uij[n] and Tij[n] is the feeding input, linking input, internal activities and

dynamic threshold respectively. The yield is gotten from Y esteem. The Y esteem shows the

neuron status i.e., on the off chance that the worth is equivalent to 1. At that point the neuron

is viewed as actuated. The estimation of M and W is subject to the encompassing neurons and

is consistent worth. β is the connecting coefficient steady; αF, αL and αT are the constriction

time constants; and VF, VL and VT are the characteristic voltage capability of the taking care

of sign, connecting sign and dynamic edge, separately. The estimation of Y has minute data,

for example, provincial data, edge data that is the highlights of info picture.

3.2. Feature Extraction

The image which is to be handled is converted into gray image and is fed as input to the

PCNN model. Then it is transformed into sequence of binary images which contains insights

regarding the shape, edge and texture features of the original image. But texture features can’t

be used since the data information is excessively enormous. To handle this, some kind of

changes are carried out on these sequence images to extract a limited data as an original

Page 7: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

Detecting Near Duplicate Images using Hellinger Correlation Coefficient

http://www.iaeme.com/IJARET/index.asp 852 [email protected]

image feature. Johnson defined time series as the image feature G(n) = , where n is

the value from 1to N. Time series is rotation, translation, scaling and distortion invariance.

The length of the feature vector is the complete number of steps in the PCNN performed

iteratively. The feature vector is invariant to the geometrical changes. The current works have

indicated that PCNN for feature extraction with geometrical invariant features highlight the

noise resistance. After feature extraction, Hellinger Correlation Coefficient is computed as the

similarity metrics for identifying near duplicate images.

Hellinger Correlation Coefficient

Hellinger coefficient is a measure between the two probability Distributions. Correlation

coefficients are expressed as values between 1and 0. A coefficient of +1 indicates the positive

correlation. Here the image features are considered as a two dimensional vector. Pseudo code

of hellinger coefficient based on images:

Step1: Extracted features of the image is considered as a matrix.

Step 2: Calculate the similarity between the query image with the total number of images in

the image dataset using the equation given below

From the above equation,

1. Calculte the row average of query image and image in the image set

2. Dot and sum product is equal to 1 (ie) =1

3.3. Algorithm

The overall steps of the proposed system are as follows:

Step 1: Read the query image from the user as an input

Step 2: Read all images in the Image Set one by one

Step 3: Extract features of Query image and all images present in the Set using PCNN

Step 4: Initialize the values for PCNN parameters

Step 5: Feeding the field of neuron is computed as follows:

i)Each pixel of the input image is multiplied with the constant Guassian Weight Matrix Mijkl ,

previous output values i.e., Ykl[n-1] (initially it is zero) and inherent voltage potential VF.

From this

is obtained.

ii)Along with the above, attenuation time constant of feeding input αF is multiplied with the

previous feeding input values (initially it is set to zero), i.e.,

iii) These two steps’ values are summed up with the input image pixel Sij i.e.,

to obtain the feeding input field value.

Page 8: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

G. Kalaiarasi, M. Maheswari, Prathima Devadas and M. Selvi

http://www.iaeme.com/IJARET/index.asp 853 [email protected]

Step 6: Linking the field of neuron is computed as follows:

i)Each pixel of the input image is multiplied with the constant Guassian Weight Matrix Wijkl ,

previous output values i.e., Ykl[n-1] (initially it is zero) and inherent voltage potential VL.

From this

is obtained.

ii)Along with the above, attenuation time constant of feeding input αL is multiplied with the

previous linking input values (initially it is set to zero), i.e.,

iii) These two steps’ values are summed up i.e.,

to obtain the linking input field value.

Step 7: Both the feeding field and linking field communicates with the neighbouring neurons

through the synaptic weights M and W

Step 8: Internal Activity of the neuron is calculated as

i)Linking coefficient β is multiplied with Linking field Lij i.e.,

is obtained

ii)Constant 1 is added with the above step and is multiplied with Linking field i.e.,

is obtained.

Step 9: Dynamic threshold is computed using

Step 10: If the internal activity exceeds threshold, the pulse outputs returns 1; else the output

is 0

Step 11: Image feature Time series is calculated using G(n) =

Step 12: Repeat the steps 5 to 11 for all images in the Image set separately

Step 13: Calculate Hellinger Correlation Coefficient between the Query image with the all

images present in Image Set.

Step 14: Higher coefficient value implies higher similarity between the images resulting in

Near Duplicate Images

Step 15: Lower coefficient value results in non near duplicate images i.e., remaining images

other than near duplicates

Step 16: Hellinger Coefficient is calculated for all the images in the Set

Step 17: Finally Near Duplicate images are detected.

4. EXPERIMENT RESULTS

In this paper, image set is created by making variants for original image; thus making near

duplicates i.e., original images are taken and some transformations (rotation, scaling,

cropping, etc.) are performed. Fig. 3 consists of near duplicate images for eight original

images.

Page 9: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

Detecting Near Duplicate Images using Hellinger Correlation Coefficient

http://www.iaeme.com/IJARET/index.asp 854 [email protected]

Figure 3 Images with near duplicates

4.1. Parameters of PCNN

Based on many references [43], [60], [61], [62], [63], [64] the PCNN parameters for

processing all the images must be same. Table 1 shows the parameters that are set in prior.

L,U and Y matrices are initially set to zero and the value for W and M is given by the matrix

[0.5,1,0.5;1,0,1;0.5,1,0.5].

Table 1 Parameters of PCNN

Parameters αF αL αT VF VL VT Β

Values 0.1 1.0 1.0 0.5 0.2 20 0.1

4.2. Results

Once the feature extraction of images is completed, Hellinger Coefficient is computed.

Hellinger Coefficient values of each and every image with all the other images are plotted in

the fig.

Based on the correlation between images detection of near duplicate image is performed.

For the image set taken in this paper, results are shown below:

Query Image 1

Near Duplicate Images

Page 10: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

G. Kalaiarasi, M. Maheswari, Prathima Devadas and M. Selvi

http://www.iaeme.com/IJARET/index.asp 855 [email protected]

Query Image 2

Near Duplicate Images

Figure 4 Detection of Near Duplicate images

5. CONCLUSION

The detection of Near Duplicate Images is completed by feature extraction utilizing Pulse

Coupled Neural Network and computation of similarity utilizing Hellinger Correlation

Coefficient. The outcomes show that the images are recognized appropriately.

REFERENCES

[1] S. Srivastava, P. Mukherjee, B. Lall, (2015) ―imPlag: Detecting image plagiarism using

hierarchical near duplicate retrieval,‖ in Annual IEEE India Conference (INDICON) Dec

17 (pp. 1-6).

[2] B. Wen, Y. Zhu, R. Subramanian, T.T. Ng, X. Shen, S. Winkler, (2016) ―COVERAGE—

A novel database for copy-move forgery detection,‖ In IEEE International Conference on

Image Processing (ICIP) Sep 25 (pp. 161-165).

[3] P. Buchana, I. Cazan, M. Diaz-Granados, F. Juefei-Xu, M. Savvides, (2016)

―Simultaneous forgery identification and localization in paintings using advanced

correlation filters,‖ In IEEE International Conference on Image Processing (ICIP) Sep 25

(pp. 146-150).

[4] W.L. Zhao, C.W. Ngo, H.K. Tan, X. Wu, (2007) ―Near-duplicate keyframe identification

with interest point matching and pattern learning,‖ IEEE Transactions on Multimedia. Jul

23;9(5):1037-48.

[5] W.L. Zhao, C.W. Ngo, (2009) ―Scale-rotation invariant pattern entropy for keypoint-based

near-duplicate detection,‖ IEEE Transactions on Image Processing. Jan 9;18(2):412-23.

Page 11: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

Detecting Near Duplicate Images using Hellinger Correlation Coefficient

http://www.iaeme.com/IJARET/index.asp 856 [email protected]

[6] W.T. Chu, C.H. Lin, (2010) ―Consumer photo management and browsing facilitated by

near-duplicate detection with feature filtering,‖ Journal of Visual Communication and

Image Representation. Apr 1;21(3):256-68.

[7] H. Xie, K. Gao, Y. Zhang, S. Tang, J. Li, Y. Liu, (2011) ―Efficient feature detection and

effective post-verification for large scale near-duplicate image search,‖ IEEE Transactions

on multimedia. Sep 5; 13(6):1319-32.

[8] Y. Lei, L. Zheng, J. Huang, (2014) ―Geometric invariant features in the Radon transform

domain for near-duplicate image detection,‖ Pattern recognition. Nov 1;47(11):3630-40.

[9] L. Liu, Y. Lu, C.Y. Suen, (2015) ―Variable-length signature for near-duplicate image

matching,‖ IEEE Transactions on Image Processing. Feb 4; 24(4):1282-96.

[10] J. J. Foo, J. Zobel, R. Sinha, (2007) ―Clustering near-duplicate images in large

collections,‖ In Proceedings of the international workshop on multimedia information

retrieval, Sep 24 (pp. 21-30).

[11] J. J. Foo, J. Zobel, R. Sinha R, S. M. Tahaghoghi, (2007) ―Detection of near- duplicate

images for web search,‖ In Proceedings of the 6th ACM international conference on

Image and video retrieval, Jul 9 (pp. 557-564).

[12] J. Goldberger, S. Gordon, H. Greenspan, (2006) ―Unsupervised image-set clustering using

an information theoretic framework,‖ IEEE transactions on image processing. Jan

16;15(2):449-58.

[13] T. Liu, C. Rosenberg, H. A. Rowley, (2007) ―Clustering billions of images with large

scale nearest neighbor search,‖ In 2007 IEEE workshop on applications of computer

vision (WACV'07) Feb 21 (pp. 28-28).

[14] G. J. Bloy, (2008) ―Blind camera fingerprinting and image clustering,‖ IEEE Transactions

on Pattern Analysis and Machine Intelligence. Jan 21;30(3):532-4.

[15] Y. Yang, D. Xu, F. Nie, S. Yan, Y. Zhuang, (2010) ―Image clustering using local

discriminant models and global integration,‖ IEEE Transactions on Image Processing. Apr

26;19(10):2761-73.

[16] M. Karthikeyan and P. Aruna, (2013) ―Probability based document clustering and image

clustering using content-based image retrieval,‖ Applied Soft Computing. Feb 1;

13(2):959-66.

[17] M. A. Sekeh, M. A. Maarof, M. F. Rohani, B. Mahdian, (2013) ―Efficient image

duplicated region detection model using sequential block clustering,‖ Digital

Investigation. Jun 1;10(1):73-84.

[18] L. J. Villalba, A. L. Orozco, J. R. Corripio, (2015) ―Smartphone image clustering. Expert

Systems with Applications,‖ Mar 1; 42(4):1927-40.

[19] H. K. Al-Mohair, J. M. Saleh, S. A. Suandi, (2015) ―Hybrid human skin detection using

neural network and K-means clustering technique,‖ Applied Soft Computing. Aug

1;33:337-47.

[20] H. P. Lai, M. Visani, A. Boucher, J. M. Ogier, (2014) ―A new interactive semi- supervised

clustering model for large image database indexing,‖ Pattern Recognition Letters. Feb

1;37:94-106.

[21] I. Dimitrovski, D. Kocev, S. Loskovska, S. Džeroski, (2016) ―Improving bag- of-visual-

words image retrieval with predictive clustering trees,‖ Information Sciences. Feb 1;329:

851-65.

[22] S. Pandey and P. Khanna, (2016) ―Content-based image retrieval embedded with

agglomerative clustering built on information loss,‖ Computers & Electrical Engineering.

Aug 1;54: 506-21.

Page 12: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

G. Kalaiarasi, M. Maheswari, Prathima Devadas and M. Selvi

http://www.iaeme.com/IJARET/index.asp 857 [email protected]

[23] S. Pandey, P. Khanna, H. Yokota, (2016) ―Clustering of hierarchical image database to

reduce inter-and intra-semantic gaps in visual space for finding specific image semantics,‖

Journal of Visual Communication and Image Representation. Jul 1; 38:704-20.

[24] N. Ahmed, (2015) ―Recent review on image clustering,‖ IET Image Processing. Aug 7;

9(11):1020-32.

[25] Z. Li and X. Feng, (2013) ―Near Duplicate Image Detecting Algorithm based on Bag of

Visual Word Model,‖ Journal of Multimedia. Oct 1;8(5).

[26] L. W. Kang, C. Y. Hsu, H. W. Chen, C. S. Lu, (2010) ―Secure SIFT-based sparse

representation for image copy detection and recognition,‖ In IEEE International

Conference on Multimedia and Expo, Jul 19 (pp. 1248-1253).

[27] H. Liu, H. Lu, X. Xue, (2010) ―SVD-SIFT for web near-duplicate image detection,‖ In

IEEE International Conference on Image Processing, Sep 26 (pp. 1445-1448).

[28] S. Zhang, Q. Tian, K. Lu, Q. Huang, W. Gao, (2013) ―Edge-SIFT: Discriminative binary

descriptor for scalable partial-duplicate mobile search,‖ IEEE Transactions on Image

Processing. Mar 7; 22(7):2889-902.

[29] W. L. Zhao, C. W. Ngo, (2009) ―Scale-rotation invariant pattern entropy for keypoint-

based near-duplicate detection,‖ IEEE Transactions on Image Processing. Jan 9;

18(2):412-23.

[30] L. Xie, J. Wang, B. Zhang, Q. Tian, (2015) ―Fine-grained image search,‖ IEEE

Transactions on Multimedia. Mar 4; 17(5):636-47.

[31] Z. Zhou, Y. Wang, Q. J. Wu, C. N. Yang, X. Sun, (2016) ―Effective and efficient global

context verification for image copy detection,‖ IEEE Transactions on Information

Forensics and Security. Aug 17; 12(1):48-63.

[32] H. Xu, J. Yang, J. Yuan, (2016) ―Invariant multi-scale shape descriptor for object

matching and recognition,‖ In IEEE International Conference on Image Processing (ICIP)

Sep 25 (pp. 644-648).

[33] W. Dong, Z. Wang, M. Charikar, K. Li, (2012) ―High-confidence near- duplicate image

detection,‖ In Proceedings of the 2nd acm international conference on multimedia

retrieva, Jun 5 (pp. 1-8).

[34] S. Battiato, G. M. Farinella, G. C. Guarnera, T. Meccio, G. Puglisi, D. Ravì, R. Rizzo,

(2010) ―Bags of phrases with codebooks alignment for near duplicate image detection,‖ In

Proceedings of the 2nd ACM workshop on Multimedia in forensics, security and

intelligence, Oct 29 (pp. 65-70).

[35] Y. Ke, R. Sukthankar, L. Huston, Y. Ke, R. Sukthankar, (2004) ―Efficient near-duplicate

detection and sub-image retrieval,‖ In ACM multimedia, Oct 10 (Vol. 4, No. 1, p. 5).

[36] I. C. Chang, J. C. Yu, C. C. Chang, (2013) ―A forgery detection algorithm for exemplar-

based inpainting images using multi-region relation,‖ Image and Vision Computing. Jan 1;

31(1):57-71.

[37] X. Gu, (2008) ―Feature extraction using unit-linking pulse coupled neural network and its

applications, ―Neural processing letters. Feb 1; 27(1):25-41.

[38] S. Hegenbart, A. Uhl, (2015) ―A scale-and orientation-adaptive extension of local binary

patterns for texture classification,‖ Pattern recognition. Aug 1; 48(8):2633-44.

[39] Du S, Yan Y, Ma Y. (2016) Local spiking pattern and its application to rotation-and

illumination-invariant texture classification. Optik. Aug 1; 127(16):6583-9.

[40] Y. Zhang and L. Wu, (2008) ―Pattern recognition via PCNN and Tsallis entropy,‖

Sensors. Nov; 8(11):7518-29.

Page 13: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

Detecting Near Duplicate Images using Hellinger Correlation Coefficient

http://www.iaeme.com/IJARET/index.asp 858 [email protected]

[41] R. C. Mureşan, (2003) ―Pattern recognition using pulse-coupled neural networks and

discrete Fourier transforms,‖ Neurocomputing. Apr 1; 51:487-93.

[42] M. A. Qureshi and M. Deriche, (2015) ―A bibliography of pixel-based blind image

forgery detection techniques,‖ Signal Processing: Image Communication. Nov 1; 39:46-

74.

[43] Y. Ma, Z. Wang and C. Wu, (2006) "Feature Extraction from Noisy Image Using PCNN,"

2006 IEEE International Conference on Information Acquisition, Weihai, pp. 808-813.

[44] K. K. Thyagharajan and G. Kalaiarasi, (2020) ―A Review on Near-Duplicate Detection of

Images using Computer Vision Techniques,‖ Archives of Computational Methods in

Engineering. Jan 6:1-20.

[45] K.K. Thyagharajan and G. Kalaiarasi, (2018) "Pulse Coupled Neural Network based Near-

Duplicate Detection of Images (PCNN - NDD)," Advances in Electrical and Computer

Engineering, vol.18, no.3, pp.87-96.

[46] G. Kalaiarasi and K. K. Thyagharajan, (1999) ―Clustering of near duplicate images using

bundled features,‖ Cluster Computing. 2019 Sep 1; 22(5):11997-2007.

[47] J. L. Johnson and M. L. Padgett, ―PCNN models and applications,‖ IEEE transactions on

neural networks. May; 10(3):480-98.

[48] Z. Wang, Y. Ma, F. Cheng, L. Yang, (2010) ―Review of pulse-coupled neural networks,‖

Image and Vision Computing. Jan 1; 28(1):5-13.

[49] M. M. Subashini, S. K. Sahoo, (2014) ―Pulse coupled neural networks and its

applications,‖ Expert systems with Applications. Jun 15; 41(8):3965-74.

[50] S. Kavitha and K, K. Thyagharajan, (2014) ―Dual channel pulse coupled neural network

algorithm for fusion of multimodality brain images with quality analysis,‖ Applied

Medical Informatics, Sep 30; 35(3):31-9.

[51] M. M. Mohammed, A. Badr, M. B. Abdelhalim, (2015) ―Image classification and retrieval

using optimized pulse-coupled neural network,‖ Expert systems with applications. Jul 1;

42(11):4927-36.

[52] R. Forgac and I. Mokris, (2007) ―Foundations of Image Recognition by Pulse Coupled

Neural Networks,‖ Science & Military Journal; 2(1):24.

[53] X. Li, H. Zheng, C. Liu, (2013) ―Face recognition scheme based on HSI- PCNN,‖ Journal

of Multimedia. Oct 1;8(5):573.

[54] M. Yonekawa and H. Kurokawa, (2012) ―The content-based image retrieval using the

pulse coupled neural network,‖ In IEEE International Joint Conference on Neural

Networks (IJCNN) Jun 10 (pp.1-8).

[55] M. Maheswari, S. Geetha, (2019) ―Adaptable and proficient Hellinger Coefficient Based

Collaborative Filtering for recommendation system,‖ Cluster Computing. Sep 1;

22(5):12325-38.

[56] Y. El Merabet, Y. Ruichek, S. Ghaffarian, Z. Samir, T. Boujiha, R. Touahni, R.

Messoussi, A. Sbihi, (2017) ―Hellinger Kernel-based Distance and Local Image Region

Descriptors for Sky Region Detection from Fisheye Images,‖ In VISIGRAPP (4:

VISAPP) Feb 27 (pp. 419- 427).

[57] G. H. Fu, Y. J. Wu, M. J. Zong, J. Pan, (2020) ―Hellinger distance-based stable sparse

feature selection for high-dimensional class-imbalanced data,‖ BMC bioinformatics. Dec;

21:1-4.

[58] S. M. Taheri, H. Mahyar, M. Firouzi, E. Ghalebi, R. Grosu, A. Movaghar, (2017)

―HellRank: a Hellinger-based centrality measure for bipartite social networks,‖ Social

Network Analysis and Mining. Dec 1; 7(1):22.

Page 14: DETECTING NEAR DUPLICATE IMAGES USING HELLINGER ......identify near duplicate image from the image set for an inquiry image by the user. Initially features are extracted from the images

G. Kalaiarasi, M. Maheswari, Prathima Devadas and M. Selvi

http://www.iaeme.com/IJARET/index.asp 859 [email protected]

[59] J. Wang and N. Zheng, (2013) ―A novel fractal image compression scheme with block

classification and sorting based on Pearson's correlation coefficient,‖ IEEE Transactions

on Image processing. Jun 17; 22(9):3690-702.

[60] H. R. Ma and X. W. Cheng, (2014) ―Automatic image segmentation with PCNN

algorithm based on grayscale correlation,‖ International Journal of Signal Processing,

Image Processing and Pattern Recognition; 7(5):249-58.

[61] S. Wei, Q. Hong, M. Hou, (2011) ―Automatic image segmentation based on PCNN with

adaptive threshold time constant,‖ Neurocomputing. Apr 1; 74(9):1485-91.

[62] H. Yanli, L. Bonian, L. Yingjie, W. Xiaofei, (2010) ―Image Retrieval Using Pulse-

Coupled Neural Networks and Correlation Coefficient,‖ In 2010 2nd International

Conference on Information Engineering and Computer Science.

[63] H. Yanli, L. Bonian, L. Yingjie, W. Xiaofei, (2010) ―Image Retrieval Using Pulse-

Coupled Neural Networks and Correlation Coefficient,‖ In 2010 2nd International

Conference on Information Engineering and Computer Science.

[64] S. He, D. Wang, Y. Nin, (2010) ―Application of pulse coupled neural network in image

recognition,‖ In IEEE International Conference on Computing, Control and Industrial

Engineering, Jun 5 (Vol. 2, pp. 415-419).