Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store...
Transcript of Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store...
![Page 1: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/1.jpg)
Paper ID: 1570186597
imPlag: Detecting Image Plagiarism Using Hierarchical Near Duplicate Retrieval
Siddharth Srivastava, Prerana Mukherjee, Brejesh LallIndian Institute of Technology, Delhi
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 2: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/2.jpg)
Introduction
The key characteristic of image plagiarism is that it may involve thereproduction of the original image using an entirely different modesuch as hand made sketches. Image Plagiarism can be posed as asuperset of image copy detection problems.
Fig. 1. (a) Original Image (b) Plagiarised image (reproduction of the source image) (c) Copied image (considered as strong attack by copy detection algorithms but an expected case for Image Plagiarism)
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 3: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/3.jpg)
Problems ?
• Detection of similar images – Huge Databases, Interactive Time
• Plagiarism brings in innovation
• Hence involves both Research and Engineering Challenges
- Stitched from 3888 images- One column/row pixel from each image
So knowing your limits is necessary
Image Courtesy:Eirik Solheim(Image has been used for demonstrating
the extent of deformation possible in images)
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 4: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/4.jpg)
KEY CONTRIBUTIONS
Development of a hierarchical feature extraction and feature indexingtechnique.
Evaluation of recent feature extraction techniques against simple,moderate and extreme deformations.
Dataset construction for testing image plagiarism algorithms.
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 5: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/5.jpg)
Dataset
• Natural Images – mountains, rivers, animals, birds etc.
• Actual scenario – too many images can be similar but might not be plagiarized (synthetically transformed)
• So for evaluation, dataset was created since detecting image plagiarism is not really only Content Based Image Retrieval • Search for images on Flickr, ukbench dataset
• Find similar images using Google Reverse Image Search (Google doesn’t index Flickr !!)
• Transformed Images – Affine, Grayscale, Color channel separation etc. (30 transformations)
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 6: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/6.jpg)
Methodology
HeirarchicalFeature
Extraction
Feature Indexing
Search Query
NN Match/Exact
Matching
Ranking or Verification
Fingerprint the image- Perceptual Hash- SIFT > SURF, ORB, FREAK, PCA-SIFT
Store for retrieval - Database- Apache Lucene- Locality Sensitive Hashing
Search the index - Search LSH Index
Relevant Results ranked at the top - Bag of Visual Words Histogram matching
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 7: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/7.jpg)
Hierarchical Indexing
Lucene Index
Perceptual Hash
Locality Sensitive Hashing
SIFT Features
Bag of Visual Words
Images
Lucene Index Traditional Database
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 8: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/8.jpg)
Layered Retrieval
Input ImageCalculate
Perceptual Hash of input Image
Search LSH Index for nearest neighbours
Rank images based on BoVWhistogram matching
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 9: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/9.jpg)
Perceptual Hash
• Can be used for multimedia content (audio, video, images)
• Similar images have similar hash values
Scale image to
32x32
Convert to
grayscale
Compute DCT
Keep first 8x8
coefficients
Take Average
(no DC Term)
Coeff > Avg=> 1
Coeff <= Avg=> 0
Flatten to 64bit vector
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 10: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/10.jpg)
Bag of Visual Words
• SIFT features converted to Bag of Visual Words
• More efficient than direct keypoint matching
• Observations:• Large vocabulary size may increase false negatives
• Small vocabulary size may increase false positives
• Though there is no definite pattern on what should the vocabulary size be
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 11: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/11.jpg)
Results
• Accuracy: 81%
• Scalability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1000 2000 3000 4000 5000 6000 7000
Tim
e(se
c)
Number of images in the dataset
top-60 results
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 12: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/12.jpg)
Conclusion
• We perform evaluations to choose best criteria and techniques for detecting image plagiarism.
• A method is proposed, consisting of perceptual hashing and SIFT with hierarchical approximate matching scheme.
• This scheme was able to maintain the tradeoff between time and accuracy.
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 13: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/13.jpg)
References
• E. Chalom, E. Asa, and E. Biton, “Measuring image similarity: an overview of some useful applications,” Instrumentation & Measurement Magazine, IEEE, vol. 16, no. 1, pp. 24–28, 2013.
• C. Zauner, M. Steinebach, and E. Hermann, “Rihamark: perceptual image hash benchmarking,” in IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 2011, pp. 78 800X–78 800X.
• V. Voronin, V. Frantc, V. Marchuk, and K. Egiazarian, “Fast texture and structure image reconstruction using the perceptual hash,” Image Processing: Algorithms and Systems XI, 2013.
• A. Kumar, A. Anand, A. Akella, A. Balachandran, V. Sekar, and S. Seshan, “Flexible multimedia content retrieval using infonames,” ACM SIGCOMM Computer Communication Review, vol. 41, no. 4, pp. 455–456, 2011
• A. Gionis, P. Indyk, R. Motwani et al., “Similarity search in high dimensions via hashing,” in VLDB, vol. 99, 1999, pp. 518–529.
• V. Christlein, C. Riess, J. Jordan, and E. Angelopoulou, “An evaluation of popular copy-move forgery detection approaches,” Information Forensics and Security, IEEE Transactions on, vol. 7, no. 6, pp. 1841– 1854, 2012.
• M. Lux and S. A. Chatzichristofis, “Lire: lucene image retrieval: an extensible java cbir library,” in Proceedings of the 16th ACM international conference on Multimedia. ACM, 2008, pp. 1085–1088.
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 14: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/14.jpg)
THANKYOU
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 15: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/15.jpg)
Appendix: Dataset Images
Perceptually Similar ?
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 16: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/16.jpg)
Appendix: Nature is not always greenish
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 17: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/17.jpg)
Appendix: Accuracy
0
10
20
30
40
50
60
70
80
90
100
60 100 150 200 250
Acc
ura
cy(%
)
Number of Results(top-N)
Accuracy(%)
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 18: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/18.jpg)
Appendix: Results
Fig 2. Comparison of Feature matching techniques Fig 3. Average time taken by SIFT, SURF and Perceptual Hash
PH
SURF
SIFT
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 19: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/19.jpg)
Appendix: Results
Fig 4. Comparison of ranked retrieval Fig 5. Ranked V/s Non Ranked Retrieval
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 20: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/20.jpg)
Appendix: Results
Fig 6. Time vs Number of results Fig 7. Time vs Number of Images in the dataset
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 21: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/21.jpg)
Appendix: Results
Fig 8. Lucene v/s Database Retrieval time
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 22: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/22.jpg)
Locality Sensitive Hashing
• Similar features hashed to same hash values
• Parameters• No of bits (k)
• No of tables (l)
• Maximum Bucket capacity (usually unlimited)
• Empirical Analysis needed for determining parameters as per the dataset
• varying number of bits, varies bucket size (small hash, more collisions and vice versa)
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India
![Page 23: Paper ID: 1570186597 imPlag: Detecting Image Plagiarism ...- SIFT > SURF, ORB, FREAK, PCA-SIFT Store for retrieval ... Comparison of Feature matching techniques Fig 3. Average time](https://reader034.fdocuments.us/reader034/viewer/2022042712/5f8efba471dc364cdf3d0e5b/html5/thumbnails/23.jpg)
Lucene
• Very efficient in document indexing and retrieval
• Bag of Visual words histograms are indexed
• Allows for random access of documents
• Histograms are fetched from Lucene index and ranked (Filtering)
IEEE India CouncilDepartment of Electrical Engineering
Faculty of Engineering & Technology
Jamia Millia Islamia, New Delhi, India