Image Classification and Retrieval logic
-
Upload
gianvito-siciliano -
Category
Data & Analytics
-
view
176 -
download
1
Transcript of Image Classification and Retrieval logic
![Page 1: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/1.jpg)
IMAGE CLASSIFICATION PIPELINE
CF x IIF classification logic strategy
![Page 2: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/2.jpg)
DONE• Information Retrieval (query the dataset)
• TF-IDF
• Machine Learning (classify new instances)
• CF-IIF
• Technologies: OpenIMAJ / Spark
• Image Pipeline
• Logica di classificazione
• Implementazione in java (e test)
![Page 3: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/3.jpg)
DA RIVEDERE• KNN with KDtree
• Cosine similarities, and distance metrics
• Improve cf-iif extractor (logica in spark)
• Tuning with hyper parameter
• Reduce features space: SVD (scegliendo lo 0,01% di cluster sono 6000+ features)
![Page 4: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/4.jpg)
TODO
• Re-engineering in spark
• testare differenti SIFT features (pyramid?)
• sostituire KNN con CNeuralNetworks
• (GraphLab/Deep4J)
![Page 5: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/5.jpg)
FEATURE EXTRACTOR
LOADER
MODEL TEST
PREDICTION
Build a classifier, based on salient-feature vocabulary created from the dataset
1. load images dataset inherent 3 distinct class
2. extract local features from each image,
3. create the codebook for classification
4. train and test the model
IMAGE PIPELINE
![Page 6: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/6.jpg)
FEATURES EXTRACTION
• Extracting features is where cool image processing happens, and represent the key part of the pipeline.
• Feature extractors make featureVector to represent an image in a vector space.
• In a visual application systems, you need robust features for classification and search
FEATURE EXTRACTOR
IMAGES
MODEL TEST
PREDICTION
![Page 7: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/7.jpg)
WHY FEATURES EXTRACTION• Typically, images features are numerical vectors
that can be used with ML techniques.
• FeatureVectors can be compared by measuring a distance
• Is useful to groups similar similar features and reduce the dimension space
• Indexing for Information Retrieval
FEATURE EXTRACTOR
![Page 8: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/8.jpg)
IMAGE FEATURES Image features can be:
GLOBAL: single featureVector comes out
FEATURE EXTRACTOR
![Page 9: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/9.jpg)
IMAGE FEATURES Image features can be:
GRID BASED: multiple featureVectors from each image blocks
FEATURE EXTRACTOR
![Page 10: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/10.jpg)
IMAGE FEATURES Image features can be:
LOCAL: multiple featureVector from interest points (different from each image)
FEATURE EXTRACTOR
![Page 11: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/11.jpg)
IMAGE FEATURES Image features can be:
SEGMENTED: multiple featureVector from region point (different from each image)
FEATURE EXTRACTOR
![Page 12: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/12.jpg)
SIFT FEATURESScale Invariant Feature Transformation is an advanced technique to extract local features from interest points of an images, that are invariant to rotation, lighting changes…
Builds on the idea of a local gradient histogram by incorporating spatial binning which in essence creates multiple gradient histograms about interest points and appends them all together
Standard SIFT geometry appends a spatial 4x4 grid of histograms with 8 orientations
Leading a 128-dims features vector which is highly discriminant and robust128
![Page 13: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/13.jpg)
OPENIMAJ• Image processing Java libraries, that includes a lot of
feature extractor and other utilities for visual applications:
• DoGSIFT
• DenseSIFT
• PyramidSIFT
• …
![Page 14: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/14.jpg)
SPARK
• Spark is used to scale out the applications: big dataset, high feature dimensions…
• MLlib
![Page 15: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/15.jpg)
IMAGE PIPELINE
sift EXTRACTOR
DATASET
KNN
KMEANS QUANTIZATOR
cf-iif EXTRACTOR
PREDICTIONS
![Page 16: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/16.jpg)
sift
DATASET
KNN
KMEANS QUANTIZAT
OR
my EXTRACTOR
IMAGE PIPELINE
PATH LABEL
img1 car
img3 bycicle
img17 car
PATH LABEL siftFEAT
img1 car
img3 bycicle
128
6000+
PREDICTIONS
1) SIFT features extraction
![Page 17: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/17.jpg)
sift EXTRACTOR
DATASET
KNN
KME
my EXTRACTOR
IMAGE PIPELINE
X
siftFEAT
128
6000+
#images
CLUSTER
128
PREDICTIONS
2) features quantisation
![Page 18: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/18.jpg)
sift EXTRACTOR
DATASET
KNN
KME
my EXTRACTOR
IMAGE PIPELINE
X
siftFEAT
128
6000+
#images
CLUSTER
128
fixed K = 300
PREDICTIONS
2) features reduction
![Page 19: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/19.jpg)
sift EXTRACTOR
DATASET
KNN
KMEANS QUANTIZAT
OR
my
IMAGE PIPELINEA questo punto associamo a ciascuna immagine un “cluster vector” (analogo del keyword vector nel testo)
Cw=<w1, w2,...., wn> n=|C|
Per ciascuna immagine, e per ciascun cluster di descrittori j, il corrispondente peso wj sara’ il prodotto di due fattori:
• Cluster Frequency: percentuale di punti di quella immagine che sono stati mappati nel cluster j
• Inverse Image Frequency: logaritmo del rapporto tra la cardinalita’ del database e il numero di immagini in cui descrittori mappati in quel cluster sono presenti
PREDICTIONS
![Page 20: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/20.jpg)
sift EXTRACTOR
DATASET
KNN
KMEANS QUANTIZAT
OR
my
IMAGE PIPELINELa logica è ripresa dal TF-IDF molto comune in text retrieval.
Un cluster diventa discriminante se contiene poche immagini!
PREDICTIONS
![Page 21: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/21.jpg)
sift EXTRACTOR
DATASET
KNN
KMEANS QUANTIZAT
OR
my
IMAGE PIPELINELa logica è ripresa dal TF-IDF molto comune in text retrieval.
Un cluster diventa discriminante se contiene poche immagini!
Features-space: si passa dai descrittori SIFT ai cluster-vector
PREDICTIONS
![Page 22: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/22.jpg)
IMAGE PIPELINE
PATH LABEL siftFEAT
img can
img gatt
128
CLUSTER
V images
myEXTR
PATH LABEL cfiifFEAT
img can
img gatt
![Page 23: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/23.jpg)
sift EXTRACTOR
DATASET
KNN
KMEANS QUANTIZAT
OR
my
IMAGE PIPELINE
PATH LABEL myFEAT
img1 bycicle
img3 car
#clusters
#images
PREDICTIONS
3) features transformation
![Page 24: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/24.jpg)
sift EXTRACTOR
DATASET
KNN
KMEANS QUANTIZAT
OR
my
IMAGE PIPELINE
PATH LABEL myFEAT
img1 bycicle
img3 car
#clusters
#images
La cella wj del cluster-vector dell’imm. A rappresenta il suo peso nel cluster J
PREDICTIONS
![Page 25: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/25.jpg)
sift EXTRACTOR
DATASET
KNN
KMEANS QUANTIZAT
OR
my
IMAGE PIPELINE
Tutte le immagini sono rappresentate dai cf-iif vector!
PREDICTIONS
PATH LABEL myFEAT
img1 bycicle
img3 car
#clusters
#images
![Page 26: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/26.jpg)
sift EXTRACTOR
DATASET
KNN
KMEANS QUANTIZAT
OR
my
IMAGE PIPELINE
This is own vocabulary! PREDICTIONS
PATH LABEL myFEAT
img1 bycicle
img3 car
![Page 27: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/27.jpg)
KNN
sift EXTRACTOR
KNN
KMEANS QUANTIZAT
OR
my EXTRACTOR
IMAGE PIPELINE
cfiifFEAT
TEST TRAIN 4) train a knn classifier
![Page 28: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/28.jpg)
sift EXTRACTOR
KNN
KMEANS QUANTIZAT
OR
my EXTRACTOR
IMAGE PIPELINETEST TRAIN 4) learn the model
KNN
![Page 29: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/29.jpg)
sift EXTRACTOR
KNN
KMEANS QUANTIZAT
OR
my EXTRACTOR
IMAGE PIPELINE
PREDICTIONS
TEST TRAIN
myFEAT
KNN
4) testLABEL
car
bycicle
motorbike
![Page 30: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/30.jpg)
sift EXTRACTOR
KNN
KMEANS QUANTIZAT
OR
my EXTRACTOR
IMAGE PIPELINE
PREDICTIONS
TEST TRAIN
myFEAT
KNN
This is the most similar!
LABEL
car
bycicle
motorbike
LABEL
car
![Page 31: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/31.jpg)
sift EXTRACTOR
KNN
KMEANS QUANTIZAT
OR
my EXTRACTOR
IMAGE PIPELINE
PREDICTIONS
TEST TRAIN
myFEAT
KNN
This is the most similar!
LABEL
car
bycicle
motorbike
LABEL
car
![Page 32: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/32.jpg)
IMAGE PIPELINES• Image Features Extraction: find interest points and extract discriminative
and robust features
• OPENIMAJ multiple algortihms
• Learn large codebooks from features
• Train the model (KNN)
• SPARK scalable models (3 days to train a KNN model on 15 images with openimaj)
• SPARK multiple models (Bayes models, Neural Networks)
PAIN POINTS
![Page 33: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/33.jpg)
• Efficient Nearest Neighbour Search (test)
• KDTree
• HyperParameters Tuning (in own pipeline used for Kmeans, CFIIF and KNN)
IMAGE PIPELINESPAIN POINTS
![Page 34: Image Classification and Retrieval logic](https://reader034.fdocuments.us/reader034/viewer/2022051521/5a6d40cc7f8b9a16428b5171/html5/thumbnails/34.jpg)
• Image Features can be used to match music!
• Extractors can be used to find objects! (Face Detection)
OPENIMAJ++