Image Classification and Retrieval logic

IMAGE CLASSIFICATION PIPELINE

CF x IIF classification logic strategy

DONE• Information Retrieval (query the dataset)

• TF-IDF

• Machine Learning (classify new instances)

• CF-IIF

• Technologies: OpenIMAJ / Spark

• Image Pipeline

• Logica di classificazione

• Implementazione in java (e test)

DA RIVEDERE• KNN with KDtree

• Cosine similarities, and distance metrics

• Improve cf-iif extractor (logica in spark)

• Tuning with hyper parameter

• Reduce features space: SVD (scegliendo lo 0,01% di cluster sono 6000+ features)

TODO

• Re-engineering in spark

• testare differenti SIFT features (pyramid?)

• sostituire KNN con CNeuralNetworks

• (GraphLab/Deep4J)

FEATURE EXTRACTOR

LOADER

MODEL TEST

PREDICTION

Build a classifier, based on salient-feature vocabulary created from the dataset

1. load images dataset inherent 3 distinct class

2. extract local features from each image,

3. create the codebook for classification

4. train and test the model

IMAGE PIPELINE

FEATURES EXTRACTION

• Extracting features is where cool image processing happens, and represent the key part of the pipeline.

• Feature extractors make featureVector to represent an image in a vector space.

• In a visual application systems, you need robust features for classification and search

FEATURE EXTRACTOR

IMAGES

MODEL TEST

PREDICTION

WHY FEATURES EXTRACTION• Typically, images features are numerical vectors

that can be used with ML techniques.

• FeatureVectors can be compared by measuring a distance

• Is useful to groups similar similar features and reduce the dimension space

• Indexing for Information Retrieval

FEATURE EXTRACTOR

IMAGE FEATURES Image features can be:

GLOBAL: single featureVector comes out

FEATURE EXTRACTOR


GRID BASED: multiple featureVectors from each image blocks

FEATURE EXTRACTOR


LOCAL: multiple featureVector from interest points (different from each image)

FEATURE EXTRACTOR


SEGMENTED: multiple featureVector from region point (different from each image)

FEATURE EXTRACTOR

SIFT FEATURESScale Invariant Feature Transformation is an advanced technique to extract local features from interest points of an images, that are invariant to rotation, lighting changes…

Builds on the idea of a local gradient histogram by incorporating spatial binning which in essence creates multiple gradient histograms about interest points and appends them all together

Standard SIFT geometry appends a spatial 4x4 grid of histograms with 8 orientations

Leading a 128-dims features vector which is highly discriminant and robust128

OPENIMAJ• Image processing Java libraries, that includes a lot of

feature extractor and other utilities for visual applications:

• DoGSIFT

• DenseSIFT

• PyramidSIFT

• …

SPARK

• Spark is used to scale out the applications: big dataset, high feature dimensions…

• MLlib

IMAGE PIPELINE

sift EXTRACTOR

DATASET

KNN

KMEANS QUANTIZATOR

cf-iif EXTRACTOR

PREDICTIONS

sift

DATASET

KNN

KMEANS QUANTIZAT

OR

my EXTRACTOR

IMAGE PIPELINE

PATH LABEL

img1 car

img3 bycicle

img17 car

PATH LABEL siftFEAT

img1 car

img3 bycicle

128

6000+

PREDICTIONS

1) SIFT features extraction

sift EXTRACTOR

DATASET

KNN

KME

my EXTRACTOR

IMAGE PIPELINE

X

siftFEAT

128

6000+

#images

CLUSTER

128

PREDICTIONS

2) features quantisation

sift EXTRACTOR

DATASET

KNN

KME

my EXTRACTOR

IMAGE PIPELINE

X

siftFEAT

128

6000+

#images

CLUSTER

128

fixed K = 300

PREDICTIONS

2) features reduction

sift EXTRACTOR

DATASET

KNN

KMEANS QUANTIZAT

OR

my

IMAGE PIPELINEA questo punto associamo a ciascuna immagine un “cluster vector” (analogo del keyword vector nel testo)

Cw=<w1, w2,...., wn> n=|C|

Per ciascuna immagine, e per ciascun cluster di descrittori j, il corrispondente peso wj sara’ il prodotto di due fattori:

• Cluster Frequency: percentuale di punti di quella immagine che sono stati mappati nel cluster j

• Inverse Image Frequency: logaritmo del rapporto tra la cardinalita’ del database e il numero di immagini in cui descrittori mappati in quel cluster sono presenti

PREDICTIONS

sift EXTRACTOR

DATASET

KNN

KMEANS QUANTIZAT

OR

my

IMAGE PIPELINELa logica è ripresa dal TF-IDF molto comune in text retrieval.

Un cluster diventa discriminante se contiene poche immagini!

PREDICTIONS

sift EXTRACTOR

DATASET

KNN

KMEANS QUANTIZAT

OR

my

IMAGE PIPELINELa logica è ripresa dal TF-IDF molto comune in text retrieval.

Un cluster diventa discriminante se contiene poche immagini!

Features-space: si passa dai descrittori SIFT ai cluster-vector

PREDICTIONS

IMAGE PIPELINE

PATH LABEL siftFEAT

img can

img gatt

128

CLUSTER

V images

myEXTR

PATH LABEL cfiifFEAT

img can

img gatt

sift EXTRACTOR

DATASET

KNN

KMEANS QUANTIZAT

OR

my

IMAGE PIPELINE

PATH LABEL myFEAT

img1 bycicle

img3 car

#clusters

#images

PREDICTIONS

3) features transformation

sift EXTRACTOR

DATASET

KNN

KMEANS QUANTIZAT

OR

my

IMAGE PIPELINE

PATH LABEL myFEAT

img1 bycicle

img3 car

#clusters

#images

La cella wj del cluster-vector dell’imm. A rappresenta il suo peso nel cluster J

PREDICTIONS

sift EXTRACTOR

DATASET

KNN

KMEANS QUANTIZAT

OR

my

IMAGE PIPELINE

Tutte le immagini sono rappresentate dai cf-iif vector!

PREDICTIONS

PATH LABEL myFEAT

img1 bycicle

img3 car

#clusters

#images

sift EXTRACTOR

DATASET

KNN

KMEANS QUANTIZAT

OR

my

IMAGE PIPELINE

This is own vocabulary! PREDICTIONS

PATH LABEL myFEAT

img1 bycicle

img3 car

KNN

sift EXTRACTOR

KNN

KMEANS QUANTIZAT

OR

my EXTRACTOR

IMAGE PIPELINE

cfiifFEAT

TEST TRAIN 4) train a knn classifier

sift EXTRACTOR

KNN

KMEANS QUANTIZAT

OR

my EXTRACTOR

IMAGE PIPELINETEST TRAIN 4) learn the model

KNN

sift EXTRACTOR

KNN

KMEANS QUANTIZAT

OR

my EXTRACTOR

IMAGE PIPELINE

PREDICTIONS

TEST TRAIN

myFEAT

KNN

4) testLABEL

car

bycicle

motorbike

sift EXTRACTOR

KNN

KMEANS QUANTIZAT

OR

my EXTRACTOR

IMAGE PIPELINE

PREDICTIONS

TEST TRAIN

myFEAT

KNN

This is the most similar!

LABEL

car

bycicle

motorbike

LABEL

car

IMAGE PIPELINES• Image Features Extraction: find interest points and extract discriminative

and robust features

• OPENIMAJ multiple algortihms

• Learn large codebooks from features

• Train the model (KNN)

• SPARK scalable models (3 days to train a KNN model on 15 images with openimaj)

• SPARK multiple models (Bayes models, Neural Networks)

PAIN POINTS

• Efficient Nearest Neighbour Search (test)

• KDTree

• HyperParameters Tuning (in own pipeline used for Kmeans, CFIIF and KNN)

IMAGE PIPELINESPAIN POINTS

• Image Features can be used to match music!

• Extractors can be used to find objects! (Face Detection)

OPENIMAJ++

https://github.com/gianvi

Thanks!

https://github.com/gianvi

Image Classification and Retrieval logic

Data & Analytics

Transcript of Image Classification and Retrieval logic