DeviantART Analysis using Image Features

56
Image Analysis using Image Features DeviantART Analysis using Image Features Bart Buter, Davide Modolo, Sander van Noort Nick Dijkshoorn, Quang Nguyen, Bart van de Poel 1

description

DeviantART Analysis using Image Features. Bart Buter, Davide Modolo, Sander van Noort Nick Dijkshoorn, Quang Nguyen, Bart van de Poel. Profile Project. Our project focused on explorative research on the analysis of artists and their images of a huge art community called deviantART - PowerPoint PPT Presentation

Transcript of DeviantART Analysis using Image Features

Page 1: DeviantART Analysis using Image Features

Image Analysis using Image Features

1

DeviantART Analysis using Image Features

Bart Buter, Davide Modolo, Sander van Noort Nick Dijkshoorn, Quang Nguyen, Bart van de Poel

Page 2: DeviantART Analysis using Image Features

Image Analysis using Image Features

2

Profile Project

• Our project focused on explorative research on the analysis of artists and their images of a huge art community called deviantART

• The research touched different fields:– Visualization (implementation of a Toolkit)– Data collection– Features extraction (statistical and cognitive-inspired) – Classification– Network analysis

Page 3: DeviantART Analysis using Image Features

Image Analysis using Image Features

3

Overview

• Introduction• Toolkit• Experiments & Results• Future work• Conclusion

Page 4: DeviantART Analysis using Image Features

Image Analysis using Image Features

4

Introduction - deviantART

• deviantART (dA) is the largest online community showcasing various forms of user-made artwork

• 13 million registered members (called Deviants)• Allows emerging and established artists to exhibit,

promote, and share their works• All artwork is well organized (comprehensive category

structure)– Traditional media (painting and sculpture), to digital art,

pixel art, films and anime

Page 5: DeviantART Analysis using Image Features

Image Analysis using Image Features

11

Research questions

• Can we visualize important aspects of deviantART?• Can artists and/or styles be distinguished?• Are artists influencing each other?• Do art styles change over time?• Are there none-artists interesting for deviantART?

Page 6: DeviantART Analysis using Image Features

Image Analysis using Image Features

12

Toolkit

• General tool to answer research questions about social art communities (deviantART)

• 4 Components

Online

Page 7: DeviantART Analysis using Image Features

Image Analysis using Image Features

13

Data collection from deviantART

• Network of “professional” artists– Download artist’s name and their watchers– Output for Pajek and Matlab graph toolbox

• Artist’s images and information about these images– Download galleries from users as dataset– No web API, instead follow Backend links– Parse RSS XML files and download images

Data collection

Page 8: DeviantART Analysis using Image Features

Image Analysis using Image Features

14

Data collection

• For each image store a xml file

Example:<?xml version="1.0"?><root xml_tb_version="3.1"> <guid>http://catluvr2.deviantart.com/art/42-

Journals-73664427</guid> <title>-42 Journals</title>

<category>customization/screenshots/other</category>

<filename>_42_Journals_by_catluvr2.jpg</filename></root>

Data collection

Page 9: DeviantART Analysis using Image Features

Image Analysis using Image Features

15

Dataset information

• Downloaded 31 users• About 5000 images • Daily Deviations of a random day

• Top categories:– photography: 2244– customization: 906– traditional: 842– digitalart: 587– fanart: 239

Data collection

Page 10: DeviantART Analysis using Image Features

Image Analysis using Image Features

16

Feature extraction

• Why we need features– Can’t visualize sets of images in high-dimensional space– Features can be intuitive for toolkit users– Easier to work with than raw data (classification)

• Kind of features:– Statistical features – Cognitively-inspired features

Feature extraction

Page 11: DeviantART Analysis using Image Features

Image Analysis using Image Features

17

Feature format

• Store features in XML files– One XML file per image describing all features– Easy to add new features of existing images– Easy to add images

• Only calculate features that are not already present in XML file

• Add those features to the XML file of the image

Feature extraction

Page 12: DeviantART Analysis using Image Features

Image Analysis using Image Features

18

Statistical features

• Low level & understandable features– RGB values (average, median)– Hue, Saturation&Intensityvalues (average, median)– Edge-pixel ratio – Corner-pixel ratio – Entropy of the intensity– Variance of the intensity– Compositional features

Feature extraction – Statistic part

Page 13: DeviantART Analysis using Image Features

Image Analysis using Image Features

19

Edge-pixel ratioRatio: 0.0094 Ratio: 0.0998

Feature extraction - Statistic part

Page 14: DeviantART Analysis using Image Features

Image Analysis using Image Features

20

Average of the intensity

AvgIntensity: 21.90 AvgIntensity: 243.67AvgIntensity: 123.96

Feature extraction - Statistic part

Page 15: DeviantART Analysis using Image Features

Image Analysis using Image Features

21

Entropy of the intensity

Intensity entropy: 1.5408 Intensity entropy: 7.8799

Feature extraction - Statistic part

Page 16: DeviantART Analysis using Image Features

Image Analysis using Image Features

22

Variance of the intensity

Intensity variance: 506 Intensity variance: 14676

Feature extraction - Statistic part

Page 17: DeviantART Analysis using Image Features

Image Analysis using Image Features

23

Compositional edge-pixel ratio

Feature extraction - Statistic part

Page 18: DeviantART Analysis using Image Features

Image Analysis using Image Features

24

Hue and Saturation

Feature extraction - Statistic part

Page 19: DeviantART Analysis using Image Features

Image Analysis using Image Features

25

Weibull-Distribution Image Contrast

• Why

Feature extraction – Statistical part

Page 20: DeviantART Analysis using Image Features

Image Analysis using Image Features

26

Cognitively-inspired features

Model of Saliency-Based Visual Attention

• It has appeared that attention influences visual information even in the earliest areas of primate visual cortex

• This influence seems to shape an integrated saliency map

• This maps is the representation of the environment that weighs every input by its local feature contrast and its current behavioral relevance

• It enables the visual system to integrate a large amount of information

Feature extraction - Cognitive part

Page 21: DeviantART Analysis using Image Features

Image Analysis using Image Features

27

Itti, Koch and Niebur’s Model

Feature extraction - Cognitive part

Page 22: DeviantART Analysis using Image Features

Image Analysis using Image Features

28

Example of saliency map

color

orientation

intensity

EXTRA: skin SALIENCY MAP

ORIGINAL IMAGE

Feature extraction - Cognitive part

Page 23: DeviantART Analysis using Image Features

Image Analysis using Image Features

29

What do we have• Important visual features

about the style of the photo of this image:

- The portrait is not exactly in the middle- The portrait is a human- The portrait is standing statically- Colors are quite uniform, and they are not so many

But how to use all the different maps to represent these information?

Saliency map

Skin map

Color map

Orientation map

Intensity map

Feature extraction - Cognitive part

Page 24: DeviantART Analysis using Image Features

Image Analysis using Image Features

30

Cognitively-inspired features (1)

• Shannon entropy of the 5 different maps (the saliency and the conspicuity ones)

• Standard deviation of the saliency distribution in the saliency map

• Location of the three most salient points

• Skin intensity

Feature extraction - Cognitive part

Page 25: DeviantART Analysis using Image Features

Image Analysis using Image Features

31

Cognitively-inspired features (2)

• Location has been computed using the Inhibition Of Return (IOR) procedure:

Original saliency map

After the first inhibition After the second inhibition

3 most salient locations

Feature extraction - Cognitive part

Page 26: DeviantART Analysis using Image Features

Image Analysis using Image Features

32

Cognitively-inspired features (3)• Skin is an extra channel (not standard in the Itti’s model) but it has

been found really interesting• It can easily be used to detect nude images (that are quite popular

within devianArt’s professional photographer)

Original image Skin map

Original image

Skin map

Feature extraction - Cognitive part

Page 27: DeviantART Analysis using Image Features

Image Analysis using Image Features

33

OpenCV face detector

Feature extraction - Cognitive part

Page 28: DeviantART Analysis using Image Features

Image Analysis using Image Features

34

Classification

• Given a set of features, the classification is used to:– Determine if two artists/categories are distinguishable – Determine which features are useful to do it

• Different classifiers are available in the Toolkit:– k-Nearest Neighbour (kNN)– Naive Bayes (NB)– Nearest Mean (NM)– Support Vector Machine (libSVM)

Classification

Page 29: DeviantART Analysis using Image Features

Image Analysis using Image Features

35

Classification

• Pre-processing functions: – Reading in XML files and creating a dataset– Normalization– Dataset filtering on classes and features– Parameter optimization using cross-validation

• Classification current capabilities: – 1 class against another class– 1 class against all other classes

Classification

Page 30: DeviantART Analysis using Image Features

Image Analysis using Image Features

36

Classification

• Feature selection is needed when dealing with a lot of features– Reduces the dimensions of the data representation – Give the feature combination that best separate a class

• Sequential forward feature selection– First select the most informative feature and iteratively

add the next most informative feature to it– Criterion is based on the inter-intra distance

Classification

Page 31: DeviantART Analysis using Image Features

Image Analysis using Image Features

37

Classification

• Evaluation measures:– Precision

• The percentage of how many of the positive classified images were indeed positive

– Recall• The percentage of how many of the total positive images

were found positive

– F1-Measure • The weighted average of the precision and recall

Classification

Page 32: DeviantART Analysis using Image Features

Image Analysis using Image Features

38

Visualization

• Purpose of the visualization:1. Visualize the dataset

• Find patterns• Analyse classification results• Filtering (relevant information)• Input: Dataset (thumbs+full) images & XML features files

– Converted to single TAB seperated file

2. Express the classification performance• Capture the performance in one graph• Input: performance output of the classifier

Visualization

Page 33: DeviantART Analysis using Image Features

Image Analysis using Image Features

39

Visualization

• Use existing visualization application?– Mondrian, general purpose statistical data-visualization system

Visualization

http://rosuda.org/mondrian/

Page 34: DeviantART Analysis using Image Features

Image Analysis using Image Features

40

Visualization

• Use existing visualization application?– XmdvTool, interactive visual exploration of multivariate data sets

– Flat version of the data set

Visualization

http://davis.wpi.edu/~xmdv/

Page 35: DeviantART Analysis using Image Features

Image Analysis using Image Features

41

Visualization

• Use existing visualization application?• Tool that has generic uses, produce only generic displays• Data can take many interesting forms– Require unique types of display and interaction– Not captured with general applications

• UI not intuitive (lack easy way to filter data)• (These tools also look outdated)

Visualization

Page 36: DeviantART Analysis using Image Features

Image Analysis using Image Features

42

Visualization

• What language/framework for our visualization?• There are many…• Prefuse visualization toolkit (generic displays)

• Adobe Flash/Flex (expensive, slow for large datasets)

Visualization

Page 37: DeviantART Analysis using Image Features

Image Analysis using Image Features

43

Visualization

• (Partially) Implemented in “Processing”– Open source programming language to create images,

animations, and interactions– Build on top of Java (collection of Java classes)– Consists of:

• Processing Development Environment (PDE) (very minimalistic)

• A collection of commands (API)• Several libraries that support more advanced features

(OpenGL, XML)– Easy to integrate into Java (Eclipse)

Visualization

Page 38: DeviantART Analysis using Image Features

Image Analysis using Image Features

44

Visualization: Processing

• Provides functions to make life more easy– image(img, x, y, [width, height])– line(x1, y1, x2, y2) stroke(color)

– Not to draw complete graphs/plots

• Right combination of cost, ease of use and speed

• Export the application as a Java Applet– Run it on a website– Use URL instead of images to avoid legal issues

Visualization

Page 39: DeviantART Analysis using Image Features

Image Analysis using Image Features

46

Experiments & Results

Page 40: DeviantART Analysis using Image Features

Image Analysis using Image Features

47

Experiment #1 – Classification

• Goal:– Use the toolkit to find what kind of features best separate

two artists

• Details of the experiment– Experiment was performed for all artists in the dataset– Feature selection algorithm was used to output the 1-5

most informative features– Evaluation was done using the F-measure

Page 41: DeviantART Analysis using Image Features

Image Analysis using Image Features

48

Selecting the classifier

• Select classifier for the experiment– Train all the classifiers on a subset of the trainingdata using

crossvalidation to optimize parameters– Criteria of selection: F-measure– SVM gives the highest F-measure

KNN Naive Bayes Nearest Mean Linear SVM

0.7644 0.8157 0.7383 0.8278

Average F-measure 1vs1 classification over all artists

Page 42: DeviantART Analysis using Image Features

Image Analysis using Image Features

49

Result Matrix using the top 1 feature

Page 43: DeviantART Analysis using Image Features

Image Analysis using Image Features

50

Result Matrix using top 2 features

Page 44: DeviantART Analysis using Image Features

Image Analysis using Image Features

51

Result Matrix using top 3 features

Page 45: DeviantART Analysis using Image Features

Image Analysis using Image Features

52

Result Matrix using the top 4 features

Page 46: DeviantART Analysis using Image Features

Image Analysis using Image Features

53

Result Matrix using the top 5 features

Page 47: DeviantART Analysis using Image Features

Image Analysis using Image Features

54

Result Matrix using all features

Page 48: DeviantART Analysis using Image Features

Image Analysis using Image Features

55

Visualization Case (1)

• Artist Pair: Kitsunebaka91 and LALAax– Fmeasure Pair: 0.952941 and 0.884615– medIntCells_2 – gridEdgeRatio_4

• Artist Pair: fediaFedia and gsphoto– Fmeasure Pair: 0.867347 and 0.938095– avgHue – intVariance

Page 49: DeviantART Analysis using Image Features

Image Analysis using Image Features

56

Visualization Case (2)

• Artist Pair: K1lgore and sekcyjny– Fmeasure Pair: 0.692308 and 0.640000– avgBCells_3 – salMapCEntropy

• Artist Pair: stereoflow and zihnisinir– Fmeasure Pair: 0.649007 and 0.683871– avgHueCells_4 – avgR

Page 50: DeviantART Analysis using Image Features

Image Analysis using Image Features

57

Results

Features Number of Occurences in the top 5 features: F-measure > 0.9

EdgeRatio in the Center 13

EdgeRatio over the entire image 8

Average Hue in the Center 5

Saliency Map Skin Entropy 7

Entropy of the Intensity 5

Average R in the lower right corner 4

EdgeRatio in the Center Right 3

Page 51: DeviantART Analysis using Image Features

Image Analysis using Image Features

58

Results

Names Number of Occurences in the top 5 features: F-measure > 0.9

Kitsunebake91 12

Pierrebfoto 10

One_Vox 2

gsphoto 2

sekcyjny 2

Page 52: DeviantART Analysis using Image Features

Image Analysis using Image Features

59

Experiment #2a - Global network results

• Goal:– Describe the professional network for watcher

connections.• Results:– 103’663 unique artists – 4’483’023 connections– Average Degree: 43.25– Fraction of reciprocal links: 17.65%

Page 53: DeviantART Analysis using Image Features

Image Analysis using Image Features

60

Experiment #2b - Core network results

• Goal:– Find a core of highly connected users.

• Algorithm– Recursive remove all nodes with degree < N

• Results:• Out-degree N = 44, 1471 nodes• In – degree N = 43, 1701 nodes• In+Out – degree N = 185, 1099 nodes• CoOccurence matrix:• Tripple Occurrence:

– 14

1701 541 54541 1471 28654 286 1099

Page 54: DeviantART Analysis using Image Features

Image Analysis using Image Features

61

Core network

Page 55: DeviantART Analysis using Image Features

Image Analysis using Image Features

62

Future work

• More features – Including emotional features (color and texture)

• More network information– Using the Core network as a basis for a new dataset (ongoing)– More links, not only watchers (hierachy)

• Incorporating time

• Using classifiers to make recommendations

Page 56: DeviantART Analysis using Image Features

Image Analysis using Image Features

63

Questions