15 - Phylogeny
-
Upload
thalison-araujo -
Category
Documents
-
view
228 -
download
0
Transcript of 15 - Phylogeny
-
8/16/2019 15 - Phylogeny
1/171
Prof. Dr. Anderson [email protected]
http://www.ic.unicamp.br/~rocha
Reasoning for Complex Data (RECOD) Lab.Institute of Computing, Unicamp
Av. Albert Einstein, 1251 - Cidade Universitária
CEP 13083-970 • Campinas/SP - Brasil
http://www.ic.unicamp.br/~rochahttp://www.ic.unicamp.br/~rochamailto:[email protected]:[email protected]
-
8/16/2019 15 - Phylogeny
2/171
A. Rocha, 2012 – Análise Forense de Documentos Digitais 4
Summary
! Phylogeny Problem
! Image Phylogeny
! Video Phylogeny
! Current Challenges
-
8/16/2019 15 - Phylogeny
3/171
Phylogeny of Media
Parts of this class were presented in talk in Feb. 2nd, 2012 at Google, US,
by S. Goldenstein
Joint work – S. Goldenstein, Z. Dias and A. Rocha
-
8/16/2019 15 - Phylogeny
4/171
What is this talk about?
4
-
8/16/2019 15 - Phylogeny
5/171
What is this talk about?
• An important problem that has beenmostly overlooked by the community.
4
-
8/16/2019 15 - Phylogeny
6/171
What is this talk about?
• An important problem that has beenmostly overlooked by the community.
• Has immediate applications in many areas.
4
-
8/16/2019 15 - Phylogeny
7/171
What is this talk about?
• An important problem that has beenmostly overlooked by the community.
• Has immediate applications in many areas.• It is hard to solve.
4
-
8/16/2019 15 - Phylogeny
8/171
What is this talk about?
• An important problem that has beenmostly overlooked by the community.
• Has immediate applications in many areas.• It is hard to solve.
• There is room for elegant math anddifferent solutions.
4
-
8/16/2019 15 - Phylogeny
9/171
What is this talk about?
• An important problem that has beenmostly overlooked by the community.
• Has immediate applications in many areas.• It is hard to solve.
• There is room for elegant math anddifferent solutions.
Video Phylogeny: Recovering Near-Duplicate Video RelationshipsZ Dias, A Rocha, and S Goldenstein
IEEE Workshop on Information Forensics and
Security (WIFS) 2011
Image Phylogeny by MinimalSpanning Trees Z Dias, and A Rocha and S Goldenstein
IEEE Transactions of Information Forensics and
Security, April 2012.4
-
8/16/2019 15 - Phylogeny
10/171
How it started
In 2009, the current Brazilian president was the
president’s chief of staff, and the government pre-
candidate for the 2010 presidential election.Folha de SP , a major Brazilian newspaper (think New
York Times) ran an interview and article about her.
They printed a “scan of her criminal records” as
political activist in the military dictatorship period
(1964-1985), suggesting it as a record she engaged in
violent armed activities (which she denies to this day).
5
-
8/16/2019 15 - Phylogeny
11/171
How it started
6
-
8/16/2019 15 - Phylogeny
12/171
How it started
6
In 2009, the current Brazilian president was just apre-candidate (for the 2010 presidential election).
-
8/16/2019 15 - Phylogeny
13/171
How it started
6
In 2009, the current Brazilian president was just apre-candidate (for the 2010 presidential election).
The newspaper A Folha de SP (think New York Times)ran article about her.
-
8/16/2019 15 - Phylogeny
14/171
How it started
6
In 2009, the current Brazilian president was just apre-candidate (for the 2010 presidential election).
The newspaper A Folha de SP (think New York Times)ran article about her.
It had a “scan of her criminal files” as political activistin the military dictatorship period (1964-1985).
-
8/16/2019 15 - Phylogeny
15/171
7
-
8/16/2019 15 - Phylogeny
16/171
Criminal
Records?
A “scan” of her personal
files maintained by the
military internal security
during the Brazilian
military regime.
The Public Archive of SP
actually hosts such a
collection. 8
-
8/16/2019 15 - Phylogeny
17/171
Searching the Web...
9
-
8/16/2019 15 - Phylogeny
18/171
Searching the Web...
10
-
8/16/2019 15 - Phylogeny
19/171
Criminal Records?
11
-
8/16/2019 15 - Phylogeny
20/171
Criminal Records?
• This image was already going around thenet for " six months ! it is a clear fake.
11
-
8/16/2019 15 - Phylogeny
21/171
Criminal Records?
• This image was already going around thenet for " six months ! it is a clear fake.
• She hired us, as consultants, to provide aforensic analysis of its authenticity thatcould hold on court.
11
-
8/16/2019 15 - Phylogeny
22/171
Criminal Records?
• This image was already going around thenet for " six months ! it is a clear fake.
• She hired us, as consultants, to provide aforensic analysis of its authenticity thatcould hold on court.
•There were several versions of the image
(near duplicates)
# Which one was the original?
# Where should we perform the analysis?
11
-
8/16/2019 15 - Phylogeny
23/171
How to find the Original?
• The images are “copied” around...- resized;
- cropped;
- color corrected;
- recompressed;
- and possibly other transformations.
12
-
8/16/2019 15 - Phylogeny
24/171
Media Phylogeny
• Identify, among a set of near duplications,which element is the original, and thestructure of generation of each near
duplication.
• Tells the history of the transformationscreated the duplications.
13
-
8/16/2019 15 - Phylogeny
25/171
Media Phylogeny
D
D1 D2
DDB2
D3
D4 D5
D6
Watermarking & Fingerprinting Enabled
Content-Based Copy
Detection & Recognition
Referenced copy
DDB
DDB1
T i ( D
) T j ( D
)
T k ( D
1 ) T
l ( D 2 )
T r ( D
3 ) T
t ( D 3 )
T i ( D
5 )
T a ( D
D B ) T
b
( D D B )
14
-
8/16/2019 15 - Phylogeny
26/171
Image Phylogeny TreesIPT
-
8/16/2019 15 - Phylogeny
27/171
Image Phylogeny Trees:
IPT• Security .• Forensics.
• Copyright enforcement.• News tracking services.
• Indexing.
16
-
8/16/2019 15 - Phylogeny
28/171
Image Phylogeny Trees:
IPT• Security : the modification graph providesinformation of suspects’ behavior, and points outflow of content distribution.
• Forensics.
• Copyright enforcement.
•News tracking services.
• Indexing.
17
-
8/16/2019 15 - Phylogeny
29/171
Image Phylogeny Trees:
IPT• Security .• Forensics: analysis in the original document (root
of the tree) instead of in a near duplicate.
• Copyright enforcement.
• News tracking services.
• Indexing.
18
-
8/16/2019 15 - Phylogeny
30/171
Image Phylogeny Trees:
IPT• Security .• Forensics.
• Copyright enforcement: traitor tracing withoutthe need of source control techniques(watermarking or fingerprinting).
•News tracking services.
• Indexing.
19
-
8/16/2019 15 - Phylogeny
31/171
Image Phylogeny Trees:
IPT• Security .• Forensics.
• Copyright enforcement.• News tracking services: the ND relationships
can feed news tracking services with key elementsfor determining the opinion forming process acrosstime and space.
• Indexing.
20
-
8/16/2019 15 - Phylogeny
32/171
Image Phylogeny Trees:
IPT• Security .• Forensics.
• Copyright enforcement.• News tracking services.
• Indexing: tree root can give us an image from anND set as a representative to index, store, or evenfurther refine the ND search.Tree structure might help indexing and retrieving.
21
-
8/16/2019 15 - Phylogeny
33/171
A CB
D E
F G
H
Image Near
A
BC D
G
H
E
F
Phylogeny
T ~ β(A)
T ~ β(A) ~ β(A)
T ~ β(A)
T ~ β(B)
T ~ β(C )
T ~ β(D)
Our Objective
-
8/16/2019 15 - Phylogeny
34/171
Two Subproblems
1. Define good dissimilarity functions
between images.
2. Develop algorithms that construct the
Image Phylogeny Tree given a dissimilaritymatrix of the images.
23
d(i, j)
-
8/16/2019 15 - Phylogeny
35/171
Dissimilarity
-
8/16/2019 15 - Phylogeny
36/171
DissimilarityThe dissimilarity is not a metric - we want to
estimate how likely A#B and B#A.
Think about cropping, or resizing an image -
these are not two-way operations.
Jeffrey Pine, Sentinel Dome, Yosemite National Park, Ansel Adams. 25
-
8/16/2019 15 - Phylogeny
37/171
Dissimilarity
• Define a family of image transformations parameterized by
• Letfind that minimizes
dβ(i, j) = |I j − T β(I i)|2
min dβ(i, j)
d(i, j) = dβmin(i, j)
T β(I ) β .
26
-
8/16/2019 15 - Phylogeny
38/171
Dissimilarity
• We use a composition of three simple steps.
• In the general case, finding the optimumparameters of a general transformationmight be a complicated optimization.
27
T β(I ) = T jpeg(T color(T spatial(I ))
-
8/16/2019 15 - Phylogeny
39/171
Dissimilarity
28
-
8/16/2019 15 - Phylogeny
40/171
Dissimilarity
• Spatial- Affine Transformation
- Cropping
28
-
8/16/2019 15 - Phylogeny
41/171
Dissimilarity
• Spatial- Affine Transformation
- Cropping
• Color- Channel Brightness and Contrast.
28
-
8/16/2019 15 - Phylogeny
42/171
Dissimilarity
• Spatial- Affine Transformation
- Cropping
• Color- Channel Brightness and Contrast.
• JPEG Compression- Quantization tables.
28
-
8/16/2019 15 - Phylogeny
43/171
Spatial Transformation
for the Dissimilarit
Afghanistan-Pakistan border, Steve McCurry. 29
-
8/16/2019 15 - Phylogeny
44/171
Spatial Transformation
for the DissimilaritKey Points.
Afghanistan-Pakistan border, Steve McCurry. 29
-
8/16/2019 15 - Phylogeny
45/171
Spatial Transformation
for the DissimilaritKey Points.
Afghanistan-Pakistan border, Steve McCurry. 29
-
8/16/2019 15 - Phylogeny
46/171
Spatial Transformation
for the Dissimilarit
Correspondences.
Key Points.
Afghanistan-Pakistan border, Steve McCurry. 29
-
8/16/2019 15 - Phylogeny
47/171
Spatial Transformation
for the Dissimilarit
Correspondences.
Robust Estimationof Affine Transf.
Key Points.
x0
y0
=
a b
c d
x
y
+
txty
Afghanistan-Pakistan border, Steve McCurry. 29
C l T f
-
8/16/2019 15 - Phylogeny
48/171
Color Transformation
for the Dissimilarit
V-J Day, Times Square, Alfred Eisenstaedt 30
C l T f
-
8/16/2019 15 - Phylogeny
49/171
Color Transformation
for the Dissimilarit
31
C l T f
-
8/16/2019 15 - Phylogeny
50/171
Color Transformation
for the Dissimilarit
Brightness / Contrast ~ Same mean and stddev.
32
D l
-
8/16/2019 15 - Phylogeny
51/171
Dissimilarity:
Com ression
• Use the quantization table of the jpeg of Bto compress T(A).
33
-
8/16/2019 15 - Phylogeny
52/171
-
8/16/2019 15 - Phylogeny
53/171
Tree Construction
35
-
8/16/2019 15 - Phylogeny
54/171
Tree Construction
• Local decisions of direction on pairs ofimages is not a good idea...
35
-
8/16/2019 15 - Phylogeny
55/171
Tree Construction
• Local decisions of direction on pairs ofimages is not a good idea...
• Proposition: we want a MST.
35
-
8/16/2019 15 - Phylogeny
56/171
Tree Construction
• Local decisions of direction on pairs ofimages is not a good idea...
• Proposition: we want a MST....but we have a complete directed graph.
35
MST f di d h
-
8/16/2019 15 - Phylogeny
57/171
MST of directed graphs
in the LiteratureThe Optimum Branching problem finds the
MST of a directed graph for a given root.
In our context, it would have to be applied
to each vertex as a root, and the final
complexity in our scenario would be O(n^3).
It also uses a Fibonacci Heap.
36
-
8/16/2019 15 - Phylogeny
58/171
Oriented Kruskal
37
-
8/16/2019 15 - Phylogeny
59/171
Oriented Kruskal
Our method runs once, and finds both the
root and structure simultaneously.
It has an complexity ! we need
to sort all edges of the complete graph.
It requires the Union-Find data structure.
38
O(n2 log n)n2
-
8/16/2019 15 - Phylogeny
60/171
! 1 M[6,1] = 12 Select Edge ( 1 6 )
! 2 M[4,5] = 15 Select Edge ( 5 4 )
" 3 M[4,1] = 16 Test II: Root(1) = 6
! 4 M[5,2] = 18 Select Edge ( 2 5 )
" 5 M[6,5] = 19 Test II: Root(5) = 4
! 6 M[6,3] = 22 Select Edge ( 3 6 )
" 7 M[2,4] = 23 Test I: Root(2) = Root(4)
! 8 M[4,6] = 27 Select Edge ( 6 4 )
Algorithm Steps
[ 6 , 5 , 6 , 4 , 4 , 4 ]Reconstructed Tree
4
5
2
6
1 3
1
2
4
8
6
M 1 2 3 4 5 6
1 - 31 57 37 45 49
2 31 - 33 23 29 32
3 51 41 - 42 37 38
4 16 36 28 - 15 27
5 35 18 54 30 - 54
6 12 40 22 60 19 -
Dissimilarity Matrix
Construction Example
E l ti
-
8/16/2019 15 - Phylogeny
61/171
Evaluation:
Comparing Trees
40
E l ti
-
8/16/2019 15 - Phylogeny
62/171
Evaluation:
Comparing Trees
40
E l ti
-
8/16/2019 15 - Phylogeny
63/171
Evaluation:
Comparing Trees
40
E l ti
-
8/16/2019 15 - Phylogeny
64/171
Evaluation:
Comparing Trees
40
-
8/16/2019 15 - Phylogeny
65/171
IPT Experiments
41
-
8/16/2019 15 - Phylogeny
66/171
IPT Experiments
• Experimental Setup.
41
-
8/16/2019 15 - Phylogeny
67/171
IPT Experiments
• Experimental Setup.
•Complete Trees.
41
-
8/16/2019 15 - Phylogeny
68/171
IPT Experiments
• Experimental Setup.
•Complete Trees.
• Missing Nodes.
41
-
8/16/2019 15 - Phylogeny
69/171
IPT Experiments
• Experimental Setup.
•Complete Trees.
• Missing Nodes.• Missing Root.
41
-
8/16/2019 15 - Phylogeny
70/171
IPT Experiments
• Experimental Setup.
•Complete Trees.
• Missing Nodes.• Missing Root.
• Missing Internal Nodes.
41
-
8/16/2019 15 - Phylogeny
71/171
IPT Experiments
• Experimental Setup.
•Complete Trees.
• Missing Nodes.• Missing Root.
• Missing Internal Nodes.• Real ND sets from the Web.
41
-
8/16/2019 15 - Phylogeny
72/171
IPT Experiments
• Experimental Setup.
•Complete Trees.
• Missing Nodes.• Missing Root.
• Missing Internal Nodes.• Real ND sets from the Web.
• A first look into Forests.41
Experimental Setup
-
8/16/2019 15 - Phylogeny
73/171
Experimental Setup
42
Experimental Setup
-
8/16/2019 15 - Phylogeny
74/171
Experimental Setup
• 50 raw images from UCID.
42
Experimental Setup
-
8/16/2019 15 - Phylogeny
75/171
Experimental Setup
• 50 raw images from UCID.• Trees with 10, 20, 30, 40, and 50 nodes.
42
Experimental Setup
-
8/16/2019 15 - Phylogeny
76/171
Experimental Setup
• 50 raw images from UCID.• Trees with 10, 20, 30, 40, and 50 nodes.
• For every size, 50 random tree topologies, eachwith 10 different random parameters.
42
Experimental Setup
-
8/16/2019 15 - Phylogeny
77/171
Experimental Setup
• 50 raw images from UCID.• Trees with 10, 20, 30, 40, and 50 nodes.
• For every size, 50 random tree topologies, eachwith 10 different random parameters.
• ND set created with affine transformation, crop,brightness-contrast-gamma on each channel and
compression. We use ImageMagick.
42
Experimental Setup
-
8/16/2019 15 - Phylogeny
78/171
Experimental Setup
• 50 raw images from UCID.• Trees with 10, 20, 30, 40, and 50 nodes.
• For every size, 50 random tree topologies, eachwith 10 different random parameters.
• ND set created with affine transformation, crop,brightness-contrast-gamma on each channel and
compression. We use ImageMagick.
• Dissimilarity construction with OpenCV andlibjpg: affine transformation, brightness-contrast by
channel, and compression.42
Complete Trees
-
8/16/2019 15 - Phylogeny
79/171
Complete Trees
!"
$!"
%!"
&!"
'!"
(!"
)!"
*!"
+!"
,!"
$!!"
$! %! &! '! (!
- . / 0 1
-231
!""# %&'() *(+,() -./()#01
43
Complete Trees
-
8/16/2019 15 - Phylogeny
80/171
Complete Trees
!"
$!"
%!"
&!"
'!"
(!"
)!"
*!"
+!"
,!"
$!!"
$! %! &! '! (!
- . / 0 1
-231
!""# %&'() *(+,() -./()#01
If the correct root is at depth zero, we identified the root of thetree. Here, regardless of the tree size, the average depth at
which our solution finds the correct root is lower than 0.03. 43
-
8/16/2019 15 - Phylogeny
81/171
Missing Links
• On the wild, it is unrealistic to expect tohave all the nodes of the tree.
• How to handle missing links?How do we evaluate the algorithm?
44
-
8/16/2019 15 - Phylogeny
82/171
Missing Nodes
12
65
1
7 8 9
43
1110
Using Ancestry Information
2
12
5
1
7
11
2
45
Missing only
-
8/16/2019 15 - Phylogeny
83/171
Missing only
Internal Nodes
!"
$!"
%!"
&!"
'!"
(!"
)!"
*!"
+!"
,!"
$!!"
( $! $( %! %( &! &( '! '( (!
- . / 0 1
-231
!""# %&'() *(+,() -./()#01
46
Missing Root and
-
8/16/2019 15 - Phylogeny
84/171
Missing Root and
Internal Nodes
!"
$!"
%!"
&!"
'!"
(!"
)!"
*!"
+!"
,!"
$!!"
( $! $( %! %( &! &( '! '( (!
- . / 0 1
-231
!""# %&'() *(+,() -./()#01
47
Real ND sets from Web
-
8/16/2019 15 - Phylogeny
85/171
Real ND sets from Web
48
How to Evaluate results?
-
8/16/2019 15 - Phylogeny
86/171
How to Evaluate results?• Since we do not know the ground truths, we
evaluate the stability of reconstruction.
49
How to Evaluate results?
-
8/16/2019 15 - Phylogeny
87/171
How to Evaluate results?• Since we do not know the ground truths, we
evaluate the stability of reconstruction.
Er1: one if the new node IB is not a child of its
generating node IA.
49
How to Evaluate results?
-
8/16/2019 15 - Phylogeny
88/171
How to Evaluate results?• Since we do not know the ground truths, we
evaluate the stability of reconstruction.
Er1: one if the new node IB is not a child of its
generating node IA.
Er2: one if the structure of the tree relating the
nodes in the original set changes with the insertionof the new node IB in the set.
49
How to Evaluate results?
-
8/16/2019 15 - Phylogeny
89/171
How to Evaluate results?• Since we do not know the ground truths, we
evaluate the stability of reconstruction.
Er1: one if the new node IB is not a child of its
generating node IA.
Er2: one if the structure of the tree relating the
nodes in the original set changes with the insertionof the new node IB in the set.
Er3: one if the new node IB appears as a father ofanother node on the original tree.
49
How to Evaluate results?
-
8/16/2019 15 - Phylogeny
90/171
• Since we do not know the ground truths, weevaluate the stability of reconstruction.
Er1: one if the new node IB is not a child of its
generating node IA.
Er2: one if the structure of the tree relating the
nodes in the original set changes with the insertionof the new node IB in the set.
Er3: one if the new node IB appears as a father ofanother node on the original tree.
Er4: is one if the root of the reconstructed tree ofthe original set is different from the root of the
reconstructed tree of the set augmented with IB.
49
How to Evaluate results?
-
8/16/2019 15 - Phylogeny
91/171
• Since we do not know the ground truths, weevaluate the stability of reconstruction.
Er1: one if the new node IB is not a child of its
generating node IA.
Er2: one if the structure of the tree relating the
nodes in the original set changes with the insertionof the new node IB in the set.
Er3: one if the new node IB appears as a father ofanother node on the original tree.
Er4: is one if the root of the reconstructed tree ofthe original set is different from the root of the
reconstructed tree of the set augmented with IB.
P: one if the reconstructed tree is perfectcompared to the original tree (Er1 = Er2 = 0).
49
How to Evaluate results?
-
8/16/2019 15 - Phylogeny
92/171
• Since we do not know the ground truths, weevaluate the stability of reconstruction.
[ 6 , 5 , 6 , 4 , 4 , 4 ]Initial Tree
4
5 6
1 3
4
5
2
6
1
3
5
7
I 7 = T ~ β( I 5)
Select one Node and Artificially Generate a Direct Descendant
[ 7 , 5 , 6 , 4 , 4 , 4 , 6 ]
Tree After Inserting Node 7
2
7
Reconstruction Errors
Er1 1
Er2 1
Er3 1
Er4 0
Suc ess
P 0
Er1: one if the new node IB is not a child of its
generating node IA.
Er2: one if the structure of the tree relating the
nodes in the original set changes with the insertionof the new node IB in the set.
Er3: one if the new node IB appears as a father ofanother node on the original tree.
Er4: is one if the root of the reconstructed tree ofthe original set is different from the root of the
reconstructed tree of the set augmented with IB.
P: one if the reconstructed tree is perfectcompared to the original tree (Er1 = Er2 = 0).
49
-
8/16/2019 15 - Phylogeny
93/171
Real ND setsORIENTED K RUSKAL I PT ALGORITHM R ESULTS FOR THE U NCONSTRAINED S CENARIO.
Description # of Cases %Er1 %Er2 %Er3 %Er4 %P
TG1 Iranian Missiles 90 40.0% 11.1% 11.1% 0.0% 55.6%
TG2 Bush Reading 95 17.9% 3.2% 3.2% 0.0% 81.1%
TG3 WTC Tourist 95 25.3% 6.3% 6.3% 1.1% 71.6%
TG4 BP Oil Spill 100 25.0% 0.0% 0.0% 0.0% 75.0%
TG5 Israeli-Palestinian Peace Talks 95 21.1% 7.4% 7.4% 0.0% 75.8%
TG6 Criminal Record 90 41.1% 13.3% 13.3% 0.0% 54.4%
TG7 Palin and Rifle 100 17.0% 2.0% 2.0% 0.0% 81.0%
TG8 Beatles Rubber 100 8.0% 9.0% 9.0% 1.0% 85.0%
TG9 Kerry and Fonda 80 21.3% 13.8% 13.8% 0.0% 68.8%
TG10 OJ Simpson 90 18.9% 2.2% 2.2% 0.0% 78.9%
Average 93.5 23.5% 6.8% 6.8% 0.2% 72.7%
50
What’s up with this
-
8/16/2019 15 - Phylogeny
94/171
What s up with this
near du licate set?
51
What’s up with this
-
8/16/2019 15 - Phylogeny
95/171
What s up with this
near du licate set?
52
What’s up with this
-
8/16/2019 15 - Phylogeny
96/171
What s up with this
near du licate set?
52
What’s up with this
-
8/16/2019 15 - Phylogeny
97/171
What s up with this
near du licate set?
52
Cl
-
8/16/2019 15 - Phylogeny
98/171
Close-up
53
Cl
-
8/16/2019 15 - Phylogeny
99/171
Close-up
54
Fi k F
-
8/16/2019 15 - Phylogeny
100/171
First peek at Forests
• Forests (multiple co-existing trees) are areal case in real applications.
• Can our method be modified to findmultiple trees?
55
-
8/16/2019 15 - Phylogeny
101/171
Video Phylogeny Tree:VPT
Vid Ph l T
-
8/16/2019 15 - Phylogeny
102/171
Video Phylogeny Tree
• We ignore the sound track.
• We use only static image content.
57
Vid Ph l T
-
8/16/2019 15 - Phylogeny
103/171
Video Phylogeny Tree
• We ignore the sound track.
• We use only static image content.
Why not get one frame, and
use the IPT as the VPT?
57
Vid Ph l T
-
8/16/2019 15 - Phylogeny
104/171
Video Phylogeny Tree
• We ignore the sound track.
• We use only static image content.
Why not get one frame, and
use the IPT as the VPT?
The IPTs of frames aredifferent along the video!!!
57
Vid Ph l T
-
8/16/2019 15 - Phylogeny
105/171
Video Phylogeny Tree
• We ignore the sound track.
• We use only static image content.
Why not get one frame, and
use the IPT as the VPT?
The IPTs of frames aredifferent along the video!!!
But why?57
-
8/16/2019 15 - Phylogeny
106/171
58
-
8/16/2019 15 - Phylogeny
107/171
58
-
8/16/2019 15 - Phylogeny
108/171
58
-
8/16/2019 15 - Phylogeny
109/171
58
-
8/16/2019 15 - Phylogeny
110/171
58
-
8/16/2019 15 - Phylogeny
111/171
58
-
8/16/2019 15 - Phylogeny
112/171
58
-
8/16/2019 15 - Phylogeny
113/171
-
8/16/2019 15 - Phylogeny
114/171
58
-
8/16/2019 15 - Phylogeny
115/171
58
-
8/16/2019 15 - Phylogeny
116/171
58
-
8/16/2019 15 - Phylogeny
117/171
-
8/16/2019 15 - Phylogeny
118/171
58
-
8/16/2019 15 - Phylogeny
119/171
58
-
8/16/2019 15 - Phylogeny
120/171
58
-
8/16/2019 15 - Phylogeny
121/171
58
Different IPTs
-
8/16/2019 15 - Phylogeny
122/171
Different IPTs
• Different quality over time, for example:
- black frames,
- blur,- compression artifacts,
-dynamic range.
• Which (if any) is the right one?59
A Few Approaches
-
8/16/2019 15 - Phylogeny
123/171
A Few Approaches
60
A Few Approaches
-
8/16/2019 15 - Phylogeny
124/171
A Few Approaches
• Expected result from a single frame IPT (baseline).
60
A Few Approaches
-
8/16/2019 15 - Phylogeny
125/171
A Few Approaches
• Expected result from a single frame IPT (baseline).
• Minimum dissimilarity matrix followed by IPT.
60
A Few Approaches
-
8/16/2019 15 - Phylogeny
126/171
A Few Approaches
• Expected result from a single frame IPT (baseline).
• Minimum dissimilarity matrix followed by IPT.
• Average dissimilarity matrix followed by IPT.
60
A Few Approaches
-
8/16/2019 15 - Phylogeny
127/171
A Few Approaches
• Expected result from a single frame IPT (baseline).
• Minimum dissimilarity matrix followed by IPT.
• Average dissimilarity matrix followed by IPT.
• Reconciliation Tree.
60
Single-Frame
-
8/16/2019 15 - Phylogeny
128/171
Ex ectation
61
Single-Frame
-
8/16/2019 15 - Phylogeny
129/171
Ex ectation
• Calculate IPT on each frame.
61
Single-Frame
-
8/16/2019 15 - Phylogeny
130/171
Ex ectation
• Calculate IPT on each frame.
61
Single-Frame
-
8/16/2019 15 - Phylogeny
131/171
Ex ectation
• Calculate IPT on each frame.
• Calculate Expectation of metrics, but does notreconstruct a VPT (Video Phylogeny Tree).
61
Min / Avera e
-
8/16/2019 15 - Phylogeny
132/171
62
Min / Avera e
-
8/16/2019 15 - Phylogeny
133/171
•Sample frames.
62
Min / Avera e
-
8/16/2019 15 - Phylogeny
134/171
•Sample frames.
• Calculate Dissimilarity Matrix on eachsynchronized frames.
62
Min / Avera e
-
8/16/2019 15 - Phylogeny
135/171
•Sample frames.
• Calculate Dissimilarity Matrix on eachsynchronized frames.
• Create a new Dissimilarity Matrix using theframe’s Dissimilarities
- min,- average,
-normalized min,
- normalized average.
62
Min / Avera e
-
8/16/2019 15 - Phylogeny
136/171
•Sample frames.
• Calculate Dissimilarity Matrix on eachsynchronized frames.
• Create a new Dissimilarity Matrix using theframe’s Dissimilarities
- min,- average,
-normalized min,
- normalized average.
• Construct VPT using oriented Kruskal onthis new Matrix.
62
Reconciliation
-
8/16/2019 15 - Phylogeny
137/171
A roach
63
Reconciliation
-
8/16/2019 15 - Phylogeny
138/171
A roach• Sample frames.
63
Reconciliation
-
8/16/2019 15 - Phylogeny
139/171
A roach• Sample frames.
• Calculate IPT on each frame.
63
Reconciliation
-
8/16/2019 15 - Phylogeny
140/171
A roach• Sample frames.
• Calculate IPT on each frame.• Reconcile the frame’s IPTs into the VPT.
- Build Reconciliation Matrix.
- Apply Tree Reconciliation Algorithm
63
Reconciliation Matrix
-
8/16/2019 15 - Phylogeny
141/171
Reconciliation Matrix
Algorithm 1 Reconciliation Matrix.
Require: number of near-duplicate videos, nRequire: number of selected frames, f Require: 2-d vector, t, with the f phylogeny trees previously calculated
1: for i ∈ [1..n] do Initialization2: for j ∈ [1..n] do3: P [i, j] ← 04: end for5: end for6: for i ∈ [1..f ] do Creating the matrix P 7: for j ∈ [1..n] do8: P [ j, t[i][ j]] = P [ j, t[i][ j]] + 19: end for
10: end for11: return P Returning the parenthood matrix P
64
Tree Reconciliation Alg.
Al i h 2 T R ili i
-
8/16/2019 15 - Phylogeny
142/171
Algorithm 2 Tree Reconciliation.
Require: number of near-duplicate videos, nRequire: matrix, P , from Algorithm 1
1: for i ∈ [1..n] do Tree initialization2: tree[i] ← i3: end for4: sorted ← sort positions (i, j) of P into nonincreasing order
List of edges sorted from the most to the least common5: r ← 0 Initially, the final root r is not defined6: nedges ← 0
7: for each position (i, j) ∈ sorted do
Testing each edge in order8: if r = 0 and i = j then Defining the root of the tree9: r ← i
10: end if 11: if i 6= r then If i is not the root of the tree12: if Root(i) 6= Root( j) then13: if Root( j) = j then14: tree[ j] ← i
15: nedges ← nedges + 116: if nedges = n − 1 then If the tree is complete17: return tree Returning the final VPT18: end if 19: end if 20: end if 21: end if 22: end for
65
-
8/16/2019 15 - Phylogeny
143/171
Example
-
8/16/2019 15 - Phylogeny
144/171
[ 1 , 1 , 1 , 1 , 3 , 4 ]
1
2 43
5 6
Frame 1
[ 4 , 2 , 2 , 2 , 2 , 5 ]
2
3 54
1 6
Frame 2
[ 1 , 1 , 1 , 2 , 2 , 3 ]
1
2 3
6
Frame 3
4 5
[ 4 , 4 , 2 , 4 , 1 , 2 ]
4
1
Frame 4
5
2
3 6
[ 1 , 1 , 6 , 2 , 2 , 1 ]
1
2 6
3
Frame 5
4 5
[ 1 , 4 , 1 , 1 , 3 , 1 ]
1
3 64
25
Frame 6
P 1 2 3 4 5 6
1 4 2
2 3 1 2
3 3 2 1
4 2 3 1
5 1 3 2
6 2 1 1 1 1
C h i l d
Parent
4 (1,1) !
3 (2,1) !
3 (3,1) !
3 (4,2) !
3 (5,2) !
2 (1,4) "
2 (2,4) "
2 (3,2) "
2 (4,1) "2 (5,3) "
2 (6,1) !
Edges
[ 1 , 1 , 1 , 2 , 2 , 1 ]
1
2 63
Final Video Phylogeny Tree
54
Example
Experimental results
-
8/16/2019 15 - Phylogeny
145/171
68
Experimental results
-
8/16/2019 15 - Phylogeny
146/171
•Limited experiments in this paper:
- Ignored temporal cropping,- Ignored video compression on dissimilarities,- 16 Videos (Super Bowl Commercials 2011),
- 16 trees,- 10 near-duplicates per tree.
68
Experimental results
-
8/16/2019 15 - Phylogeny
147/171
• Limited experiments in this paper:- Ignored temporal cropping,- Ignored video compression on dissimilarities,- 16 Videos (Super Bowl Commercials 2011),
- 16 trees,- 10 near-duplicates per tree.
• Transformations using mencoder.
68
Experimental results
-
8/16/2019 15 - Phylogeny
148/171
• Limited experiments in this paper:- Ignored temporal cropping,- Ignored video compression on dissimilarities,- 16 Videos (Super Bowl Commercials 2011),
- 16 trees,- 10 near-duplicates per tree.
• Transformations using mencoder.
• Sampling frames and sync by ffmpeg.
68
-
8/16/2019 15 - Phylogeny
149/171
Transformations
-
8/16/2019 15 - Phylogeny
150/171
Transformations
Table ITRANSFORMATIONS AND THEIR OPERATIONAL RANGES FOR CREATINGTHE CONTROLLED DATA SET.
Transformation Oper. Range
(1) Global Resampling/Scaling (Up/Down) [90%, 110%](2) Scaling by axis [90%, 110%](3) Cropping [0%, 5%](4) Brightness Adjustment [−10%, 10%](5) Contrast Adjustment [−10%, 10%](6) Gamma Correction [0.9, 1.1]
We used mencoder to generate theNearDuplicates, with these transformations and
ranges:
69
Comparing Trees
-
8/16/2019 15 - Phylogeny
151/171
Comparing Trees
70
Results of Approaches
-
8/16/2019 15 - Phylogeny
152/171
Results of Approaches
Table IIAVERAGE RESULTS FOR THE 16× 16 TEST CASES UNDER CONSIDERATION
FOR THE PROPOSED VPT METHODS.
Method Root Depth Edges Leaves Ancestry(E) Single Frame 76.5% 0.382 54.2% 67.7% 58.6%
(1) Min 59.0% 0.926 49.6% 64.1% 50.8%(2) Min-Norm 68.0% 0.605 51.3% 66.4% 54.2%
(3) Avg 85.6% 0.215 56.6% 70.3% 62.0%(4) Avg-Norm 85.9% 0.203 58.0% 72.4% 64.5%(5) Reconc. Tree 91.0% 0.098 65.8% 77.7% 70.4%
(5)/(E) Boost 18.9% 74.3% 21.4% 14.7% 20.1%
71
Experimental Results
-
8/16/2019 15 - Phylogeny
153/171
Table IIIRESULTS FOR THE TRE E RECONCILIATION APPROACH USING 1 6
DIFFERENT TREES.
Video Root Depth Edges Leaves Ancestry
V 01 100.0% 0.000 68.1% 77.9% 73.4%V 02 87.5% 0.125 66.9% 76.0% 68.8%V 03 75.0% 0.312 56.9% 73.7% 57.8%V 04 81.2% 0.188 57.5% 68.2% 60.7%V 05 93.8% 0.062 69.4% 81.3% 73.9%V 06 93.8% 0.125 66.2% 77.7% 72.7%
V 07 100.0% 0.000 73.1% 83.2% 79.5%V 08 93.8% 0.062 59.4% 75.0% 66.1%V 09 100.0% 0.000 70.6% 80.2% 73.1%V 10 100.0% 0.000 65.6% 75.9% 72.3%V 11 81.2% 0.188 64.4% 80.0% 69.8%V 12 100.0% 0.000 68.7% 80.2% 76.4%V 13 87.5% 0.125 75.0% 82.5% 77.7%V 14 100.0% 0.000 69.4% 78.1% 72.5%V 15 81.2% 0.188 56.9% 72.8% 64.2%V 16 81.2% 0.188 65.0% 80.2% 67.2%
Average 91.0% 0.098 65.8% 77.7% 70.4%Std Dev 8.8% 0.097 5.6% 4.0% 6.0%
72
Limitations of
-
8/16/2019 15 - Phylogeny
154/171
frame-based VPT
73
Limitations of
-
8/16/2019 15 - Phylogeny
155/171
frame-based VPT• Does not use sound.
73
Limitations of
-
8/16/2019 15 - Phylogeny
156/171
frame-based VPT• Does not use sound.• Ignores temporal information of the visual content.
73
Limitations of
-
8/16/2019 15 - Phylogeny
157/171
frame-based VPT• Does not use sound.• Ignores temporal information of the visual content.
• Requires sync frames!- This is actually a very complicated issue in Video, and
some codecs are finickier than others.
- If we allow change in FPS + temporal crop, it might beimpossible to fulfill this requisite.
73
-
8/16/2019 15 - Phylogeny
158/171
Discussion
What’s next?
-
8/16/2019 15 - Phylogeny
159/171
75
What’s next?
-
8/16/2019 15 - Phylogeny
160/171
• Better study of the effect of the family oftransformations on images and video.
75
What’s next?
-
8/16/2019 15 - Phylogeny
161/171
• Better study of the effect of the family oftransformations on images and video.
• Use Reencoding for better video Dissimilarities.
75
What’s next?
-
8/16/2019 15 - Phylogeny
162/171
• Better study of the effect of the family oftransformations on images and video.
• Use Reencoding for better video Dissimilarities.
• Global dissimilarity for video (instead of framebased).
75
What’s next?
-
8/16/2019 15 - Phylogeny
163/171
• Better study of the effect of the family oftransformations on images and video.
• Use Reencoding for better video Dissimilarities.
• Global dissimilarity for video (instead of framebased).
• Reconstructing Forests.
75
What’s next?
-
8/16/2019 15 - Phylogeny
164/171
• Better study of the effect of the family oftransformations on images and video.
• Use Reencoding for better video Dissimilarities.
• Global dissimilarity for video (instead of framebased).
• Reconstructing Forests.
• Large Scale Scenario.75
Other Domains
-
8/16/2019 15 - Phylogeny
165/171
76
Other Domains
-
8/16/2019 15 - Phylogeny
166/171
• Software Development.• Phylogeny of code.
76
Other Domains
-
8/16/2019 15 - Phylogeny
167/171
• Software Development.• Phylogeny of code.
• Computational Linguistics.• Phylogeny of Historical Documents.
76
Other Domains
-
8/16/2019 15 - Phylogeny
168/171
• Software Development.• Phylogeny of code.
• Computational Linguistics.• Phylogeny of Historical Documents.
• File Carving through Phylogeny Algorithms.
76
Other Domains
-
8/16/2019 15 - Phylogeny
169/171
• Software Development.• Phylogeny of code.
• Computational Linguistics.• Phylogeny of Historical Documents.
• File Carving through Phylogeny Algorithms.
• These cases are closer to the traditionalBiological application, with metrics.
76
Other Domains
-
8/16/2019 15 - Phylogeny
170/171
• Software Development.• Phylogeny of code.
• Computational Linguistics.• Phylogeny of Historical Documents.
• File Carving through Phylogeny Algorithms.
• These cases are closer to the traditionalBiological application, with metrics.
Beyond the Naïve Plagiarism Detection.76
-
8/16/2019 15 - Phylogeny
171/171
Thank you.