Story Compression: Aggregating News...

Post on 12-May-2020

2 views 0 download

Transcript of Story Compression: Aggregating News...

Story Compression: Aggregating News Feeds Joseph W. Barker

Advisor: James W. Davis Ohio State University

What is Story Compression? • News broadcasts from multiple sources tend to cover same stories • Stories have content overlap – General content covered by multiple sources – Specific content covered by one source

• Information gathering – Waste time if view all broadcasts (general content → redundancy) – Miss information if only view one broadcast (specific content)

• Answer: Story Compression – Detect general vs. specific content and create single story from all

broadcasts with no redundancy

Overview • Divide story into content segments (i.e., single idea) – Video shot (continuous scene) detection

• Compare segments – Speech/text contains most of the informational content – Word similarity → Segment Similarity

• Detect specific vs. general segments

Word Similarity

• Focus on concepts rather than specific word matching

• Graph-based hierarchy of word-concept relationships

– E.g., WordNet

• Malik et. al 2007

– 𝑠𝑖𝑚 𝑤1, 𝑤2 =2∙𝑑𝑖𝑠𝑡(𝑟𝑜𝑜𝑡,𝐿𝐶𝑆 𝑤1,𝑤2 )

𝑑𝑖𝑠𝑡 𝑟𝑜𝑜𝑡,𝑤1 +𝑑𝑖𝑠𝑡(𝑟𝑜𝑜𝑡,𝑤2)

• Li et. al 2003

– 𝑠𝑖𝑚 𝑤1, 𝑤2 =

𝑒−𝛼 𝑑𝑖𝑠𝑡 𝑤1,𝑤2 tanh (𝛽 𝑑𝑖𝑠𝑡 𝑟𝑜𝑜𝑡, 𝐿𝐶𝑆 𝑤1, 𝑤2 )

Feline

Mammal

Canine

Poodle

Object

Cat

Segment Similarity • Sentence similarity? – Segments range from sub-sentence to

multiple sentences – Also, sentence boundaries (when multiple)

poorly defined – Sentence similarity emphasizes

grammar/word order; won’t work

• If ordering is problematic, use unordered groups instead

• Solution: Graph collapsing – Group of nodes collapsed to single node by

summing edge weights – Inspired by spectral clustering and notion

of random walk on graphs – Random walk between groups equivalent

to random walk between collapsed nodes

Segment Similarity

Word Similarity

Most Unique Segments • Manual segmentation

employed • Specific content • Uniqueness → overall

dissimilarity • Perfect dissimilarity →

similarity matrix rows/columns zero except for diagonal

• Thus, sum of row/column should approach zero for most dissimilar segments

Most Related Segments • General content • Related → group self-

similar • Perfect self-similarity →

similarity matrix elements for group all one

• Thus, sum of elements should approach 𝑛2 (𝑛=number in group)

0 10 20 30 40 50 60 70 80 90 1003.3

3.35

3.4

3.45

3.5

3.55

3.6

3.65

3.7

3.75

3.8Segment Pair Similarity (higher is better)

Sim

ilarity

Segment pairs (sorted)

0 5 10 15 20 25 30 35 40 450.014

0.016

0.018

0.02

0.022

0.024

0.026

0.028

0.03

0.032

Segment Uniqueness (lower better)

Uniq

ueness

Segments (sorted)

Perfect dissimilarity Somewhat dissimilar

Perfect similarity Somewhat similar

Automatic Segment Detection • How to decide boundaries

between segments? – No sentence boundaries, so text

not strong indicator • Shot detection: Detect visual

change from one scene to another

• Common techniques: – Temporal extent

• Consecutive: compare sequential pairs of frames

• Key frame: compare to “key” frame of previous segment

– Distance measures • Pixel-based: Sum of Absolute

Differences (SAD), Sum of Squared Differences (SSD), Normalized Cross-Correlation (NCC)

• Color-based (histograms): χ2, Bhattacharyya

• Texture-based: Scale Invariant Feature Transform (SIFT)

Towards Improving Segment Detection • Common methods give mediocre

performance • May be due to only examining single

temporal extent • Possible solution: Use graph

collapsing to examine all temporal extents simultaneously

• Sum of blocks on diagonal approaches 𝑛2 if members in segment

• Sum of block anti-diagonal approaches zero if corner is segment boundary

• Current problem: Scale of valleys (boundaries) varies quadratically with segment size, simple peak finding not good enough

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9Shot Detection: Key Frame (First)

Normalized threshold (1 = perfect match)

F s

core

SAD

SSD

NCC

SIFT-MR

BATTA-H16

CHI2-H16

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9Shot Detection: Consecutive

Normalized threshold (1 = perfect match)

F s

core

SAD

SSD

NCC

SIFT-MR

BATTA-H16

CHI2-H16

Method F TP FP FN

SAD 0.747 0.596 0.081 0.322

SSD 0.746 0.595 0.044 0.362

NCC 0.770 0.626 0.009 0.365

BATTA-H16 0.779 0.638 0.125 0.237

CHI2-H16 0.210 0.117 0.005 0.878

0 2000 4000 6000 8000 10000 120000.85

0.9

0.95

1

1.05

1.1

1.15

1.2

1.25

Frame

Anti-diagonal Sum

Conclusion and Future Work • Graph collapsing can be used to derive group similarity from

similarity of group members • Additionally, can be used to evaluate uniqueness of objects,

relatedness of groups – Tested with text, working on video

• Future work – Finalize graph collapsing video segmentation – Expand word similarity to include multiple languages – Investigate sub-image feature extraction/matching – Examine other sources (e.g., YouTube)

“…declaring a public health emergency….”

“…declaring a public health emergency….”

ABC NBC

#1)

“…after the virus killed….” “…sadly had claimed 18 lives….”

NBC

CBS

#2)

“…declaring a public health emergency….”

“…to repeat, declared a public health emergency….”

ABC NBC

#3)

ABC

CBS

“…they’ve set up a special tent….”

“…a tent has been setup….”

#4)

“In Boston today, the mayor sounded the alarm”

ABC

#1)

“…moved onto the upper respiratory, which is a lot of coughing…”

ABC

#2)

“…stay home when you are sick…”

ABC

#3)

“…I’ve never been hit by a Mack truck…”

ABC

#4)

“…is on the panel that decides what goes in the vaccine…”

CBS

#5)

“…after confirmed cases of flu reach 700…”

CBS

#6)

Consecutive Shot Detection Across All Stories

Sho

t D

etec

tio

n o

n s

tory

FLU

Video similarity

Sum of diagonal blocks

Fram

e B

lock

Sta

rt

Block End

AB

C

CB

S

NB

C