Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM...
-
Upload
jessica-daniel -
Category
Documents
-
view
223 -
download
0
Transcript of Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM...
![Page 1: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/1.jpg)
Hierarchical Segmentation:Finding Changes in a Text Signal
Malcolm Slaney and Dulce Ponceleon
IBM Almaden Research Center
![Page 2: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/2.jpg)
Problem Statement Problem
How do we browse video? Goal
Create a table-of-contents Solution
Look for topic changes in text
![Page 3: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/3.jpg)
TOC Example
Chapter 1
Chapter 2
![Page 4: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/4.jpg)
Overview of This Talk Goal and approach Latent semantic indexing (LSI) Scale space Combination Results
LSIScaleSpaceFilter
Segment
![Page 5: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/5.jpg)
Approach Sentences -> Semantic Space Filter at multiple scales Look for large jumps Three subjects (loops) shown
Loop 1: Polychromaticity Artifacts Loop 2: Emission Tomography Loop 3: Ultrasound Tomography
![Page 6: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/6.jpg)
Courtesy of Jianbo Shi (CMU)
Building on Previous Work LSI and clustering Text tiling Change point analysis Segmentation Scale space
![Page 7: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/7.jpg)
Latent Semantic Indexing Collect histogram of word
frequencies Use SVD to capture frequent
combinations Orthogonal decomposition
Represent in low-dimensional space
Word
s
Docs Docs
10D
![Page 8: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/8.jpg)
LSI Within a Document Split into chunks
Fixed size Sentences
Compute histograms Perform SVD Look at results Sources
“Principles of Computerized Tomographic Imaging”
PBS News Hour
![Page 9: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/9.jpg)
LSI – 2D Projection
Chapter 4 of Principles of Computerized Tomographic Imaging
![Page 10: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/10.jpg)
LSI – Self-similarity Measure
similarity Cosine of angle
between “documents”
Plot all pairs of chunks/sentences
Look for block diagonal
Chapter 4 of Principles of Computerized Tomographic Imaging
![Page 11: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/11.jpg)
Scale-space Filtering What size are the features? Look at different scales! Continuous scale Used for
Object Recognition Feature Detection
![Page 12: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/12.jpg)
Scale-space Movie Green line
marks best high-level segmentation
10d semantic space
Scale varies from 1 to 400 sentences
![Page 13: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/13.jpg)
Scale-space Segmentation Low pass filter signal Form image of scale vs. time Look for changes Track peaks of vector derivative
across scale
![Page 14: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/14.jpg)
Scale-space Example
Derivative as function of scale and sentence
![Page 15: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/15.jpg)
LSI and Scale Space Putting it all together Split document/transcript Perform LSI analysis Look at change in angle Perform scale-space segmentation Show tree
![Page 16: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/16.jpg)
Scale-Space Image
Peaks in scale-space derivative
Peaks traced to their origin
![Page 17: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/17.jpg)
Results – CT Comparison
Scale-Space Book Headings
![Page 18: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/18.jpg)
Results – News Comparison
Scale-Space Ground Truth
![Page 19: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/19.jpg)
Results – Autocorrelation Block
sentences Measure
correlation Positive
Peak Anti-
correlation
![Page 20: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/20.jpg)
Discussion Issues Evaluation (and ground truth)
Lafferty’s measure Temporal properties
Histogram/SVD chunking size Autocorrelation
![Page 21: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/21.jpg)
Computational Effort Histogram: O(N) SVD: O(N3) Scale space: O(N2) N < 1000
Number of sentences in a video or document is not large
![Page 22: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/22.jpg)
LSI Document Lookup Histogram documents Entropy term weighting Compute SVD Use first 10-100 vectors to model
space Encode query as histogram Look for documents in similar
direction
![Page 23: Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.](https://reader035.fdocuments.us/reader035/viewer/2022081513/5697bff81a28abf838cbf58b/html5/thumbnails/23.jpg)
LSI Example Collection of
book titles Differential
equations vs. algorithms and applications