Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina
description
Transcript of Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina
![Page 1: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/1.jpg)
Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina
Digital Library Project, Database Group
Stanford University
Automatic Organization for Digital Photographs with Geographic Coordinates
![Page 2: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/2.jpg)
JCDL 2004 2
Geo-Referenced Photos
April 8th, 2004 1:20:02pm
Latitude: N34.3121
Longitude: W122.234
![Page 3: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/3.jpg)
JCDL 2004 3
Geo-Photography Technology
+1) 2)
![Page 4: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/4.jpg)
JCDL 2004 4
Personal Photo Libraries
• Searching/browsing very difficult
• Little discernible structure to photo collections
![Page 5: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/5.jpg)
JCDL 2004 5
• Content-based retrieval– Basic, primitive (far from semantic)
• Manual labeling– Improved, yet cumbersome
• Visual methods for fast scanning (Zoom)– Don’t scale well
Managing Personal Photos
![Page 6: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/6.jpg)
JCDL 2004 6
Our Approach
• Absolutely no human effort required
• Utilize time and location– Automatically captured – Easy to get
![Page 7: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/7.jpg)
JCDL 2004 7
Automatic Organization
![Page 8: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/8.jpg)
JCDL 2004 8
Automatic Organization
![Page 9: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/9.jpg)
JCDL 2004 9
Automatic Organization
![Page 10: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/10.jpg)
JCDL 2004 10
Automatic Organization
![Page 11: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/11.jpg)
JCDL 2004 11
Outline
• Requirements and challenges
• The algorithms
• Sample output
• Experiment results
![Page 12: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/12.jpg)
JCDL 2004 12
Browsing by Location/Time
• Use a map/calendar– wwmx.org from MSR:
• Map issues – Lots of screen space– Sparse – Limited interaction?– Not intuitive for some
![Page 13: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/13.jpg)
Using Hierarchies
Time
United States
Yosemite N.P, Yosemite Valley, CA
Location:Around: San Francisco, Berkeley, Sonoma CA
San Francisco, Golden Gate Park, CA
Seattle, WA
…
……
…
Berkeley,
Oakland CA
2003-01-01: Yosemite N.P. (2 Days)
2003-01-18: San Francisco (1 hour)
2003-01-18: San Francisco (1 hour)Time:
![Page 14: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/14.jpg)
JCDL 2004 14
Challenges
• Locations should be intuitive
• Events are tricky – 3-days trip to NYC– The kid’s soccer game, followed by a
birthday party
• Good names are important.
![Page 15: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/15.jpg)
JCDL 2004 15
Outline
• Requirements and challenges
• The algorithms
• Sample output
• Experiment results
![Page 16: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/16.jpg)
JCDL 2004 16
Process Diagram
![Page 17: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/17.jpg)
JCDL 2004 17
Discovering Structure
Location Hierarchy
Initial Event Segmentation
Location Clustering
Final Event Segmentation
Event Hierarchy
Initial Event Segmentation
Automatic Organization
![Page 18: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/18.jpg)
JCDL 2004 18
Initial Event Segmentation
• Photos occur in bursts
• Identify bursts: semantically “connected”
![Page 19: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/19.jpg)
JCDL 2004 19
Initial Event Segmentation
Stream of photos
More details: •Graham et al, JCDL 2002•Tomorrow•Proceedings
![Page 20: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/20.jpg)
JCDL 2004 20
Discovering Structure
Location Hierarchy
Initial Event Segmentation
Final Event Segmentation
Event Hierarchy
Location Clustering
Automatic Organization
![Page 21: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/21.jpg)
JCDL 2004 21
Location Clusters
• Cluster the bursts into locations
• A. Gionis and H. Mannila. Finding recurrent sources in sequences. In Proceedings, Computational molecular biology 2003.– Minimize: number of clusters– Minimize: error (distance to cluster centers)
![Page 22: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/22.jpg)
Photo location
Location Clusters: 2-D View
![Page 23: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/23.jpg)
2-D View: with Bursts
![Page 24: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/24.jpg)
JCDL 2004 24
Location Clusters
Location4 -
Location3 -
Location2 -
Location1 -
![Page 25: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/25.jpg)
Location4 -
Location3 -
Location2 -
Location1 -
Location Clusters (breakdown)
• Some clusters may be overloaded:– Many bursts / picture-taking days in one location
San Francisco
![Page 26: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/26.jpg)
JCDL 2004 26
Discovering Structure
Location Hierarchy
Initial Event Segmentation
Location Clustering
Event Hierarchy
Final Event Segmentation
Automatic Organization
![Page 27: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/27.jpg)
JCDL 2004 27
Final Event Segmentation
• Again scan sequence, new events detected:– Whenever location context changes– In the same location, use adaptive time
threshold
![Page 28: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/28.jpg)
JCDL 2004
Final Event Segmentation
Overnight trip to Yosemite
Soccer game and dinner
![Page 29: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/29.jpg)
JCDL 2004 29
Next - names
• Detected location and event structure
• Need to choose names for each node
![Page 30: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/30.jpg)
30
Assigning Names
Photo location
Stanford
Palo Alto City Park
Palo AltoButano State Park
Stanford 42
Palo Alto 30
Butano 10
P.A. park 8
![Page 31: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/31.jpg)
31
Assigning Names – Nearby?
San Jose, 20 miles
San Francisco, 30 milesWhat if photos occur sparsely within cities or parks?
![Page 32: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/32.jpg)
JCDL 2004
Assigning Names - Nearby
Which city has stronger “gravity”?
![Page 33: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/33.jpg)
JCDL 2004
Assigning Names - Nearby
San Jose is Closer
![Page 34: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/34.jpg)
JCDL 2004
Assigning Names - Nearby
San Jose is bigger**larger population
![Page 35: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/35.jpg)
JCDL 2004
Assigning Names - Nearby
But San Fran is more important!**greater Google count
Final name for location cluster:
“Stanford, 30 miles South of SF”
![Page 36: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/36.jpg)
JCDL 2004 36
Assigning Names - Alexandria
• Using polygon-based dataset of administrative areas
• Alexandria gazetteer can be used for other prominent geographic features
![Page 37: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/37.jpg)
JCDL 2004 37
Outline
• The requirement and challenges of automatic organization
• The algorithms
• Sample output
• Experiment results
![Page 38: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/38.jpg)
JCDL 2004 38
Location Hierarchy
Photoshop Album (at least 4 man-hours)
Our system (about 0 man-seconds)
![Page 39: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/39.jpg)
39
Location Hierarchy (US)
+San Francisco, Berkeley, Sonoma, CA-Stanford, Mountain View, Monterey, CA
•Monterey (58 miles S of San Jose) •Mountain View (4 miles NW of San Jose) •Stanford
-Colorado (219 miles W of Denver)-Long Beach (35 miles S of Los Angeles, CA)-Philadelphia, PA-Seattle, WA-Sequoia N.P. (153 miles E of Fresno, CA)-South lake Tahoe; Bear Valley, CA-Yosemite N.P.; Yosemite Valley, CA
![Page 40: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/40.jpg)
Events
about 0 man-seconds:
...
2003-06-28: Long Beach,CA (3 days)
2003-07-04: San Francisco,CA (3 hours)
2003-07-10: Colorado (3 days)
2003-07-15: San Francisco,CA(1 hours)
2003-07-18: Mountain View,CA (5 hours)
2003-07-27: San Francisco,CA (1 hours)
2003-09-28: Philadelphia,PA (1 hours)
2003-10-03: Sequoia NP (3 days)
...
Photoshop Album (at least 4 man-hours)
![Page 41: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/41.jpg)
JCDL 2004 41
Event Names
• LOCALE: share automatically
• Check personal calendar
• Event Gazetteer
• Easy interface
![Page 42: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/42.jpg)
JCDL 2004 42
Experiment
• Tested on 3 real-world geo-referenced photo collections
• Our system automatically generated the structure and names
• Tested with the owners
![Page 43: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/43.jpg)
JCDL 2004 43
Experiment - Locations
• Accepted the automatic hierarchy
• Only minor edits requested– Merge/split few of the locations
![Page 44: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/44.jpg)
JCDL 2004 44
Experiment - Events
• Compared to events as annotated by users
• 80-85% in both recall and precision
• Other metrics proposed (see paper)
![Page 45: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/45.jpg)
JCDL 2004 45
Experiment - Naming
• Naming location clusters– For 76% of clusters, system and users pick
at least one name in common– For the rest, “automatic” name was useful
![Page 46: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/46.jpg)
Not yet published:
• Paid 13 participants to “geo-reference” their photos• Loaded to WWMX and our browser
– Most liked the map better, but…– Performed the same for search/browse tasks– Event notion helps overcome location handicap– Organization “made sense”
P.S. Some didn’t touch the map, yet used our location hierarchy.
P.S.2 This was on a BIG screen!
![Page 47: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/47.jpg)
JCDL 2004 47
Thank You!
More details:
Proceedings
Google: Mor Naaman
http://www-db.stanford.edu/~mor/
![Page 48: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/48.jpg)
JCDL 2004 48
Future Work
• User interface
• PDA
• Integrate with map
• Global photo libraries
![Page 49: Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina](https://reader030.fdocuments.us/reader030/viewer/2022012908/56814053550346895dabc573/html5/thumbnails/49.jpg)
JCDL 2004 54
Remember The Bursts?