Business Identification: Spatial Detection

24
Business Identification: Spatial Detection Alexander Darino Week 7

description

Business Identification: Spatial Detection. Alexander Darino Week 7. Weaknesses to Current Approach. Business Name Matching. Business Spatial Detection. Latitude Longitude. Geocoding Reverse Geocoding. Nearby Businesses. Business Identification. Image. OCR. Detected Text. - PowerPoint PPT Presentation

Transcript of Business Identification: Spatial Detection

Page 1: Business Identification: Spatial Detection

Business Identification:Spatial Detection

Alexander DarinoWeek 7

Page 2: Business Identification: Spatial Detection

2

Weaknesses to Current Approach

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image OCR Detected Text

Business Name

Matching

BusinessIdentification

Business Spatial

Detection

Page 3: Business Identification: Spatial Detection

3

Alternative: Image Matching

Page 4: Business Identification: Spatial Detection

Alternative: Image Matching

• Weaknesses:– Low Availability of Storefront Images (< 50% Avg)• George Aiken area businesses with photos: 18/35• Brueggers area businesses with photos: 22/40• Tambellini area businesses with photos: 8/22

– Available Images too small (100 x 100)• Not a viable solution

Page 5: Business Identification: Spatial Detection

5

Alternative: Template Matching

• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini

Page 6: Business Identification: Spatial Detection

Alternative: Template Matching

• Progress– Able to generate templates– Able to extract SIFT features using Lowe’s

implementation– Able to match SIFT features using Lowe’s

implementation• Problems– Features are being matched to garbage

Page 7: Business Identification: Spatial Detection

Alternative: Template Matching

Page 8: Business Identification: Spatial Detection

Alternative: Template Matching

Page 9: Business Identification: Spatial Detection

Alternative: Template Matching

Page 10: Business Identification: Spatial Detection

Alternative: Template Matching

Page 11: Business Identification: Spatial Detection

Alternative: Template Matching

• Currently on-hold– Need to discuss solution with Amir– Currently looking into another alternative…

Page 12: Business Identification: Spatial Detection

Alternative: Scene Text Recognition• State of the Art:– STR ≠ OCR– Far superior to our ‘naïve’ approaches to STR (ie. OCR, Image

matching, SIFT)• OCR only works for highly controlled environments.

CEDAR, ICDAR, etc not helpful• STR works for unconditioned environments– Scale invariant– Color/intensity invariant– Font invariant– Lexicon-Assisted

Page 13: Business Identification: Spatial Detection

Alternative: Scene Text Recognition

• No STR implementations readily available• University of Massachusetts specializes in STR– Papers describe enhancements and unification of

previous work, but not algorithms– Will email for blackbox implementation

• Currently looking into ‘previous work’– More models– Some algorithms

Page 14: Business Identification: Spatial Detection

Alternative: Scene Text Recognition

• Options– Email authors for implementation– Try to implement STR as per described models• Blackboxes whenever possible (email!)• Code when blackboxes are not available

– Try to implement crude STR via blackboxes

Text Detection Orthorectification Increase Contrast OCR Detected

Text

Page 15: Business Identification: Spatial Detection

STR Implementation

• STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes”

Multiresolution-based potential

characters detection

Character/layout geometry and color properties analysis

Local affine rectification

Refined Detection

Page 16: Business Identification: Spatial Detection

Multiresolution-based potential characters detection

• Laplacian-of-Guassian Edge Detection• Dice image/edges into Patches– Combine patches with similar properties into

regions– Obtain bounding box of region as candidate text– Properties include:• Mean• Variance• Intensity(?)

Page 17: Business Identification: Spatial Detection

Multiresolution-based potential characters detection

Page 18: Business Identification: Spatial Detection

Multiresolution-based potential characters detectionPatches qualify if:

Page 19: Business Identification: Spatial Detection

Multiresolution-based potential characters detection

Page 20: Business Identification: Spatial Detection

Multiresolution-based potential characters detection

Page 21: Business Identification: Spatial Detection

Multiresolution-based potential characters detection

Page 22: Business Identification: Spatial Detection

STR Implementation

• Possible Solutions:– Don’t grow bounding box. Grow non-rectangular

region, then obtain bounding box– Or replace with off-the-shelf Text Detector

blackbox (?)

Page 23: Business Identification: Spatial Detection

Next Steps

• Email for STR implementations• Backtrack: Implement ‘crude’ STR

• Continue with current STR implementation

Text Detection Orthorectification

Increase Contrast/Binarization OCR Detected

Text

Page 24: Business Identification: Spatial Detection

Thank You