[RakutenTechConf2013] [C4-1] Text detection in product images
-
Upload
rakuten-inc -
Category
Technology
-
view
1.179 -
download
2
description
Transcript of [RakutenTechConf2013] [C4-1] Text detection in product images
Text detection in product images
10/26/2013
Naoki Chiba, Lead Scientist
Rakuten Institute of TechnologyRakuten Inc.http://rit.rakuten.co.jp/
2
Product images
Sales pitches in images
Applications:• Content retrieval/filtering• Recognition• Translation
3
RIT Text Detector
Far more accurate Works like magic
4
Outline
1 Text detection overview
2 Current methods
3 RIT’s approach
5
Outline
1 Text detection overview
2 Current methods
3 RIT’s approach
6
Academic Research
Natural scene OCR ≠ traditional scanned OCRCamera capturedIllumination variationsPerspective distortionShort text
Source: ICDAR Text locating competition
Digital-born text Natural-scene text
7
Product Images - Two Purposes
1. Sales pitches
2. Product list
Text’s role is different
8
Product list
Sales pitch (Merchant’s names, Price, Shipping)
9
“Now Printing” images
Showing image unavailability, but..
NotUpdated
10
Text detection for product images
More accurate
Much Faster
11
Outline
1 Text detection overview
2 Current methods
3 RIT’s approach
12
Current methods
1. Texture based (Classifier-based)2. Region based (Connected components)3. Hybrids
13
1. Texture-based method
Special texture ScanClassifier (SVM, AdaBoost or Neural network)
Problems:
• Scale/Rotation variant
• High computation
14
2. Region-based method
Local features (edges or color clustering)
Connected component analysisText lines and word separation
Problem:
• False candidates
Output of Stroke width transform
15
3. Hybrid method
Region based Edge (Stroke Width Transform) Color clustering
B
Classifier SVM Random Forrest
AdaBoost
16
Problems
1. Character/word annotationTime-consuming task
2. Transparent textHard to detect
17
Problem 1: Character/word annotation
Time consuming for many images
18
Problem 2: Transparent text
?• Weak edges (difficult to detect)
19
Outline
1 Text detection overview
2 Current methods
3 RIT’s approach
20
RIT’s Approach
1. Character/word annotationTime-consuming task
Text image classifier using image-wise annotation
2. Transparent textHard to detect
Transparent text detection and background recovery
21
1. Text image classifier using image-wise annotation
• Text image detection (not char/word)– Image-wise annotation (less time)– Clustering detected regions
(measure text likeliness)
22
Image-wise Annotation
Draw rectangles
送料無料
Image-wiseClassify text/non-text
text non-text
Character-wise
23
Clustering detected regions
f1
f2
C1
C2
C3
x
x
xx
x
Region in text imagesRegion in non-text images
x Cluster center
C 4
C 5
P(C1) = 3/4
P(C4) = 0/3
24
Comparison
• Rakuten 500 images• Compared w/a traditional region-based method
Current Proposed0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
Accuracy
Better than a typical method
25
RIT’s Approach
1. Character/word annotationTime-consuming task
Text image classifier using image-wise annotation
2. Transparent textHard to detect
Transparent text detection and background recovery
26
2. Transparent text detection and background recovery
• Edge Detection with adaptive threshold– Image content analysis
• Background recovery– Text color/opacity estimation
27
Edge detection with adaptive thresholds
Less noise
Weak edges are better preserved
28
Texture strength
Measuring image complexity
Direction and energy: eigenvectors and eigenvalues[1]
Image patches:
Texture strength:
[1] Xiang Zhu and Peyman Milanfar, “Automatic parameter selection for denoising algorithms using a no-reference measure of image content,” IEEE transactions on image processing, pp. 3116–32, 2010.
29
Proposed text detection
1. Texture based (Classifier based)
SVM/Random Forest/AdaBoost2. Region based (Connected components)
Edge/Color Clustering3. Hybrids
Region (Edge Stroke Width) + Texture (AdaBoost)
30
System flow
Components Analysis
Detected text
Stroke width transform and Connected componentInput image Adaptive Edge
detection
31
Detection result
(a) constant threshold (b) proposed
32
System flow
Components Analysis
Detected text
Stroke width transform and Connected componentInput image
Backgroundrecovery
Adaptive Edge detection
33
Transparent Text
T I: observed pixel value
O: original pixel value
I
O
• 2 >= equations• Least squares solution• 2 unknown
text coloropacity
34
Extraction result
(b) recovered(a) original
35
Comparison with InPainting
Original
InPainting Rakuten
Magic
Patented!
36
Thank you!
Details: ACPR 2013