[RakutenTechConf2013] [C4-1] Text detection in product images

Text detection in product images

10/26/2013

Naoki Chiba, Lead Scientist

Rakuten Institute of TechnologyRakuten Inc.http://rit.rakuten.co.jp/

2

Product images

Sales pitches in images

Applications:• Content retrieval/filtering• Recognition• Translation

3

RIT Text Detector

Far more accurate Works like magic

4

Outline

１ Text detection overview

2 Current methods

3 RIT’s approach

5

Outline


2 Current methods

3 RIT’s approach

6

Academic Research

Natural scene OCR ≠ traditional scanned OCRCamera capturedIllumination variationsPerspective distortionShort text

Source: ICDAR Text locating competition

Digital-born text Natural-scene text

7

Product Images - Two Purposes

1. Sales pitches

2. Product list

Text’s role is different

8

Product list

Sales pitch (Merchant’s names, Price, Shipping)

9

“Now Printing” images

Showing image unavailability, but..

NotUpdated

10

Text detection for product images

More accurate

Much Faster

11

Outline


2 Current methods

3 RIT’s approach

12

Current methods

1. Texture based (Classifier-based)2. Region based (Connected components)3. Hybrids

13

1. Texture-based method

Special texture ScanClassifier (SVM, AdaBoost or Neural network)

Problems:

• Scale/Rotation variant

• High computation

14

2. Region-based method

Local features (edges or color clustering)

Connected component analysisText lines and word separation

Problem:

• False candidates

Output of Stroke width transform

15

3. Hybrid method

Region based Edge (Stroke Width Transform) Color clustering

B

Classifier SVM Random Forrest

AdaBoost

16

Problems

1. Character/word annotationTime-consuming task

2. Transparent textHard to detect

17

Problem 1: Character/word annotation

Time consuming for many images

18

Problem 2: Transparent text

?• Weak edges (difficult to detect)

19

Outline


2 Current methods

3 RIT’s approach

20

RIT’s Approach


Text image classifier using image-wise annotation


Transparent text detection and background recovery

21

1. Text image classifier using image-wise annotation

• Text image detection (not char/word)– Image-wise annotation (less time)– Clustering detected regions

(measure text likeliness)

22

Image-wise Annotation

Draw rectangles

送料無料

Image-wiseClassify text/non-text

text non-text

Character-wise

23

Clustering detected regions

f1

f2

C1

C2

C3

x

x

xx

x

Region in text imagesRegion in non-text images

x Cluster center

C ４

C ５

P(C1) = 3/4

P(C4) = 0/3

24

Comparison

• Rakuten 500 images• Compared w/a traditional region-based method

Current Proposed0.0%

10.0%

20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

80.0%

90.0%

Accuracy

Better than a typical method

25

RIT’s Approach


Text image classifier using image-wise annotation


Transparent text detection and background recovery

26

2. Transparent text detection and background recovery

• Edge Detection with adaptive threshold– Image content analysis

• Background recovery– Text color/opacity estimation

27

Edge detection with adaptive thresholds

Less noise

Weak edges are better preserved

28

Texture strength

Measuring image complexity

Direction and energy: eigenvectors and eigenvalues[1]

Image patches:

Texture strength:

[1] Xiang Zhu and Peyman Milanfar, “Automatic parameter selection for denoising algorithms using a no-reference measure of image content,” IEEE transactions on image processing, pp. 3116–32, 2010.

29

Proposed text detection

1. Texture based (Classifier based)

SVM/Random Forest/AdaBoost2. Region based (Connected components)

Edge/Color Clustering3. Hybrids

Region (Edge Stroke Width) + Texture (AdaBoost)

30

System flow

Components Analysis

Detected text

Stroke width transform and Connected componentInput image Adaptive Edge

detection

31

Detection result

(a) constant threshold (b) proposed

32

System flow

Components Analysis

Detected text

Stroke width transform and Connected componentInput image

Backgroundrecovery

Adaptive Edge detection

33

Transparent Text

T I: observed pixel value

O: original pixel value

I

O

• 2 >= equations• Least squares solution• 2 unknown

text coloropacity

34

Extraction result

(b) recovered(a) original

35

Comparison with InPainting

Original

InPainting Rakuten

Magic

Patented!

36

Thank you!

Details: ACPR 2013

[RakutenTechConf2013] [C4-1] Text detection in product images

Technology

Transcript of [RakutenTechConf2013] [C4-1] Text detection in product images