Dr. Anshuman Razdan Director (razdan@asu)
-
Upload
stephanie-rokos -
Category
Documents
-
view
24 -
download
0
description
Transcript of Dr. Anshuman Razdan Director (razdan@asu)
3D Handwriting AnalysisA. Razdan, J. Femiani, J. Rowe
Partnership for Research in Spatial Modeling (PRISM)
Dr. Anshuman Razdan
Director
04/19/23 2
Parsing the OCR Problem
• Preprocessing and Image enhancement
• Pen Stroke Creation
• Character recognition
• Word recognition
04/19/23 3
Image Enhancement• Preprocessing includes enhancing and refining the
raw image.• Identifying and extracting blurred, stained, faded,
bled through, or transferred characters, etc.• New PRISM method specifically identifies and
analyzes linear structures (line strokes). • This technique works in both 3D (CT, MRI) and 2D
(images) domains.
04/19/23 4
Image Refinement
• 1D and 2D function models based on the 3 observed shape characteristics have been developed, and enhanced images are derived from their second derivatives.
• A two-stage algorithm is developed to extract line and net patterns. Line and net patterns are first enhanced and then extracted by applying threshold value.
• Line and net patterns in a noisy environment exist in many imaging technologies
• Examples: Roads and rivers in satellite photos, curves in finger prints, blood vessels in CT angiography
04/19/23 8
Flat Land: A Romance of Many Dimensions
• You have to view the problem in at least one dimension higher than the data to get a sense of it(Flatland: A Romance of Many Dimensions: by Edwin A. Abbott, A Square, circa. 1884)
KING of 1D LandObserver in 2D Land
You are in 3D looking down at 2D space
woman
High Priest
04/19/23 10
Now I See Now I Don’tPRISM KGL Mesh Viewer ControlC:\RazdanData\Prism\KDI\Presentations/tub_mesh_connected.kgl
04/19/23 11
Flat Land Conclusion
• 1D (line) embed in 2D space (paper surface)
• 2D (images) embed in 3D space (like this room)
• 3D (objects) embedded in 4D or 5D space ….
• Given this argument, using 3D space for understanding 2D images makes sense….
04/19/23 13
3D Pen Trace Recreation
• Concept of raising or embedding 2D image in 3D space a.k.a Flat Land.
• Understanding ink flow and information embedded in the pen strokes
• Theory of Volume Modeling and Iso-surface Extraction
04/19/23 14
Chain Codes or Pen Traces• For any character
matching/recognition algorithm to work efficiently it needs to unravel the stroking of the pen.
• This means figuring out the chain code. Since it is not available in 2D bitmap we do it using 3D.
04/19/23 15
Pen Stroking• Pressure is applied to via the pen and is different in
upstrokes and down strokes and also angle of writing.• There is flow of ink from the pen to the paper.
Crossovers result in darker images
04/19/23 16
How 2D is raised to 3D
2D ImageTransformed into 3D
• A transfer function is applied which converts intensity at each pixel into a height function and also a density function
• Results in Volumetric data same as CT or MRI
H(i,j) = F(x,y, I(x,y))
D(i,j,k) = I(x,y)
Vol Func(x,y,H(i,j)) = D(I(x,y))
04/19/23 17
Marching Cubes• Marching cubes is used for making 3D surfaces from
volumetric data such as MRI, CAT scan, etc.
04/19/23 18
MC: Thresholding• Explanation of how Marching Cubes uses predefined
triangulations for each cube to form a whole mesh.
04/19/23 19
Volume Blurring• Start with Volume Function (V) on raw image (left image)• Apply Marching Cubes on V (middle image)• Create V’ = GnV (Blurring filter applied n times and then MC to
create right image). Gn is the secret sauce.
04/19/23 23
• Given two curves X1 and X2, one can ask two distinct questions:– Curve matching i.e.
• Is X1 = X2 ?
• Or one a subset of the other curve
• Or how similar are the two curves?
– Curve alignment i.e.• What is the rotation and translation required to align one
curve with the other?
The Problem
04/19/23 25
Conclusions• Novel method to unravel strokes, characters and letterforms in
complex handwritten documents. • Segments by Region/Row irrespective of scale, orientation, or
position.• Geometry based curve matching technique for character
recognition (dictionary generation, text recognition, and translation)
• Language independence• Doesn’t need expensive scanning equipment (we paid $24.99).• Can be combined with existing technologies.• Provisional Patent filed in April 2003. Full patent filing spring
2004.
04/19/23 28
Weaknesses
• Requires continuous tone original source (can not address single bit image i.e. FAX).
• Can be computationally expensive for certain applications such as forgery but the technology is built to take advantage of parallelization.
04/19/23 29
Opportunities• Extend concept of volumes to other applications
– Forensics (Offline comparisons)– Biometrics (Online authentication – wacom demo)– Forgery detection– Number extraction from noisy background (Currencies)
• Opportunities for derivative patents
04/19/23 30
Gaps
• Need to combine power of Stroke extraction and curve matching with traditional HMM and other statistical methods or commercial engines.
• Man power/expertise required– AI/Statistics/traditional char recognition expert to create
powerful hybrid engine
– Language specific expert/paleographer
• Requires productization and field testing.
04/19/23 31
Threats
• Competition by 2D solutions and existing technologies.
• Lack of awareness of the capabilities of 3D analytical tools in OCR world.– Geometry solution in a world seeped in statistical methods.
• Establishing validity of the 2D - 3D conversion algorithm
04/19/23 34
PRISM Infrastructure
• Two labs on campus – 0ne moving to bigger space in BY – downtown Tempe.
– Additional 8000 sq ft slated for a new project (Decision Theatre) in downtown Tempe.
• 24 proc SGI, 20+ workstations (Unix, PC and Linux)• Four 3D Laser scanners for inanimate objects• 3D face scanner (recent acquisition)• 2 Rapid Prototyping machines
04/19/23 35
Image Refinement
• Biomedical Examples: White matter in brain MRI scans, cell spindle fibers, membranes in laser confocal microscopic data.
Brain MRI Scan Mouse egg
Fungus membrane
04/19/23 36
3 characteristics (Chaudhuri et al)
1. Piecewise linear segments
2. Cross section as a Gaussian function
3. Relatively constant width
Image Refinement• Blood Vessel
04/19/23 38
2D Case: 2nd Derivatives
),(2
)sincos(exp)sincos(
cos
2
)sincos(exp
cos),(
2
22
4
2
2
2
2
2
yxNyx
yx
yxyxF
xx
xx
),(2
)sincos(exp)sincos(
sincos
2
)sincos(exp
sincos),(),(
2
22
4
2
2
2
yxNyx
yx
yxyxFyxF
xy
yxxy
),(2
)sincos(exp)sincos(
sin
2
)sincos(exp
sin),(
2
22
4
2
2
2
2
2
yxNyx
yx
yxyxF
yy
yy
),(2
)sincos(exp),(
2
2
yxNCyx
yxF
C: constant, N: noise
04/19/23 39
Enhancement• Maximal eigenvalue as an enhanced image
0),( if 0
0),( if ),(),(
2
yx
yxyxyxF
vv
Hv
),(1
),(
sin
cos
2
)sincos(exp
1
sin
cos
sincossin
cossincos
2
)sincos(exp
1
)sincos if ( sin
cos
2
2
2
2
2
2
2
2
2
yxFyx
yx
yx
yxFF
FF
yyyx
xyxx
Enhanced Image
04/19/23 42
Distance Between Two Functions
Penalty function
Case 1: f and g continuous over [0,1]
Case 2: f over [0,1] and g over [0,d], d <= 1
04/19/23 43
Curve Shape Measures• Shape Measures or Properties
– Curvature (planar)– Torsion (space curves)– Total or absolute Curvature (space)
• Classical Differential geometry says if the curvatures are identical then so are the curves subject to position and rotation
04/19/23 44
Curve Matching
• Remember • Writing in terms of
curvatures • What about partial
match?
• Or the general case