KI2 - 1
Kunstmatige Intelligentie / RuG
Structural Pattern Recognition
Marius Bulacu & prof. dr. Lambert Schomaker
2
Classification – First Step to Intelligence
The intelligent agent confronts an overwhelming confusion of sensory data and pattern classification is the first crucial step in making sense of the world.
The nature of classification and decision had been a central theme in the discipline of philosophical epistemology, the study of the nature of knowledge.
The foundations of pattern recognition can be traced back to Plato and later Aristotle, who distinguished between:
- an “essential property” – shared by all members in a class or “natural kind”
- an “accidental property” – which would differ among members in the class
Pattern recognition can be cast as the problem of finding such essential properties of a category.
3
General Structure of aPattern Recognition System
segmentation
sensing
feature extraction
classification
post-processing
decision
input
costs
context
missing features
DEEP PROBLEM!
4
Statistical vs. StructuralPattern Recognition
Statistical Patterns are represented by an n-dimensional feature vector.
Each feature is a numerical measure of a characteristic of the pattern.
The statistical variations of the features within each class are described and evaluated.
The feature space is partitioned into mutually exclusive regions with each region belonging to a specific class.
Recognition of an unknown sample is performed by determining in which region it falls and therefore to which pattern class it belongs.
Structural Patterns are represented by knowledge about how sub-pattern primitives relate to each other and must be combined to make up the entire pattern.
The primitives are simple and invariant sub-pattern formations that have no direct relation to the structure of the entire pattern.
Patterns are modeled in terms of primitives and their relations and are usually represented as strings, trees or graphs.
Recognition of an unknown pattern is performed by finding the most similar sample from a database of known objects using error-tolerant string or graph matching.
5
Knowledge-based Symbolic Methods
Assumption: the Turing / Von Neumann computer is a universal computation engine…
…therefore it can be used at all levels of information processing:
provided an appropriate algorithm can be designed which operates on appropriate representations
6
Knowledge-based Symbolic Methods
provided an appropriate algorithm can be designed…
which operates on appropriate representations…
7
Knowledge-based Symbolic Methods
…provided an appropriate algorithm can be designed…
mechanisms: recursion, hierarchic procedures search algorithms parsers matching algorithms string manipulation.. numerical computing
signal processing image processing statistical processing
8
…which operates on appropriate representations…
stacks linear strings and arrays matrices linked lists trees
Knowledge-based Symbolic Methods
9
…which operates on appropriate representations…
stacks linear strings and arrays matrices linked lists trees
is indeed successful in many information processing problems
Knowledge-based Symbolic Methods
Example: double spiral problem
in inner orouter spiral?
Example: double spiral problem
in inner orouter spiral?
difficult for, e.g., neural nets
Example: double spiral problem
in inner orouter spiral?
Answer: outside
difficult for, e.g., neural nets
Example: double spiral problem
in inner orouter spiral?
How?- flood-fill algorithm- other?
Example: double spiral problem
in inner orouter spiral?
- Find the right representation!
odd/even count
is not sensitive to shape variations of the spiral: a general solution
= Outside
count theintersections
Example: double spiral problem
in inner orouter spiral?
Outside
16
Culture
If it doesn’t work, you didn’t think hard enough.
You have to know what you do. You have to prove that & why it works. Even neural networks work on top of the
Turing/von Neumann engine (it will always win).
If you’re smart, you can often avoidNP-completeness.
Use of probabilities is a sign of weakness.
17
Strong Points
Scalability is often possible Convenience: little context dependence, no
training Reusability Transformability (compilation) Algorithmic refinement once it is known
how to do a trick (e.g., graphics cards and
DSPs in mobile phones: ugly code but
highly efficient)
18
Challenges
Knowledge dependence is expensive– not a problem in “IT” application design– a challenge to AI
Uncertainty
Noise
Brittleness
19
Solutions
More and more representational weight: (UML, Semantic Web, XML solves everything)
Symbolic learning mechanisms:– induction: version spaces grammar inference– decision tree learning– rewriting formalisms
Active hypothesis testing (what if…, assume X…)
20
Example 1
Primitives:
horizontal stick - H
vertical stick - V
loop west - W
loop east - E
loop north - N
loop south - S
closed loop - C
Patterns:
a -b -
e -
c -
h -
d -
s -z -
T -x -
WC
VC
E
CV
CE
VS
EW
WE
NWES
HV
6 - EC
21
Example 2
In Reading Systems (optical character recognition), only a small part of the algorithm concerns problems of image processing and character classification.
Most of the code is concerned with the structure
of the text image:– where are the blobs? – are these blobs text, photo or graphics?– how to segment into meaningful chunks: characters, words?– what is the logical organization (reading order) in the physical
organization of pixels?
Knowledge-based approaches are a necessity!
Name of conference
Programme committee
Brief description of conference
Submission details
26
Example of layout analysis
Knowing the type of a text block strongly reduces the number of possible interpretations
Example: “address block”
Address:– name of person– street, number– postal code, city
prof dr. L.R.B. SchomakerGrote Appelstraat 239712 TS GroningenNederland
Amsterdam7/7/2003
address
prof dr. L.R.B. SchomakerGrote Appelstraat 239712 TS GroningenNederland
address
person name
street
codes+city
country
prof dr. L.R.B. SchomakerGrote Appelstraat 239712 TS GroningenNederland
address
titles initials surname
street street ,,, digits
4 digits 2 upper case city name
country name
prof dr. L.R.B. SchomakerGrote Appelstraat 239712 TS GroningenNederland
<address> <person> <title></title> <initials or first name> </initials or first name> <surname></surname> </person> <home> <street name></street name> <number> </number> </home> <city> <postal code> <four digits></four digits> <white space></white space> <two upper-case letters> …. </postal code> </city> <country> </country></address>
(address (title is-left-of initials is-left-of surname) is-above (street name is-left-of number) is-above (city)is-above (country))
Content Layout
prof dr. L.R.B. SchomakerGroteAppelstraat 239712 TSGroningenNederland
etc.
etc.
<address> <person> <title></title> <initials or first name> </initials or first name> <surname></surname> </person> <home> <street name></street name> <number> </number> </home> <city> <postal code> <four digits></four digits> <white space></white space> <two upper-case letters> …. </postal code> </city> <country> </country></address>
(address (title is-left-of initials is-left-of surname) is-above (street name is-left-of number) is-above (city)is-above (country))
Content Layout
etc.
etc.
HELPS TEXT CLASSIFICATION
HELPS TEXT SEGMENTATION
prof dr. L.R.B. SchomakerGroteAppelstraat 239712 TSGroningenNederland
33
Spatial Relations in the XY PlaneBetween Rectangles
A 1) left-disjoint
2) left-touch
3) left-overlap
4) included-touch_left
5) included / includes
6) included-touch_right
7) right-overlap
8) right-touch
9) right-disjoint
X axis (similar on the Y axis)
1 2
3
4 5 6
7
8 9
34
Constructing a Graph Representation
A
B C
A B
C
(4, 9)
(7, 9) (9, 4)
1) left-disjoint
2) left-touch
3) left-overlap
4) included-touch_left
5) included / includes
6) included-touch_right
7) right-overlap
8) right-touch
9) right-disjoint
35
Example 3
Primitives: Patterns:
Head – H
Arm – A
Leg – L
Tail – T
HAALL HTTLLLL
How similar are these two patterns?
What is a proper similarity / distance measure between strings?
36
String Matching with ErrorsEdit Distance
Edit distance = how many fundamental operations are required to transform one string x into another string y
The fundamental operations are:
Substitution: a character in x is replaced by the corresponding character in y
Insertion: a character of y is inserted into x
Deletion: a character in x is deleted
v r i e n d e l i j k
rf
iendly
y
x
j
i
source
sink
0 1 2 43 65 7 8 9 1110
2
1
3
4
6
5
7
8
1 2 43 65 7 8 9 1110
2 1 32 54 6 7 8 109
3 2 21 43 5 6 7 98
4 3 12 32 4 5 6 87
5 4 23 21 3 4 5 76
6 5 34 12 2 3 4 65
7 6 45 23 2 2 3 54
8 7 56 34 3 3 3 5 = d(x, y) = d(y, x) 4
deletionremove letter of x
insertioninsert letter of y
into x
substitutionreplace letter of x
by letter of y
C[i, j] = min( C[i-1, j] + 1, C[i, j-1] + 1, C[i-1, j-1] + 1 - (x[i], y[j]) )
Edit Distance – The Cost Matrix
deletion insertion substitution / no change
no change
38
Statistical vs. StructuralPattern Recognition
Statistical
vectors of real numbers
probability distributions
metrics
Structural
lists of nominal attributes
strings, trees, graphs
rules
Top Related