Lecture # 1- Pattern Recognition

download Lecture # 1- Pattern Recognition

of 46

Transcript of Lecture # 1- Pattern Recognition

  • 8/3/2019 Lecture # 1- Pattern Recognition

    1/46

    1

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/3/2019 Lecture # 1- Pattern Recognition

    2/46

    Course Syllabus Course Content

    Introduction to Pattern Recognition

    Statistical Pattern Recognition

    Bayesian Decision Theory

    Quadratic Classifier

    Parameter and Density Estimation

    Nearest Neighbors

    Linear Discriminants Validation

    Dimensionality Reduction

    Principal Component Analysis

    Feature Subset Selection

    2

  • 8/3/2019 Lecture # 1- Pattern Recognition

    3/46

    Course Syllabus Clustering

    Mixture Models

    Self Organizing Map

    Neural Network Approaches Single-Layer Perceptron

    Multi-Layer Perceptron

    Radial Basis Functions

    Advanced Topics Decision trees

    Hidden Markov Models

    Support Vector Machines

    3

  • 8/3/2019 Lecture # 1- Pattern Recognition

    4/46

    Course Syllabus Course Prerequisites

    Basic probability theory, statistics and the basic linearalgebra.

    Basic computer science principles and skills, at a levelsufficient to write a reasonable non-trivial computer program.

    Relation to Other Courses

    Machine Learning

    Data Mining

    4

  • 8/3/2019 Lecture # 1- Pattern Recognition

    5/46

    Course Syllabus Reference Book

    Pattern Classification, Richard O. Duda et.al, Second Edition,2000.

    Grading Policy Assignments and Presentation.

    Project and Paper submission.

    Final Exam.

    5

  • 8/3/2019 Lecture # 1- Pattern Recognition

    6/46

    What is a pattern? is an entity, vaguely defined, that could be

    given a

    name, fingerprint image, handwritten word, humanface, speech signal, DNA sequence,

    6

  • 8/3/2019 Lecture # 1- Pattern Recognition

    7/46

    What is Recognition? It is the ability to classify

    7

  • 8/3/2019 Lecture # 1- Pattern Recognition

    8/46

    What is Pattern Recognition (PR)? It is the study of how machines can

    observe the environment

    learn to distinguish patterns of interest from theirbackground

    make sound and reasonable decisions about thecategories of the patterns

    8

  • 8/3/2019 Lecture # 1- Pattern Recognition

    9/46

    Other Patterns Insurance, credit card applications - applicants

    are characterized by # of accidents, make of car, year of model Income, # of dependents, credit worthiness, mortgage amount

    Web documents Key words based descriptions.

    Housing market

    Location, size, year, school district, etc.

    9

  • 8/3/2019 Lecture # 1- Pattern Recognition

    10/46

    Pattern Class A collection ofsimilar(not necessarily identical)

    objects Inter-class variability

    Intra-class variability

    10

    Characters that look similar

    The letter T in different typefaces

  • 8/3/2019 Lecture # 1- Pattern Recognition

    11/46

    Pattern Class Model Different descriptions, which are typically

    mathematical in form for each class/population(e.g., a probability density like Gaussian)

    11

  • 8/3/2019 Lecture # 1- Pattern Recognition

    12/46

    Pattern Recognition Key Objectives:

    Process the sensed data to eliminate noise

    Hypothesize the models that describe each classpopulation (e.g., recover the process that generatedthe patterns)

    Given a sensed pattern, choose the best-fitting

    model for it and then assign it to class associatedwith the model

    12

  • 8/3/2019 Lecture # 1- Pattern Recognition

    13/46

    Classification vs Clustering Classification (known categories)

    Clustering (creation of new categories)

    13

    Category A

    Category B

    Classification (Recognition)

    (Supervised Classification)

    Clustering(Unsupervised Classification)

  • 8/3/2019 Lecture # 1- Pattern Recognition

    14/46

    Emerging PR Applications

    Problem Input Output

    Speech recognition Speech waveforms Spoken words, speaker

    identity

    Identification and counting

    of cells

    Slides of blood samples, micro-

    sections of tissues

    Type of cells

    Detection and diagnosis of

    disease

    EKG, EEG waveforms Types of cardiac

    conditions, classes of brain

    conditions

    Fingerprint identification Input image from fingerprintsensors

    Owner of the fingerprint,fingerprint classes

    Web search Key words specified by a user Text relevant to the user

    Character recognition

    (page readers, zip code,

    license plate)

    Optical scanned image Alphanumeric characters

    14

  • 8/3/2019 Lecture # 1- Pattern Recognition

    15/46

    Template Matching

    Template

    Input scene

  • 8/3/2019 Lecture # 1- Pattern Recognition

    16/46

    16

    Deformable Template:

    Corpus Callosum SegmentationPrototype registration to thelow-level segmented image

    Shape training set

    Prototype andvariationlearning

    Prototype warping

  • 8/3/2019 Lecture # 1- Pattern Recognition

    17/46

    Statistical Pattern Recognition

    17

    Patterns represented in a feature space

    Statistical model for pattern generation infeature space

  • 8/3/2019 Lecture # 1- Pattern Recognition

    18/46

    Main Components of a PR system

    18

  • 8/3/2019 Lecture # 1- Pattern Recognition

    19/46

    Example of a PR System

    19

  • 8/3/2019 Lecture # 1- Pattern Recognition

    20/46

    Complexity of PR

    An Example

    20

    Fish Classification:

    Sea Bass / SalmonPreprocessing involves:

    (1) image enhancement

    (2) separating touching

    or occluding fish

    (3) finding the boundary ofthe fish

  • 8/3/2019 Lecture # 1- Pattern Recognition

    21/46

    How to separate sea bass from salmon? Possible features to be used:

    Length

    Lightness Width

    Number and shape of fins

    Position of the mouth

    Etc

    21

  • 8/3/2019 Lecture # 1- Pattern Recognition

    22/46

    Decision Using Length Choose the optimal threshold using a

    number oftraining examples.

    22

  • 8/3/2019 Lecture # 1- Pattern Recognition

    23/46

    Decision Using Average Lightness Choose the optimal threshold using a number

    oftraining examples.

    23

    Overlap in the histograms is small compared to length feature

  • 8/3/2019 Lecture # 1- Pattern Recognition

    24/46

    Cost of Missclassification There are two possible classification errors.

    (1) deciding the fish was a sea bass when it was asalmon.

    (2) deciding the fish was a salmon when it was a seabass.

    Which error is more important ?

    24

  • 8/3/2019 Lecture # 1- Pattern Recognition

    25/46

    Decision Using Multiple Features To improve recognition, we might have to use

    more than one feature at a time.

    Single features might not yield the best performance. Combinations of features might yield better

    performance.

    1

    2

    x

    x

    1

    2

    :

    :

    x lightness x width

    25

  • 8/3/2019 Lecture # 1- Pattern Recognition

    26/46

    Decision Boundary Partition thefeature space into two regions by

    finding the decision boundary that minimizes

    the error.

    26

  • 8/3/2019 Lecture # 1- Pattern Recognition

    27/46

    How Many and which Features to use? Issues with feature extraction:

    Correlated features do not improve performance.

    It might be difficult to extract certain features. It might be computationally expensive to extract

    many features.

    Curse of dimensionality

    27

  • 8/3/2019 Lecture # 1- Pattern Recognition

    28/46

    Curse of Dimensionality Adding too many features can lead to a worsening of

    performance. Calculation Complexity

    Overfitting Problem

    28

  • 8/3/2019 Lecture # 1- Pattern Recognition

    29/46

    Model Complexity We can get perfect classification performance on the

    training data by choosing complex models.

    Complex models are tunedto the particular trainingsamples, rather than on the characteristics of the truemodel.

    29Issue of generalization

  • 8/3/2019 Lecture # 1- Pattern Recognition

    30/46

    Generalization The ability of the classifier to produce correct results

    on novelpatterns.

    How can we improve generalization performance ? More training examples (i.e., better pdf estimates). Simpler models (i.e., simpler classification boundaries) usually yield

    better performance.

    30

  • 8/3/2019 Lecture # 1- Pattern Recognition

    31/46

    The Design Cycle

    31

  • 8/3/2019 Lecture # 1- Pattern Recognition

    32/46

    Issues to Consider: Noise / Segmentation

    Data Collection / Feature Extraction

    Pattern Representation / Invariance/Missing Features

    Model Selection / Overfitting

    Prior Knowledge / Context

    Classifier Combination

    Costs and Risks Computational Complexity

    32

  • 8/3/2019 Lecture # 1- Pattern Recognition

    33/46

    NoiseVarious types of noise (e.g., shadows, conveyor

    belt might shake, etc.)

    Noise can reduce the reliability of the featurevalues measured.

    Knowledge of the noise process can helpimprove performance.

    33

  • 8/3/2019 Lecture # 1- Pattern Recognition

    34/46

    Segmentation Individual patterns have to be segmented.

    Can we segment without having categorized themfirst ?

    Can we categorize them without having segmentedthem first ?

    How do we "group" together the proper number

    of elements ?

    34

  • 8/3/2019 Lecture # 1- Pattern Recognition

    35/46

    Data Collection How do we know that we have collected an

    adequately

    large and representativeset of examples for training/testing the system?

    35

  • 8/3/2019 Lecture # 1- Pattern Recognition

    36/46

    Feature Extraction It is a domain-specific problem which influences

    classifier's performance.

    Which features are most promising ?Are there ways to automatically learn which

    features are best ?

    How many should we use ?

    Choose features that are robust to noise.

    Favor features that lead to simpler decision regions.

    36

  • 8/3/2019 Lecture # 1- Pattern Recognition

    37/46

    Pattern Representation Similar patterns should have similar

    representations.

    Patterns from different classes should havedissimilar representations.

    Pattern representations should be invariant totransformations such as:

    translations, rotations, size, reflections, non-rigiddeformations

    Small intra-class variation, large inter-classvariation.

    37

  • 8/3/2019 Lecture # 1- Pattern Recognition

    38/46

    Missing Features Certain features might be missing (e.g., due to

    occlusion).

    How should the classifier make the bestdecision with missing features ?

    How should we train the classifier with missingfeatures ?

    38

  • 8/3/2019 Lecture # 1- Pattern Recognition

    39/46

    Model Selection How do we know when to reject a class of

    models and try another one ?

    Is the model selection process just a trial anderror process ?

    Can we automate this process ?

    39

  • 8/3/2019 Lecture # 1- Pattern Recognition

    40/46

    Overfitting Models complex than necessary lead to

    overfitting (i.e., good performance on the

    training data but poor performance on noveldata).

    How can we adjust the complexity of the model ?(not very complex or simple).

    Are there principled methods for finding thebest complexity ?

    40

  • 8/3/2019 Lecture # 1- Pattern Recognition

    41/46

    Domain KnowledgeWhen there is not sufficient training data,

    incorporate domain knowledge:

    Model how each pattern in generated (analysis bysynthesis) - this is difficult !! (e.g., recognize all typesof chairs).

    Incorporate some knowledge about the patterngeneration method. (e.g., optical characterrecognition (OCR) assuming characters aresequences of strokes)

    41

  • 8/3/2019 Lecture # 1- Pattern Recognition

    42/46

    Context

    42

    How m chinfo mation a e

    we mi sing

  • 8/3/2019 Lecture # 1- Pattern Recognition

    43/46

    Classifier Combination Performance can be improved using a "pool" of

    classifiers.

    How should we combine multiple classifiers ?

    43

  • 8/3/2019 Lecture # 1- Pattern Recognition

    44/46

    Costs and Risks Each classification is associated with a cost or

    risk (e.g., classification error).

    How can we incorporate knowledge about suchrisks ?

    Can we estimate the lowest possible risk ofanyclassifier ?

    44

  • 8/3/2019 Lecture # 1- Pattern Recognition

    45/46

    Computational Complexity How does an algorithm scale with

    the number of feature dimensions

    number of patterns

    number of categories

    Brute-force approaches might lead to perfectclassifications results but usually have

    impractical time and memory requirements.What is the tradeoff between computational

    ease and performance ?

    45

  • 8/3/2019 Lecture # 1- Pattern Recognition

    46/46

    General Purpose PR SystemsAbility to switch rapidly and seamlessly

    between different pattern recognition tasks

    Difficulty to design a device that is capable ofperforming a variety of classification tasks

    Different decision tasks may require differentfeatures.

    Different features might yield different solutions. Different tradeoffs (e.g., classification error) exist for

    different tasks.