Online Chinese Character Handwriting Recognition for Linux Presenter: Ran CHENG (Kelvin) Primary...
-
Upload
peter-conrad-wilcox -
Category
Documents
-
view
218 -
download
1
Transcript of Online Chinese Character Handwriting Recognition for Linux Presenter: Ran CHENG (Kelvin) Primary...
Online Chinese Character Online Chinese Character Handwriting Recognition for Handwriting Recognition for
LinuxLinuxPresenter: Ran CHENG (Kelvin)Presenter: Ran CHENG (Kelvin)
Primary Supervisor: Jim HoganPrimary Supervisor: Jim Hogan
Associate Supervisor: Jinhai CaiAssociate Supervisor: Jinhai Cai
ContentContent
BackgroundBackground IntroductionIntroduction Related materialRelated material Handwriting Recognition SystemHandwriting Recognition System EvaluationEvaluation Future workFuture work
BackgroundBackground Why?Why?
Why handwriting?Why handwriting? One of most important input methodsOne of most important input methods
Why Chinese character?Why Chinese character? Potential Large marketPotential Large market One of the I18N goalsOne of the I18N goals
Why online?Why online? Only feasible runtime Input methodOnly feasible runtime Input method Frequently used Frequently used
Why Linux?Why Linux? Fast developing OSFast developing OS
Who?Who? Who is the sponsor?Who is the sponsor?
Redhat LinuxRedhat Linux What?What?
What will be the deliverables?What will be the deliverables? One handwriting software prototypeOne handwriting software prototype A feasible handwriting recognition algorithmA feasible handwriting recognition algorithm
IntroductionIntroduction Handwriting typesHandwriting types
OnlineOnline OfflineOffline SignatureSignature
The current online Chinese handwriting marketThe current online Chinese handwriting market Most are commercial, not open sourceMost are commercial, not open source Some existing open source, but not ChineseSome existing open source, but not Chinese
Aim:Aim: Online Handwriting recognition and recognition accuracyOnline Handwriting recognition and recognition accuracy Recognition for Chinese CharacterRecognition for Chinese Character Implementation of handwriting recognition algorithm Implementation of handwriting recognition algorithm
under Linuxunder Linux
Related materialRelated material
Hidden Markov Model (HMM)Hidden Markov Model (HMM) Chinese Character ProcessingChinese Character Processing
Hidden Markov Model (HMM)Hidden Markov Model (HMM)
What is HMM?What is HMM? Markov process with unknown parameters Markov process with unknown parameters challenge is to determine the hidden parameters challenge is to determine the hidden parameters
from the observable sequencefrom the observable sequence ExampleExample
Two people in different city {Bob, Carol}Two people in different city {Bob, Carol} Talk through the phoneTalk through the phone Weather and activitiesWeather and activities
{Sunny, Rainy, Cloudy} {Walk, Shopping, Cleaning}{Sunny, Rainy, Cloudy} {Walk, Shopping, Cleaning}
Chinese Character ProcessingChinese Character Processing
Character segmentationCharacter segmentation Pre-processingPre-processing Pattern RepresentationPattern Representation ClassificationClassification Context processingContext processing
Handwriting Recognition SystemHandwriting Recognition System
Writing padWriting pad Data collection, organization and formatData collection, organization and format Feature analysisFeature analysis Training state initialisation and optimisation Training state initialisation and optimisation Character recognition Character recognition
Data collectionData collection
42 Chinese characters for 43 strokes and 42 Chinese characters for 43 strokes and variationsvariations all the Chinese character strokesall the Chinese character strokes frequently used charactersfrequently used characters
From 5 different peopleFrom 5 different people 40 training examples for each character40 training examples for each character
Feature analysisFeature analysis
Character decompositionCharacter decomposition Each stroke is represented by Each stroke is represented by
5 states5 states State decompositionState decomposition
Each state contains statistic Each state contains statistic distribution probability of 16 distribution probability of 16 featuresfeatures
Training state initialisationTraining state initialisation
Observation segmentationObservation segmentation Feature distributionFeature distribution State TransitionState Transition
Training state optimisation (Continue)Training state optimisation (Continue)
Observation Observation segmentationsegmentation
Feature distributionFeature distribution
State TransitionState Transition
Character recognitionCharacter recognition
1.1. Create a ranking list.Create a ranking list.2.2. Pick up a reserved input file as the observation file in the Pick up a reserved input file as the observation file in the
Viterbi algorithm.Viterbi algorithm.3.3. Pick up the distribution probability and transition probability Pick up the distribution probability and transition probability
files for a character stored in the database or file system.files for a character stored in the database or file system.4.4. Run the Viterbi algorithm and record the overall probability Run the Viterbi algorithm and record the overall probability
(we only used the overall path in the state transition (we only used the overall path in the state transition optimisation, and only use overall probability here).optimisation, and only use overall probability here).
5.5. According to the probability, insert the character at the proper According to the probability, insert the character at the proper position into the ranking list.position into the ranking list.
6.6. Repeat step 2 to 5 until no more character data is left in the Repeat step 2 to 5 until no more character data is left in the database or file system.database or file system.
EvaluationEvaluation
67% (56/84) of the 67% (56/84) of the characters are correctly characters are correctly recognised recognised
98.8% (83/84) of the 98.8% (83/84) of the character are recognised character are recognised in the top five positions in the top five positions
Future workFuture work
Writing Pad XInput supportWriting Pad XInput support Relative position handlingRelative position handling
For instance, “For instance, “ 工” 工” and “and “ 土”土” Duration handling Duration handling
For instance, “For instance, “ 士” 士” and “and “ 土”土”