Speech Recognition
-
Upload
nilkanth-shirodkar -
Category
Education
-
view
16 -
download
1
Transcript of Speech Recognition
Speech Recognition in Konkani
Nilkanth Shet Shirodkar
What is Speech Recognition
Also known as automatic speech recognition or computer speech recognition which means understanding voice by the computer and performing any required task
Where can it be used
- System controlControlling devices
- CommercialIndustrial applications
- Voice dialing
Recognition
Voice Input Analog to Digital Acoustic Model
Language Model
Display Speech Engine
Speech Recognition
bull 1 Voice recording2 Word boundary detection3 Feature extraction 4 Recognition with the help of language models
Components of the recognition system
①Sound recording and Word detection Component Takes the input from the audio recorder preferably
microphone and identifies the word in the input signal Word detection is usually done by using the energy and the zero crossing rate of the signal The output of this component is then sent to the feature extractor module
②Feature Extractor This is responsible for generating the feature
vectors for the audio signals input to it from the word detection component It generates the MFCC (Mel Frequency Cepstrum Coe1113094fficients) which is used later to identify the audio signal
bull 3 Recognition System ndash HMM (Hidden Markov Model-based) component
which takes as input the feature vectors generated from the feature extractor component and then finds the best or most suitable match from the knowledge model
bull 4 Knowledge Model ndash language dictionary which is used to identify the
sound signal
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
What is Speech Recognition
Also known as automatic speech recognition or computer speech recognition which means understanding voice by the computer and performing any required task
Where can it be used
- System controlControlling devices
- CommercialIndustrial applications
- Voice dialing
Recognition
Voice Input Analog to Digital Acoustic Model
Language Model
Display Speech Engine
Speech Recognition
bull 1 Voice recording2 Word boundary detection3 Feature extraction 4 Recognition with the help of language models
Components of the recognition system
①Sound recording and Word detection Component Takes the input from the audio recorder preferably
microphone and identifies the word in the input signal Word detection is usually done by using the energy and the zero crossing rate of the signal The output of this component is then sent to the feature extractor module
②Feature Extractor This is responsible for generating the feature
vectors for the audio signals input to it from the word detection component It generates the MFCC (Mel Frequency Cepstrum Coe1113094fficients) which is used later to identify the audio signal
bull 3 Recognition System ndash HMM (Hidden Markov Model-based) component
which takes as input the feature vectors generated from the feature extractor component and then finds the best or most suitable match from the knowledge model
bull 4 Knowledge Model ndash language dictionary which is used to identify the
sound signal
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Where can it be used
- System controlControlling devices
- CommercialIndustrial applications
- Voice dialing
Recognition
Voice Input Analog to Digital Acoustic Model
Language Model
Display Speech Engine
Speech Recognition
bull 1 Voice recording2 Word boundary detection3 Feature extraction 4 Recognition with the help of language models
Components of the recognition system
①Sound recording and Word detection Component Takes the input from the audio recorder preferably
microphone and identifies the word in the input signal Word detection is usually done by using the energy and the zero crossing rate of the signal The output of this component is then sent to the feature extractor module
②Feature Extractor This is responsible for generating the feature
vectors for the audio signals input to it from the word detection component It generates the MFCC (Mel Frequency Cepstrum Coe1113094fficients) which is used later to identify the audio signal
bull 3 Recognition System ndash HMM (Hidden Markov Model-based) component
which takes as input the feature vectors generated from the feature extractor component and then finds the best or most suitable match from the knowledge model
bull 4 Knowledge Model ndash language dictionary which is used to identify the
sound signal
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Recognition
Voice Input Analog to Digital Acoustic Model
Language Model
Display Speech Engine
Speech Recognition
bull 1 Voice recording2 Word boundary detection3 Feature extraction 4 Recognition with the help of language models
Components of the recognition system
①Sound recording and Word detection Component Takes the input from the audio recorder preferably
microphone and identifies the word in the input signal Word detection is usually done by using the energy and the zero crossing rate of the signal The output of this component is then sent to the feature extractor module
②Feature Extractor This is responsible for generating the feature
vectors for the audio signals input to it from the word detection component It generates the MFCC (Mel Frequency Cepstrum Coe1113094fficients) which is used later to identify the audio signal
bull 3 Recognition System ndash HMM (Hidden Markov Model-based) component
which takes as input the feature vectors generated from the feature extractor component and then finds the best or most suitable match from the knowledge model
bull 4 Knowledge Model ndash language dictionary which is used to identify the
sound signal
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Speech Recognition
bull 1 Voice recording2 Word boundary detection3 Feature extraction 4 Recognition with the help of language models
Components of the recognition system
①Sound recording and Word detection Component Takes the input from the audio recorder preferably
microphone and identifies the word in the input signal Word detection is usually done by using the energy and the zero crossing rate of the signal The output of this component is then sent to the feature extractor module
②Feature Extractor This is responsible for generating the feature
vectors for the audio signals input to it from the word detection component It generates the MFCC (Mel Frequency Cepstrum Coe1113094fficients) which is used later to identify the audio signal
bull 3 Recognition System ndash HMM (Hidden Markov Model-based) component
which takes as input the feature vectors generated from the feature extractor component and then finds the best or most suitable match from the knowledge model
bull 4 Knowledge Model ndash language dictionary which is used to identify the
sound signal
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Components of the recognition system
①Sound recording and Word detection Component Takes the input from the audio recorder preferably
microphone and identifies the word in the input signal Word detection is usually done by using the energy and the zero crossing rate of the signal The output of this component is then sent to the feature extractor module
②Feature Extractor This is responsible for generating the feature
vectors for the audio signals input to it from the word detection component It generates the MFCC (Mel Frequency Cepstrum Coe1113094fficients) which is used later to identify the audio signal
bull 3 Recognition System ndash HMM (Hidden Markov Model-based) component
which takes as input the feature vectors generated from the feature extractor component and then finds the best or most suitable match from the knowledge model
bull 4 Knowledge Model ndash language dictionary which is used to identify the
sound signal
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
②Feature Extractor This is responsible for generating the feature
vectors for the audio signals input to it from the word detection component It generates the MFCC (Mel Frequency Cepstrum Coe1113094fficients) which is used later to identify the audio signal
bull 3 Recognition System ndash HMM (Hidden Markov Model-based) component
which takes as input the feature vectors generated from the feature extractor component and then finds the best or most suitable match from the knowledge model
bull 4 Knowledge Model ndash language dictionary which is used to identify the
sound signal
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
bull 3 Recognition System ndash HMM (Hidden Markov Model-based) component
which takes as input the feature vectors generated from the feature extractor component and then finds the best or most suitable match from the knowledge model
bull 4 Knowledge Model ndash language dictionary which is used to identify the
sound signal
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Speech Recognition system
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Acoustic Model
bull Features that were extracted from the input sound by the extraction module have to be compared with some predefined model to identify the spoken word
bull Word Modelbull Phone Model
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
bull Phone Model - Only parts of words called phones are modelled instead of modelling the word as a whole Instaed of matching the sound with each word we match the sound with the words and recognise the parts
bull Word Model - The words are modelled as a whole During recognition the input sound matched against each word present in the wodel and the best possible match is then considered to be spoken word
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
o Phone Set - Phoneme is the basic or the smallest unit of sound
o aa a iy o Dictionary bull A dictionary is also known as the pronunciation
lexicon specifies the pronunciations of the words as linear sequence of phonemes
bull the dh axbull on aa n
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Language Model
bull Providing a fair idea about the context and the words that can occur in the context to the speech recognition system It also provides an idea about the different words that are possible in the language and the sequence in which these words may occur
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
HMM for ASR
bull Building an HMM for each phonebull Combine the phone models based on the
pronunciation model to create word level models
bull Word level models are combined based on the language model
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
How Language Models work
bull Hard to compute ndash P(ldquoAnd nothing but the truthrdquo)
bull Decompose probabilityndash P(ldquoAnd nothing but the truth) = P(ldquoAndrdquo)
P(ldquonothing|andrdquo) P(ldquobut|and nothingrdquo) P(ldquothe|and nothing butrdquo) P(ldquotruth|and nothing but therdquo)
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
CMUSphinx
Sphinx3 is the speech recognizer (decoder) SphinxTrain is a set of tools for acoustic
modeling SphinxBase is a common set of library used in
CMU Sphinx
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Jasper
bull Jasper is an open source platform for developing voice-controlled applications
bull Uses voice to ask for informationbull Jasper runs on Raspberry Pibull Configure Jasper to make own personal
Assistant
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
Resources
bull List of publications httpcmusphinxsourceforgenetwikiresearch Speech Recognition With CMU Sphinx [Blog by N Shmyrev
Sphinx developer] bull Speech recognition seminars at Leiden Institute for
Advanced Computer Science Netherlands bull httpwwwliacsnl~erwinspeechrecognitionhtml
httpwwwliacsnl~erwinSR2003 httpwwwliacsnl~erwinSR2005httpwwwliacsnl~erwinSR2006httpwwwliacsnl~erwinSR2009
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-
References
bull [1] Anushree Srivastava Nivedita Singh and Shivangi Vaish Speech Recognition For Hindi Language International Journal of Engineering Research amp Technology (IJERT) April ndash 2013
bull Wiqas Ghai and Navdeep Singh ldquoAnalysis of Automatic Speech Recognition Systems for Indo-Aryan Languages Punjabi A Case Studyrdquo Vol-2 Issue-1 March 2012
bull Website httpcmusphinxsourceforgenet
- Speech Recognition in Konkani
- What is Speech Recognition
- Where can it be used
- Slide 4
- Speech Recognition
- Components of the recognition system
- Slide 7
- Slide 8
- Speech Recognition system
- Acoustic Model
- Slide 11
- Slide 12
- Language Model
- HMM for ASR
- Slide 15
- How Language Models work
- CMUSphinx
- Jasper
- Resources
- References
-