Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run...

16
Temple University Goals : 1. Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4 website 3. Based on the results, make decisions (issue with microprocessor, floating point etc.) By Jaykrishna shukla, Mubin Ahmed and Cara

Transcript of Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run...

Page 1: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University

Goals :1. Down sample 20 khz TIDigits data to 16 khz.

2. Use Down sample data run regression test andCompare results posted in Sphinx-4 website

3. Based on the results, make decisions (issue with microprocessor, floating point etc.)

By Jaykrishna shukla, Mubin Ahmed and Cara Santin

Page 2: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University

Learned :

1. Cygwin not effective to run Sox2. effective to run linux command line interface to build

application3. Easy to install

Page 4: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple UniversityIntroduction to Training

Q1.What is acoustic model ?

A1. model used by a speech recognizer for decoding language spoken by a person and modeling numerically how the language sounds when spoken in a form that can be stored on a computer.

Q2. what is training

A2. process that wants to converge on a solution yielding the most likely sequence of vectors for a given acoustic unit.

Q3. why is training required?

A3. In order to generate a set of acoustic model for any audio data, one needs to follow a particular set of steps which is named as training, hence to generate acoustic model, training is required.

Page 5: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University

• The Flow chart for the Training Procedure

Training acoustic model using SphinxTrain 1.0 overview

Page 6: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple UniversitySphinxTrain 1.0 & auto generation

• The new version of sphinx train has a build all option, that generates all the required files that were shown in the flow chart from previous slide. However, in order to do object specific function, one needs to modify the config file according to the purpose of the task.

Page 7: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple UniversityThis week’s accomplishment

• The two major goals that I achieved this week were:

• Finished the complete training process for the an4 demo.

• Worked on generating the feature model for the TI Digit short test data.

• Sample output of a training process (it took more than 20 min to compile this code)

Page 8: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple UniversityGenerating the feature vectors

• There two main step in generating the feature vector:

• 1. Generate the .Fileids file (it is just the path list of all the data file)

• 2. Modify the Make_feats (perl script) to in order to read the correct data in and change the default settings that the SphinxTrain comes with.

Page 9: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple UniversityConclusion and Future

• The main problem in feature generation is that the Make_feats file has default settings for the an4 tutorial, hence to getting it working we have to change the configuration for both the make_feats file and the SphinxTrain connfig file (because the config file determines what goes in to the make_feats file. Follow the below Example )

Page 10: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University

Training Acoustic model using Sphinx Train

Jaykrishna shukla,Mubin Amehed& cara SantinDepartment of Electrical and Computer Engineering

Temple University

URL:

Page 11: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University: Slide 11

Introduction to Training

Q1.What is acoustic model ?

A1. model used by a speech recognizer for decoding language spoken by a person and modeling numerically how the language sounds when spoken in a form that can be stored on a computer.

Q2. what is training

A2. process that wants to converge on a solution yielding the most likely sequence of vectors for a given acoustic unit.

Q3. why is training required?

A3. In order to generate a set of acoustic model for any audio data, one needs to follow a particular set of steps which is named as training, hence to generate acoustic model, training is required.

Page 12: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University: Slide 12

• The Flow chart for the Training Procedure

Training acoustic model using SphinxTrain 1.0 overview

Page 13: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University: Slide 13

SphinxTrain 1.0 & auto generation

• The new version of sphinx train has a build all option, that generates all the required files that were shown in the flow chart from previous slide. However, in order to do object specific function, one needs to modify the config file according to the purpose of the task.

Page 14: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University: Slide 14

This week’s accomplishment

• The two major goals that I achieved this week were:

• Finished the complete training process for the an4 demo.

• Worked on generating the feature model for the TI Digit short test data.

• Sample output of a training process (it took more than 20 min to compile this code)

Page 15: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University: Slide 15

Generating the feature vectors

• There two main step in generating the feature vector:

• 1. Generate the .Fileids file (it is just the path list of all the data file)

• 2. Modify the Make_feats (perl script) to in order to read the correct data in and change the default settings that the SphinxTrain comes with.

Page 16: Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.

Temple University: Slide 16

Conclusion and Future

• The main problem in feature generation is that the Make_feats file has default settings for the an4 tutorial, hence to getting it working we have to change the configuration for both the make_feats file and the SphinxTrain connfig file (because the config file determines what goes in to the make_feats file. Follow the below Example )