Introduction to myanmar Text-To-Speech

17
Introduction to Myanmar Text-To-Speech Engine Ngwe Tun Consultant Yangon Education Center for Blinds

Transcript of Introduction to myanmar Text-To-Speech

Page 1: Introduction to myanmar Text-To-Speech

Introduction to Myanmar Text-To-Speech Engine

Ngwe Tun

Consultant

Yangon Education Center for Blinds

Page 2: Introduction to myanmar Text-To-Speech

What is Text-To-Speech Engine?

A text-to-speech (TTS) system converts normal language text into speech.

Also known as speech synthesizer.

e.g Microsoft SAM.

"The quick brown fox jumps over the lazy dog 1,234,567,890 times. soi"

Page 3: Introduction to myanmar Text-To-Speech

Overview of Text-To-Speech Engine A text-to-speech system (or "engine") is composed of two parts: a front-end and a

back-end.

The front-end converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. This process is often called text normalization, pre-processing, or tokenization.

The front-end then assigns phonetic transcriptions to each word, and divides and marks the text into prosodic units, like phrases, clauses, and sentences.

The process of assigning phonetic transcriptions to words is called text-to-phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end.

The back-end—often referred to as the synthesizer—then converts the symbolic linguistic representation into sound. In certain systems, this part includes the computation of the target prosody (pitch contour, phoneme durations),which is then imposed on the output speech.

Page 4: Introduction to myanmar Text-To-Speech

Features and Functions of

Text-to-Speech Software

Text-to-speech (TTS) software tools are similar in that they speak text on a

computer. However, they vary widely in their functionality.

Formatting text – Allows you to format digital text you create, download from

the Internet, or scan into your computer similar to a word processing

program.

Speaking what you type – Speaks text as you type to give you support in

writing. Within this function there may be the ability to set the level of

support, such as speaking words or speaking each letter and then the word.

Page 5: Introduction to myanmar Text-To-Speech

Speaking the Text

Continuous reading – reads from where you choose to begin reading and stops when it reaches the end of the text or you use a stop command

Incremental reading – reads an increment of text such as a word, sentence, chunk/phrase, or paragraph and stops and waits for you to request another increment of text read.

Highlighted text – reads just text you highlight with the cursor. Some TTS programs read the document from a starting point until the users stops the program. Other TTS programs only read text selected and highlighted by the user.

Voices – TTS software can use one voice or allow you to choose from a selection of male, female, and even foreign language voices

Reading Speed – you can choose to read faster or slower in precise words per minute or in speed increments

Page 6: Introduction to myanmar Text-To-Speech

Who will be using speech synthesis?

for users with visual disabilities. Screen readers not only read text files but also

give the user other audible navigation support such as reading the user

interface, indicating where the user’s cursor is on the screen, and indicating

when the user’s cursor has passed over a folder.

Text readers are commercial TTS software tools for users who read below grade

level because of a learning disability, English as a second language, a reading

disability, or low vision.

Stephen Hawking is one of the most famous people using speech synthesis to

communicate

Mobiles Phones can speak out incoming text such as SMS, E-Mail and notification

Page 7: Introduction to myanmar Text-To-Speech

Advantage of Text-To-Speech.

listen to class notes, text books and electronic text.

Facilitates education

Avoids eyestrain from too much reading

Make proofreading effective

Learn English, Myanmar or other languages

Prepare for speeches by hearing your work read aloud.

Listen to e-books or e-material during your commute.

Amuse children by letting your PC read stories to them.

Help seniors or those with vision problems.

Page 8: Introduction to myanmar Text-To-Speech

Demo

Page 9: Introduction to myanmar Text-To-Speech

Myanmar Languages in Digital Era

Computers, Mobile Phone, MP3 Players, Watches and Electronic Devices

Widely used in Social Networks & Online Content.

Accessible to News, Information, Knowledge from International & Local

Localization & Rich Myanmar Content in Electronic Form.

Page 10: Introduction to myanmar Text-To-Speech

Why we need Myanmar Text-To-Speech?

Myanmar Language Users can not use Screen Reader.

Screen/File/Text Reader do not support Myanmar.

Any Myanmar computer user must easier to use Myanmar Language with Text-

To-Speech Features.

The initiative of Open Source Myanmar Text-To-Speech Engine will empower

to other Software Vendor who want to develop/integrate with their

application. E.g. Mobile Phone Manufacturer can integrate Myanmar Language

support easily and without reinvent the wheel.

Myanmar Language Learning will be more easier through Text Reader.

Page 11: Introduction to myanmar Text-To-Speech

How shall we develop Myanmar TTS?

Learn

Define Scope of Work

Collect Digital Asset

Discover the best tool to make TTS and plan for the future

Develop

Develop Myanmar Language Model/Tokenization and Grapheme-To-Sound

Train TTS Engine

Test TTS Engine internally/Public Review

Enhanced

Review the work plan, try to find improvement, applied the feedback

Invite Specialist on the TTS and Improve the Engine with third party opinions

Develop Tools and apply in the real environment (e.g. audio books)

Page 12: Introduction to myanmar Text-To-Speech

Open Source Model of Myanmar TTS

Everyone can participate in the development Team

Anyone can guide to Project

Whoever can contribute their idea

Open Source Model of TTS Engine for Myanmar Language.

We realized that 1 Consultant, 1 Project Leader and 3 Developers will not be

fulfilled the complete Myanmar TTS.

WE need You, Your Feedback, Your Contribution.

Page 13: Introduction to myanmar Text-To-Speech

The purpose of a Text-to-Speech system

To convert any text into natural sounding speech.

First, text needs to be normalized. Normalization is the process of

transforming text into a single canonical form, therefore text is parsed into

single tokens.

Next, the text-to-speech system assigns the appropriate phonetic

transcriptions to each word which reflect how text should be pronounced in

any given natural language. The synthesizer then converts the symbolic

linguistic representations into sound.

The last step is to choose the right speech units which ensure the high quality

and natural sound of generated speech.

Page 14: Introduction to myanmar Text-To-Speech

Architecture of Myanmar TTS

Minimal unit of sound will be Syllable or Syllable-Chain

က ကာ ကာား က က ကား

မနတ သစစာ ဥမမာ

Word Segmentation or Tokenization

ကလ ားလ ေလက ာငားကကာားဖြငသောားခ ကကသည။

ကလ ားလ ေလက ာငားကကာားဖြငသောားခ ကကသည။

ကလ ား / လ ေ / လက ာငား / က / ကာား / ဖြင / သောား / ခ ကကသည / ။

Compose Syllable Sound to compose words sound with concatenation.

က+လ ား / လ ေ / လက ာငား / က / ကာား / ဖြင / သောား / ခ +ကက+သည / ။

Need to adjust speed and intonation between syllable and words.

ကလ ားလ ေ / လက ာငားက / ကာားဖြင / သောားခ ကကသည။

Page 15: Introduction to myanmar Text-To-Speech

Application of Myanmar TTS

The longest application has been in the use of screen readers for people with

visual impairment

commonly used by people with dyslexia and other reading difficulties as well

as by pre-literate children

Speech synthesis techniques are also used in entertainment productions such

as games and animations

Text to Speech for disability and handicapped communication aids have

become widely deployed in Mass Transit.

Text-to speech is also used in second language acquisition

Page 16: Introduction to myanmar Text-To-Speech

Dream for the Myanmar TTS

Text-To-Speech Engine integrated with Mobile Phone, Computer and

Electronic Devices

Voice Command integrated with Mobile Phone, Computer and Electronic

Devices

Every one can read any Myanmar news, information and electronic Text by

Screen Reader

Text-To-Speech Engine empowered in Public Announcement and weather

notification.

Screen Reader Functions will be integrated with OCR, even image can read

aloud.

Page 17: Introduction to myanmar Text-To-Speech

Thanks for being here and participating

in the Project

Sponsorship of the TTS Project by KBZ Group of Company

Great arrangement by Yangon Education Center for Blind

Contribution of Knowledge by several people

Last, not the Least, Warmly welcome to Future Contributors.