Copyright, 1998 © Alexander Schonfeld TIP Try and stay awake… kick sleeping neighbors. Don’t...

Post on 23-Dec-2015

217 views 0 download

Tags:

Transcript of Copyright, 1998 © Alexander Schonfeld TIP Try and stay awake… kick sleeping neighbors. Don’t...

Copyright, 1998 © Alexander Schonfeld

TIP Try and stay awake…

kick sleeping neighbors.Don’t blink!

IntroductionIntroduction

Internationalization Internationalization ((i18ni18n) is the ) is the process of designing an application so process of designing an application so that it can be adapted to different that it can be adapted to different languages and regions, without languages and regions, without requiring engineering changes.requiring engineering changes.

LocalizationLocalization ( (l10nl10n) is the process of ) is the process of adapting software for a specific region adapting software for a specific region or language by adding locale-specific or language by adding locale-specific components and translating text.components and translating text.

Organization of Organization of PresentationPresentation

What is What is i18ni18n?? Java example of messagesJava example of messages What is a “What is a “localelocale”?”? Formatting data in messagesFormatting data in messages Translation issuesTranslation issues Date/Time/Currency/etcDate/Time/Currency/etc UnicodeUnicode and support in Java and support in Java Iteration through textIteration through text

Why is Why is i18ni18n important? important?

Build once, sell anywhere…Build once, sell anywhere… Modularity demands it!Modularity demands it!

– Ease of Ease of translationtranslation ““With the addition of localization With the addition of localization

data, the same executable can be data, the same executable can be run worldwide.” run worldwide.”

Characteristics of Characteristics of i18ni18n...... Textual elements such as status messages and Textual elements such as status messages and

the GUI component labels are the GUI component labels are not hardcodednot hardcoded in in the program. Instead, they are stored outside the program. Instead, they are stored outside the source code and the source code and retrieved dynamicallyretrieved dynamically. .

Support for new languages does Support for new languages does not require not require recompilationrecompilation. .

Other culturally-dependent data, such as dates Other culturally-dependent data, such as dates and currencies, appear in and currencies, appear in formats that conformformats that conform to the end-user's region and language. to the end-user's region and language.

Really why…Really why…

CarmaggedonCarmaggedon

The rest is Java… why?The rest is Java… why?

Java:Java:– is readable!is readable!– has most complete built-in has most complete built-in i18ni18n support. support.– easily illustrates correct implementation of easily illustrates correct implementation of

many many i18ni18n concepts. concepts.– concepts can be extended to any language.concepts can be extended to any language.

For more info see:For more info see: www.coolest.com/i18nwww.coolest.com/i18n java.sun.com/docs/books/tutorial/i18njava.sun.com/docs/books/tutorial/i18n

Java Example: Java Example: Messages...Messages...

Before:Before:System.out.println("Hello.");System.out.println("How are you?");System.out.println("Goodbye.");

Too much code! Too much code!

After:After:

Sample Run…Sample Run…

% java I18NSample fr FRBonjour.Comment allez-vous?Au revoir.

% java I18NSample en USHello.How are you?Goodbye.

CreatedCreated MessagesBundle_fr_FR.properties, , which contains these lines:which contains these lines:

(What the translator deals with.)

In the English one?In the English one?

1. So What Just 1. So What Just Happened?Happened?

greetings = Bonjour.farewell = Au revoir.inquiry = Comment allez-vous?

Look!Look!

2. Define the 2. Define the localelocale......

Look!Look!

3. Create a 3. Create a ResourceBundleResourceBundle......

Look!Look!

4. Get the Text from the4. Get the Text from the ResourceBundleResourceBundle......

What is a “locale”?What is a “locale”?

Locale objects are only identifiers. Locale objects are only identifiers. After defining a Locale, you pass it to After defining a Locale, you pass it to

other objects that perform useful tasks, other objects that perform useful tasks, such as formatting dates and numbers. such as formatting dates and numbers.

These objects are called These objects are called locale-sensitivelocale-sensitive, , because their behavior varies according because their behavior varies according to Locale. to Locale.

A A ResourceBundleResourceBundle is an example of a is an example of a locale-sensitivelocale-sensitive object. object.

Did you get that?Did you get that?

currentLocale = new Locale(language, country);currentLocale = new Locale(language, country);

message = ResourceBundle.getBundle("MessagesBundle",currentLocale);message = ResourceBundle.getBundle("MessagesBundle",currentLocale);

MessagesBundle_en_US.propertiesMessagesBundle_fr_FR.propertiesMessagesBundle_de_DE.properties

greetings = Bonjour.farewell = Au revoir.inquiry = Comment allez-vous?

“fr” “FR”

message.getString(“inquiry”)message.getString(“inquiry”)

Got a program… Got a program… need to… need to…

What do I have to change?What do I have to change? What’s easily translatable?What’s easily translatable? What’s NOT?What’s NOT?

– ““It said 5:00pm on that $5.00 watch on May 5th!”It said 5:00pm on that $5.00 watch on May 5th!”– ““There are 5 watches.”There are 5 watches.”

Unicode characters.Unicode characters. Comparing strings.Comparing strings.

What do I have to change?What do I have to change?

Just a few things…Just a few things… messages messages labels on GUI components labels on GUI components online help online help sounds sounds colors colors graphics graphics iconsicons dates dates timestimes

numbers numbers currencies currencies measurements measurements phone numbers phone numbers honorifics and honorifics and

personal titles personal titles postal addresses postal addresses page layouts page layouts

What’s easily translatable? What’s easily translatable? Isolate it!Isolate it!

Status messagesStatus messages Error messages Error messages Log file entriesLog file entries GUI component labelsGUI component labels

– BAD!BAD!

– GOOD!GOOD!

Button okButton = new Button(“OK”);

String okLabel = ButtonLabel.getString("OkKey");Button okButton = new Button(okLabel);

template = At {2,time,short} on {2,date,long}, we attack \

the {1,number,integer} ships on planet {0}.planet = Mars

What’s NOT What’s NOT (easily

translatable)?? ““At 1:15 PM on April 13, 1998, we attack the 7 ships on Mars.”At 1:15 PM on April 13, 1998, we attack the 7 ships on Mars.”

MessageBundle_en_US.properties

The String in the ResourceBundle that corresponds to the "planet" key.

The date portion of a Date object. The same Date object is used for both the date and time variables. In the Object array of arguments the index of the element holding the

Date object is 2.

The time portion of a Date object. The "short" style specifies the DateFormat.SHORT formatting style.

A Number object, further qualified with the "integer" number style.

What’s NOT = What’s NOT = ““Compound MessagesCompound Messages””

ExamplExample!e!

1. Compound Messages:1. Compound Messages:

messageArgumentsmessageArguments...... Set the message Set the message

arguments…arguments… Remember the Remember the

numbers in the numbers in the template refer to template refer to the index in the index in messageArgumemessageArgumentsnts!!

2. Compound Messages:2. Compound Messages:create create formatterformatter......

Don’t forget Don’t forget setting the setting the LocaleLocale of the of the formatterformatter object...object...

3. Compound Messages:3. Compound Messages:

Get the template Get the template we defined we defined earlier…earlier…

Then pass in our Then pass in our arguments!arguments!

And finally RUN...And finally RUN...

Sample Run…Sample Run…currentLocale = en_US

At 1:15 PM on April 13, 1998, we attack the 7 ships on the planet Mars.

currentLocale = de_DE

Um 13.15 Uhr am 13. April 1998 haben wir 7 Raumschiffe auf dem Planeten Mars entdeckt.

(Note: I modified the example and don’t speak German so couldn’t translate my changes so the German does not match.)

What’s NOT What’s NOT (easily

translatable)?? Answer = Answer = Plurals!Plurals!

There are no files on XDisk.There is one file on XDisk.There are 2 files on XDisk.

3 possibilities for output templates.

Possible integer value in one of the templates.

Also variable...

pattern = There {0} on {1}.

noFiles = are no files

oneFile = is one file

multipleFiles = are {2} files

Plurals(s)’ses!?!Plurals(s)’ses!?!

ChoiceBundle_en_US.properties

noFiles = are no filesoneFile = is one filemultipleFiles = are {2} files

There are 2 files on XDisk.

Plurals!Plurals! What’s What’s

different?different? Now we even Now we even

index our index our templates… templates… see see fileStringsfileStrings, , indexed with indexed with fileLimitsfileLimits..

First create the First create the array of array of templates.templates.

How = How = Not just a Not just a

patternpattern...... Now we have Now we have

formatsformats too... too...

And... And... Before we just Before we just

called called formatformat directly after directly after applyPatternapplyPattern......

Now we have Now we have setFormatssetFormats too. too.

This is required This is required to give us to give us another layer of another layer of depth to our depth to our translation.translation.

Sample Run…Sample Run…currentLocale = en_US

There are no files on XDisk.There is one file on XDisk.There are 2 files on XDisk.There are 3 files on XDisk.

currentLocale = fr_FR

Il n' y a pas des fichiers sur XDisk.Il y a un fichier sur XDisk.Il y a 2 fichiers sur XDisk.Il y a 3 fichiers sur XDisk.

Numbers and Currencies!Numbers and Currencies!

What’s wrong with my numbers?What’s wrong with my numbers?– We say: We say:

– Germans say:Germans say:

– French say:French say:

345,987.246

345 987,246

345.987,246

Numbers...Numbers...

Supported through Supported through NumberFormatNumberFormat! !

Shows what locales are available. Note, Shows what locales are available. Note, you can also create custom formats if you can also create custom formats if needed.needed.

Locale[] locales = NumberFormat.getAvailableLocales();

345 987,246 fr_FR345.987,246 de_DE345,987.246 en_US

Money!Money!

Supported with: Supported with: NumberFormatNumberFormat..getCurrencyInstancgetCurrencyInstancee! !

9 876 543,21 F fr_FR9.876.543,21 DM de_DE$9,876,543.21 en_US

Percents?Percents?

Supported with: Supported with: NumberFormatNumberFormat..getPercentInstancegetPercentInstance! !

““A Date and Time…A Date and Time…

Supported with:Supported with:– DateFormatDateFormat..getDateInstancegetDateInstance

– DateFormatDateFormat..getTimeInstancegetTimeInstance

– DateFormatDateFormat..getDateTimeInstancegetDateTimeInstance

DateFormat timeFormatter = DateFormat.getTimeInstance(DateFormat.DEFAULT, currentLocale);

DateFormat dateFormatter = DateFormat.getDateInstance(DateFormat.DEFAULT, currentLocale);

DateFormat dateTimeFormatter = DateFormat.getDateTimeInstance( DateFormat.LONG, DateFormat.LONG, currentLocale);

Date example...Date example...

Supported with: Supported with: DateFormatDateFormat..getDateInstancegetDateInstance! !

9 avr 98 fr_FR9.4.1998 de_DE09-Apr-98 en_US

Characters...Characters...

16 bit!16 bit! 65,536 characters 65,536 characters Encodes all major languagesEncodes all major languages In Java In Java CharChar is a Unicode character is a Unicode character See See unicode.org/unicode.org/

0x0000 0xFFFF

ASCIIGreek Symbol

s

Kana

Future Use

Internal

etc...

Java support for the Java support for the Unicode Unicode CharChar......

Character API:Character API:– isDigit isDigit – isLetter isLetter – isLetterOrDigit isLetterOrDigit – isLowerCase isLowerCase – isUpperCase isUpperCase – isSpaceChar isSpaceChar – isDefinedisDefined

UnicodeUnicode Char Char values accessed values accessed with:with: String eWithCircumflex = new String("\u00EA");

Java support for the Java support for the Unicode Unicode CharChar......

Example of some repair…Example of some repair…– BAD!BAD!

– GOOD!GOOD!if (Character.isLetter(ch)) // ch is a letter

if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z')) // ch is a letter

Java support for the Java support for the Unicode Unicode CharChar......

Get the Unicode category for a Get the Unicode category for a CharChar::

– LOWERCASE_LETTERLOWERCASE_LETTER– UPPERCASE_LETTERUPPERCASE_LETTER– MATH_SYMBOLMATH_SYMBOL– CONNECTOR_PUNCTUATIONCONNECTOR_PUNCTUATION– etc...etc...if (Character.getType('_') == Character.CONNECTOR_PUNCTUATION)

// ch is a “connector”

Comparing StringsComparing Strings

Called “string collation”Called “string collation” Collation rules provided by the Collation rules provided by the CollatorCollator

classclass Rules vary based on Rules vary based on LocaleLocale Note:Note:

– can customize rules with can customize rules with RuleBasedCollatorRuleBasedCollator

– can optimize collation time with can optimize collation time with CollationKeyCollationKey

•Strings of the world unite!

Collator!Collator! As always As always

make a make a newnew class...class...

Note the Note the Unicode Unicode charchar definitions.definitions.

Finally note the Finally note the use of the use of the collatorcollator..comparcomparee

Sample Run!Sample Run!

The The EnglishEnglish CollatorCollator returns: returns:

According to the collation rules of the According to the collation rules of the FrenchFrench language, the preceding list is in language, the preceding list is in the wrong order. In French, "pêche” the wrong order. In French, "pêche” should follow "péché" in a sorted list. The should follow "péché" in a sorted list. The French French CollatorCollator thus returns: thus returns:

peachpéchépêchesin

peachpêchepéchésin

Detecting Text BoundariesDetecting Text Boundaries

Important for?Important for?Word processing functions such as selecting, Word processing functions such as selecting,

cutting, pasting text… etc. (double-click and cutting, pasting text… etc. (double-click and select)select)

BreakIteratorBreakIterator class class (imaginary cursor)(imaginary cursor)

– Character boundaries Character boundaries getCharacterInstancegetCharacterInstance

– Word boundaries Word boundaries getWordInstancegetWordInstance

– Sentence boundaries Sentence boundaries getSentenceInstancegetSentenceInstance

– Line boundaries Line boundaries getLineInstancegetLineInstance

•Beware!!! The END of the word is coming!

BreakIteratorBreakIterator: : First we create First we create

our our wordIteratorwordIterator..

Then attach the Then attach the iterator to the iterator to the target text.target text.

Loop through Loop through the text finding the text finding boundaries and boundaries and set them to set them to carrets in our carrets in our footer string.footer string.

She stopped. She said, "Hello there," and then went on.^ ^^ ^^ ^ ^^ ^^^^ ^^ ^^^^ ^^ ^^ ^^ ^

BreakIteratorBreakIterator: :

You see this You see this

Although this word contains three user Although this word contains three user characters, it is composed by six characters, it is composed by six Unicode characters:Unicode characters:

Really only 3 user characters…Really only 3 user characters…(Imagine the characters masked on top of each other…)(Imagine the characters masked on top of each other…)

=

String house = "\u0628" + "\u064e" + "\u064a" + "\u0652" + "\u067a" + "\u064f";

I only speak English...

Arabic for “house”

BreakIteratorBreakIterator: : First note creating First note creating

the Arabic/Saudi the Arabic/Saudi Arabia Arabia LocaleLocale..

Then notice our Then notice our 6 6 UnicodeUnicode charchar of of text.text.

Looping through Looping through the text finding the text finding boundaries yields boundaries yields only only 33 breaks breaks after the after the beginning.beginning.

0246

It works with:It works with:

Problems with: Problems with:

BreakIteratorBreakIterator: :

Please add 1.5 liters to the tank! “It’s up to us.”^ ^ ^

"No man is an island . . . every man . . . "^ ^ ^ ^ ^ ^^

My friend, Mr. Jones, has a new dog. The dog's name is Spot.^ ^ ^ ^

Returns places where you can split Returns places where you can split a line a line (good for word wrapping)(good for word wrapping)::

According to a According to a BreakIteratorBreakIterator, a line , a line boundary occurs after the end of a boundary occurs after the end of a sequence of whitespace characters sequence of whitespace characters (space, tab, newline).(space, tab, newline).

BreakIteratorBreakIterator: :

She stopped. She said, "Hello there," and then went on.^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

Java provides:Java provides:

BreakIteratorBreakIterator: :

FileInputStream fis = new FileInputStream("test.txt");InputStreamReader defaultReader = new InputStreamReader(fis);String defaultEncoding = defaultReader.getEncoding();

Unicode charsInputStreamReader

OutputStreamWriterUnicode chars

Non-Unicode

Non-Unicode

FileOutputStream fos = new FileOutputStream("test.NEW");Writer out = new OutputStreamWriter(fos, "UTF8");

Output encoding format

For more info on For more info on i18ni18n and: and:– W3CW3C and and i18ni18n

The The futurefuture of of HTTPHTTP, , HTMLHTML, , XMLXML, , CSS2CSS2……

– GUIsGUIs– The OTHER character sets…The OTHER character sets…

Scary stuff… those Scary stuff… those ISOISO standards standards

– UNIX/clonesUNIX/clones C programming for C programming for i18ni18n X/Open I18N ModelX/Open I18N Model

•Go forth and internationalize...