Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by...
Transcript of Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by...
![Page 1: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/1.jpg)
Internationalizing JavaScript Applications
Norbert Lindenberg
© Norbert Lindenberg 2012. All rights reserved.
![Page 2: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/2.jpg)
ECMAScript
• Language Speci!cation
• Developed by Ecma TC 39
• Language syntax and semantics
• Core API: Object, String, Array, RegExp, ...
• 5.1 current
• 6 expected December 2013
![Page 3: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/3.jpg)
ECMAScript• Internationalization API Speci!cation
• Developed by Ecma TC 39 + experts
• Collation, number, date & time formatting
• Started fall 2010
• Speci!cation stable
• Implementations and test suite in progress
• Approval expected December 2012
![Page 4: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/4.jpg)
JavaScript Environments
• Web browsers: with DOM, XHR
• Servers: Node
• Platforms: Firefox OS, Metro Windows 8-style UI, Phonegap
• Libraries: jQuery, Dojo, YUI, GWT, +++++
![Page 5: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/5.jpg)
Collation
![Page 6: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/6.jpg)
Collation (Sorting)• Old: String.prototype.localeCompare
• Only string argument
• New: Intl.Collator
• locales
• options
• Fixed: String.prototype.localeCompare
• With locales and options arguments
![Page 7: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/7.jpg)
Locales• BCP 47 language tags
• Language, script, country codes
• “es”, “en-AU”, “zh-Hans-CN”
• Unicode locale extension
• “de-u-co-phonebk”
• Preference lists
• [“mr”, “hi”, “en-IN”]
![Page 8: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/8.jpg)
Locale Negotiation• BCP 47 Lookup
• [“es-GT”, “es-MX”] → “es-GT”, “es”, “es-MX”
• Best !t
• implementation de!ned
• [“es-GT”, “es-MX”] → “es-GT”, “es-MX”, “es”
• Unicode extension handled separately
![Page 9: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/9.jpg)
Collator Extensions
• co: collation – phonebook, pinyin, ...
• kf: case !rst – upper, lower
• kn: numeric sorting
• kk: use normalization
![Page 10: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/10.jpg)
Collator Options
• localeMatcher: lookup, best !t
• usage: sort, search
• sensitivity: base, accent, case, variant
• ignorePunctuation
• numeric, normalization, caseFirst
![Page 11: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/11.jpg)
Non-ECMAScript
• Nothing good found (some for Latin only)
• Collation is hard
• Knowledge of full Unicode character set
• Big tables
![Page 12: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/12.jpg)
Number Formatting
![Page 13: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/13.jpg)
Number Formatting• Old: Number.prototype.toLocaleString
• No arguments
• New: Intl.NumberFormat
• locales
• options
• Fixed: Number.prototype.toLocaleString
• With locales and options arguments
![Page 14: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/14.jpg)
NumberFormat Extensions
• nu: numbering system
![Page 15: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/15.jpg)
NumberFormat Options
• localeMatcher: lookup, best !t
• style: decimal, currency, percent
• currency: ISO 4217 currency code
• currencyDisplay: symbol, code, name
• minimum/maximum digits
• useGrouping
![Page 16: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/16.jpg)
¤ % ๙ # , ⚑Globalize + + - + - 250+
Dojo + + - + - 30+
Closure + + + + + 300+
Windows 8-style UI + + + + + 100s
iLib + + - + - 10+¤: currency formatting. %: percent formatting. ๙: numbering systems. #: digit settings. ,: grouping separator option. ⚑: supported locales.¤: currency formatting. %: percent formatting. ๙: numbering systems. #: digit settings. ,: grouping separator option. ⚑: supported locales.¤: currency formatting. %: percent formatting. ๙: numbering systems. #: digit settings. ,: grouping separator option. ⚑: supported locales.¤: currency formatting. %: percent formatting. ๙: numbering systems. #: digit settings. ,: grouping separator option. ⚑: supported locales.¤: currency formatting. %: percent formatting. ๙: numbering systems. #: digit settings. ,: grouping separator option. ⚑: supported locales.¤: currency formatting. %: percent formatting. ๙: numbering systems. #: digit settings. ,: grouping separator option. ⚑: supported locales.¤: currency formatting. %: percent formatting. ๙: numbering systems. #: digit settings. ,: grouping separator option. ⚑: supported locales.
Non-ECMAScript
![Page 17: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/17.jpg)
Date and Time Formatting
![Page 18: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/18.jpg)
Date and Time Formatting
• Old: Date.prototype.toLocale[|Date|Time]String
• No arguments
• New: Intl.DateTimeFormat
• locales
• options
• Fixed: Date.prototype.toLocale[|Date|Time]String
• With locales and options arguments
![Page 19: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/19.jpg)
DateTimeFormat Extensions
• ca: calendar
• nu: numbering system
![Page 20: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/20.jpg)
DateTimeFormat Options
• localeMatcher: lookup, best !t
• timeZone: UTC
• hour12
• weekday, era, year, month, day, hour, minute, second, timeZoneName: components
• formatMatcher: basic, best !t
![Page 21: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/21.jpg)
Non-ECMAScript
ca tz ๙ ⚑Globalize 5+ + - 250+Dojo 4 - - 30+Closure + + + 300+Windows 8-style UI ? - ? ?iLib 3 + - 10+YUI - - - 50+ca: calendars. tz: time zones. ๙: numbering systems. ⚑: supported locales.ca: calendars. tz: time zones. ๙: numbering systems. ⚑: supported locales.ca: calendars. tz: time zones. ๙: numbering systems. ⚑: supported locales.ca: calendars. tz: time zones. ๙: numbering systems. ⚑: supported locales.ca: calendars. tz: time zones. ๙: numbering systems. ⚑: supported locales.
![Page 22: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/22.jpg)
Message Construction
• Substitution
• {user} went to {city}.
• {user}さんは{city}へ行きました。
![Page 23: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/23.jpg)
Message Construction
• Plurals
• {user} est allé à {city}.
• {user1} et {user2} sont allés à {city}.
• 1-6 forms depending on language
• {number, plural {one {...} few {...} many {...}}}
![Page 24: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/24.jpg)
Message Construction
• Gender
• {user} est allé à {city}.
• {user} est allée à {city}.
• 1-4 forms depending on language
• {gender, select {female {...} male {...} unknown {...}}}
![Page 25: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/25.jpg)
Message Construction{gender, select {
female {num, plural {
one {{user1} est allée à {city}.}
other {{user1} et {user2} sont allées à {city}.}}}
male {num, plural {
one {{user1} est allé à {city}.}
other {{user1} et {user2} sont allés à {city}.}}}
}}
![Page 26: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/26.jpg)
Message Construction
• Google has MessageFormat for Closure environment
• Alex Sexton provided standalone version
![Page 27: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/27.jpg)
Occupy Wall Street. By @tanlines.
![Page 28: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/28.jpg)
Supplementary Characters
• Characters above U+FFFF
• Emoji, rare CJK, ancient scripts, musical symbols, ...
• 2 units in UTF-16
![Page 29: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/29.jpg)
Today: UCS-2 or UTF-16?UCS-2:
• Regular expressions
• String comparison
• Case conversion
UTF-16:
• Source text conversion
• URI handling
![Page 30: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/30.jpg)
Today: UCS-2 or UTF-16?UCS-2:
• Regular expressions
• String comparison
• Case conversion
UTF-16:
• Source text conversion
• URI handling
• DOM, text input, text rendering, XMLHttpRequest, libraries, apps
![Page 31: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/31.jpg)
ECMAScript 6: UTF-16
• New Unicode mode in regular expressions
• Case conversion for full Unicode
• Full Unicode in identi!ers
• String accessors for code points
• But: no change to low-level string comparison
![Page 32: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/32.jpg)
Rendering
• Emoji on Mac/iOS are rendered with color font
• On Mac, only Safari supports this font
• Not Firefox, Chrome, Opera
• Fonts for other supplementary characters supported in all modern browsers
![Page 33: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/33.jpg)
Regular Expressions
• RegExp in ES5 doesn’t have much Unicode support
• No support for Unicode character properties
• No support for supplementary characters
![Page 34: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/34.jpg)
Regular Expressions
• CSet (inimino): Character classes with supplementary characters
• XRegExp (Steven Levithan and Mathias Bynens): Unicode categories and properties with supplementary characters
![Page 35: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/35.jpg)
Unicode Normalization
• Makes strings be equal that users perceive as equal (more or less)
• ä = a ¨
• ự = ự
• 김 = ㄱ ㅣ ㅁ
![Page 36: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/36.jpg)
Unicode Normalization
• ECMAScript “assumes” normalization happens where needed
• Reality: applications have to do it
• Libraries available, but not up to date:
• unorm (Matsuza)
• Richard Ishida’s normalizer
![Page 37: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/37.jpg)
北京大学.中国
![Page 38: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/38.jpg)
北京大学.中国
![Page 39: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/39.jpg)
Internationalized Domain Names
• Unicode at user interface
• ASCII under the hood
• 北京大学.中国 = xn--1lq90ic7fzpc.xn--!qs8s
• Main steps:
• normalization (as discussed)
• punycode (Mathias Bynens has latest)
![Page 40: Internationalizing JavaScript Applications · ECMAScript • Language Speci!cation • Developed by Ecma TC 39 • Language syntax and semantics • Core API: Object, String, Array,](https://reader034.fdocuments.us/reader034/viewer/2022042621/5f62d67d3a9e63592a3f7224/html5/thumbnails/40.jpg)
Summary
• ECMAScript Internationalization API provides core functionality
• Please review and provide feedback
• http://norbertlindenberg.com/2012/06/ecmascript-internationalization-api/
• Libraries provide more internationalization support than you may think