Java I18n and Unicode JaxJug lightning talk 4/15/09.
-
Upload
bartholomew-freeman -
Category
Documents
-
view
213 -
download
0
Transcript of Java I18n and Unicode JaxJug lightning talk 4/15/09.
Java I18n and Unicode
JaxJug lightning talk4/15/09
Java I18n
• Native support for Unicode• Localization
o Create properties fileso Locale objecto ResourceBundles
• Other locale-dependent data
Issues
• But what about when you are talking to someone else?o Different encodingso BE and LE
• String s = new String(buffer, "UTF8");• s.getBytes("UTF8")• public String parseMessage(byte[] buffer) { StringBuffer buffer = new StringBuffer(); try { InputStreamReader is = new InputStreamReader(new ByteArrayInputStream(buffer), "UTF-16LE")); BufferedReader br = new BufferedReader(is); int ch; while ((ch = br.read() != -1) { buffer.append((char)ch); } br.close(); return buffer.toString();
} catch (IOException e) { e.printStackTrace(); return null; }
}
Converting Unicode
import java.io.*;
public class UnicodeFormatter { static public String byteToHex(byte b) {
// Returns hex String representation of byte b char hexDigit[] =
{ '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' }; char[] array = { hexDigit[(b >> 4) & 0x0f],
hexDigit[b & 0x0f] }; return new String(array);
}
static public String charToHex(char c) { // Returns hex String representation of char c byte hi = (byte) (c >>> 8); byte lo = (byte) (c & 0xff); return byteToHex(hi) + byteToHex(lo);
}
}
(from http://java.sun.com/docs/books/tutorial/i18n/)
Regular Expressions and Unicode
• What is a character?o Ñ ==
\u00F1 OR \u006E \u0303
• Matching graphemes - .• Canonical equivalence
o Pattern.compile (pattern, CANON_EQ);• http://www.regular-expressions.info/unicode.html