Alto Desempenho com Java
Transcript of Alto Desempenho com Java
![Page 2: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/2.jpg)
Foreword
In the beginning was the Tao. The Tao gave birth
to Space and Time. Therefore Space and Time
are Yin and Yang of programming.
Programmers that do not comprehend the Tao are
always running out of time and space for their
programs. Programmers that comprehend the
Tao always have enough time and space to
accomplish their goals.
How could it be otherwise? From www.canonical.org/~kragen/tao-of-programming.htm
![Page 3: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/3.jpg)
What is High Performance?
•HitachiH8 8 bit cpu, 16 MHz
•32 kb Ram
2 X Sun SPARC Enterprise
M5000
6 Quad Core 2.4ghz - 6 MB L2
Cache,Sparc VII CPUs, 48 hw
threads, 32Gb RAM
Sources:
Sun Microsystems: www.sun.com/servers/midrange/m5000/
WikiPedia: en.wikipedia.org/wiki/Lego_Mindstorms
Aad van der Steen HPC Page - www.phys.uu.nl/~steen/web08/sparc.html
![Page 4: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/4.jpg)
High Performance is all about
“Delivering solutions which meet
requirements within time and space
constraints using available resources
rationally”
The most important resource: brain time.
HW increases performance with time, brain
decreases performance with time.
![Page 5: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/5.jpg)
Why Java?
• Mature technology
• Speedy and Stable VMs (those who were
burned in the early days still loath it,
though)
• Lots of high quality tools
• Lots of high quality available libraries
• Large ecosystem
• NOT the language itself
![Page 7: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/7.jpg)
A small case study
• Goal: Analyse 17 G (gzip’ed) worth of
MSC Call Detail Records (CDRs in Mobile
Operator Lingo)
Snippet:04|001|26806XXXXXXXXXX|3519XXXXXXXX|3519800049344611||||||
081105|002559|||00062|00|000-076|015-113||||MALM1
|0|01|9XXXXXXXX|11|||2|1|MICOUT|0|0||||||||||||||||||331985|268061011305482|B
AL10A|15|22|12402523|||||||||||||||||||||||||||||||02|||||100001011305482||3e3212003
4df00|||0|1|17|||1|||3||1|01|3519XXXXXXXX||||1|01|3519XXXXXXXX||25||||||0|01|
9XXXXXXXX|002559|081105|00062||2||5||||||||||||3|||||||||||||||||||||||||||||||||||||||||||||||||||
||||||||||||||||
Note: Sensitive information was hidden
![Page 8: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/8.jpg)
A bit more info
• Aproximatly 170 G uncompressed
• Exactly 359 014 695 cdrs
• Trivia: about 3 days worth of GSM call
logs.
• Correlate CDRs with Customer information
• Peformance goal : running time must be
below one hour.
![Page 9: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/9.jpg)
Performance Budget
Network Bandwith
and Latency
Disk Bandwith
and Latency
Memory CPU
![Page 10: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/10.jpg)
If you don’t take a temperature you
can’t find a fever
• Measure the progress as the system is
implemented
• Make *honest* measurements. Prove
yourself wrong.
• Avoid premature optimization – How can
you know? If you’re within your
performance budget don’t worry
(*) Fat Man’s Law X – “House of God”
Samuel Shen - http://en.wikipedia.org/wiki/The_House_of_God
![Page 11: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/11.jpg)
"The journey of a thousand miles starts
with a single step." Lao Tse
• Line read performance
1811229 Line Sample
Sample timmings:
real 0m13.872s
user 0m13.366s
sys 0m4.056s
ETA: ~45 minutes
![Page 12: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/12.jpg)
I/O Tips
• Use Memory Mapped Files (see
FileChannel.map and MappedByteBuffer
APIS)
• Use Buffered I/O - BufferedInputStream
• Optimal buffer size multiple of OS page
size (usually 8k)
• If the process is I/O bound and have fast
CPUs, consider processing compressed
files
![Page 13: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/13.jpg)
One more step
• Extract date of call and customer phone
number
04|001|268061100021547|3519XXXXXXXX|3519800049344611||||||
081105|002559|||00062|00|000-076|015-113||||MALM1
|0|01|9XXXXXXXX|11|||2|1|MICOUT|0|0||||||||||||||||||331985|2680610113
05482|BAL10A|15|22|12402523|||||||||||||||||||||||||||||||02|||||100001011305
482||3e32120034df00|||0|1|17|||1|||3||1|01|3519XXXXXXXX||||1|01|351
9XXXXXXXX||25||||||0|01|9XXXXXXXX|002559|081105|00062||2||5|||||||
|||||3|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Censored numbers to protect the innocent
![Page 14: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/14.jpg)
Split lines by columns
String fields[] = line.split("\\|");
Sample timmings:
real 1m0.670s
user 1m1.252s
sys 0m6.015s
ETA: 3 hours, 18 minutes
~ 6 x slower!!! Exceeded the performance budget
![Page 15: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/15.jpg)
When in doubt, profile
~85% spent splitting fields!
![Page 16: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/16.jpg)
Tune
String fields[] = split(line, '|', 3,10,11);
Sample timmings:
real 0m13.450s
user 0m13.425s
sys 0m3.965s
ETA: 44 minutes e 35 seconds
14 extra lines of java code and we’re back on track
![Page 17: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/17.jpg)
Must get SIM card data
• SIM card Type (prepaid, postpaid, ...)
• ~ 15 million record table
• Database constantly under load
• 4000 querys/s (0.25 ms/q) spare capacity
![Page 18: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/18.jpg)
Database Tips (JDBC)
– Reuse connections!
– Read only ? setReadOnly(true)
– Allways use PreparedStatements
– Allways explicitly close ResultSet (GC
friendly)
– Turn off autocommit
– Use batched operations and transactions in
CRUD type accesses
– Large ResultSets? Increase fetch size! rs.setFetchSize(XXX)
![Page 19: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/19.jpg)
Ooops
• Too slow!
• Assuming an average rate of 4000 q/s:
ETA: ~ 1 day, 56 minutes
![Page 20: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/20.jpg)
Alternatives
• TimesTen
• SolidDb
In Memory Databases
• H2
• Hsqldb
• Derby
Emebeded Relational
• BerkeyleyDb
• Infinitydb
Others Embebed
![Page 21: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/21.jpg)
Must keep a balance
Performance
Cost, Complexity,
Learning Curve (aka neuron
Time), Maintenance
![Page 22: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/22.jpg)
Remebering old times
• In C/C++ you could map structs to
memory
• The amount of information needed is 16
bytes per SIM card (phone number, start
date, end date, type of card – 4 * 4 bytes)
• ~ 343 M if stored in a compact form (int[])
• Sort the data and wrap the array in a List
• Use Collections.binarySearch to do the
heavy lifting
![Page 23: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/23.jpg)
Way faster!
• No extra libraries, 40 lines of simple java
code
ETA: 1 hour, 30 minutes e 35
seconds
Above the budget
![Page 24: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/24.jpg)
Put those extra cores to work
• 6 Quad Core 2.4ghz - 6 MB L2
Cache,Sparc VII CPUs, 48 hw threads,
32Gb RAM
• Split the data in work units
• Split the work units among the threads
• Collect the results when the treads finish
![Page 25: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/25.jpg)
Concurrent tips
• Concurrent programming is really hard!
• But you’re not going to be able to avoid it
(cpu speed increases per core stalled,
cores are increasing in number)
• Don’t share R/W data among threads
• Locking will kill performance
• Be aware of memory architecture
java.sun.com/javase/6/docs/technotes/guide
s/concurrency/index.html
![Page 26: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/26.jpg)
Mission Acomplished
• With 8 threads of the 48 possible
Real running time: 10 minutes,
23 seconds
Near linear scaling!
There’s no point in optimizing more. We’ve
just entered the Law of Diminishing returns
en.wikipedia.org/wiki/Diminishing_returns
![Page 27: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/27.jpg)
What about Network I/O
• 1 thread per client using blocking I/O does
not scale
• Use Nonblocking I/O
• VM implementors will (problaby) use the
best API in the host OS (/dev/epoll in
Linux Kernel 2.6 for example)
• NBIO is hard. Don’t reinvent the wheel.
See Apache Mina - mina.apache.org
• Scales to over 10.000k connections easily!
![Page 28: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/28.jpg)
A few extra tips
• Know your VM
• Not all VMs are created equal
• Even without changing a line of code you
can improve things, if you know what
you’re doing
• If you’re using the SUN VM try the Server
VM (default is Client VM)
• Plenty of options to fiddle
blogs.sun.com/watt/resource/jvm-options-
list.html
![Page 29: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/29.jpg)
What about designing and maintaining
complex systems
• Implement a feature complete solution in
small scale
• Learn the performance characteristics.
Implement benchmarks.
• Change the architecture if needed
• How much does it cost? It’s all about
€€€€€ (licensing, hardware, human
resources, rack space, energy, cooling
requirements, maintenance,...)
![Page 30: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/30.jpg)
Keep measuaring after the system
goes live
“The only man I know who behaves sensibly
is my tailor; he takes my measurements
anew each time he sees me. The rest go
on with their old measurements and
expect me to fit them.” George Bernard Shaw -
en.wikiquote.org/wiki/George_Bernard_Shaw
• Specially if you keep adding features
![Page 31: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/31.jpg)
Code snippets – A (way) faster split
public static String[] split(String l, char sep, int... columns) {
String[] fields = new String[columns.length];
int start = 0, column = 0, end, i = 0;
while((end = l.indexOf(sep, start)) != -1) {
if(column++ == columns[i]) {
fields[i] = l.substring(start, end);
if(++i == columns.length)
return fields;
}
start = end + 1;
}
if(column == columns[i])
fields[i] = l.substring(start);
return fields;
}
String fields[] = split(line, '|', 3,10,11);
![Page 32: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/32.jpg)
Static in-memory “database”: Poor
man’s solution (but as fast as it gets)public class ClientFile implements List<CardInfo>, RandomAccess {
static final int CLIENT_SIZE = 16;
int[] clients;
public ClientFile() throws FileNotFoundException, IOException {
File f = new File("clientes.db");
FileInputStream fs = new FileInputStream(f);
int client_count = (int)f.length() / CLIENT_SIZE;
clients = new int[client_count * 4];
byte b[] = new byte[(int) f.length()];
fs.read(b);
for(int i = 0;i != client_count; ++i) {
clients[i * 4] = toi(b, i * CLIENT_SIZE);
clients[i * 4 + 1] = toi(b, i * CLIENT_SIZE + 4);
clients[i * 4 + 2] = toi(b, i * CLIENT_SIZE + 8);
clients[i * 4 + 3] = toi(b, i * CLIENT_SIZE + 12);
}
}
// map byte[] to integer
public int toi(byte[] b, int offset) {
return ((0xFF & b[offset]) << 24) +
((0xFF & b[offset + 1]) << 16) +
((0xFF & b[offset + 2]) << 8) +
(0xFF & b[offset + 3]);
}
(…)
![Page 33: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/33.jpg)
Static in-memory “database”:
(continued)(…)
public CardInfo get(int index) {
return new CardInfo(clients[index * 4],
clients[index * 4 + 1],
clients[index * 4 + 2],
clients[index * 4 + 3]);
}
public CardInfo getCardInfo(String msisdn, String yymmdd, String hhmmss){
Calendar cal = Calendar.getInstance();
cal.set(i(yymmdd, 0, 1) + 2000, i(yymmdd, 2, 3) - 1, i(yymmdd, 4, 5),
i(hhmmss, 0, 1), i(hhmmss, 2, 3), i(hhmmss, 4, 5));
int idx = Collections.binarySearch(this,
new Key(i(msisdn),
(int)(cal.getTimeInMillis() / 1000)));
if (idx < 0) {
return null;
}
return get(idx);
}
![Page 34: Alto Desempenho com Java](https://reader033.fdocuments.us/reader033/viewer/2022052316/5599653d1a28ab821e8b472b/html5/thumbnails/34.jpg)
Questions?
• Answers1 €
• Answers that require thought5 €
• Correct Answers20 €
• Dumb looksFor Free!