jvm goes to big data
-
Upload
srisatish-ambati -
Category
Technology
-
view
4.641 -
download
5
description
Transcript of jvm goes to big data
![Page 1: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/1.jpg)
JVM goes BigData
srisatish.ambati AT gmail.comDataStax/OpenJDK2/28/2011@srisatish
![Page 2: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/2.jpg)
Motivation
• A compendium of recent jvm scale issues while working with big data.
• This talk will not have details on big data.
• Thanks Sid!
![Page 3: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/3.jpg)
Trail Ahead
synchronizedNon-blocking Hashmap - A state transition viewCollectionsSerializationUUIDGarbage Collection - The free parameters! - Generations, Promotion, Fragmentation - OffheapQuestions & asynchronous IO
![Page 4: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/4.jpg)
tools of trade
• What the JVM is doing:– dtrace, hprof, introscope, jconsole, visualvm, yourkit,
gchisto, zvision
• Invasive JVM observation tools:– bci, jvmti, jvmdi/pi agents, logging
• What the OS is doing:– dtrace, oprofile, vtune, perf
• What the network/disk is doing:– ganglia, iostat, lsof, nagios, netstat, tcpdump
![Page 5: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/5.jpg)
![Page 6: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/6.jpg)
synchronized
under the hood– Fast path for no-contention thin lock
– Bias threads to lock or bulk revoke bias
– Store free biasing
![Page 7: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/7.jpg)
JMM: happens-before, causality
Partial order
volatile
Piggybacking
FutureTask
BlockingQueue
jsr133
![Page 8: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/8.jpg)
java.util.concurrent also holds locks!
![Page 9: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/9.jpg)
Tomcat under concurrent load!
![Page 10: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/10.jpg)
Non-blocking collections: Amdahl's > Moore's!
State, Actions – key/value pairs!get, put, delete, _resize
ByteArray to hold DataConcurrent writes: using CAS
No locks, no volatileMuch faster than locking under heavy load
Directly reach main data array in 1 step
Resize as neededCopy Array to a larger Array on demand. Post updates
![Page 11: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/11.jpg)
Death & Taxes: Java Overheads!
• Cost of an 8-char String?
• Cost of 100-entry TreeMap<Double,Double> ?
8bhdr
12bfields
4bptr
4bpad
8bhdr
4blen
16bdata
A: 56 bytes, or a 7x blowup
48bTreeMap
40bTreeMap$Entry
16bDouble
16bDouble
A: 7248 bytes or a ~5x blowup
![Page 12: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/12.jpg)
yourkit: memory profile
![Page 13: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/13.jpg)
Which collection: Mozart or Bach?
Concurrency: Non-blocking HashMap Google Collections
Overheads Watch out for per-element costs! Primitives can be hard to manage!
Sparse collections
Average collection size in enterprise is ~3
![Page 14: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/14.jpg)
java.io.Serializable is S.L..O.…W
True to platform Use “transient” ObjectSerialField[] Avro Google Protocol Buffers, Externalizable + byte[] Roll your own
serializable
![Page 15: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/15.jpg)
ser+deser smaller is better
https://github.com/eishay/jvm-serializers.git
![Page 16: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/16.jpg)
avro
• Schema– No per datum overheads
– Optional code gen
• Types are runtime
• Untagged data
• No manually-assigned field Ids
Cons:
• Schema mismatches
• Runtime only checks
![Page 17: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/17.jpg)
google-proto-buffer
• Define message format in .proto file
• All data in key/value pairs
• Generate sources
• .builder for each class with getter/setter
![Page 18: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/18.jpg)
thrift
• Type, Transport, Protocol, Version, Processors
• Separation of structure from protocol & transport
• TCompactProtocol, etc– tag/data, compression
• TSocket, TfileTransport, etc
• colocated clients & servers
![Page 19: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/19.jpg)
UUIDjava.util.UUID is slow
dominated by sha_transform costs Leach-salz (128-bit)
Turns out that default PRNG (via SecureRandom)
Uses /dev/urandom for seed initialization
-Djava.security.egd=file:/dev/urandom
• PRNG without file is atleast 20%-40% better.
Use TimeUUIDs where possible – much faster
Alternatives: JUG – java.uuid.generator, com.eaio.uuid
~10x faster
http://github.com/cowtowncoder/java-uuid-generator
http://jug.safehaus.org/
http://johannburkard.de/blog/programming/java/Java-UUID-generators-compared.htm
![Page 20: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/20.jpg)
/**
* Returns a {@code String} object representing this {@code UUID}.
*
* <p> The UUID string representation is as described by this BNF:
* <blockquote><pre>
* {@code
* UUID = <time_low> "-" <time_mid> "-"
* <time_high_and_version> "-"
* <variant_and_sequence> "-"
* <node>
* time_low = 4*<hexOctet>
* time_mid = 2*<hexOctet>
* time_high_and_version = 2*<hexOctet>
* variant_and_sequence = 2*<hexOctet>
* node = 6*<hexOctet>
* hexOctet = <hexDigit><hexDigit>
* hexDigit =
* "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
* | "a" | "b" | "c" | "d" | "e" | "f"
* | "A" | "B" | "C" | "D" | "E" | "F"
* }</pre></blockquote>
*
* @return A string representation of this {@code UUID}
*/
public String toString() {
return (digits(mostSigBits >> 32, 8) + "-" +
digits(mostSigBits >> 16, 4) + "-" +
digits(mostSigBits, 4) + "-" +
digits(leastSigBits >> 48, 4) + "-" +
digits(leastSigBits, 12));
}
Leach-salz UUID
![Page 21: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/21.jpg)
------------------------------------------------------------------------------------------------------------------------------- PerfTop: 1485 irqs/sec kernel:18.6% exact: 0.0% [1000Hz cycles], (all, 8 CPUs)-------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO _______ _____ ________________________________________________________________
1882.00 26.3% intel_idle [kernel.kallsyms] 1678.00 23.5% os::javaTimeMillis() libjvm.so 382.00 5.3% SpinPause libjvm.so 335.00 4.7% Timer::ImplTimerCallbackProc() libvcllx.so 291.00 4.1% gettimeofday /lib/libc-2.12.1.so 268.00 3.7% hpet_next_event [kernel.kallsyms] 254.00 3.6% ParallelTaskTerminator::offer_termination(TerminatorTerminator*) libjvm.so ------------------------------------------------------------------------------------------------------------------------------- PerfTop: 1656 irqs/sec kernel:59.5% exact: 0.0% [1000Hz cycles], (all, 8 CPUs)-------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO _______ _____ ________________________________________________________________ 6980.00 38.5% sha_transform [kernel.kallsyms] 2119.00 11.7% intel_idle [kernel.kallsyms] 1382.00 7.6% mix_pool_bytes_extract [kernel.kallsyms] 437.00 2.4% i8042_interrupt [kernel.kallsyms] 416.00 2.3% hpet_next_event [kernel.kallsyms] 390.00 2.2% extract_buf [kernel.kallsyms] 376.00 2.1% ThreadInVMfromNative::~ThreadInVMfromNative() libjvm.so 321.00 1.8% T.3542 libjvm.so 298.00 1.6% __ticket_spin_lock [kernel.kallsyms] 296.00 1.6% Timer::ImplTimerCallbackProc() libvcllx.so 255.00 1.4% Unsafe_GetInt libjvm.so
![Page 22: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/22.jpg)
summary
TimebasedUUIDs vs. UUIDs
use ~4 times less kernel time on creation!
No SHA library calls!
optimized toString()
Much faster than standard java.util.UUID
- Better Instructions per clocks as well.
If on EC2:
Watch out for non-cacheable file access to /dev/urandom!
![Page 23: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/23.jpg)
String theory of Java!
byte[] vs. char[]
If ver > jdk16u21 try -XX:+UseCompressedStrings
Append performance (gc) differs:
Strings vs. StringBuffers
com.google.common.base.Joiner• Join text for cheap,
• skipNulls or useForNulls()
![Page 24: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/24.jpg)
“Null References: A billion dollar mistake”
- C.A.R Hoare
“I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.” - qconlondon, '09
![Page 25: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/25.jpg)
Best Practices:Garbage Collection
![Page 26: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/26.jpg)
verbose:gc
GC Logs are cheap even in production
-Xloggc:gc.log
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
A bit expensive/obscure ones: -XX:PrintFLSStatistics=2 -XX:CMSStatistics=1
-XX:CMSInitiationStatistics -XX:+PrintFLSCensus
![Page 27: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/27.jpg)
Three free parameters
Allocation Rate: your workload!
Size: defines runway!
Live Set, memory
Pause times:
Stoppages!
![Page 28: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/28.jpg)
Four free parameters
Allocation Rate: your application load!
Size: defines runway!
Live Set, system memory
Pause times:
Stoppages!
(fourth: Overheads of GC – Space & CPU.)
![Page 29: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/29.jpg)
Part I: Sizingto be -Xmx == -Xms or not?Young generation:
Use -Xmn for predictable performance
edensurvivor spaces
new Object()survivor ratio
jvm allocates
TenuringThreshold
promotion
old gen
![Page 30: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/30.jpg)
Part II: Pick a collector!
Serial GC – Serial new + Serial Old
Parallel GC (default) Parallel Scavenge + Serial Old
UseParallelOldGC : Parallel Scavenge + Parallel Old
UseConcurrentMarkSweep: ParNew, CMS Old, Serial Old
G1/Experimental
![Page 31: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/31.jpg)
Reading GC logs – a topic/tool
Full GC is STW
Initial Mark, Rescan/WeakRef/Remark are STW
Look for promotion failures
Look for concurrent mode failures
![Page 32: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/32.jpg)
... 995.330: [CMS-concurrent-mark: 0.952/1.102 secs] [Times: user=3.69 sys=0.54, real=1.10 secs] 995.330: [CMS-concurrent-preclean-start]995.618: [CMS-concurrent-preclean: 0.279/0.287 secs] [Times: user=0.90 sys=0.20, real=0.29 secs] 995.618: [CMS-concurrent-abortable-preclean-start]995.695: [GC 995.695: [ParNew (promotion failed)Desired survivor size 41943040 bytes, new threshold 1 (max 1)- age 1: 29826872 bytes, 29826872 total: 720596K->703760K(737280K), 0.4710410 secs]996.166: [CMS996.317: [CMS-concurrent-abortable-preclean: 0.218/0.699 secs] [Times: user=1.39 sys=0.10, real=0.70 secs] (concurrent mode failure): 4100132K->784070K(5341184K), 4.7478300 secs] 4780154K->784070K(6078464K), [CMS Perm : 17033K->17014K(28400K)], 5.2191410 secs] [Times: user=5.70 sys=0.01, real=5.22 secs]...
![Page 33: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/33.jpg)
Tuning CMS
Don’t promote too often! Frequent promotion causes fragmentation
(avoid never tenure) TenuringThreshold
Size the generations Min GC times are a function of Live Set
Old Gen should host steady state comfortably
Avoid CMS Initiating heuristic -XX:+UseCMSInitiationOccupanyOnly
Use Concurrent for System.gc() -XX:+ExplicitGCInvokesConcurrent
![Page 34: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/34.jpg)
GC Threads
Parallelize on multicores -XX:ParallelGCThreads=4
(default: derived from # of cpus on system)
*8 + (n-5)/8
-XX:ParallelCMSThreads=4
(default: derived from # of parallelgcthreads)
Strategy A:
Tune min gcs & let appl data in eden
![Page 35: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/35.jpg)
Fragmentation
Performance degrades over time
Inducing “Full GC” makes problem go away
Free memory that cannot be used
Round off errors
Reduce occurrenceUse a compacting collector
Promote less often
Use uniform sized objects
![Page 36: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/36.jpg)
Not enough large contiguous space for promotion
Small objects still can fit in the holes!
Compaction – stop the world.
Unsolved on Oracle/Sun Hotspot
Azul Systems Pauseless JVM.
![Page 37: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/37.jpg)
JRockit Mission Control
![Page 38: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/38.jpg)
Example
Application suddenly transitions to back-to-back full gcs.
Cannot use free mem – too many holes!
![Page 39: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/39.jpg)
Tools
• GCHisto
• jconsole
• VisualVM/VisualGC
• Logs
• Thread dumps
• yourkit memory profile, snapshots
![Page 40: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/40.jpg)
GCSpy
![Page 41: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/41.jpg)
Gone 0xff the heap !!
ByteBuffer.allocateDirect(16 * 1024 * 1024)
Also can be mapped memory of a file region
Store long-lived objects outside jvm
Managed by native i/o ops.
JNA: dynamically load & call native libraries without compile time decl like JNI
Works for limited use cases in the lab.
Ex: Terracotta, Hbase, Cassandra
![Page 42: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/42.jpg)
Gone 0xff the heap ?
Issues to consider:No clear api to de-allocate from this region
– See jbellis patch to JNA-179 for FreeableBufferObject cleanup relegated to finalization Single finalizer thread, Bug ID: 4469299Behind WeakReference processing in jdk16u21
Workaround:-XX:MaxDirectMemorySize=<size> Manually Trigger System.gc() to avoid “leak”
![Page 43: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/43.jpg)
Virtually there!
Ballooning driver for Memory: Disable it!
Time (TSC) issue! It's relative!
Scheduling when # of threads > # of vcpus..
Tickless _nohz kernel
GC Thread starvation = STW pauses
large ec2 instances are not all equal..
DirectPathIO & vt-d, rvi – Watch out for Sockets!
Tools: Performance counters still not virtualized!
![Page 44: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/44.jpg)
summary
• JVM is still the most popular platform for deployment for the new languages!
• JVM heartburn around scale!– Serialization– UUID– Object overhead– Garbage Collection– Hypervisor
![Page 45: jvm goes to big data](https://reader038.fdocuments.us/reader038/viewer/2022102804/54b783ca4a7959a8698b463d/html5/thumbnails/45.jpg)
References
Chris Wimmer, Chris Wimmer, http://wikis.sun.com/display/HotSpotInternals/Synchronizationhttp://wikis.sun.com/display/HotSpotInternals/SynchronizationRussel & Detlefs Russel & Detlefs http://www.oracle.com/technetwork/java/biasedlocking-oopsla2006-wp-149958.pdfGoogle Protocol Buffers Google Protocol Buffers http://code.google.com/p/protobufThrift Thrift http://incubator.apache.org/thrift/static/thrift-20070401.pdfLeach-Salz Variant of UUID Leach-Salz Variant of UUID http://www.upnp.org/resources/draft-leach-uuids-guids-00.txtHans Boehm, Hans Boehm, http://www.hpl.hp.com/personal/Hans_Boehm/gc/complexity.htmlBrian Goetz, JSR-133 Brian Goetz, JSR-133 http://www.ibm.com/developerworks/java/library/j-jtp03304/GCSpy GCSpy http://www.cs.kent.ac.uk/projects/gc/gcspy/Understanding GC logs Understanding GC logs http://blogs.sun.com/poonam/entry/understanding_cms_gc_logs
Cliff Click's http://sourceforge.net/projects/high-scale-lib/Cliff Click's http://sourceforge.net/projects/high-scale-lib/