Flink 1.0-slides
-
Upload
jamie-grier -
Category
Engineering
-
view
162 -
download
3
Transcript of Flink 1.0-slides
What’s new in Apache FlinkTM 1.0
Kostas Tzoumas@kostas_tzoumas
Flink 1.0• March 8, 2016
• First release in 1.x.y series
• Initiates backwards compatibility for selected APIs
• More than 64 contributors
• More than 450 JIRAs resolved
Flink 1.0: major features
• Out of core state
• Savepoints
• CEP library
• Improved monitoring & Kafka 0.9 support
Out of core state
Out of core state• Alternative to in-memory state
• Powered by RocksDB instances in Flink TMs
• Enabled by using the RocksDBStateBackend
• State limited by disk space only
• State checkpoints save RocksDB databases in reliable store
Savepoints
Production deployments
• Maintaining stateful applications in production settings comes with its own challenges
• Failures, code upgrades, cluster maintenance, …
• Streaming jobs cannot be simply stopped and restarted
Reminder: fault tolerance
• At least once, at most once, exactly once
• Flink guarantees exactly-once processing
• Flink guarantees end to end exactly-once with selected sources and sinks
• e.g., Kafka —> Flink —> HDFS
How? Checkpoints• Flink guarantees fault tolerance by regularly taking
checkpoints of the application state without ever stopping the execution
• At failure, input stream is rewinded to the logical time of the last checkpoint
Introducing savepoints
• A savepoint is a Flink checkpoint that (1) is taken by the user, (2) is accessible externally, and (3) never expires
• Command line save & resume interface
• Save: flink savepoint <JobID>
• Resume: flink run -s <path/to/savepoint> <jobJar>
Savepoints and versions
• A savepoint saves a version of a stateful application at a well-defined time
• E.g.: take snapshots of one application at well-defined times
“Like git for state” • Branch off from savepoints creating a tree of
running application versions
Essential for production deployments
• Application code upgrades
• Flink version upgrades
• Maintenance, migration, debugging
• What-if simulations
• A/B testing
• Time travel
Complex Event Processing
FlinkCEP
• What is Complex Event Processing?
• A catch-all term
• In our context: easily detect patterns in streams
Pattern API
Other features in 1.0• Support for Kafka 0.9 API (and hence MapR
Streams)
• Monitoring console: job submission, checkpoint statistics, detecting bottlenecks
• See http://flink.apache.org/news/2016/03/08/release-1.0.0.html
Closing
Summary
• Flink 1.0: Initiating backwards compatibility and pushing the envelope even further for production streaming deployments
What’s next• SQL
• Dynamic scaling (+ savepoints)
• Hybrid in-memory/out-of-core state backend
• Query-able state
• Support for Apache Mesos
• More connectors and sinks (Kinesis, Cassandra, …)
Join the community• Follow: @ApacheFlink, @dataArtisans
• Read: flink.apache.org/blog, data-artisans.com/blog
• Subscribe: (news | dev | user)@flink.apache.org