Stream Processing in the Cloud With Data Microservices
-
Upload
mariusbogoevici -
Category
Technology
-
view
1.584 -
download
0
Transcript of Stream Processing in the Cloud With Data Microservices
© 2016 Pivotal!1
Stream Processing in the Cloud with Data Microservices
Marius Bogoevici, Software Engineer, Pivotal @mariusbogoevici
!2
Stream Processing in the Cloud with Data Microservices
• Use Cases • Predictive maintenance
• Fraud detection • QoS measurement
• Log analysis • High throughput/low latency
• Growing quantities of data • Immediate response is required
• Grouping and ordering of data • Partitioning
• Windowing
!3
Stream Processing in the Cloud with Data Microservices
• Huge quantities of data to be analyzed efficiently • Scaling requirements
• Massive storage
• Massive computing power (memory/CPU) • Massive scalability, from a few machines to data center level
• Reliance on platform’s resource management abilities • public and private cloud: AWS • cluster managers: Apache YARN, Apache Mesos,
Kubernetes
• full application platforms: Cloud Foundry
!4
Stream Processing in the Cloud with Data Microservices
• Microservice pattern applied to data processing applications
• Typical benefits of microservices: • scalability, isolation, agility, continuous deployment,
operational control • Tuning process-specific resources
• Instance count • Memory • CPU
• Event-driven interaction
• communication decoupling, distributed data consistency • There’s life after REST
!5
dataflow:> stream create demo --definition “http | file”
!6
Spring Cloud Stream
▪ Built on battle-tested code
▪ Spring Boot: full-stack standalone apps
▪ Spring Integration: EAI patterns, Connectors
▪ Opinionated primitives
▪ Persistent Publish Subscribe Semantics
▪ Consumer Groups
▪ Partitioning
▪ Pluggable Middleware binding API
▪ Write applications in a middleware agnostic manner
▪ Kafka, Rabbit, Gemfire, Redis, …
▪ Programming model focuses on input/output channels
!7
Spring Cloud Stream in a nutshell …
!8
… in a 10000 ft nutshell …
!9
Publish-subscribe semantics
▪ Fits both data streaming and event-driven use cases
!10
Consumer groups▪ Borrowed from Kafka, applied to all binders ▪ Groups of competing consumers within the pub-sub destination ▪ Durable subscriptions ▪ Used in scaling and partitioning
!11
Partitioning
▪ Required for stateful processing scenarios ▪ Outputs specify a data partitioning strategy ▪ Inputs can be bound to a specific partition
!12
Programming model: Spring Integration
packageorg.springframework.cloud.stream.messaging;
publicinterfaceSource{StringOUTPUT="output";@Output(Source.OUTPUT)MessageChanneloutput();}
@EnableBinding(Source.class)@SpringBootApplicationpublicclassApplication{
@InboundChannelAdapter(Source.OUTPUT) public String sayHello() {
return “hello” + System.currentTimeMillis();}
}
!13
Programming model: @StreamListener
packageorg.springframework.cloud.stream.messaging;
publicinterfaceSink{StringINPUT="input";@Input(Sink.input())SubscribableChannelinput();}
@EnableBinding(Processor.class)publicclassGreetingProcessor{
@AutowireGreetingServicegreetingService;
@StreamListener(Sink.INPUT)publicvoidgreet(Greetinggreeting){greetingService(greeting);}}
{greeting: “hello world”}
contentType: application/json
!14
@Enable All the Things
▪ @EnableBindings(Source.class)
▪ one output
▪ @EnableBindings(Sink.class)
▪ one input
▪ @EnableBinding(Processor.class)
▪ one input and one output
▪ @EnableBinding(MyOrderHandler.class)
▪ custom interfaces with as many inputs and outputs
▪ @EnableRxJavaProcessor
▪ OOTB support for RxJava with one input and one output
!15
Binder SPI
!16
Binder SPI - mapping on native middleware support
!17
Spring Cloud Data Flow
▪ Portable Orchestration Layer for Stream and Tasks
▪ Stream and Task DSL
http | transform | hdfs
▪ REST API
▪ Shell
▪ UI
▪ OOTB apps for common integration use-cases
!18
Spring Cloud Deployer
▪ SPI for deploying applications to modern runtimes
▪ Local (for testing)
▪ YARN
▪ Cloud Foundry
▪ Kubernetes
▪ Mesos + Marathon
!19
dataflow:> stream create demo --definition “http | file”
▪ Stream definition
▪ Launching Boot applications
▪ Can pass configuration parameters via Spring Boot
▪ Control instance count, resource allocation
!20
Spring Cloud Data Flow
Pla$orm
!21
Data Flow Developer Experience
▪ 1. Create Spring Cloud Stream Microservice App
@EnableBinding(Processor.class)publicclassUpperCase{
@Transformer(inputChannel=Processor.INPUT,outputChannel=Processor.OUTPUT)publicStringprocess(Stringmessage){ returnmessage.toUpperCase();}
}
!22
Data Flow Development Experience
dataflow:>moduleregister--nameuppercase--typeprocessor--coordinatesgroup:ar7fact:version
dataflow:>streamcreatedemo--defini7on"h=p--server.port=9000|uppercase|file--directory=/tmp/demodata"
2:BuildandInstall:
$mvncleaninstall
3:RegisterModulewithDataFlow:
4:DefineStreamviaDSL:
!23
Taps
dataflow:> stream create tap --definition ":demo.http > counter --store=redis"
dataflow:> stream create demo --definition "http --server.port=9000 | uppercase | file --directory=/tmp/devnexus"
!24
Roadmap
▪ Spring Cloud Stream 1.0.0.RC1 - March 22, 2016
▪ Spring Cloud Stream 1.0.0.GA - March 2016
▪ Spring Cloud Data Flow - 1.0.0.GA Q2 2016
!25
Links!
h"p://cloud.spring.io/spring-cloud-streamh"p://github.com/spring-cloud/spring-cloud-stream-modulesh"p://github.com/spring-cloud/spring-cloud-stream-samplesh"ps://github.com/mbogoevici/devnexus2016h"p://github.com/spring-cloud/spring-cloud-deployer*h"p://cloud.spring.io/spring-cloud-dataflow*h"p://github.com/spinnaker
*CloudFoundry,YARN,Kubernetes,Mesos+Marathonvariants
QuesNons:email:[email protected]"er:@mariusbogoevici