Building Asynchronous Applications

46
Johan Edstrom SOA Executive and Apache developer. Apache Member Apache Camel PMC Apache ServiceMix committer Original CXF Blueprint Author Cassandra client library developer Author [email protected] [email protected] Using common frameworks Asynchronous applications

Transcript of Building Asynchronous Applications

Johan Edstrom SOA Executive and Apache developer. Apache Member Apache Camel PMC Apache ServiceMix committer Original CXF Blueprint Author Cassandra client library developer Author [email protected] [email protected]

Using common frameworksAsynchronous applications

Savoir Technologies - This is where we started

Where are we now?

lWe have worked heavily with • Governments

• Insurance

• Utilities

• Network companies

• Education companies

• Medical processing companies

What is this all about?

lScaling • It is hard

• How it is done - depends

• Is this a blueprint?

lTips n’ Tricks • Things we’ve learned over time

lExperience • Do’s and Don't

Before we start

lWhat are we looking to change

JVM

DB

Actor

Business Logic

We will look a little at these tools and libraries

lApache Camel

lApache Karaf, Savoirtech Aetos, ServiceMix

lApache ActiveMQ

lApache Cassandra / Savoirtech Hecate

lApache CXF

lAnd somewhat on AKKA • We really are just peeking at AKKA to validate some ideas

Apache Camel

lApache Camel is an open source Java framework with • Concrete implementations of all the widely used EIP patterns

• Connectivity to a great variety of transports and API

• Easy to use Domain Specific Languages (DSL)

Apache Camel

lCamel is a Domain Specific Language (DSL) focused on implementing Enterprise Integration Patterns (EIPs) • Examples: multicast, splitter, content-based router, routing slip, “dynamic

routers”, aggregator

lEIPs are implemented by Camel “Processors” • Users can develop their own processors

• A processor is a fundamental building block

• Bean language and bindings exists so that not a single piece of Apache Camel Imports will be necessary when integrating your existing code

lCamel can connect to a wide variety of integration technologies • Examples: JMS, HTTP, FTP, SOAP, File - There are ~ 180 components

Apache Camel

Do I need all of that?

lNope, many solutions will need just a few things • jaxb, camel, jms, soap, rest and perhaps jdbc

• Cut the container down to fit your needs

• We don’t need to load all of the 100+ Apache Camel components

• Pick and choose!

lShould I run that messaging solution inside the “ESB” • Entirely up to you, let us look a little deeper at that in a sec.

lCan I test these solutions or am I stuck with System.out.println and a remote debugger?

Apache Karaf

lMini OSGi Container • Foundation of Apache ServiceMix, Aetos, Talend ESB, Cisco Prime, ODL

platforms and quite a few other offerings

• For scaling you certainly don’t need Karaf - all of the concepts are theoretically possible in pretty much any language and platform if you do it correctly. § That said, Karaf enforces modular code (OSGi), controlled deployment

and offers up many nice things like a remote console for “free”.

JMS JAX-WS JAX-RS Camel Spring Aries

OSGi

Console Logging Provision Admin Spring-DM Aries

Decoupling with OSGi

lEnforces knowledge of imports and exports

lAllows us to version

lProgramming model with micro-services allows for re-use on an Api level vs code level

lPromotes contracts

Apache ActiveMQ

lFast, powerful and flexible messaging system

lEasily embeddable

lCan create complex topologies

lTons of connection possibilities

lCan be scaled up / down / right / left • Note - Currently merging with HornetMQ

Apache Cassandra

lEntered Apache as Incubator Project in 2009 • Went through incubation to build community of committers

lBecame Top Level Apache Project in February 2010

lHas proven to be a very flexible and widely used distributed big data solution.

lCQL3 changes data-modeling “slightly”

lCassandra was named after the Greek goddess • Cassandra could accurately predict things that would come

• She spurred the Oracle of Delphi (thus a possible pun)

What are some language tools we can use?

lFutures, Promises, Continuations (Async JaxRs, RIFE), JMS, Executors, Actors, Consumers, Producers, Runnables

• Let’s not go too deep here; frameworks in Java landare there to help you not have to write super duper low level code, like Mina / Netty for networking. Guava for concurrency and collection handling. CXF for JaxWs / JaxRs abstraction to just name a few.

Traditional “full stack” application

lSynchronous in design • Browser -> Servlet container is response time sensitive and expensive

• Servlet code -> Service code can be time consuming

• Service code -> Connection factories can easily block

• Probably uses a Java / JSP framework, some JS § Developers tend to be forced to know Java, JS, a bit of RDBMS, fiddle with

servlet containers, rely quite a bit on QA for testing and shows weird and spurious errors during load testing

JVM

DB

Actor

Business Logic

Persistence - Are you a noSQL candidate?

• Do you need to do complex queries on your data

• Many JOINs and foreign keys with many relationships

• Can be done in Cassandra with Hadoop or other MapReduce (Spark)

• Need to weigh the development effort vs. just writing a SQL query

• Do you need very strict ACID transactionality

• Banking/Financial transactions could be difficult

• ACID/Transactions can be built in Cassandra with complex application code using tools such as ZooKeeper (Distributed locks exist)

• Need to weigh the development effort vs. using a RDBMS which supports transactionality out-of-the-box

• Do you have very complex indexing requirements

• Are you indexing multiple fields and having to create many Column Families to access your data many different ways?

Let us say we are.lThe work here would be in data modeling • The benefits we’d reap are eventual/controllable consistency

• Extremely fast and Asynchronous writes

• Automatic Data distribution and partitioning

• No single point of failure

• You have to unlearn some things

*Images courtesy of DataStax

What does that look like in Java?lTo create our keyspace.

lTo use it with CQL3 - DataStax driver

lUsing it with Hecate - Hecate maps POJO’s to prepared and async statements.

Compared to a traditional RDBMS?

lIn one project we mapped in 17 registry services • ~20 million users in the system more than 40 mil transactions / day

§ Handled all load-test scenarios on a 5 node Cassandra ring. • We are talking about average response times < 100ms

end to end.

• We also don’t need to worry much about § Second and 1st level caches § Cache synchronization or distribution § Locking § Select for update § Autoscaling § Building out § Building out geographically

JVMActor

Business Logic

To sum up Cassandra

lIf you are looking at doing this

lselect a,b,c from table_X where y > 15 allow filtering; • You need to rethink your data model, you are looking at something that with

sufficient amounts of data can take down pretty much any cluster if y is not part of a partition key, index (Indexes are bad for other reasons as well).

• Solve these types of problems by writing your data the way you want to retrieve it.

Front facing stuff

JVMActor

Business Logic

Apache CXF

lLibrary that is passing the TCK for the Web Profile

lBuilds on the base JaxWS / JaxRs in the JDK

lAlso does esoteric stuff like Corba / RMI

lCan be used to do JMS

lYou can build pub sub systems (WSN Notification)

Now lets look at the browser facing partlJAXRS 2.0 Provides for Async

Sync

Async

And the execution of the response

lUsing an executor service, with timeouts

lSame thing but without lambdas

Lets put some load on this

lWe use Gatling (You could use JMeter or any other tool)

l500 user load

l500.000 Requests

lWe measure the total time, time / request, mean avg.

With a delay of 200-500ms - Async

Async ResponseTimes

With a delay of 10-100ms - Sync

Sync responsetimes

Observations

lWith a fast response that is linear • Async creates overhead, response times are worse

• With just a minimal blocking introduced, sync starts choking pretty fast

• It is hard to simulate on one machine

lWhere this these techniques are utilized § A Large EC2 system is currently hitting around 163 r/s going against

Cassandra with prepared (sync) statements and utilizing Async JaxRs § It is slated to be a replacement for a sync system that uses JSP and

Cassandra (or Mongo), it does about the same load over 3 physical machines, 16GB JVM’s and yes…. There were better developers involved on the second system

Onto the last part!lLet’s make that business thing Async too!

lWe’ll make it an almost “BPEL’ey” process.

lWS Call from our mainweb service, response coming via a queue

lNow we could solve this over JMS request/replybut we want to cluster too.

JVMActor

Business Logic

What is it we want to solve?

lWe want some storage

lWe treat that as completely transient

lWe want to share data between nodes

lWe don’t want to re-invent the wheel

HazelCast

We really need a MemoryGrid!!!!

lWe can use HazelCast as it has neat side effects

Init()

loadConfig()

A HazelCast Config

lYou can setup Hazelcast • To be unicast

• Multicast

• EC2 aware

• Implement persistence if you want

• Use existing adaptors for things like Hibernate

Why the HazelCast use?

Could we do this differently

lJMS Request reply

• But then you need to watch for

• Connectivity, queue speed, timeout

• If you don’t use Camel it is § Significantly more complex code § More error prone

lAnd it is down the road interesting with a Grid

lThis grid could be used for “slip” patterns, park transactional stuff, be a lock / mutex

And the callback

lWe rely on HazelCast to inform us

And once we correlate

lWe can now resume our Webservice out of band!

What happened there?

lThe entry listener is the nice part

All done.

JVMActor

Business Logic

Business Logic

Business Logic

MEMORY GRID

QUEUEING

Summary

lWe had a monolithic app

lAnd we ended up with • Replaced persistence with a more non blocking solution

• Replaced all of the synchronous web side

• Introduced queueing

• Introduced a memory grid

• We added asynchronous behavior to something BPEL’ey so that we can run the process across multiple callers or multiple systems in parallel

Thank you!