Enterprise (Data) Architecture
description
Transcript of Enterprise (Data) Architecture
Enterprise (Data) Architecture
Christian NentwichModel Two Zero
Caveat Emptor
a)This is an introductory overview, a detailed treatment would take at least a whole term!
b)It’s fiendlishly complex – nobody has a satisfactory answer
About Model Two Zero
• Software company based near “Silicon Roundabout” (Old Street, London)
• Mission: help large businesses establish the next generation of enterprise data architectures
• Innovators in:– Executable specifications– Code generation from (restricted) natural
language– High-volume document matching
About Me
• BSc and PhD in Computer Science from University College London
• Never held down a “proper job”...• Founded two startups – Model Two Zero is
number 2!• Long involvement in financial services IT
– Strategic data architecture consultancy for many big banks, fund managers, clearing houses
– Pushing forward industry-wide data standards
Motivation• Large companies build or buy new systems all the
time• Nobody likes monolithic systems, but components
need to be assembled
Customer Records Invoicing System
Order Management System
Batch load or real-time query
Batch load Real-time message
• Companies also buy other companies• Consolidation requirements
Company BAccounting System
Company AAccounting System
Motivation
JointAccounting System
No planning Chaos
In Real Life
Format Diversity• Relational databases• APIs
– Java– .Net– Mainframe
• File formats– Comma-separator– Fixed width– XML– COBOL mainframe data
files– Tag/Value files– Proprietary binary
• Message formats– XML– JSON– Proprietary queueing
systems
• Taking a consistent approach is a challenge
Complexities• Domain complexity• Architecture diversity• Volume of data• Complex, often out of date business processes• Key person dependencies• Lost business knowledge• Time pressure
ESB – Key Idea• From this:
ESB – Key Idea• To this:
Common Bus
Zoomed in View
Common Bus
Proprietary Data Format
Standardised Data Format
Adapter
Proprietary System
Calculation• Bilateral connections require O(n2) integration
projects for n systems• ESB with standard formats and adapters requires
2n = O(n) integration projects
• This calculation does not always work out in practice!
• Why?
Key Disciplines• Data Modelling
– The data is more important than any execution technology!
– Need to define a “standard language” for interchanging data on the bus
– Very frequently XML and XML Schema– Sometimes JSON, but schema definition is problematic
• Governance– Curation of the standard– Versioning (one of the hardest problems in data
architecture)– Regression testing and rollout management
Modelling - UML
Modelling – XML Schema
Quick Comparison• UML
– Fits lots of information on one page
– Understood by relatively technically unsophisticated users
– A huge standard, contains more than is necessary for the task
– Has no “wire format”
• XML Schema– Technically complex– Comes with a wire
format – XML– Liked by developers– Fairly big standard,
contains more than is necessary for the task
Governance• Central team • Distributed teams
(open source style)
StandardModel
Changes
StandardModel
Changerequests
Versioning• Versioning of standard data formats needs to deal
with two important scenarios:– Backward compatibility: a new version of a connecting
system is able to read/write an old version of the standard
– Forward compatibility: an old version of a connecting system is able to read/write a newer version of the standard
• Careful not to confuse them – most people do
Versioning Example
<customerAccount> <firstName>Christian</firstName> <lastName>Nentwich</lastName> <address1>Somewhere</address1> <address2>in</address2> <address3>London</address3> <postcode>ABC CDE</postcode></customerAccount>
System A
System B
produces
consumes
V1.xsd
validates
Versioning Example
<customerAccount> <firstName>Christian</firstName> <lastName>Nentwich</lastName> <dateOfBirth>1977-01-01</dateOfBirth> <address1>Somewhere</address1> <address2>in</address2> <address3>London</address3> <postcode>ABC CDE</postcode></customerAccount>
System A
System B
produces
consumes
V2.xsd
validates
Changes• Adding optional
elements– Leaves senders
unaffected– Breaks receiver forward
compatibility if receivers cannot deal with unknown elements
• Adding mandatory elements– Breaks senders– Breaks receivers
• Removing mandatory elements– Breaks senders– Breaks receivers
• Removing optional elements– Breaks senders– Leaves receivers
unaffected
Service-Oriented Architecture• Establish a landscape of coarse grain business
services• Establish service wrappers around applications• Build “value add” applications by composing /
orchestrating services
Reservation ServiceCustomer ManagementService
Flight bookingorchestration
REST Wrapper
WSDL Wrapper
Service-Oriented Architecture
Reservation Service
Flight bookingOrchestration
(BPEL)
Customer ManagementService
Event-Driven Architecture• Build an architecture build on business events
rather than data• Components of the architecture subscribe for
updates and listen for events they are interested in (sources and sinks model)
<trade> <id>abc</id> <amount>500000.00</amount> <ccy>GBP</ccy> <ticker>IBM</ticker></trade>
<tradeEntered> <id>abc</id> <amount>500000.00</amount> <ccy>GBP</ccy> <ticker>IBM</ticker></tradeEntered>
From this:
To this:
Event Modelling
<tradeEntered> <id>abc</id> <amount>500000.00</amount> <ccy>GBP</ccy> <ticker>IBM</ticker></tradeEntered>
<cancellation> <id>abc</id></cancellation>
<amendment> <id>abc</id> <amount>400000.00</amount></amendment>
Event-Driven Architecture• Discipline: send only data required for an event• Model things that actually happen in the business• Let participants determine how to react to events
– No central planning– Scalability benefits
Semantic Integration• Instead of “syntactic” manipulation of data,
specify the full meaning of data in each system• Relate data items to an overarching “ontology”• Infer integration automatically• Immature discipline – many issues to solve
Field9234B TrdDt
StartDate
Trade Date
Epilogue: What we do
Instead of writing informal specifi-
cations and then code, create executable specifications.
What we do – Executable Specifications
public void validate(Transaction transaction) { if (transaction.getDebit() != null && transaction.getCredit() != null) { if (transaction.getDebit().getValue() == transaction.getCredit().getValue() && getChangeInAmount() != null) { exceptions.add(new ValidationError( "Must not specify changeInAmount if a debit is equal to a credit")); } }}
context: Transactioninv: (self->debit = self->credit) implies (self->changeInAmount->empty())
<rule context="ns:transaction"> <assert test="ns:debit != ns:credit or not(ns:changeInAmount)"> Must not specify changeInAmount if a debit is equal to a credit </assert></rule>
Java
OCL
SchematronNRLContext: TransactionValidation Rule "Our Sample Rule"If a debit is present and a credit is present then no changeInAmount is presentReport 'Must not specify changeInAmount if a debit is equal to a credit'
Natural Rule Language• An open language for expressing:
– Validation rules (constraints)– Action rules (e.g. Enrichment rules)– Transformation / mapping
• Aimed at the core problem areas in integration• Goals
– Read like an English sentence wherever possible– Require no customisation to get going– Offer symbolic and textual alternatives
• Specification: http://nrl.sourceforge.net
NRL Parser
• The NRL parser is free and open source, also on sourceforge
• Designed for processing / code generation
ConstraintRuleDeclaration
IfThenStatement
ExistsStatement ExistsStatement
1. Text in NRL concrete syntax 2. Parser-generated AST
3. Code generators (from AST)
Possible Projects• Propose solutions to the model versioning / XML
schema versioning problem– Classify permissible changes in detail– Specify a formal logic for proving the impact of changes,
and establishing policies– Review W3C approaches to the problem and contrast
Technologies to Review• Data standards /
formats– XML / XML Schema– JSON
• Semantic standards– RDF– OWL– SBVR– NRL
• Open source frameworks– Spring integration– Mule ESB
• Messaging– AMQP– SOAP / XML RPC– JMS
• Useful– REST– XPath– Xquery– Schematron– JAXB– JIBX– SAX / DOM– Python format modules
www.modeltwozero.com