Model driven engineering for big data management systems
-
Upload
marcos-almeida -
Category
Internet
-
view
136 -
download
0
Transcript of Model driven engineering for big data management systems
www.modeliosoft.com
Model driven engineering for big data management systems
Marcos ALMEIDA [email protected]
Sarah DAHAB [email protected]
Andrey SADOVYKH [email protected]
1
Outlines
Introduction Model-driven development Big Data
Juniper Sample application Conclusions
www.modeliosoft.com 2
20 ME
2006
17,5 ME
2005
70 ME
2013
ParisRennesNantes
Sophia
SOFTEAM – a French IT services / Software vendor
• SOFTEAM, a growing company 25 years’ experience 900 experts Regular growth
• Specialist in OO technologies, new architectures, methodologies
• Banking, Defense, Telecom, …
www.modeliosoft.com 3
23 ME
2008
Modelio for Software and System Engineering
• UML editor with 20 years’ historyo CloudMLo SysMLo MARTEo Code generationo Documentationo Teamwork
www.modeliosoft.com 4
• Available under open source at Modelio.org
MODEL-DRIVEN DEVELOPMENT
www.modeliosoft.com 5
It is all about models … Starting with UML
www.modeliosoft.com 6
Requirements
UML Use Cases
Architecture
UML Componentsand Classes
Design
Refined Classesor Domain Specific Language
Implementation
Code generationJava, C++, Frameworks
Model = Code
www.modeliosoft.com 7
Typical example: Control system for a frigate
• 800+ components• Developed by 100+ engineers• 1M+ LOC
• MDD fosters Productivity and Quality witho Code generationo Components reuseo Tracingo Automation
www.modeliosoft.com 8
Curious DSL example: Ruby on Rails
Haml HTML%br{:clear => left’} <br clear=”left”/>%p.foo Hello <p class=”foo”>Hello</p>%p#foo Hello <p id=”foo”>Hello</p>.foo <div class=”foo”>...</div>#foo.bar <div id=”foo” class=”bar”>...</div>
www.modeliosoft.com 9
Feature: User can manually add movie Scenario: Add a movie Given I am on the RottenPotatoes home page When I follow "Add new movie" Then I should be on the Create New Movie page When I fill in "Title" with "Men In Black" And I should see "Men In Black"
Cucumber and Capybara
HAML
What do we get from MDD?
Pros• Design once, deploy
everywhere!• Write your
transformation once, transform anything!
Cons• Transformations are
hard to write…• How to make sure they
are CORRECT? i.e.– Is there any
data/semantic loss?
www.modeliosoft.com 10
BIG DATA
www.modeliosoft.com 11
Volume, variety, velocity
1. @-mails sent every second : 2,9 million
2. Video uploaded to YouTube every minute: 25 hours
3. Data processed by Google every day: 24 petabytes
4. Tweets per day: 50 million
5. Products ordered on Amazon per second: 73 items
www.modeliosoft.com 12
Only 0,5 % of data is analyzed
• In 2012, 2 837EB generated - just 0,5% actually analyzed.
That still amounts to 14EB (or 14.185 million terabytes)
Source: IDC & EMC
www.modeliosoft.com 13
The main problem is Heterogeneity!
• Many different database management systemso Ex:
• MySQL (www.mysql.com/), • Big Table (http://research.google.com/archive/bigtable.html)• SimpleDB (http://aws.amazon.com/simpledb/)• Memcached (http://memcached.org/)• …
• Many underlying data representation paradigmso Ex:
• Relational Databases• Key-value Stores• Object-oriented Databases• Big Tables• …
www.modeliosoft.com 14
The basis of our solution is MDE… Why?
• Separating the problem from the solutiono In JUNIPER we model the solution
• Fostering automationo Analysiso Code generation
www.modeliosoft.com 15
BusinessObjects Transformation
HDFS
MySQL
MongoDB
Abstract ModelsSpecific Models / code
Transformation
Transformation
Understanding the problem… Why is it so HARD? (1/2)• Target Technologies based on different paradigms• Example:
www.modeliosoft.com 16
A
B
JPA@Entitypublic class A { @Basic public B getB(){ … }…}
SQLcreate table A (…)create table B (…)create table A_B (…)
Understanding the problem… Why is it so HARD? (2/2)• Target structure is variable• Example:
www.modeliosoft.com 17
A
B
ER
NoSQL
A
BAB
Here A and B are
independent entities
Here, for performance reasons, B is
embedded in AA
B
Illustration: comparative features of MongoDB and PostgreSQL
www.modeliosoft.com 18
Our solution: a component based approach to NoSQL heterogeneity
• Generic model transformation chaino Integrated to other Juniper tools
• Audit rules• Model to model transformations• Code generators
• Database specific instantiationsoApplication architecture modellingoData modelling oHardware architecture (deployment) modelling
www.modeliosoft.com 19
www.modeliosoft.com 20
The Juniper FP7 EU project
Website: http://www.juniper-project.org/Start Date: 2012-12-01Duration: 36 monthsTotal cost: 4 M€
www.modeliosoft.com 21
JUNIPER integrates Big Data technologies over MPI
www.modeliosoft.com 22
DOCs StreamsDBs
Data Processing
Stage 1 Stage N
BusinessIntelligence
Analytical DBs
Visualization
dbdb
DOCsDOCs
Data Processing in JUNIPER
S1
S3
S2
Analytical DBs
mpi
mpi
mpi
mpi
FPGA-enabled nodesHadoop
HPC
Modelling in Juniper
www.modeliosoft.com 23
Models
High level Architecture
(Nodes,Programs, Streams…)
Real-timeconstraints
Java Code Code
Generation (+MPI initialization, communication, etc)
Reverse Engineering
SchedulabilityAnalysis
Tool
SchedulingAdvisor
Measurements & Advice
Deployment Scripts
ConfigurationModelExport
CodeGeneration
Mapping Programming Model, UML and MARTE
www.modeliosoft.com 24
JUNIPERProgram
Channel
Cloud Node
ProgrammingModel
UML MARTE
Modelling the application and real-time constraints
www.modeliosoft.com 25
Real-time constrains- response time- bandwidth
Big Data flowJUNIPER Programs
Modelling the hardware infrastructure at a high level
www.modeliosoft.com 26
Cloud Node
CPU with 4 cores Hard drive
MPI code generation
www.modeliosoft.com 27
Overview of the Juniper programming model concepts
Next step: integrating data modelling to the programming model
www.modeliosoft.com 28
Business data modelling in Juniper
• Example
www.modeliosoft.com 29
Uniquely identifying pieces of data
Partitioning dataIn different nodes
Business data modelling in Juniper
• Concepts
www.modeliosoft.com 30
Approach taken for dealing with heterogeneity in JUNIPER
1. Define a generic template for Modelio modules to provide support for big data management systems
2. Instantiate the template for MongoDB and PostgreSQL
www.modeliosoft.com 31
MongoDB modelling module
www.modeliosoft.com 32
MongoDB Example (1/2)
www.modeliosoft.com 33
MongoDB Example (2/2) + DEMO Video
Database schema configuration scripts
Deployment scripts
Configuration scripts
www.modeliosoft.com 34
PostgreSQL modeller module
www.modeliosoft.com 35
PostgreSQL Example
master installation scriptstandby installation script
configuration files
www.modeliosoft.com 36
DATABASE MIGRATION SAMPLE APPLICATION
www.modeliosoft.com 37
[VIDEO]
www.modeliosoft.com 38
CONCLUSIONS
www.modeliosoft.com 39
In short…
• Challenge: o Big data applications How should we handle heterogeneous data ??
• Juniper response:o Model driven solution for designing real-time big data systemso Component based solution to heterogeneity
• General business objects + big data concepts modelling • Database specific concepts
– Modelling– Model transformations– Code generation
www.modeliosoft.com 40
… and Perspectives / Exploitation
• Source code and documentation available on our websiteo http://forge.modelio.org/projects/junipero http://forge.modelio.org/projects/mongodb-modelero http://forge.modelio.org/projects/postgresql-modeler
• Tutorial + Dissemination on our forum
www.modeliosoft.com 41
Questions?Marcos AlmeidaSOFTEAM | ModelioSoft{name.surname}@softeam.fr
SOFTEAM R&D Web Site: http://rd.softeam.com
Modelio Web Site : http://www.modelio.orghttp://forge.modelio.org/projects/juniper
JUNIPER Web Site : http://www.juniper-project.org
www.modeliosoft.com 42
*
*for your questions