JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis
-
Upload
andrey-sadovykh -
Category
Engineering
-
view
58 -
download
2
Transcript of JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis
![Page 1: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/1.jpg)
www.modeliosoft.com
Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis
1
![Page 2: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/2.jpg)
Outlines
IntroductionModel-driven development
Big Data
JuniperCase-study
resultsConclusions
www.modeliosoft.com 2
![Page 3: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/3.jpg)
20 ME
2006
17,5 ME
2005
70 ME
2013
ParisRennesNantes
Sophia
SOFTEAM – a French IT services / Software vendor
• SOFTEAM, a growing company
25 years’ experience 850 experts Regular growth
• Specialist in OO technologies, new architectures, methodologies
• Banking, Defense, Telecom, …
www.modeliosoft.com 3
23 ME
2008
![Page 4: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/4.jpg)
Modelio for Software and System Engineering
• UML editor with 20 years’ historyo CloudML
o SysML
o MARTE
o Code generation
o Documentation
o Teamwork
www.modeliosoft.com 4
• Available under open source at Modelio.org
![Page 5: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/5.jpg)
MODEL-DRIVEN DEVELOPMENT
www.modeliosoft.com 5
![Page 6: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/6.jpg)
It is all about models … Starting with UML
www.modeliosoft.com 6
Requirements
UML Use Cases
Architecture
UML Components
and Classes
Design
Refined Classes
or Domain Specific Language
Implementation
Code generation
Java, C++, Frameworks
![Page 7: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/7.jpg)
Model = Code
www.modeliosoft.com 7
![Page 8: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/8.jpg)
Typical example: Control system for a frigate
• 800+ components
• Developed by 100+ engineers
• 1M+ LOC
• MDD fosters Productivity and Quality with
o Code generation
o Components reuse
o Tracing
o Automation
www.modeliosoft.com 8
![Page 9: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/9.jpg)
Curious DSL example: Ruby on Rails
Haml HTML
%br{:clear => left’} <br clear=”left”/>
%p.foo Hello <p class=”foo”>Hello</p>
%p#foo Hello <p id=”foo”>Hello</p>
.foo <div class=”foo”>...</div>
#foo.bar <div id=”foo” class=”bar”>...</div>
www.modeliosoft.com 9
Feature: User can manually add movie
Scenario: Add a movieGiven I am on the RottenPotatoes home pageWhen I follow "Add new movie"Then I should be on the Create New Movie pageWhen I fill in "Title" with "Men In Black"And I should see "Men In Black"
Cucumber and Capybara
HAML
![Page 10: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/10.jpg)
What do we get from MDD?
Pros
• Design once, deploy everywhere!
• Write your transformation once, transform anything!
Cons
• Transformations are hard to write…
• How to make sure they are CORRECT? i.e.
– Is there any data/semantic loss?
www.modeliosoft.com 10
![Page 11: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/11.jpg)
BIG DATA
www.modeliosoft.com 11
![Page 12: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/12.jpg)
Volume, variety, velocity
1. @-mails sent every second : 2,9 million
2. Video uploaded to YouTube every minute: 25 hours
3. Data processed by Google every day: 24 petabytes
4. Tweets per day: 50 million
5. Products ordered on Amazon per second: 73 items
www.modeliosoft.com 12
![Page 13: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/13.jpg)
Only 0,5 % of data is analyzed
• In 2012, 2 837EB generated - just 0,5% actually analyzed.
That still amounts to 14EB(or 14.185 million terabytes)
Source: IDC & EMC
www.modeliosoft.com 13
![Page 14: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/14.jpg)
SQL or Hadoop
www.modeliosoft.com 14
Do you have some data and a problem to solve?
yes
Data fits in memory?
*Inspired by: Aaron Cordova
Data fits on single RAID array?
Tons of options. Don’t need database or
Hadoop
yes
no
no
yesSolvable with SQL?
Use MySQL
yes
Can you program? Write a prog.
yes
no
no
Dead end
![Page 15: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/15.jpg)
SQL or Hadoop (continued)
www.modeliosoft.com
15
* Inspired by: Aaron Cordova
Data fits on single RAID array?
no
Have lots of money?
Solvable with Oracle SQL?
no
Buy a SAN, Use Oracle
yes
Do you have a PhD in parallel prog. ?
no Roll your own MPI solution
yes
Solvable using MapReduce?
no
yes
Can you program MapReduce jobs?
Write MapReduce on
Hadoop?
yes
no
Dead end
no
![Page 16: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/16.jpg)
Challenges
Hadoop MapReduce is the major trend
Success relies on personnel programming skills
Many problems are not solvable with Hadoop. Real-time?
MPI for high performance computing is an option when you have a lots of money and a PhD
www.modeliosoft.com 16
![Page 17: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/17.jpg)
www.modeliosoft.com 17
![Page 18: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/18.jpg)
JUNIPER integrates Big Data technologies over MPI
www.modeliosoft.com 18
DOCsStreamsDBs
Data Processing
Stage 1 Stage N
BusinessIntelligence
Analytical DBs
Visualization
dbdb
DOCsDOCs
Data Processing in JUNIPER
S1
S3
S2
Analytical DBs
mpi
mpi
mpi
mpi
FPGA-enabled nodes
Hadoop
HPC
![Page 19: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/19.jpg)
Modelling in Juniper
www.modeliosoft.com 19
Models
High level
Architecture
(Nodes,Programs,
Streams…)
Real-time
constraints
Java
Code Code
Generation (+MPI initialization, communication, etc)
Reverse
Engineering
Schedulability
Analysis
Tool
(in progress)
Scheduling
Advisor
(in progress)
Measurements &
Advice
Deployment
Scripts
(in progress)
ConfigurationModel
Export
Code
Generation
![Page 20: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/20.jpg)
Mapping Programming Model, UML and MARTE
www.modeliosoft.com 20
JUNIPERProgram
Channel
Cloud Node
ProgrammingModel
UML MARTE
![Page 21: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/21.jpg)
Modelling the application and real-time constraints
www.modeliosoft.com 21
Real-time constrains- response time- bandwidth
Big Data flowJUNIPER Programs
![Page 22: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/22.jpg)
Modelling the hardware infrastructure at a high level
www.modeliosoft.com 22
Cloud Node
CPU with 4 cores Hard drive
![Page 23: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/23.jpg)
MPI code generation
www.modeliosoft.com 23
Code Generation
JUNIPER
Application Model
![Page 24: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/24.jpg)
PETAFUEL CASE STUDY
www.modeliosoft.com 24
![Page 25: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/25.jpg)
Risk: $45 million in half day
www.modeliosoft.com 25
![Page 26: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/26.jpg)
Master Card debit card approval within 4 sec
www.modeliosoft.com 26
petaFuel - transaction approval within 4 seconds
Transactions
Events Stream
Historical Data
30 GB
Basic
Checks
Fraud Patterns
Detection
Transaction
ApprovalDecision
![Page 27: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/27.jpg)
Juniper application model
www.modeliosoft.com 27
![Page 28: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/28.jpg)
Deployment model
www.modeliosoft.com 28
![Page 29: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/29.jpg)
MPI code generation
www.modeliosoft.com 29
public class EventProcessor {
public static final int RANK = 1;
public static IEventProcessor iEventProcessorImpl = new IEventProcessor() {
@Override
public void process(Event event) {
String key = getKeyFromTimestamp(event.getTimestamp());
String value = keyValueStoreIKeyValueStore.find(key);
if (value == null) {
keyValueStoreIKeyValueStore.put(key, "1");
} else {
int count = Integer.parseInt(value);
keyValueStoreIKeyValueStore.put(key, ""+(count+1);
}
}
…
};
…
public static void main(final String[] args) {
MPI.Init(args);
…
MPI.Finalize();
}
…
}
![Page 30: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/30.jpg)
CONCLUSIONS
www.modeliosoft.com 30
![Page 31: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/31.jpg)
Juniper trade-offs
www.modeliosoft.com 31
JUNIPER
Criteria Hadoop MPI
Communication HDFS - file system (httpd) HPC cluster interconnect (Infiniband)
Data flow Map Reduce Modeling + MPI comms
Parallelization Automatic Manually based on domain decomposition
Response time guaranties None Real-time for single node
Stages in multi-format No Any (incl. Hadoop + FPGA)
Hardware Commodity cluster HP cluster
Price € €€€
Skills + ++++
Customers General audience Critical systems
![Page 32: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/32.jpg)
Prospects – more work
Work in progress
UML based language
• MPI Communication
• Timing properties
• Deployment
petaFuel case study
Future work
Modelling payload
Integrating schedulability
Running final evaluations
Final release
www.modeliosoft.com 32
![Page 33: JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis](https://reader034.fdocuments.us/reader034/viewer/2022042615/55a948331a28ab9b3e8b483e/html5/thumbnails/33.jpg)
Questions?
Andrey Sadovykh
Marcos Almeida
SOFTEAM | ModelioSoft
{name.surname}@softeam.fr
SOFTEAM R&D Web Site:
http://rd.softeam.com
Modelio Web Site :
http://www.modelio.org
JUNIPER Web Site :
http://www.juniper-project.org
www.modeliosoft.com 33
*
*for your questions