Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG...

28

Transcript of Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG...

Page 1: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014
Page 2: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Mehr als Reporting – Datenanalysen mit Oracle R Enterprise

Dr. Nadine Schöne Sales Consultant Oracle Direct, Sales Consulting Dr. Michael Haupt Principal Member of Technical Staff Oracle Labs, Virtual Machine Research Group 25. September 2014

Page 3: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3

Page 4: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Agenda

Mehr als Standard Reporting?

Weiterführende Datenanalysen

R und Oracle R Enterprise (ORE)

Demo

Benefits

Ausblick: Mehr Performance für R

1

2

3

4

5

4

6

Page 5: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Mehr als Standard Reporting?

5

Page 6: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Reporting

6

Page 7: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Weiterführende Datenanalysen

7

Page 8: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 8

Page 9: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Sensordaten-Analyse I

9

200.000 Haushalte

3 Jahre

1 Messung/Stunde

5.256 Mrd. Messwerte (2.628 Messwerte/Kunde)

Page 10: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Sensordaten-Analyse II

10

10 s/Modell

200.000 Haushalte ➔

200.000 Modelle

23 Tage + 4 Stunden 4,3 Stunden

Oracle R Enterprise

Page 11: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

R Screenshots

Page 12: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Advanced Analytics

• Data Understanding & Visualization – Summary & Descriptive Statistics – Histograms, scatter plots, box plots, bar charts – R graphics: 3-D plots, link plots, special R graph types – Cross tabulations – Tests for Correlations (t-test, Pearson’s, ANOVA) – Selected Base SAS equivalents • Data Selection, Preparation and Transformations – Joins, Tables, Views, Data Selection, Data Filter, SQL time windows, Multiple schemas – Sampling techniques – Re-coding, Missing values – Aggregations – Spatial data – R to SQL transparency and push down • Classification Models – Logistic Regression (GLM) – Naive Bayes – Decision Trees – Support Vector Machines (SVM) – Neural Networks (NNs) • Regression Models – Multiple Regression (GLM) – Support Vector Machines

Große Bandbreite an In-Database Data Mining und statistischen Funktionen

Clustering – Hierarchical K-means – Orthogonal Partitioning – Expectation Maximization

Anomaly Detection – Special case Support Vector Machine (1-Class SVM)

Associations / Market Basket Analysis – A Priori algorithm

Feature Selection and Reduction – Attribute Importance (Minimum Description Length) – Principal Components Analysis (PCA) – Non-negative Matrix Factorization – Singular Vector Decomposition

Text Mining – Most OAA algorithms support unstructured data (i.e. customer

comments, email, abstracts, etc.) Transactional Data

– Most OAA algorithms support transactional data (i.e. purchase transactions, repeated measures over time)

R packages—ability to run open source – Broad range of R CRAN packages can be run as part of database

process via R to SQL transparency and/or via Embedded R mode

* included in every Oracle Database

Deskriptive Datenanalyse & Visualization

Klassifikations- & Regressions Modelle

Clustering

Verwendung von Open Source R packages

Daten Aufbereitung & Transformationen

Page 13: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Wichtige Themen für Enterprise Data Analytics

1. Skalierbarkeit

2. Performance

3. Entwicklung &

Produktion

Page 14: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

R und Oracle R Enterprise (ORE)

14

Page 15: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Aspekte herkömmlicher R/Datenbank-Interaktion

15

R logo © R Foundation, vonhttp://www.r-project.org

Page 16: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

R Engine andere R-Packages

Oracle R Enterprise Packages

User R Engine (Dektop)

1

User-Tabellen

Oracle DB SQL

Ergebnisse

Datenbank Compute Engine 2 R Engine andere

R-Packages

Oracle R Enterprise Packages

R Engine(s) verwaltet durch Oracle DB

R

Ergebnisse

3

Post-Processing der Ergebnisse

Analysen, die in der Oracle DB nicht verfügbar sind

Ausführung in Collaboration mit der Oracle DB

„Collaborative Execution“-Modell

Page 17: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracles R Technologien

•Oracle R Distribution

•ROracle

•Oracle R Enterprise

•Oracle R Advanced Analytics for Hadoop

Für R Comunity frei verfügbar

Page 18: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Demo

18

Page 19: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Benefits

19

Page 20: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Benefits I

5.881 R-Packages

20

Page 21: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Benefits II

21

Integration

Performance & Scalability

Performante Enterprise Predictive Analytics Applikationen

Geringe Total Costs of Ownership

Page 22: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Ausblick: Mehr Performance für R

22

Page 23: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

FastR

• Neuimplementierung von R in Java

– Verwendung von Graal (Compiler) und Truffle (AST-Interpreter)

– Dynamische Compilierung, Skalierung auf heterogenen Architekturen

– Beteiligt: Oracle Labs (Deutschland, USA, Österreich), JKU Linz, Purdue University, TU Dortmund

23

U

U U

U

U I

I I

G

G I

I I

G

G

Node Rewriting

for Profiling Feedback

AST Interpreter

Rewritten Nodes

AST Interpreter

Uninitialized Nodes

Compilation using

Partial Evaluation

Compiled Code

Node Transitions

S

U

I

D

G

Uninitialized Integer

Generic

DoubleString

Page 24: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

“R is a powerful and interesting tool for data analysis! ORE brings R into a scalable DB engine (solving problems of data management, analysis and scalability). We actually can obtain information and added value from not so actively used data.”

– Stefano Alberto Russo, Researcher at CERN Openlab

24

Page 25: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Weitere Informationen

25

ORE-Diskussionsforum: https://community.oracle.com/community/developer/english/business_intelligence/data_warehousing/r

Oracle Advanced Analytics: http://www.oracle.com/technetwork/database/options/advanced-analytics/index.html

ORE-Blog: https://blogs.oracle.com/R/

FastR: https://bitbucket.org/allR/fastR

Graal/Truffle: https://wiki.openjdk.java.net/display/Graal/Main

Page 26: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Kontakt

Dr. Nadine Schöne| Sales Consultant

Email: [email protected]

Tel: +49 331 200 7190

ORACLE Deutschland B.V. & Co. KG

Schiffbauergasse 14

14467 Potsdam

Page 27: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 27

Page 28: Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG BigData 2014