Dell PowerEdge M420 and Oracle Database 11g R2: A Reference Architecture
Leveraging Oracle · Kai Yu Distinguished Engineer, Dell Technical Leadership Community, Dell EMC...
Transcript of Leveraging Oracle · Kai Yu Distinguished Engineer, Dell Technical Leadership Community, Dell EMC...
Leveraging Oracle Autonomous Data Warehouse for Machine Learning
103080
Jean Yu Data Scientist
IBM
Kai Yu, Distinguished EngineerDell Technologies
June 15, 2020
Jean Yu
▪ Data Scientist, IBM Multi-Cloud Management System
▪ 21+ years of software development expertise
▪ Experience areas: Machine Learning, Multi-cloud management, storage system management, enterprise application management, IT enterprise deployment, workforce management, and banking arenas
▪ IBM Software Group Invention Review Board voting member since 2006
▪ IBM Software Group Master Inventor with 12 patents granted and 4 invention plateaus.
Kai Yu
▪ Distinguished Engineer, Dell Technical Leadership Community, Dell EMC Database Engineering
▪ 26+ years working in IT Industry▪ Specializing in Oracle Database, Oracle EBS, Cloud, Virtualization▪ Author and Frequent Speaker at IEEE and Oracle Conferences▪ IOUG Cloud Computing SIG Co-founder and VP▪ Oracle ACE Director▪ Co-recipient of the OAUG Innovator of Year Award▪ Oracle Excellence Award- Technologist of the
Year: Cloud Architect by Oracle Magazine
▪ My Blog: http://kyuoracleblog.wordpress.com/
Agenda• Oracle Machine Learning in Database
• Oracle Autonomous Data Warehouse (ADW) Overview
• Running Oracle Machine Learning with ADW
• Example: Machine Learning Model Building with Oracle Machine Learning
• Q & A
Oracle Machine Learning
In Database
• Why Machine Learning: Analytics value and Maturity:
Oracle Machine Learning in Database
• Move the Algorithm, Not the Data ➔ In Database Machine Learning
Oracle Machine Learning in Database
• Oracle integrates ML across the Oracle Stack
Empower data scientists and analysts, developers, and DBAs/IT with ML
– Eliminate costly data movement and latency
– Fast and scalable data exploration, data preparation, and ML algorithms
– Over 30 algorithms supporting: regression, classification, time series, clustering, feature
extraction, anomaly detection
– R and Python integration supports data scientists
– Ease of ML model and R/Python script deployment
– Automate key ML process steps Leverage existing backup, recovery, and security mechanisms
and protocols
– That’s where most enterprise data lives – bring the algorithms to the data!
Oracle Database and Oracle Autonomous Database
Oracle Machine Learning in Database
• Simple SQL Syntax – Classification Model:
– Model build(PL/SQL):
BEGIN
DBMS_DATA_MINING.CREATE_MODEL(
model_name =>‘N1_CLASS_MODEL’,
mining_function =>’dbms_data_mining.classification,
data_table_name =>’N1_TRAIN_DATA’,
case_id_column_name => ‘CUSTOMER_ID’,
target_column_name =>’CREDIT_SCORE_BIN’,
setting_table_name =>’N1_BUILD_SETTINGS’)
END;
– Real-time scoring(SQL query):
select prediction_probability(N1_CLASS_MODEL, 'Good Credit’
USING 'Rich' as WEALTH, 2000 as income, 'Silver' as customer_value_segment) Prediction_Probability
from dual;
Oracle Machine Learning for SQL
• In-database, parallel, distributed algorithms
– No extracting data to separate ML engine
– Fast and scalable
– Batch and real-time scoring
– Explanatory prediction details
• ML models as first class database objects
– Access control via permissions
– Audit user actions
– Export / import models across databases
• Leverage ML across Oracle stack
Oracle Machine Learning in Database
Oracle Autonomous Data
Warehouse (ADW) Overview
Oracle Autonomous Data Warehouse (ADW) Overview• Oracle Autonomous Database: AI-based Automation for Data management and operation:
self-driving, self-securing, self-repairing
• Oracle Autonomous Database built on Oracle Cloud Infrastructure (OCI)
• OCI (IaaS) provides a set of cloud infrastructure products Compute, storage, database, network Edge on the same flexibility virtual network.
• OCI supports Oracle cloud platform (PaaS) and Cloud Application(SaaS)
Oracle Autonomous Data Warehouse (ADW) Overview• Oracle Autonomous Database : Optimized by workload:
• Autonomous Data Warehouse (ADW) provides an easy-to-use, fully autonomous data warehouse that scales elastically, delivers fast query performance and requires no database administration.
Autonomous Data Warehouse (ADW) Overview
• Autonomous Data Warehouse Architecture:
Oracle Autonomous Database OverviewAutonomous Data Warehouse (ADW) Overview• Autonomous Data Warehouse Characteristics
– End-to-end management : provisioning, patching, backup, etc
– Fully Tuned and provide good performance out of box , Ready to load and go
– Scale compute and storage to fit workload without downtime
– Support existing apps in cloud/ on–premise , connect with SQL*net, JDBC/ODBC
– Use SQL Developer to connect to ADW for development and data loading.
– Provide build-in web-based data analysis tools such as notebook for designing
and sharing SQL based data-driven, interactive documents.
– Business Intelligence tools: Oracle Analytic cloud, Oracle Data Virtualization Desktop, third
party BI tool
Sign Up and Access Oracle Cloud Infrastructure
• Sign up Oracle Cloud Infrastructure
– Oracle Cloud Free Tier:
– Upgrade to Paid Account
– Always Free Services
• Oracle Cloud Free Tier:
– Free access all the services and resources
– US$300 of free credit for up to 30 days
– After that, only those Always Free services are available
– Allow you to upgrade to Pay account.
Sign Up and Access Oracle Cloud Infrastructure
• Always Free Cloud Services . Always Free Autonomous Database
– Autonomous Data Warehouse -- Processor: 1 Oracle CPU processor
– Autonomous Transaction Processing -- Memory: 8GB RAM
– Compute -- Workloads: ADW or ATP
– Block Volume -- Database Version: Oracle 18c or 19c
– Object Storage -- Infrastructure: Shared Exadata
– Archive Storage Infrastructure
– Load Balancing -- Max Simultaneous Database Session: 20
– Monitoring -- Database shutdown if 7 days no activity
– Notification can be restarted
– Outbound Data -- Resource reclaimed if database down
Transfer for 3 consecutive months
Provisioning Autonomous Data Warehouse
• Provisioning Autonomous Data Warehouse Instance:
– Click Create Autonomous Database-> Specify Workload Type, Database name,
CPU core count, Storage (TB), Admin password -> take 10-15 minutes to create ADW
Database service.
Provisioning Autonomous Data Warehouse
• Provisioning Autonomous Data Warehouse Instance:
Running Oracle Machine
Learning with ADW
• Oracle Machine Learning: a part of ADWC collaborate environment
– A web-based development to create data mining notebook
– Used by data scientists, developer and business to users to
perform data analytics, data discover and data virtualizations.
– Leverage the ADWC scalability and performance in Oracle cloud .
• Key Features:
– Collaborative UI for data scientists
– Packaged with Autonomous Data Warehouse
Cloud Easy access to shared notebooks,
templates, permissions, scheduler, etc.
– SQL ML algorithms API
Running Oracle Machine Learning with ADW
• Machine Learning: algorithms automatically examine large amount of data to
identify patterns, discover new insight am make predications
• Machine learning algorithms implemented as SQL functions inside Oracle DB
Running Oracle Machine Learning with ADW
• Extended in-database algorithms with Python and R
• Easily deploy models with REST API
• Monitor model performance to recommend model rebuild
• AutoML in OML4Py
.
Oracle 20c Machine Learning New Features
• Use Oracle Machine Learning: start with Oracle ML users from ADW Admin
Running Oracle Machine Learning with ADW
• Use Oracle Machine Learning: Login to Oracle ML, taken into OML home page
Running Oracle Machine Learning with ADW
• Use Oracle Machine Learning: Login to Oracle ML And look at Notebook Examples
Running Oracle Machine Learning with ADW
• Use Oracle Machine Learning: Login to Oracle ML, taken into OML home page
Run SQL Statement to SQL Query Scratchpad
Running Oracle Machine Learning with ADW
• The report menu bar lets you change the result to a graph and/or export result
Oracle Machine Learning SQL Notebook
Example: Machine Learning
Model Building with Oracle
Machine Learning
Machine Learning Model Building with Oracle Machine Learning
• OML Example: predicate customer credit score:• Based on existing customer information with known credit_sore in a customer table
– Customer_id– Age– Income– Tenure– Loan_type– Loan_amount– Occupation– Marital_status– Wealth– Credit_limit
– ….–– Credit_score_bin
. To create an machine learning model that can predict the potential credit scores for new customers whose credit scores are unknownYou can try this by yourself by following this link:
(https://github.com/oracle/learning-library/blob/master/workshops/adwc4dev/L300.md)
Machine Learning Model Building with Oracle Machine Learning
• Data Gathering and Preparation
• ML Model Building, Evaluation and Application
Machine Learning Model Building with Oracle Machine Learning
ActionPrepare
Data Set
Table or View
Credit_Score
Exploratoy
Data
Analysis
Create
Credit_score
Review
Attribute
Selection
Top N Attributes for Good Credit
Create
Top N Attributes for max spent
Split Data
Train Data Test Data
Create
Create
ModelBuild_
settings
Train Data
Class Model
Model
Evaluation
New DataTest DataApply Result
Lift Result
Apply Model
Class Model
New Predictions
• Prepare Data set: create view credit_scoring_100k as select * from credit_score
• Exploratory Data Analysis: by credit score; by Job and Income
Data Gathering and Preparation with OML
SQL Query
Data Gathering and Preparation with OML
Good credit customers are hard to find
• Prepare Data set: create view credit_scoring_100k as select * from credit_score
• Exploratory Data Analysis: bin the variable
PL/SQL code:
Data Gathering and Preparation with OML
dbms_data_mining_transform.create_bin_num(
bin_table_name => 'bin_num_tbl');
dbms_data_mining_transform.insert_autobin_num_eqwidth(
bin_table_name => 'bin_num_tbl',
data_table_name => 'CREDIT_SCORING_100K_V',
bin_num => 5,
max_bin_num => 10,
exclude_list =>
dbms_data_mining_transform.COLUMN_LIST('CUSTOMER_ID'));
dbms_data_mining_transform.xform_bin_num(
bin_table_name => 'bin_num_tbl',
data_table_name => 'CREDIT_SCORING_100K_V',
xform_view_name => 'mining_data_bin_view');
• Prepare Data set: create view credit_scoring_100k as select * from credit_score
• Exploratory Data Analysis: bin the variable
Data Gathering and Preparation with OML
• Attribute Selection aka Feature Engineering/Feature Selection
Create attribute importance Machine Learning model for Good Credit
Data Gathering and Preparation with OML
Attribute Selection
Top N Attributes for Good Credit
Top N Attributes for max spent
• Attribute Selection aka Feature Engineering/Feature Selection
Create attribute importance Machine Learning model for Good Credit
Data Gathering and Preparation with OML
Top attributes for good credit customer
• Attribute importance model for MAX_CC_SPENT_AMOUNT:
Data Gathering and Preparation with OML
. Split Data into Train data (60% data): N1_TRAIN_DATA
Test Data (40% data) : N1_TEST_DATA
Data Gathering and Preparation with OML
Split Data
Train Data Test Data
Create
Create or replace view N1_TRAIN_DATA
as SELECT * from CREDIT_scoring_100k_V SAMPLE (60) SEED (1)
Create or replace view N2_TRAIN_DATA
as SELECT * from CREDIT_scoring_100k MINUS
SELECT * FROM N1_ TRAIN_DATA
• Create Predictive Model to Target Good Credit Customers– Add settings to n1_build_settings table
ML Model Building, Evaluation and Application with OML
Create Model
Build_settings
Train Data
Class Model
• Create Predictive Model to Target Good Credit Customers– Build a Classification model
BEGIN
DBMS_DATA_MINING.CREATE_MODEL(
model_name =>‘N1_CLASS_MODEL’,
mining_function =>’dbms_data_mining.classification,
data_table_name =>’N1_TRAIN_DATA’,
case_id_column_name => ‘CUSTOMER_ID’,
target_column_name =>’CREDIT_SCORE_BIN’,
setting_table_name =>’N1_BUILD_SETTINGS’)
END;
ML Model Building, Evaluation and Application with OML
Create Model
Build_settings
Train Data
Class Model
• Model Evaluation – Test the Model by generating an apply result
Execute DBMS_DATA_MINING.APPLY Procedure:
DBMS_DATA_MINING.APPLY(‘N1_CLASS_MODEL’,’N1_TEST_DATA’,’CUSTOMER_ID’,’N1_APPLY_RESULT’)
– Test Data: N1_TEST_DATA
Apply Result: N1_APPLY_RESULT
Lift Result: N1_LIFT_TABLE
– Create a lift result : Execute DBMS_DATA_MINING.COMPUTE_LIFT Procedure:
DBMS_DATA_MINING. COMPUTE_LIFT (‘N1_APPLY_RESULT’, ’N1_TEST_DATA’, ’CUSTOMER_ID’,
’CREDIT_SCORE_BIN’, ’N1_LEFT_TABLE’, ’GOOD CREDIT’, ‘PREDICATION’, ‘PROBABILITY’, 100)
ML Model Building, Evaluation and Application with OML
Class Model
Model Evaluation
Test DataApply Result
Lift Result
• Model Evaluation
– Review Lift result :View Model's Cummulative Gains Chart and decide if its a good
model
ML Model Building, Evaluation and Application with OML
• Apply Model to New data:
Apply the Oracle Machine Learning Model to New Customers to Show
Customers Most Likely to Have Good Credit
(for new customers in credit_scoring_new_cust_v)
Model Building, Evaluation and Application with OML
%sql
select a.customer_id , a.prob_Credit_Score_Bin , b.age, b.income, b.tenure, b.loan_type,
b.loan_amount, b.occupation, b.education_level, b.marital_status
from (select * from (select Customer_id, round(prob_Credit_Score_Bin *100,2)
prob_Credit_Score_Bin
from (select Customer_ID, prediction_probability(N1_CLASS_MODEL, NULL using *)
prob_Credit_Score_Bin from credit_scoring_new_cust_v))) a, credit_scoring_100k_v b
where a.customer_id = b.customer_id
order by a.prob_Credit_Score_Bin desc
New Data
Apply Model
Class Model
New Predictions
• Apply Model to New data:
ML Model Building, Evaluation and Application with OML
Put the predication results into credit_score_new_predictions table
New Data
Apply Model
Class Model
New Predictions
• Apply Model to New data:
Review new predications tableselect * from credit_score_new_predictions
order by rank() over (order by prob_good_credit desc)
Model Building, Evaluation and Application with OML
Apply a ML Model to a Single Record in a Transactional Application:
select prediction_probability(N1_CLASS_MODEL, 'Good Credit’ USING 'Rich' as WEALTH, 2000 as income, 'Silver' as customer_value_segment)
Prediction_Probabilityfrom dual;
• Oracle Machine Learning in Database– Oracle Machine Learning for SQL
– Move Algorithm, Not Data
– Oracle Database with OML , Autonomous Data Warehouse
• Oracle Autonomous Data Warehouse (ADW) Overview
• Running Oracle Machine Learning with ADW– Machine Learning Algorithms in ADW
– SQL Query Scratchpad
– Oracle Machine Learning SQL Notebook
• Example: Machine Learning Model Building with Oracle Machine Learning– Data Gathering and Preparation
– ML Model Building, Evaluation and Application
Summary
Thank You and QAContact me at [email protected] or visit my Oracle Blog at http://kyuoracleblog.wordpress.com/
References:
1. Oracle Cloud Using Oracle Autonomous Data Warehouse E85417-33 April 2019
2. Oracle Cloud Using Oracle Machine Learning E78535-12 March 20193. Oracle Cloud Oracle Autonomous Data Warehouse , Hands on Workshop Lab 100-400(https://github.com/oracle/learning-library/tree/master/workshops/adwc4dev4. Take your Analytics to the Next Level of Insight using Machine Learning in the Oracle Autonomous Database, IOUG Collaborate 19, Charlie Berger, 5. DEV1214 Machine Learning in Oracle Autonomous Data Warehouse, Oracle OpenWorld 2019,Move the Algorithms; Not the Data! By Charlie Berger, Sr. Director Product Management, Machine Learning, AI and Cognitive Analytics
Additional Sessions: