Learning How to Become a Data-Driven Institution (254230479)
description
Transcript of Learning How to Become a Data-Driven Institution (254230479)
1/29/2015
1
Learning How to Become a Data Driven Organization
Mark Dobransky, Managing Partner
John Rome, Deputy CIO
1/29/2015
2
• Introduction
• Ice Breaker
• Estimating Exercise
• “How Big?” Discussion
• Data Quality Exercise
• Data Enrichment Exercise
• Becoming Data Driven
• Examples from a Data Driven University
Ice Breaker
1/29/2015
3
Estimating Exercise
1/29/2015
4
Which Weighs More?
Estimating Exercise
1/29/2015
5
What is John’s Weight?Give Range with 90% Confidence Level (2 numbers)
What is John’s Age?Give Range with 90% Confidence Level (2 numbers)
John Rome, ASU with “Sparky”
1/29/2015
6
What is Mark’s Age?Give Range with 90% Confidence Level (2 numbers)
Picture of Mark
1/29/2015
7
Estimating Take-Aways
• Humans Are Bad at Estimating– Blame Evolution
• Overconfidence is At Play• We Work in Noisy Data Environments
– The More you Know, Better Deductions
• Data Analysis Often Trumps Expertise• Humans Have Emotions• “Data, data, data, I cannot make bricks without clay.” ‐Robert Downey in Sherlock Holmes
• Reason Why We Need to be Data Driven!– Hopefully Outcome of Exercise Showed This
How We Measure Data
1/29/2015
8
How We Measure DataUnit Bytes Abbr Example
Byte B 1 B ‐ A number between 0 and 255
Kilobyte 103 KB 2 KB ‐ A type written page
Megabyte 106 MB 1 MB ‐ A small novel
Gigabyte 109 GB 2 GB ‐ 30 feet of shelved books
Terabyte 1012 TB 10 TB ‐ The printed collection of the US Library of Congress
Petabyte 1015 PB 20 PB ‐ Production of all hard‐disk drives in 1995
Exabyte 1018 EB 5 EB ‐ All words ever spoken by human beings.
Zettabyte 1021 ZB .5 ZB ‐ As of 2009, the entire WWW was estimated to
contain 500 EB… which in only ½ of a Zettabyte.
Yottabyte 1024 YB 1 YB ‐ In 2010, it was estimated that storing a yottabyte on
terabyte‐size hard drives would require one million city
block size data‐centers
How We Store Data
1/29/2015
9
First Hard Disk Drive ‐ IBM 350 – 5MB
When was the IBM 350 Introduced?
A. 1947
B. 1956
C. 1960
D. 1964
E. None of the Above
1/29/2015
10
First 1 GB Hard Disk Drive ‐ IBM 3380
When was the IBM 3880 Introduced?
A. 1964
B. 1970
C. 1980
D. 1984
E. None of the Above
1/29/2015
11
Three Decades of Shrinkage
What’s it Cost?
1/29/2015
12
What it Costs?
Year Size Example
1956 5MB IBM 350. You could not buy it. You leased it
for $3,200/month as part of you main‐frame.
1981 2.5GB IBM 3380. Purchased for $81,000… computer
not included!
1984 5MB First 5.25‐inch disk drives over $3,000. Not
“standard” equipment
1994 2GB DEC DSP 5200 – Purchased for $1,168.
2004 200GB Seagate – Purchased for $135.
2014 3TB Seagate – Purchased for $105.
If you wanted 3TB of Storage in 1994, what would it cost?
A. $11,680
B. $350,400
C. $1,752,000
D. No one said there would be math!
E. None of the Above
1/29/2015
13
“So, How Big?”Is Data?”
*Source: IDC Digital Universe 2014 Report
1/29/2015
14
*Source: IDC Digital Universe 2014 Report
*Source: IDC Digital Universe 2014 Report
1/29/2015
15
*Source: IDC Digital Universe 2014 Report
Take‐Aways
• The Digital UniVerse (Big Data) is REALLY big!
• It is growing exponentially because of IoT
• Most institutions should concentrate on high value data which is less than 1.5% of the Digital Universe
• Start with your structured data first.
1/29/2015
16
Data QualityExercise
What’s Wrong With This Data?
1/29/2015
17
What’s Wrong With This Data?
Data Quality Take-Aways
• Focus on High‐Payoff Data Elements• Interrogate Data Elements Individually and Collectively
• Standardize on National Codes (IPEDS, etc.)• Conduct Data Audits for Conformity of Domain
• Document Transformation Rules and Test• Go Back to the Source if Necessary• Seeing Better Quality Data with ERP Systems• Data Driven Assumes Data Quality
1/29/2015
18
Data Enrichment Exercise
Dates
Field/Column Example
Date 01/01/2015
Full Date Description January 1, 2015
Day of the Week Monday, Tuesday, etc.
Let’s assume that you have a table called DATES which contains a row for every date in our Data Warehouse/ODS for five years prior to today and five years after today. Some of the possible columns we might include are:
1/29/2015
19
Survey Says
Field/Column Calendar Fiscal
Day Number in the Month 1,2,3,…,29/30/31 1,2,3,…,29/30/31
Day Number in Year 1,2,3,…,365/366 1,2,3,…,365/366
Month Name January, February, etc. January, February, etc.
Month Number in Year 01,02,03 01,02,03
Week in Year 1,2,3,4,…,52
Year‐Month 2015‐01 F2015‐01
Quarter Q1,Q2,Q3,Q4 FQ1,FQ2,FQ3,FQ4
Year‐Quarter 2015‐Q1 F2015‐Q1
Year 2015 F2015
Holiday Indicator Holiday, Non‐Holiday
Weekday Indicator Weekday, Weekend
Semester Fall‐2015
Major Event Fall‐2015 Registration
Benefits of Enriched Data
• Many DW/Report users are not versed in SQL date semantics so they can’t leverage those inherent capabilities.
• Not everyone knows ‘fiscal’ information and they should not have to.
• Enriched attributes are often used as a report column heading. Be descriptive not cryptic (e.g. Y/N versus Weekday/Weekend)
• Dates and grouping logic belongs in tables. This leads to consistent values across all reporting environments.
1/29/2015
20
. .
..
• ERP Adaptors
• Storefront/Portal
• Marketplace
• EDI
• Financials
• Data Sharing
• Data Archiving
• Data Migration
• [Near]Real‐Time
• CRM / SFA
• HRIS / HRMS
• REST
• Near Real‐Time
• Multi‐Source
• Operational Data
• Data Marts
Data Warehouse
Best‐in‐Class
Integration
eCommerce eBusiness
Corporate Initiatives
Kourier ‐Multi‐Purpose Software
Flexible &
Adaptable
Supports
ETL & EAIKourier
Integrator
Becoming Data Driven
1/29/2015
21
Data‐Driven Individuals Are:
• Always Asking Questions
• Aware of Human Brain Natural Tendencies
• Fighting Analysis Paralysis
• Pursuing Intellectual Honesty
• Constantly Testing and Measuring
• Finding New Opportunities Where Data May Play a Role
1/29/2015
22
Questions to Ask Yourself
• Do I Understand the Decision(s) to be Made?
• Do I Have the Data?
• How is the Data Quality?
• Do I have the Right Talent to Process, Model and Interpret the Data?
• Am I Producing Results that are Driving the Decision?
• Does the Culture Support Data‐Driven Decision Making?
Examples from Data‐Driven University
1/29/2015
23
(Simplified)
Using ASU’s Data Warehouse
ASU Sample Data Areas
1/29/2015
24
http://dashboard.asu.edu
ASU’s Dashboards
Where Students Are Coming From
1/29/2015
25
Where Students Are Coming From
From ASU’s Admissions Dashboard
Monitoring Class Enrollment
From ASU’s Course Enrollment Management Dashboard
1/29/2015
26
Finding Student Not Registered
Tracking Student Progress
From ASU’s eAdvisor Dashboard
1/29/2015
27
Early Warning/Retention Tool
From ASU’s Retention Dashboard
Retention Dashboard – Student Detail
1/29/2015
28
Retention Dashboard (cont.)
From ASU’s Retention Dashboard
Dashboard Links Directly to ERP, Eventually to Salesforce
Clean Integration to Multiple Systems
1/29/2015
29
ASU Gains in Freshman Retention Rates
76.7 76.8
7978.5
77.2
79.5
81.2
8483.5
80
83.8
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
First‐Time Full‐Time Freshman
Remember Those Items?
1/29/2015
30
Which Weighs More?
Which Weighs More?
1/29/2015
31
Questions?
Contact Information
Mark Dobransky
Managing Partner @ Kore Technologies
(858) 678‐0030
www.koretech.com
John Rome
Deputy CIO, Arizona State University
(480) 965‐0857
www.asu.edu