Data Mining with Excel 2010 and PowerPivot
-
Upload
mark-tabladillo -
Category
Business
-
view
13.115 -
download
1
description
Transcript of Data Mining with Excel 2010 and PowerPivot
Data Mining with Excel 2010 and PowerPivot
Mark Tabladillo Ph.D. http://marktab.net September 18, 2010
SQL Saturday 46 -- Raleigh NC #sqlsat46
© 2
010 M
ark
Tabla
dill
o P
h.D
.
2
MarkTab & Data Mining
© 2
010 M
ark
Tabla
dill
o P
h.D
.
3
© 2
010 M
ark
Tabla
dill
o P
h.D
.
4
© 2
010 M
ark
Tabla
dill
o P
h.D
.
5
Outline
What is Data Mining
What is PowerPivot
Demos
© 2
010 M
ark
Tabla
dill
o P
h.D
.
6
Data Mining as a Service
© 2
010 M
ark
Tabla
dill
o P
h.D
.
7
Outline
What is Data Mining
What is PowerPivot
Demos
© 2
010 M
ark
Tabla
dill
o P
h.D
.
8
Data Mining Definitions
• Data mining
• Machine Learning
• Data mining algorithms -- typically use estimation or optimization to achieve results (as opposed to only calculations).
© 2
010 M
ark
Tabla
dill
o P
h.D
.
9
Data Mining Tasks
• Supervised
• Answer known, what is correlated?
• Unsupervised
• Answer unknown (unspecified), what are the groups?
• Forecasting
• Given a trend, what is next?
© 2
010 M
ark
Tabla
dill
o P
h.D
.
10
Value Slide
Data Mining Add-In for Excel
• Requires Analysis Services instance
• Version 10.00.2531.00 (April 2009)
• 32-Bit Add-In
• Microsoft .NET Framework 2.0 (32-bit)
• Office 2007 (Professional, Professional Plus, Ultimate, Enterprise)
• SQL Server Enterprise or Standard (or Developer) 2008 or higher
© 2
010 M
ark
Tabla
dill
o P
h.D
.
11
The Analyze Tab
© 2
010 M
ark
Tabla
dill
o P
h.D
.
12
The Analyze Tab
© 2
010 M
ark
Tabla
dill
o P
h.D
.
13
Menu Option Data Mining Algorithm
Analyze Key Influencers Naïve Bayes
Detect Categories Clustering
Fill from Example Logistic Regression
Forecast Time Series
Highlight Exceptions Clustering
Scenario Analysis (Goal Seek) Logistic Regression
Scenario Analysis (What If) Logistic Regression
Prediction Calculator Logistic Regression
Shopping Basket Analysis Association Rules
Data Mining Tab
© 2
010 M
ark
Tabla
dill
o P
h.D
.
14
Data Mining Tab
© 2
010 M
ark
Tabla
dill
o P
h.D
.
15 Many
Data Mining Capacities
© 2
010 M
ark
Tabla
dill
o P
h.D
.
16
SQL Server 2008 R2 Analysis Services Object
Maximum sizes/numbers
Maximum data mining models per structure
2^31-1 = 2,147,483,647
Maximum data mining structures per solution
2^31-1 = 2,147,483,647
Maximum data mining structures per Analysis Services database
2^31-1 = 2,147,483,647
Maximum data mining attributes (variables) per structure
2^31-1 = 2,147,483,647
Reference:
http://www.marktab.net/datamining/index.php/2010/08/01/sql-server-
data-mining-capacities-2008-r2/
Data Mining Tab
© 2
010 M
ark
Tabla
dill
o P
h.D
.
17
Outline
What is Data Mining
What is PowerPivot
Demos
© 2
010 M
ark
Tabla
dill
o P
h.D
.
18
PowerPivot for Excel
• Take advantage of familiar Excel tools and features
• Process massive amounts of data in seconds
• Load even the largest data sets from virtually any source
• Use powerful new analytical capabilities, such as Data Analysis Expressions (DAX)
• Make the most of multi-core processors and gigabytes of memory
© 2
010 M
ark
Tabla
dill
o P
h.D
.
19
PowerPivot for Excel Sources
• SQL Server
• SQL Azure
• Oracle, Teradata, Sybase, Informix, IBM DB2
• OLEDB/ODBC
• Analysis Services (SSAS)
• Reporting Services (SSRS)
• Excel, Text File
© 2
010 M
ark
Tabla
dill
o P
h.D
.
20
PowerPivot Reference
• http://www.powerpivot.com (Product Site)
• http://www.powerpivotpro.com (Blog Site)
© 2
010 M
ark
Tabla
dill
o P
h.D
.
21
Outline
What is Data Mining
What is PowerPivot
Demos
© 2
010 M
ark
Tabla
dill
o P
h.D
.
22
Resources
• MarkTab.NET Blog, links, video resources and information for data mining
• Blog: http://marktab.net/datamining
• Twitter: @MarkTabNet
© 2
010 M
ark
Tabla
dill
o P
h.D
.
23
© 2
010 M
ark
Tabla
dill
o P
h.D
.
24
Regroup and Conclusion
• Main Points from this Presentation
© 2
010 M
ark
Tabla
dill
o P
h.D
.
25
Contact Information
• Mark Tabladillo http://marktab.net
• Also on: Twitter @marktabnet Linked In
© 2
010 M
ark
Tabla
dill
o P
h.D
.
26