Post on 03-May-2020
Welcome
Citizen Data Scientist is the new Data Analyst
# T C 1 8
Mehmet Vanli
Sales Consultant
Tableau
Australia
Citizen data scientist:
A person who creates models that use
advanced diagnostic analytics or predictive and
prescriptive capabilities, but whose primary job
function is outside the field of statistics and
analytics.
“Unicorn” Data Scientists
Technology & Automation
Data driven organizations
Agenda
• Data Exploration
• Statistical Modelling
• Teaming up for Advanced Analytics
Data Exploration
Data exploration
Data source : https://bit.ly/2kAa1UM
Data exploration
outliers
Margin plots• enable comparisons• add context• visualize the same fields in
different ways
Data exploration
Margin plots• enable comparisons• add context• visualize the same fields in
different ways
Data exploration
Data exploration using calculated fields
Statistical Modelling
Forecasting
When to use :• You have a time
series• You want to infer
about future time
series values
Forecasting
Under the hood:
• Method -
Exponential
Smoothing
• Observations in the
recent past are
more predictive of
the future than
observations in the
distant past.
Forecasting
Under the hood:
• Tableau
automatically selects
the best of up to
eight models, the
best being the one
that generates the
highest quality
forecast.
95% prediction interval
Trend Line
When to use :• To infer a relationship
between two quantitative
variables
• Used to show the overall
pattern of a scatterplot
Data Source: https://usa.ipums.org/usa/index.shtml
Trend Line
Under the hood:• Method - Ordinary
Least Squares
What is the probability that this relationship is random ?
How much of the variations in divorce rate can be explained by salary ?
Line of best fit
Clustering
• Which customers should we target for the next Pinot campaign ?
• What are the characteristics of the customers based on what they bought ?
Data source: https://bit.ly/2xedqQs
Clustering
ClusteringWhen to use :• To group particular set
of objects based on
their characteristics,
aggregating them
according to their
similarities.
Under the hood:• Method – k-means
• K-means locates
centers through an
iterative procedure
that minimizes
distances between
individual points in a
cluster and the cluster
center
Clustering
Cluster 1: Pinot Lovers !
Cluster 3: Mostly sparkling drinkers
Cluster 4:Small buyers
Advanced Analytics
Teaming up for Advanced Analytics
In-Database Analytics
External servicesIntegration
Export Results Extensions for
Tableau Dashboards
External Services - R Integration
R Integration
External Services - Python Integration
Python Integration
In-database Analytics – MS SQL Server
On-Premise SQL Server
2016,17
AzureSQL Server
2016,17
Stored ProcedureOr Initial SQL
In-database Analytics – MS SQL Server
SQL Server 2016
In-database Analytics – Google BigQuery
Custom Query
Extensions for Tableau Dashboards
Extensions Gallery
Extensions for Tableau Dashboards
Extensions for Tableau Dashboards
DataRobot Insights Extension• Associations between
variables
Tableau dashboard powered by DataRobot.
• Train a predictive model and make predictions
• What if analysis – how changes in parameters effects the predicted outcome ?
Tableau empowers data analysts
to become citizen data scientists
R E L AT E D S E S S I O N S
Data science applications with TabPy/ROct 24 | 12:00pm – 1:00pm | New Orleans Theater B
Oct 23 | 2:15pm – 4:45m | MCCO – L2 – 244
Get on it, stat | Statistical analysis skills in Tableau
Oct 23 | 2:15 – 3:15 pm | MCCNO – L2- R02
Ready, set , action!
Oct 24 | 12:00 – 1:00 pm | MCCNO – L2- New Orleans Theater A
Oct 25 | 10:45am – 1:15pm | MCCO – L2 – 294
Oct 24 | 1:45pm – 4:15pm | MCCO – L2 – R08
Please complete the
session survey from the
Session Details screen
in your TC18 app
Thank you!
#TC18
ResourcesContent Tableau workbooks - https://bit.ly/2ye9Aas
How Airbnb Democratizes Data Science With Data University- https://bit.ly/2qg35BU
Loupe Tooltips - https://bit.ly/2Mx0c6e
Creating a Correlation Value Matrix - https://tabsoft.co/2Ia3xro
Analyze Data - https://tabsoft.co/2QwZdGf
Trend Line - https://tabsoft.co/2Da7gGD
Primer: What exactly is clustering, and why would you use it? - https://tabsoft.co/2CZTRka
Finding the Pearson Correlation - https://tabsoft.co/2QwmyrC
Advanced Analytics with Tableau – https://tabsoft.co/2p9TkCm
Working with External Services in Tableau: Python, R, and MATLAB - https://tabsoft.co/2xjUiQ3
Interpreting earnings call with Natural Language Processing - https://bit.ly/2Qx4zRJ
Leveraging Google BigQuery's machine learning capabilities for analysis in Tableau- https://tabsoft.co/2OnB7fK
How to use Tableau with SQL Server on R and Python - https://tabsoft.co/2yuHLIf
Machine Learning Services (R, Python) in SQL Server 2017 - https://bit.ly/2Mrul6V