Data analytics in computer networking
-
Upload
stenio-fernandes -
Category
Data & Analytics
-
view
413 -
download
2
Transcript of Data analytics in computer networking
![Page 1: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/1.jpg)
Data Analytics in Computer Networking
The Case for Exploratory Data Analysis
Stenio FernandesCarleton University / CIn-UFPE
March 2016
![Page 2: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/2.jpg)
Outline
Data Analysis - backgroundEDA basicsApplied EDA (Examples: WiFi simulated data)Q&AReferences
![Page 3: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/3.jpg)
Data Analytics - Background
![Page 4: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/4.jpg)
Data Science Pipeline
• Analytic Data• Analytic Code• Documentation• Distribution
Elem
ents
of R
epro
duci
ble
Rese
arch
Report Writing for Data Science in R, Roger D. Peng, 2016
![Page 5: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/5.jpg)
1. Stating and refining the question
2. Exploring the data
3. Building formal statistical models
4. Interpreting the results
5. Communicating the results
Epicycle of Analysis
The Art of Data Science, A Guide for Anyone Who Works with Data, Roger D. Peng and Elizabeth Matsui, 2016
![Page 6: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/6.jpg)
• summarize the measurements in a single data set without further interpretation
Descriptive
• Searching for discoveries, trends, correlations, or relationships between multiple variables to generate ideas or hypotheses
Exploratory
• quantifying whether an observed pattern will likely hold beyond the data set in hand
Inferential
• uses a subset of measurements (the features) to predict another measurement (the outcome)
Predictive
• what happens to one measurement if you make another measurement change
Causal
• changing one measurement always and exclusively leads to a specific, deterministic behavior in another
DeterministicThe Elements of Data Analytic Style, A guide for people who want to analyze data, Jeff Leek, 2015
![Page 7: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/7.jpg)
EDA basics
![Page 8: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/8.jpg)
Why use EDA - Summary
• Maximize insight into a data set• Uncover underlying structure• Extract important variables• Detect outliers and anomalies• Test underlying assumptions• Develop parsimonious models• Determine optimal factor
settings
NIS
T
• Show comparisons• Show causality, mechanism,
explanation• Show multivariate data
• Integrate multiple modes of evidence
• Describe and document the evidence
• Content is king JH U
nive
rsity
![Page 9: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/9.jpg)
Answer to initial questions
What is a typical value for a certain feature?
What is the uncertainty for a typical value of a feature?
What is a good distributional fit for a feature?
What is the percentile distribution?
Does modification on one variable have an effect another variable?
Does a factor have an effect on performance metrics?
What are the most important factors?
What is the best function for relating a response variable to other variables?
What are the best settings for factors (i.e. levels)?
Can we separate signal from noise?
Can we extract any structure from multivariate data?
Does the data have outliers?
![Page 10: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/10.jpg)
EDA Graphs
Understand data properties Find patterns in data
Suggest modeling strategies
Debug analyses
![Page 11: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/11.jpg)
Applied EDAUsing R/ggplot2
(mpg dataset) -> fake wifi dataset
![Page 12: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/12.jpg)
Practical Steps
Before performing any measurements or simulation• Identify• Performance Metrics• Performance Factors and Levels
• Caution: sometimes you have to guess the ranges for the levels• Use an educated guess
Don’t run tons of simulations / experiments (As previously discussed)
Plot quick and dirty graphs• No need for titles, labels
![Page 13: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/13.jpg)
Some examples of EDA Graphs - WiFi Data (simulated)
• “Vendor” - factor / levels: LinkSys, …• “Model“ – factor / Levels: GST200, …• "Users_Max_Rate“ - factor (background traffic) /
levels: 1.6, 1.8,…,7.0 Mbps• "Year“ – factor / Levels: 1999, 2008• "BER“ – factor / Levels: 4, 5, 6, and 8• "Type“ – factor (type of user) / Levels: 4, f, r• Rate – performance metric (Mbps)• Distance - factor (distance from the AP) / “Levels:
50,100m
Features (Observation Variables)
![Page 14: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/14.jpg)
![Page 15: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/15.jpg)
![Page 16: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/16.jpg)
![Page 17: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/17.jpg)
![Page 18: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/18.jpg)
![Page 19: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/19.jpg)
![Page 20: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/20.jpg)
![Page 21: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/21.jpg)
![Page 22: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/22.jpg)
![Page 23: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/23.jpg)
![Page 24: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/24.jpg)
![Page 25: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/25.jpg)
![Page 26: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/26.jpg)
Q&A
![Page 27: Data analytics in computer networking](https://reader033.fdocuments.us/reader033/viewer/2022051707/58ed19071a28abaa148b4567/html5/thumbnails/27.jpg)
References• NIST’s Handbook of Statistics Engineering (online)• Report Writing for Data Science in R, Roger D. Peng, 2016• The Art of Data Science, A Guide for Anyone Who Works with Data, Roger D.
Peng and Elizabeth Matsui, 2016• The Elements of Data Analytic Style, A guide for people who want to analyze
data, Jeff Leek, 2015