Introduction to Data Science - Institute for Advanced...

9
Introduction to Data Science Rob Gould UCLA Department of Statistics [email protected]

Transcript of Introduction to Data Science - Institute for Advanced...

Page 1: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

Introduction to Data Science

Rob Gould UCLA

Department of Statistics [email protected]

Page 2: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

• Funded by the National Science Foundation as a partnership between UCLA Computer Science, UCLA Statistics, UCLA Center X, and Los Angeles Unified School District.

• Goal: Utilize the notion of participatory sensing to develop computational and statistical thinking in the context of data for high school students.

Page 3: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

why?

• To teach students to reason with data in ALL forms, not just those from random samples:

• “open data” from government sources

• data with photos, GPS coordinates, dates

• data from the internet (twitter, facebook)

Page 4: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

• Year-long course • Algebra I pre-requisite • Students learn R via Rstudio • Emphasis on the Statistical Investigation Process • “Informal Inferential Reasoning” • GAISE Levels A/B

Page 5: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

What is PS?• Mobile technology is used to create a community

that shares

• the desire to better understand an issue, topic, or set of research questions

• data collection

• data

• analyses

Page 6: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

“What do we throw away?”

Page 7: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

Concepts Outline IUnit 1

what and why of dataorganizing data

data investigative cycle asking questions what is typical?

interpreting histograms associations

Unit 2 What do means, medians, MADs and

SDs measure and why” How to compare groups when faced

with variability How likely is an outcome?

What is bias Basic probability

How to simulate basic probabilities What does it mean to say something happens “by chance”? How can we

simulate this?How and why do we merge data files?

What’s the “normal” model?

Page 8: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

Concepts Outline IIUnit 3

anecdotes as evidence? What is a controlled experiment?

Why is random assignment important?

What is an observational study? How do obs. studies and randomized

experiments compare? What are surveys?

How to write a survey question that addresses a statistical question.

How do we communicate uncertainty in estimates?

How do we design a PS campaign?How are data organized on a web

page, and how can we get it?

Unit 4create a campaign

what’s a statistical prediction?How to measure success in

predictions?Predictions with regression.

Predictions with CARTMEA: modeling with data

classification and clustering

Page 9: Introduction to Data Science - Institute for Advanced Studyprojects.ias.edu/pcmi/hstp/sum2016/int/gouldpresentation.key.pdfIntroduction to Data Science Rob Gould UCLA Department of

http://mobilizingcs.org