Python in Data Science Work

19
PYTHON IN DATA SCIENCE WORK RICK BAHAGUE, DATA SCIENTIST [email protected]

Transcript of Python in Data Science Work

Page 1: Python in Data Science Work

PYTHON IN DATA SCIENCE WORKRICK BAHAGUE, DATA SCIENTIST [email protected]

Page 2: Python in Data Science Work

Our Agenda

What is Data Science?

Introduction to Python

Python Tools for Data Science

A bit of Python for Big Data Processing

Questions

Page 3: Python in Data Science Work

Data Science

Source: Python Data Analytics

Page 4: Python in Data Science Work

Data Scientist asks relevant real world questions

Source: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

And hopefully, discovers

actionable recommendations

from data

Page 5: Python in Data Science Work

TOOLS

Page 6: Python in Data Science Work

WHAT IS PYTHON?

“THE NAME PYTHON COMES FROM THE SURREAL BRITISH

COMEDY GROUP MONTY PYTHON, NOT FROM THE SNAKE. PYTHON

PROGRAMMERS ARE AFFECTIONATELY CALLED

PYTHONISTAS, AND BOTH MONTY PYTHON AND SERPENTINE

REFERENCES USUALLY PEPPER PYTHON TUTORIALS AND

DOCUMENTATION.”Automate the Boring Stuff with Python

Page 7: Python in Data Science Work

import antigravity

Page 8: Python in Data Science Work

Installing Python

https://www.continuum.io/downloads

Page 9: Python in Data Science Work

Launching Anaconda Python Distribution

Page 10: Python in Data Science Work

When is data ready and prepared for analysis ?

Image source: http://blog.kaggle.com/2016/07/21/approaching-almost-any-machine-learning-problem-abhishek-thakur/

Page 11: Python in Data Science Work

Github: https://github.com/RickBahague/dspop

Page 12: Python in Data Science Work

Sample Data Set: Github: https://github.com/veekun/pokedex

Page 13: Python in Data Science Work

Pandas: Python Data Analysis Library

Import pandas library Reading/Writing Data Series DataFrame Selecting Internal Elements Assigning Values to Elements

Page 14: Python in Data Science Work

Pandas: Python Data Analysis Library

Evaluating Values (unique, isin, value_counts, NaN) Filtering Values Transpose Operations between DataFrame and Series Statistics Functions, Correlation/Covariance

Page 15: Python in Data Science Work

Scikit-learn & ML Basics

... learning from experience either with or without supervision of

humansMastering Machine Learning with scikit-learn

Page 16: Python in Data Science Work

ML Flow

Image source: http://blog.kaggle.com/2016/07/21/approaching-almost-any-machine-learning-problem-abhishek-thakur/

Page 17: Python in Data Science Work

Machine Learning with Scikit-learn

Source: http://scikit-learn.org/stable/

Page 18: Python in Data Science Work

A bit of Big Data Processing

Source: Python Data Analytics

Page 19: Python in Data Science Work

Creative Commons License

Python in Data Science Work by Rick Bahague is licensed under a Creative Commons

Attribution-NonCommercial-ShareAlike 4.0 International License.

Based on a work at https://medium.com/@rbahaguejr.

Permissions beyond the scope of this license may be available at https://medium.com/

@rbahaguejr.