LEARN BY DOING BY-SASMITA PANIGRAHI DataScience Training
Transcript of LEARN BY DOING BY-SASMITA PANIGRAHI DataScience Training
Course Details “Machine Learning gives computer the ability to learn without being explicitly programmed” ~l.Samuel
DataScience Training Build your own predictive models in 45 days with zero prior knowledge
LEARN BY DOING BY-SASMITA PANIGRAHI
In this project ,we will build a Interactive Visualization dashboard
In this course, you will learn how to apply Data Science through seven pragmatic steps - Frame, Acquire, Refine, Transform, Explore, Model, and Insight - to any business problem. The focus will be to learn the principles through an applied case study and by actually coding in Python to solve this.
Objective ‣ Learn how to employ statistical and machine
learning algorithms to solve real life problems by working on real time Projects.
‣ Develop proficiency in using python and its libraries like Pandas, numPy, Seaborn,
‣ Interactive and live coding session
‣ Taught by Real time Practitioners
Project-1
-Pandas,Plotly,python core
Project-2
Sales Forecasting
Project-3
Fraud Detection
Project-4
Document Clustering
‣ What is Artificial Intelligence ‣ Introduction To DataScience ‣ Real Time UseCases Of DataScience ‣ Who is a DataScientist?? ‣ DataScience Project Lifecycle ‣ Skillsets needed for DataScientist ‣ Difference between DataEngineer,
DataScientist and DataAnalyst ‣ 6 Steps to take in 3.5 Months for a
complete transformation to DataScience from any other domain
‣ Machine Learning-Giving Computers The ability to learn from data ‣ Supervised vs Unsupervised ‣ DeepLearning vs Machine Learning
‣ Software Installation ‣ Jupyter Notebook Tutorial ‣ Introduction to Python ‣ Comments ‣ Variable,Operators,DataTypes ‣ If Else,For and While Loops ‣ Functions ‣ Lambda Expression ‣ Taking input from keyboard ‣ List ‣ Tuple ‣ Set ‣ Dictionary ‣ INTERVIEW QUESTIONS-
ASSIGNMENT-1
Module-1(Python Basics)
Python Fundamentals
Welcome To The Course
Module-2(Python Advance)
‣ Introduction to Numpy ‣ Creating Arrays ‣ arange(),linspace() etc. ‣ Creating Arrays of Random Numbers ‣ Basic Operations on an Array ‣ Applying Universal functions on an array ‣ Linear Algebra operations on an array ‣ Numpy DataTypes ‣ Type Conversion ‣ Array Stacking
‣ASSIGNMENT-2
NumPy
Pandas
‣ Introduction to Pandas ‣ Creating DataFrames ‣ Reading data from csv,excel etc. into a
DataFrame & writing df into csv,excel ‣ Selection and Indexing ‣ Conditional Selection ‣ Groupby ‣ Pivot Table ‣ Merging , Joining, Cancatenation ‣ Missing Value Treatment ‣ Data Visualisation using Pandas
‣ASSIGNMENT-3
we’ll begin curriculum focused on various data visualization techniques and how they can help us engage and learn from our data using Plotly
Visualisation-Plotly
‣ Line Plots ‣ Scatter Plots ‣ Pair Plots ‣ Histograms ‣ Heat Maps ‣ Bar Plots ‣ Count Plots ‣ Factor Plots ‣ Box Plots ‣ Violin Plots ‣ Swarm Plots ‣ Strip Plots ‣ Pandas Builtin Visualisation Library
‣ASSIGNMENT-4
Module-3( Visualisation)
Prcatice , Practice and Practice!!!!!!! Implement what you have learnt so far by working in a real time Project……..
Project-1
Pandas Numpy Plotly
Python Core
Skills Required
Module-4 (Statistics)
‣ Descriptive vs Inferential Statistics ‣ Mean,Median,Mode ‣ Central Limit Theorm ‣ Measure of dispersion ‣ Inter Quartile Range ‣ Variance ‣ Standard Deviation ‣ Box Plot ‣ Z score ‣ Scatter Plot ‣ Pearson’s Product Moment Correlation-r ‣ R square ‣ Adjusted R-square ‣ Normal Distribution ‣ Standard Normal Distribution ‣ Emprical rule of Normal Distribution ‣ What is an Outlier ‣ Outlier Detection and Removal
Statistics
Module-5 (ML-Linear Reg)
Linear Regression, Cost Function,Gradient
Descent,sklearn
‣ Introduction to Machine Learning ‣ Supervised vs Unsupervised ‣ Regression vs Classification ‣ Linear Regression Theory ‣ Gradients/Derivative Theory ‣ Assumption of Linear Regression ‣ Cost Function ‣ Optimize Cost function using Gradient
Descent ‣ Gradient Descent Detail Explanation ‣ Mathematical Derivation ‣ Multi- Colinearity ‣ MAE ‣ MSE ‣ RMSE
Prcatice , Practice and Practice!!!!!!! Implement what you have learnt so far
Project-2(Sales Prediction)
Sklearn Pandas Plotly
Python Core Linear Regression
‣ What is Bootstap ‣ Bagging ‣ Difference between Random Forest and
Decision Tree ‣ Feature Selection using Random Forest ‣ Hyperparameter tuning
Random Forest
Module-6 (Decision Tree, Random Forest)
Decision Tree‣ What is ID3 Algorithm ‣ Entropy ‣ Calculating Information Gain ‣ Overfitting,Underfitting,Best fit
‣ Confusion Matrix ‣ Classification Report ‣ Recall ‣ Precision ‣ AUC ‣ ROC ‣ Cross Validation
CLASSIFICATION VALIDATION TECHNIQUES
‣ Implement PCA from scratch using Numpy and using sklearn is a real time dataset
Step By Step Implementation of PCA From scratch (with out sklearn) and by using
sklearn
Module-7 (PCA)
Principal Component Analysis
‣ Introduction to Dimensionality Reduction ‣ PCA Theory discussion ‣ Eigen Values , Eigen Vectors ‣ Step by Step Detail Mathematical
Derivation ‣ Individual and Cummulative Variation
Ratio
Prcatice , Practice and Practice!!!!!!! Implement what you have learnt so far bworking in a real time Project……..
Project-3(Fraud Detection)
Sklearn Python Pandas Plotly PCA
Decision Tree Random Forest
Module-9 (KMeans)
‣ Introduction to Unsupervised Machine Learning
‣ KMeans Theory ‣ How to decide K in KMeans
Module-10 (NLP)
‣ Introduction to NLP ‣ Text Preprocessing Techniques using
Space and NLTK ‣ Word Tokens ‣ StopWord Removal ‣ Count Vectorizer ‣ Tf-Idf Vectorizer
KMeans Clustering
Text Mining
Project-4(Document Clustering)
Sklearn Python Pandas Plotly PCA
KMeans NLTK,Spacy
Last Class.........
Discussion on Some Real Time Projects from •different domains(Banking,Retail,Manufacturing etc. domain projects)
Resume Preparation Tips•
Discussion of some Magic Python Packages and •Libraries for Data Analysis, Visualization, Outlier Detection, Time Series and For Handling Class Imbalance Problem
Recordings For Bonus Chapters............
Introduction DeepLearning•What is a Perceptron•Neural Network Architecture •Activation Function•
- Sigmoid - hyperbolic tangent - relu - leaky relu - parametric relu
Hidden Layers•Feed Forward, Back Propagation•Gradient Descent•Introduction to KERAS•Implementing first Deep Learning model using •KERAS
Bonus Recording