BLiNQ Media, Social Moov, SHIFT, Makemereach | Company Showdown
BLiNQ MEDIA Praneeth Vepakomma Senior Data Scientist
description
Transcript of BLiNQ MEDIA Praneeth Vepakomma Senior Data Scientist
BLiNQ MEDIAPraneeth VepakommaSenior Data Scientist
Generalization in Supervised
Machine Learning
Hypothetical Knapsack of Coins:
Copper and Gold CoinsTotal number of coins is fixed and is a large sample.Capture-RecaptureWhat is the proportion of Gold coins?
Copper and Gold CoinsTotal number of coins is variable and is a large sample.Capture-RecaptureWhat is the proportion of Gold coins?
BASIC ML/STAT TERMINOLOGY:
190 Years after Gauss, the core problem of prediction remains an active problem :
Then:
Now:
190 Years after Gauss, the core problem of prediction remains an active problem :
Find a mapping♯ from the features:
#Approximation
is a list of parameters, required to represent the function
ExistingFeatures
KnownLabels
UnavailableFeatures
UnknownLabels
Loss Function
Loss Function
Assumptions
What is Supervised Learning?
Evaluating the Learned Function:
Loss Function quantifies the error in the approximation.
Learn a mapping by optimizing the loss.
Example:
Predictions with varying parameters:
Predictions with varying parameters:
How do we generalize?
Generalization and Predictability
Empirical Risk Minimization:
True Risk Minimization:
Empirical Risk is the average (expected) loss on seen data.
True Risk is the expected risk on the process generating the X,Y pairs.
PARAMETRIC CHARACTERIZATION OF THE MAPPING :
2d-Linear function: Slope, InterceptCubic Spline: Number of knots, Location of KnotsNearest-Neighbor regression: Number of neighborsLasso: L1-L2 WeightsSupport Vector Machines: Kernel width, Margin LengthRandom Forests: Resampling sample size
Long list of available Supervised Learning Techniques.
Most of the techniques have tuning parameters.
We can minimize out-of-sample performance by tuning the technique with optimal parameters.
Tuning can be performed by cross-validation over a discrete grid of parameter combinations.
CURSE OF DIMENSIONALITY-Flat World-10D World:
CURSE OF DIMENSIONALITY-Flat World-10D World:
CURSE OF DIMENSIONALITY-Flat World-10D World:
CURSE OF DIMENSIONALITY-Let us validate:
Structural Risk Minimization via Regularization:
Brief Description
Technology Overview
Hiring (What we’re looking for)http://blinqmedia.com/contact/job-openings/
Lets work with Abalone
Thank You!