IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts...
Transcript of IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts...
A little about me
• Assistant Professor, ISE & CSL
UIUC, 2016 –
• Ph.D. in Operations Research,
M.S. in Computational Sci. & Eng.
Georgia Tech, 2010 – 2015
• B.S. in Mathematics,
University of Sci. & Tech. of China, 2006 – 2010
2
A little about the course
Big Data Optimization
• Explore modern optimization theories, algorithms, and big data applications
• Emphasize a deep understanding of structure of optimization problems and computation complexity of numerical algorithms
• Expose to the frontier of research in the intersection of large-scale optimization and machine learning
3
Course Details
• Prerequisites: no formal ones, but assume knowledge in
– linear algebra, real analysis, and probability theory
– basic machine learning and optimization at graduate level
• Textbooks: no required ones, but recommend to read the listed references on syllabus
5Ben-Tal & Nemirovski (2011) Nesterov (2003) Beck (2017) Bubeck (2015)
Course Details
• Evaluation: groups of size 1~3
– Paper Presentation (25%)
– Final Project (75%)
• Proposal (10%) : 1~3 pages
• Report (40%): 10~15 pages under given format
• Presentation (25%): 15~20 mins
– Bonus (20 pts): a conference or journal submission
• Deadlines: see syllabus
6
Course Admin
• Syllabus & Websitehttp://niaohe.ise.illinois.edu/IE598/ (pwd: spring2018)
• Where to get help – No TAs
– Email: [email protected] with [IE 598] in your subject
– Office Location: 211 Transport Building
– Office Hours: Mon. 3:00-4:00 or by appointment via email
7
Era of Big Data
• Big data heat in industry– LinkedIn:
48,000+ Data Scientist jobs in United States
11
More than a buzzword
• Big data revolution in various areas
– Robotics and autonomous car
– Natural language processing
– Computer vision
– Healthcare
FinanceHealthcare
Environment
Aerospace
Lifestyle
12
How to do data analysis?
Key Steps
• Pose a problem
• Collect data
• Pre-process and clean data
• Formulate a mathematical model
• Find a solution
• Evaluate and interpret the results
13
What is Optimization?
• Find the optimal solution that minimize/maximize an objective function subject to constraints
14
Why do we care?
Optimization lies at the heart of many fields, especially machine learning.
• Finance– Portfolio selection, asset pricing, etc.
• Electrical Engineering – Signal and image processing, control and robotics, etc.
• Industrial Engineering– Supply chain, revenue management, transportation etc.
• Computer Science– Machine learning, computer vision, etc.
15
Example – Portfolio Selection
• Markowitz Mean-Variance Model
where
– 𝑤 is a vector of portfolio weights
– 𝑅 is the expected returns
– Σ is the variance of portfolio returns
– 𝜆 > 0 is the risk tolerance factor
16
Example – Image Denoising
• Total Variation Denoising Model
where
– 𝑥 is image matrix
– 𝑂 is the noisy image, P is the observed entries
– 𝑇𝑉(𝑥) is the total variation
17
Example – Inventory
• Newsvendor Model
where
– 𝑞 is number of newspaper to be stocked
– 𝐷 is the random demand
– 𝑐 is the unit purchase price
– 𝑝 is the sell price
18
Example – Regression
• Linear Regression Model
where
– 𝑥𝑖: predictor vector (feature)
– 𝑦𝑖: response vector (label)
– 𝑤: parameters to be learned
– 𝑛: number of data points
19
Example – Regularization
• Ridge Regression Model
▪ 𝑤2
2= σ𝑗=1
𝑑 𝑤𝑗2 is the 𝐿2-regularization
• LASSO (Least Absolute Shrinkage and Selection Operator)
▪ 𝑤1= σ𝑗=1
𝑑 |𝑤𝑗| is the 𝐿1-regularization
20
Example – Classification
• Maximum Margin Classifier Model
where
– 𝑥𝑖: predictor vector (feature)
– 𝑦𝑖 ∈ {1,−1}: label/class
– 𝑤: parameters to be learned
– 𝑛: number of data points
21
Example – Maximum Likelihood Estimation
• Assume data points 𝑥1, . . . , 𝑥𝑛 are drawn i.i.d. from some distribution and we want to fit the data with a model 𝑝(𝑥|𝑤)with parameter 𝑤, the maximum likelihood estimation is to solve
– Least square regression as a special case
– Logistic regression as a special case
23
Example – Clustering
• K-Means Model
where
– 𝑥1, … , 𝑥𝑛: data
– 𝜇1, … , 𝜇𝑘: cluster centers to be learned
– 𝐶1, … , 𝐶𝑘: clusters to be assigned to
24
Many More Examples in ML
• Supervised learning (predictive models)– Regression
– Classification
– Neural networks
– Boosting
• Unsupervised learning (data exploration)– Clustering (K-means)
– Dimension reduction (PCA)
– Density estimation
• Reinforcement learning
• Collaborative filtering
• Graphical models
• Probabilistic inference
• … … 25
Structure of Optimization
• Linear vs. Nonlinear
• Deterministic vs. Stochastic
• Continuous vs. Combinatorial
• Smooth vs. Nonsmooth
• Convex vs. Nonconvex
• Low-dimensional vs. High-dimensional
• Static vs. Online
• Single vs. Sequential Decision Making
27
Easy or Hard?
What makes an optimization problem easy or hard?
Find minimum volume ellipsoid Find maximum volume ellipsoid
Polynomial solvable NP-hard
Example from L. Xiao, CS286 seminar 28
Easy or Hard?
What makes an optimization problem easy or hard?
Linear Optimization Polynomial Optimization
Polynomial solvable P ~ NP-hard
29
Complexity and Convexity
“ The great watershed in optimization isn’t between linearity and nonlinearity, but convexity and nonconvexity.”
— R. Rockafellar, SIAM Review 1993
Non-Convex Optimization Convex Optimization30
Types of Algorithms
• Polynomial-time algorithms (dates back to 1970s or so)
– E.g., ellipsoid method, interior point method (IPM)
• First-order algorithms (dates back to 1900s, resurrection since 1980)
– E.g., accelerated gradient descent method (AGD)
• Second-order algorithms – E.g., Newton method, L-BFGS
• Stochastic and randomized algorithms (dates back to 1950s, resurrection since 2004)
– E.g., stochastic approximation (SA)
31
Courtesy Warning
• If you are looking for software/tools for big data analytics :– IE 529 Stats of Big Data & Clustering
– Check CS/CSE related data mining courses
• If you are looking for introduction-level optimization:– ECE 490 Introduction to Optimization
– IE 510 Advanced Nonlinear Programming
• If you are looking for combinatorial optimization:– IE 598 Advanced Integer Programming
• Otherwise, welcome to the class and see you next week!
33