Information Theory (IT) of Machine Learning for Big Datajunga1/FutureITTalk.pdf · Claude Elwood...

Post on 15-Aug-2020

0 views 0 download

Transcript of Information Theory (IT) of Machine Learning for Big Datajunga1/FutureITTalk.pdf · Claude Elwood...

aalto-logo-en-3

Information Theory (IT) ofMachine Learning for Big Data

Alex(ander) Jung, Aalto University

October 24, 2017

1 / 21

aalto-logo-en-3

Outline

1 Introduction

2 The IT Age

3 IT for Machine Learning Research

4 Wrap Up

2 / 21

aalto-logo-en-3

About Me

MSc (2008) and Phd (2012) in electrical engineering/signalprocessing at TU Vienna

Post-Doc stay at ETH Zurich 2012

Assistant Professor TU Vienna 2013-2015

since 2015, Ass. Prof. for Machine Learning at Aalto CS

3 / 21

aalto-logo-en-3

My Research Group

heading the group “Machine Learning for Big Data”

currently five Phd students, several MSc and BSc students

research revolves around fundamental limits and efficientalgorithms for machine learning involving massive,decentralised datasets (big data)

4 / 21

aalto-logo-en-3

My Teaching

since 2015, CS-E3210 “Machine Learning: Basic Principles”(this year 600 students)

since 2016, CS-E4020 “Convex Optimization for Big Data”(this year 50 students)

from 2018, CS-E4800 “Artificial Intelligence” (expected atleast 100 students)

5 / 21

aalto-logo-en-3

Some Brainy Quotes on The Data Deluge

“We’re Drowning in Information and Starving for Knowledge.”- Rutherford D. Rogers.

“There is Nothing More Practical Than a Good Theory.”- Kurt Lewin.

6 / 21

aalto-logo-en-3

Outline

1 Introduction

2 The IT Age

3 IT for Machine Learning Research

4 Wrap Up

7 / 21

aalto-logo-en-3

A Father of IT

Claude Elwood Shannon (1916 - 2001)

8 / 21

aalto-logo-en-3

The Communication Problem

characterize noisy channel by single number C (capacity)

reliable communication possible for rates (in bit/s) < C

9 / 21

aalto-logo-en-3

The Evolution of IT

Shannon’s key paper on channel capacity published 1948

it took some years to find efficient coding methods ...

milestone is invention of Turbo Codes (TC) in 1990s

TC reach capacity using “simple” hardware (your mobile)

TC used nowadays in

3G and 4G mobile telephony standards

satellite communication

wireless network standards (WiMAX)

recent focus on network information theory

10 / 21

aalto-logo-en-3

A Modern Communication System

11 / 21

aalto-logo-en-3

Outline

1 Introduction

2 The IT Age

3 IT for Machine Learning Research

4 Wrap Up

12 / 21

aalto-logo-en-3

Ski Resort Marketing

you are working in the marketing agency of a ski resort

hard disk full of webcam snapshots (gigabytes of data)

want to group them into “winter” and ”summer” images

you have only a few hours for this task ...

13 / 21

aalto-logo-en-3

Webcam Snapshots

ith snapshot represented by feature vector x(i) ∈ Rd

find labels y (i) = 1 if ith image from summer, else y (i) = 0

14 / 21

aalto-logo-en-3

Labeled Webcam Snapshots

select randomly N = 6 snapshots

manually categorise/label them (y = 1 for summer)

15 / 21

aalto-logo-en-3

Towards an ML Problem

we have few labeled snapshots

need an algorithm/method/software-app to automaticallylabel all snapshots as either “winter” or “summer”

interpret this ML problem as communication problem ...

16 / 21

aalto-logo-en-3

Machine Learning Problem = Communication Problem

labeled dataset X(train) provides training for learning channel

classifier y(x) decodes the feature x to detect true label y

when is reliable classification possible?

what are good classifiers y(x)?

17 / 21

aalto-logo-en-3

A Modern Machine Learning Problem: Weather Prediction

will there be sun tmrw in Helsinki? (research collaboration withFinnish Meteorological Institute)

18 / 21

aalto-logo-en-3

Outline

1 Introduction

2 The IT Age

3 IT for Machine Learning Research

4 Wrap Up

19 / 21

aalto-logo-en-3

Take Home Messages

machine learning is particular form of communication

fundamental limits by capacities of observation channels

efficient coding/decoding algorithms for machine learning

20 / 21

aalto-logo-en-3

Reading Material

C.E. Shannon, “A Mathematical Theory of Communication”,The Bell System Technical Journal, Vol. 27, pp. 379-423,623–656, July, October, 1948.

see our papers at https://users.aalto.fi/~junga1/

21 / 21