CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

15
CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos

Transcript of CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Page 1: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

CSE 515

Statistical Methods in Computer Science

Instructor:

Pedro Domingos

Page 2: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Logistics

• Instructor: Pedro DomingosEmail: [email protected]: 648 Allen CenterOffice hours: Wednesdays 3:30-4:20

• TA: Daniel LowdEmail: [email protected]: 216 Allen CenterOffice hours: Mondays 3:00-3:50

• Web: www.cs.washington.edu/515• Mailing list: cse515

Page 3: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Evaluation

• Four homeworks (15% each)– Handed out on weeks 1, 3, 5 and 7– Due two weeks later– Include programming

• Final (40%)

Page 4: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Textbook

• D. Koller & N. Friedman,Structured Probabilistic Models:Principles and Techniques, MIT Press.

• Complements:– S. Russell & P. Norvig, Artificial Intelligence:

A Modern Approach (2nd ed.), Prentice Hall, 2003.

– M. DeGroot & M. Schervish, Probability and Statistics (3rd ed.), Addison-Wesley, 2002.

– Papers, etc.

Page 5: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

What Is Probability?

• Probability: Calculus for dealing with nondeterminism and uncertainty

• Cf. Logic

• Probabilistic model: Says how often we expect different things to occur

• Cf. Function

Page 6: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

What’s in It for Computer Scientists?

• Logic is not enough

• The world is full of uncertainty and nondeterminism

• Computers need to be able to handle it

• Probability: New foundation for CS

Page 7: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

What Is Statistics?

• Statistics 1: Describing data

• Statistics 2: Inferring probabilistic models from data– Structure– Parameters

Page 8: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

What’s in It for Computer Scientists?

• Statistics and CS are both about data

• Massive amounts of data around today

• Statistics lets us summarize and understand it

• Statistics lets data do our work for us

Page 9: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Stats 101 vs. This Class

• Stats 101 is a prerequisite for this class• Stats 101 deals with one or two variables;

we deal with tens to thousands• Stats 101 focuses on continuous variables;

we focus on discrete ones• Stats 101 ignores structure• We focus on computational aspects• We focus on CS applications

Page 10: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Relations to Other Classes

• CSE 546: Machine Learning

• CSE 573: Artificial Intelligence

• Application classes (e.g., Comp Bio)

• Statistics classes

• EE classes

Page 11: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Applications in CS (I)

• Machine learning and data mining

• Automated reasoning and planning

• Vision and graphics

• Robotics

• Natural language processing and speech

• Information retrieval

• Databases and data management

Page 12: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Applications in CS (II)

• Networks and systems

• Ubiquitous computing

• Human-computer interaction

• Simulation

• Computational biology

• Computational neuroscience

• Etc.

Page 13: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

CSE 515 in One Slide

We will learn to:

• Put probability distributions on everything

• Learn them from data

• Do inference with them

Page 14: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Topics (I)

• Basics of probability and statistical estimation

• Mixture models and the EM algorithm

• Hidden Markov models and Kalman filters

• Bayesian networks and Markov networks

• Exact inference

• Approximate inference

Page 15: CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Topics (II)

• Parameter estimation• Structure learning• Discriminative learning• Maximum entropy estimation• Dynamic Bayes nets and particle filtering• Relational models• Decision theory and Markov decision

processes