Research in MIT’s Laboratory for Information and Decision Systems and in The Stochastic Systems...

Post on 11-Jan-2016

219 views 0 download

Tags:

Transcript of Research in MIT’s Laboratory for Information and Decision Systems and in The Stochastic Systems...

Research in MIT’s Laboratory for Information and Decision Systems and in The Stochastic Systems Group

Alan S. WillskyDirector, LIDS

Head, SSG

willsky@mit.eduhttp://ssg.mit.eduNovember 2010

A Brief History of LIDS The oldest continuing laboratory on campus Servomechanism Lab founded in 1940

Major contributions to crucial applications Military fire control, Numerically-controlled machines, …

Pushing emerging computer technologies Whirlwind, APT, …

Broadened agenda and name changes: ESL (1950s) and LIDS (1970s)

Radar - Porcupine point defense INTREX - One of the first database systems Modern Control and Optimization Robust and adaptive control Large-scale and decentralized systems

Continuing history of involvement and partnerships with industry and government (including a number of successful start-ups)

Continuing history of major impact on academic programs and development of widely-used texts

LIDS Now A center of gravity for research on the analytical information

and decision sciences Our mission: Pushing the envelope and foundations of

information and decision sciences in the large Research “centers of gravity” and traditional core disciplines

Systems and control Optimization Networks Inference, estimation, learning, and fusion Communications and information theory

Major push to work across disciplines, e.g., The Science of Networked Systems

A sampling of application areas Coordination/control of autonomous vehicles Energy and economic information and decision systems Situational awareness Biological and biomedical signal and image analysis and modeling Large-scale data assimilation for the geosciences

402/13/2008 Team MIT

The DARPA Urban Challenge (Joint CSAIL/LIDS): Example: Evasive Maneuvering Intention of other cars not always clear

Have to believe that other vehicles will behave rationally Still need to be able to avoid accordingly

Video shows safe avoidance maneuver

First demonstration of UWB Localization

From 9% to 87%

C-LOCIncrease in coverage

Improved precision

C-LOCIncrease in coverage

Improved precision

Gossip algorithm: P2P networks

• Peer-to-peer networkso Architecture of choice for content

dissemination, e.g. BBC iPlayero Need extremely simple

algorithms

• Randomized gossip algorithm• Local, iterative and very simple• Robust through randomness

• Efficient gossip solutions for• Content dissemination• Code-based distributed storage• Separable function computation

• Performance is determined by• Spectral properties of network

cut

MAC Protocol that finally works!

Contention resolution or Medium access Fundamental to any well engineered system,

e.g. emerging wireless networks Challenge: need efficient & implementable

MAC Unresolved quest for over four decades.

A new queued based MAC protocol Insights from learning, stat physics &

theory of Markov chains Essentially, each transmitter transmits or

not Independently, with probability

that is function of its own backlog And that’s it !

Theorem. This MAC protocol is efficient. Received ACM Sigmetrics best paper award

2009

Learning a Large Circuit Evaluating yield of an SRAM cell

To a high degree of accuracy for low failure prob. Our approach

Identify effective failure event inspired by theory of Large Deviations

Rare events happen only in a typical manner Efficient sampling mechanism based on importance

sampling

now: 2 minutesbefore: 2 months

SMART IRG#4: Future Mobility

Objectives: Develop in and beyond Singapore new paradigms for the

planning, design, and operation of future urban transportation systems

Sustainability, societal, and environmental well-being in a high-density, livable urban environment

Multi-disciplinary foundational elements Pillar 1: Networked Computing and Control [Frazzoli,

Jaillet, Dahleh] Enable transformative technologies for urban transportation by collecting, storing, securely processing, and exploiting fine-granularity mobility data through the increasingly powerful Internet “cloud” and personal devices

Pillar 2: Integrated Models of Land Use, Mobility, and Energy and Resource Use [Jaillet, Frazzoli]Develop advanced integrated behavioral models to predict the effects of system interventions. Development of new simulation, optimization, and evaluation tools for real-time services and system controls.

Pillar 3: Performance assessment and implementationDeveloping “metrics that matter” to enable scale-able system assessment approaches to validly and reliably measure sustainability impacts.

Hybrid Electric Vehicle (HEV)

The HEV draws power from two sources Internal Combustion Engine: Primary power source Battery: Secondary power source

assists engine at high torques – engine is less fuel-efficient at high torques

allow for engine shut-off while idle -- waiting at a red light recharges through regenerative braking

Challenge: optimal battery usage utilizing GPS data

Networked Control

System

Controller

System Network controller

From classical co-located systems to networked systems

Vehicle Routing Problems in a Dynamic World

“User” or “environment” model: Events of interest driven over time by

exogenous processes. Stochastic or adversarial models. Complex task specifications (e.g.,

temporal logic constraints). “System” model

Vehicles subject to algebraic, differential, and integral constraints.

Local sensing and communications. Limited computational resources. Heterogeneous systems: different

vehicles, human agents. Performance Criteria

Quality of Service: minimize delays, maximize capacity/throughput.

Approach Design polynomial time approximation

algorithms. Novel tools combining systems and control

theory, combinatorial optimization, queueing theory, stochastic processes, game theory, learning and estimation.

Traveling Repairperson Dial-A-Ride Environmental Monitoring Mobile sensor networks Surveillance Search and Rescue Area Denial Crime prevention Security Network connectivity Emergency Relief Traffic congestion management.

Dynamics in Social Networks Spread of different “epidemics” may have similar structures

Models for understanding dynamics of fads, opinions, conventions, technological innovations, and implications of network structure

Tuberculosis outbreak

Word-of mouth product

recommendations

Statistical methods for locating “rumor” sources

Proving desired performance and absence of run-time errors in real-time embedded software is critical.

Software can be modeled as a dynamical system. Specific Lyapunov-like functions can prove critical properties such as absence of variable overflow and termination in finite-time. Optimization methods such as semi-definite programming or linear programming can be used to find these Lyapunov-like functions.

Verification of Real-Time Embedded Software

Lyapunov-like FunctionsCan Prove Certain

Invariant Properties ofDynamical Systems

Optimization-Based Search(e.g. Semidefinite Programming)

for Lyapunov-like Invariants

UndecidableProblems

SuitableDynamical System

Model

Computer Program

Terminates inFinite-time?

Runs without

Overflow?

Scale up for Application to

Large Programs

Bandgap optimization Systematic design of materials for wave

propagation Decision variables are dielectric

composition “Good” material properties determined by

spectral bandgap

Mangan, et al., OFC 2004 PDP24

SSG Themes Representation and extraction of information in complex data and

phenomena Models that capture rich classes of phenomena and also lead to scalable

algorithms Graphical models represents a major component of our efforts

Representation and extraction of geometric information Learning, model discovery, and data mining

Fusion, segmentation, etc., when models aren’t available (or trustworthy) a priori Or when we desire models that have desirable properties (e.g., sparsity,

tractability Statistical methods for distributed phenomena

Graphical models/Markov random fields Sensor networks and fusion

Application areas Situational awareness/multisensor fusion in complex environments Computer vision Sensor networks Geophysical data assimilation and remote sensing Medical imaging …

Inference algorithms for graphical models on trees

Message-passing algorithms for “estimation” (marginal computation) Two-sweep algorithms (leaves-root-leaves)

For linear/Gaussian models, these are the generalizations of Kalman filters and smoothers

Belief propagation, sum-product algorithm Non-directional (no root; all nodes are equal) Lots of freedom in message scheduling

Message-passing algorithms for “optimization” (MAP estimation) Two sweep: Generalization of

Viterbi/dynamic programming Max-product algorithm

What do people do when there are loops?

Turn graphs into trees Junction trees and cutset models Dimensionality/combinatorial explosion in many cases

Learn (or approximate with) models with tractable structure

Multiresolution models and others with hidden variables Another well-oiled approach

Belief propagation (and max-product) are algorithms whose local form is well defined for any graph

However for a loopy graph, BP fuses information based on invalid assumptions of conditional independence

When does this converge? What does it converge to? Come up with new algorithms

Recursive Cavity Models

Graphical Model Example #1 Near-optimal, scalable, and very large-scale data

assimilation for geophysical mapping (and uncertainty quantification)

Multiresolution/Hierarchical Models: A Continuing SSG Theme

Earliest work: MR models on pyramidal trees Subsequent:

Algorithms that use embedded trees as the kernel of iterative algorithms

MR models but with sparse in-scale conditional graphical structure or conditional correlations Iterative algorithms with good properties

And there’s more “in the works”

Graphical Model Example #2• Fusion of multi-modal, multi-resolution data (and estimation of critical aggregate variables)• Learning of hierarchical relationships/dependencies

Graphical Models Example #3: Fast algorithms supporting expert analysts

initial estimates

re-estimates

1757 X 1284 surface, 377384 measurements 3 million nodes in the pyramidal graph Introduce 100 new measurements in a 17 X 17 square

region Use adaptive multipole methods to update in 10 iterations,

each of which involves fewer than 1000 nodes

Walk-sum analysis for Gaussian models

Gaussian models are specified in terms of the inverse covariance, J

Sparsity pattern determines graph structure Computing estimates involves solving linear equations

involving J Message-passing algorithms involve “information

walks” along paths in a graph For Gaussian problems these correspond to the computation of

walk sums – easiest to see if J is normalized so that = I - R J-1 =* I + R + R2 + … (makes sense if absolutely summable)

Walk-sum analysis and walk-summability Provides conditions for convergence of algorithms such as

Belief Propagation (which only captures some of the walks) Provides a very clear picture of when BP fails and why it does

so catastrophically

A simple example

Extensions - I Embedded subgraph iterations

Cut some edges to get a tractable graph (e.g., tree)

Perform exact inference (collect all walks) in the subgraph

Richardson iteration: Correction term for effects of edges left out (corresponds to single hop across cut edges)

Repeat – although one can cut different edges Result: Can collect all walks this way and get

exact answer asymptotically

Extensions - II Segregating “feedback nodes”

Find a set of nodes so that removing them cuts all (most) cycles

Then have three-step algorithm BP in remaining graph – exact (approximate) if all cycles

removed Solve inference on the set of feedback nodes Correction BP step for the remainder of the graph

Exact if have complete feedback vertex set Can yield excellent results even if don’t use complete set

Can work even for non-walk-summable models Experiments indicate can get very good results with log(n)

feedback nodes

Nonparametric Inference for General Graphs

Problem: What is the product of two collections of particles?

Belief Propagation•General graphs

•Discrete or Gaussian

Particle Filters•Markov chains

•General potentials

Nonparametric BP•General graphs

•General potentials

Graphical Models Example #4: Dynamic fusion in complex, constrained contexts

Multisensor Data Association in Sensor Networks

Organized network data association

Self-organization with region-based

representation

Hierarchical Dirichlet Processes and Graphical Models: From Scene/context to objects to parts/shape to features

speaker label

speaker state

observations

Speaker-specific transition densities

Speaker-specific mixture weights

Mixture parameters

Speaker-specific emission distribution – infinite Gaussian mixture

Emission distribution conditioned on speaker state

10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Gibbs Iteration

Nor

mal

ize

Ham

min

g E

rror

0 1 2 3 4 5

x 104

0

2

4

6

8

10

12

14

16Gibbs Iteration 100

Time

Spe

aker

Lab

el

Ground TruthEstimated

Unsupervised extraction of structure in dynamic processes, signals, and images

Hierarchical Dirichlet Processes for Object Recognition and Extraction of Switching Dynamic Behavior

NEED MIKE S. VIDEO

Geometry Extraction #1: Curve evolution methods for “blind” segmentation

MCMC-Curve evolution methods for aided gravity inversion

Top salt constraint With additional constraint

Principal Modes of Shape Uncertainty

Some other things Learning graphical models

Error exponents for learning tree models Learning discriminative tree models Learning tree models with hidden nodes

Applications to computer vision Learning models with hidden variables that expose sparse

conditional structure for the observed variables More nonparametrics

Learning hidden semi-Markov models Identifying more complex hidden structures

Can we learn that the motion of 11 “objects” corresponds to two basketball teams and a basketball – AND can we learn the difference between offense and defense…

Exploiting sparsity Sparse reconstruction with uncertain forward operators