Robust Activity Recognition Henry Kautz University of Washington Computer Science & Engineering...

Robust Activity RecognitionRobust Activity Recognition

Henry Kautz

University of WashingtonComputer Science & Engineering

graduate students: Don Patterson, Lin Liao,Krzysztof Gajos, Karthik Gopalratnam

CSE faculty: Dieter Fox, Gaetano Borriello

UW School of Medicine: Kurt Johnson, Pat Brown, Brian Dudgeon, Mark Harniss

Intel Research: Matthai Philipose, Mike Perkowitz,

Ken Fishkin, Tanzeem Choudhury

In the Not Too Distant Future...

In the Not Too Distant Future...

Pervasive sensing infrastructureGPS enabled phonesRFID tags on all consumer productsElectronic diaries (MS SenseCam)

Healthcare crisisAging baby boomers – epidemic of Alzheimer’s Disease Deinstitutionalization of the cognitively disabledNationwide shortage of caretaking professionals

...An Opportunity...An Opportunity

Develop technology toSupport independent living by people with cognitive disabilities

At homeAt workThroughout the community

Improve health careLong term monitoring of activities of daily living (ADL’s)Intervention before a health crisis

The UW Assisted Cognition ProjectThe UW Assisted Cognition Project

Synthesis of work inUbiquitous computingArtificial intelligenceHuman-computer interaction

ACCESSSupport use of public transitUW CSE & Rehabilitation Medicine

CAREADL monitoring and assistanceUW CSE & Intel Research

This TalkThis TalkBuilding models of everyday plans and goals

From sensor dataBy mining textual descriptionBy engineering commonsense knowledge

Tracking and predicting a user’s behavior

Noisy and incomplete sensor dataRecognizing user errors

First steps

ACCESSAssisted Cognition in Community, Employment, & Support Settings

Supported by the National Institute on Disability & Rehabilitation Research (NIDDR)

ACCESSAssisted Cognition in Community, Employment, & Support Settings

Supported by the National Institute on Disability & Rehabilitation Research (NIDDR)

Learning & Reasoning About Transportation

Routines

TaskTask

Given a data stream from a wearable GPS unit...

Infer the user’s location and mode of transportation (foot, car, bus, bike, ...)Predict where user will goDetect novel behavior

User errors?Opportunities for learning?

Why Inference Is Not Trivial

Why Inference Is Not Trivial

People don’t have wheelsSystematic GPS error

We are not in the woodsDead and semi-dead zonesLots of multi-path propagationInside of vehiclesInside of buildings

Not just location trackingMode, Prediction, Novelty

GPS Receivers We UsedGPS Receivers We Used

Nokia 6600 Java Cell Phone with Bluetooth

GPS unit

GeoStats wearable

GPS logger

Geographic Information Systems

Geographic Information Systems

Bus routes and bus stopsData source: Metro GIS

Street mapData source: Census 2000

Tiger/line data

ArchitectureArchitecture

Learning Engine

Inference Engine

GIS

Database

Goals Paths Modes Errors

Probabilistic ReasoningProbabilistic Reasoning

Graphical model: Dynamic Bayesian network

Inference engine: Rao-Blackwellised particle filters

Learning engine: Expectation-Maximization (EM) algorithm

Flat Model: State SpaceFlat Model: State Space

Transportation ModeVelocityLocation

BlockPosition along blockAt bus stop, parking lot, ...?

GPS Offset ErrorGPS signal

Motion Model for Mode of Transportation

Motion Model for Mode of Transportation

Rao-Blackwellised Particle Filtering

Rao-Blackwellised Particle Filtering

Inference: estimate current state distribution given all past readingsParticle filtering

Evolve approximation to state distribution using samples (particles)Supports multi-modal distributionsSupports discrete variables (e.g.: mode)

Rao-BlackwellisationParticles include distributions over variables, not just single samplesImproved accuracy with fewer particles

TrackingTracking

blue = foot, green = bus, red = car

LearningLearning

User model = DBN parametersTransitions between blocksTransitions between modes

Learning: Monte-Carlo EMUnlabeled data30 days of one user, logged at 2 second intervals (when outdoors)3-fold cross validation

ResultsResults

ModelMode Prediction

Accuracy

Decision Tree(supervised)

55%

Prior w/o bus info 60%

Prior with bus info 78%

Learned 84%

Pro

babili

ty o

f co

rrect

ly

pre

dic

tin

g t

he f

utu

rePro

babili

ty o

f co

rrect

ly

pre

dic

tin

g t

he f

utu

re

City BlocksCity Blocks

Prediction AccuracyPrediction Accuracy

How can we improve

predictive power?

Transportation Routines Transportation Routines

BA

Goalswork, home, friends, restaurant, doctor’s, ...

Trip segmentsHome to Bus stop A on FootBus stop A to Bus stop B on BusBus stop B to workplace on Foot

Work

Hierarchical ModelHierarchical Model

xk-1

zk-1 zk

xk

mk-1 mk Transportation mode

x=<Location, Velocity>

GPS reading

tk-1 tk

gk-1 gk Goal

Trip segment

Hierarchical LearningHierarchical Learning

Learn flat modelInfer goals

Locations where user is often motionlessInfer trip segment begin / end points

Locations with high mode transition probability

Infer trips segmentsHigh-probability single-mode block transition sequences between segment begin / end points

Perform hierarchical EM learning

Inferring GoalsInferring Goals

Inferring Trip SegmentsInferring Trip Segments

Going to work Going home

Correct goal and route predicted

100 blocks away

Application:

Opportunity Knocks

Application:

Opportunity Knocks

Demonstrated at AAHA Future of Aging Services, Washington, DC, March, 2004

Novelty DetectionNovelty Detection

Approach: model-selectionRun two trackers in parallel

Tracker 1: learned hierarchical modelTracker 2: untrained flat modelEstimate the likelihood of each tracker given the observations

Missing the bus

stop

Novelty DetectionNovelty Detection

CARECognitive Assistance in Real-world

Environments

supported by the Intel Research Council

CARECognitive Assistance in Real-world

Environments

supported by the Intel Research Council

Learning & Inferring Activities of Daily Living

Research HypothesisResearch Hypothesis

Observation: activities of daily living involve the manipulation of many physical objects

Cooking, cleaning, eating, personal hygiene, exercise, hobbies, ...

Hypothesis: can recognize activities from a time-sequence of object “touches”

Such models are robust and easily learned or engineered

Sensing Object Manipulation

Sensing Object Manipulation

RFID: Radio-frequency ID tagsSmallSemi-passiveDurableCheap

Where Can We Put Tags?Where Can We Put Tags?

How Can We Sense Them?How Can We Sense Them?

coming... wall-mounted “sparkle reader”

Example Data StreamExample Data Stream

Technical ApproachTechnical Approach

Define (or learn) activities in simple, high-level language

Multi-step, partially-ordered activitiesVarying durationsProbabilistic association between activities and objects

Compile to a DBNInfer behavior using particle filtering

Making TeaMaking Tea

Activity LibraryActivity Library

Building ModelsBuilding Models

Core ADL’s amenable to classic knowledge engineeringOpen-ended, fine-grained models: infer from natural language texts?

Perkowitz et al., “Mining Models of Human Activities from the Web”, WWW-2004

Translation to DBNTranslation to DBN

Tricky issues:TimePartial ordersObject-use probabilities

80% chance of using the teapot sometime during the “heat water” stepInstantaneous probability of seeing teapot is not fixed!

Consider: 100% chance of using teapot if making tea

DBN Encoding: DurationDBN Encoding: Duration

Dt

At At+1

Dt+1

DBN Encoding: Partial Orders

DBN Encoding: Partial Orders

Pt

At At+1

Pt

DBN Encoding: Object Probabilities

DBN Encoding: Object Probabilities

zt

Dt

At

Ot

Ht

Instantaneous probability of touching an

object cannot be a constant

DBN EncodingDBN Encoding

zt

Pt

Dt

At

Ot

Ht

At+1

Dt+1

Ht+1

Pt

What’s in a Particle?What’s in a Particle?

Sample of ActivityStarting time – sufficient to represent distribution of DurationHistory list of objectsPartial-order “credits”

Experimental SetupExperimental Setup

Hand-built library of 14 ADL’s17 test subjectsEach asked to perform 12 of the ADL’sData not segmentedNo training on individual test subjects

Sample OutputSample Output

ResultsResults

ADL Precision Recall

1 Grooming 92 92

2 Tooth brushing 70 78

3 Toileting 73 73

4 Dishwashing 100 33

5 Housecleaning 100 75

6 Appliance use 84 78

7 Adjust furnace 100 73

8 Laundry 100 78

9 Prepare snack 75 60

10 Prepare beverage 64 64

11 Use telephone 100 79

12 Leisure activities 100 58

13 Infant care 100 93

14 Take medication 100 82

Overall 88 73

Key Next StepsKey Next Steps

Parameter learningTimingObject probabilities

Structure learningNew activities from sensor data

Efficient inference forInterrupted activitiesAbandoned activitiesMalformed activities

Relational modelsHierarchical classes of objectsHierarchical classes of activities

Ultimately...Ultimately...

Affective stateagitated, calm, attentive, ...

Physiological stateshungry, tired, dizzy, ...

Interactions between peopleT. Choudhury – Social dynamics

Principled human-computer interaction

Decision-theoretic control of interventions

Why Now?Why Now?A goal of much work of AI in the 1970’s was to create programs that could understand the narrative of ordinary human experienceThis area pretty much disappeared

Missing probabilistic toolsSystems not able to experience worldLacked focus – “understand” to what end?

Today: the tools, the sensors, motivation

That Other Talk...That Other Talk...

Combining Component Caching and Clause Learning for Effective Model Counting

Beame, Bacchus, Kautz, Pitassi, & Sang (SAT 2004, Vancouver BC)

Unifies algorithms for SAT and Bayesian inference

DPLL-based, generalizes recursive conditioningExact inference in large, non-tree-like networks

Need to solve #P? Let me know!

Robust Activity Recognition Henry Kautz University of Washington Computer Science & Engineering...

Documents

Transcript of Robust Activity Recognition Henry Kautz University of Washington Computer Science & Engineering...