Automated Reading Assistance System Using Point-of-Gaze Estimation M.A.Sc. Thesis Presentation...

Post on 29-Jan-2016

214 views 0 download

Tags:

Transcript of Automated Reading Assistance System Using Point-of-Gaze Estimation M.A.Sc. Thesis Presentation...

M.A.Sc. Thesis Presentation

Automated Reading Assistance Automated Reading Assistance System System

Using Point-of-Gaze EstimationUsing Point-of-Gaze Estimation

Jeffrey J. Kang

Supervisor: Dr. Moshe Eizenman

Department of Electrical and Computer Engineering

Institute of Biomaterials and Biomedical Engineering

January 24, 2006

IntroductionIntroduction

Reading• Visual examination of text

• Convert words to sounds to activate word recognition

We learn appropriate conversions through repetitive exposure to word-to-sound mappings

Insufficient reader skill or irregular spelling can lead to failed conversion: assistance is required

Objective: Develop an automated reading assistance system that automatically vocalizes unknown words in real-time on the reader’s behalf. The system should operate within a natural reading setting.

What We Need To Do — Step 1What We Need To Do — Step 1

1. Identify the word being read, in real-time

2. Detect when the word being read is an unknown word

3. Vocalization of the unknown word

Identifying the Word Being Read Identifying the Word Being Read

Identify the viewed word using point-of-gaze estimation Point-of-gaze is:

• Where we are looking with the highest visual acuity region of the retina

• Intersection of the visual axis of each eye within the 3D scene

• Intersection of the visual axis one eye with a 2D plane

Point-of-Gaze Estimation MethodologiesPoint-of-Gaze Estimation Methodologies

1. Head-mounted 2. Remote (no head-worn components)

Head-mounted Point-of-Gaze EstimationHead-mounted Point-of-Gaze Estimation

Based on principle of tracking the pupil centre, and corneal reflections to measure eye position

Point-of-gaze is estimated with respect to a coordinate system attached to the head

scene camera

eye camera

IR LEDshot

mirror

corneal reflections

pupil centre

Point-of-Gaze in Head Coordinate SystemPoint-of-Gaze in Head Coordinate System

Point-of-gaze is measured in the head coordinate system, and placed on the scene camera image

Locating the Reading ObjectLocating the Reading Object

The position of the reading object is determined by tracking markers

Mapping the Point-of-GazeMapping the Point-of-Gaze

Establish point correspondences from• the estimated positions of the markers in the scene image

• the known positions of the markers on the reading object Homographic mapping of point-of-gaze from scene

camera image to reading object coordinate system

Identify the Reading Identify the Reading ObjectObject

Extract the barcode from the scene camera image to identify the reading object (e.g. page number)

Match barcode to database of reading objects to determine what text is being read

Identifying the Word Being ReadIdentifying the Word Being Read

Using the mapped point-of-gaze, identify the word being read by table lookup

Sample Reading VideoSample Reading Video

Sample Reading VideoSample Reading Video

Mapping AccuracyMapping Accuracy

0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Mapping Error (mm)

Pro

po

rtio

n o

f Tria

ls

Point-of-Gaze Estimation MethodologiesPoint-of-Gaze Estimation Methodologies

1. Head-mounted 2. Remote (no head-worn components)

O

2D scene objectZ

X

P

C

visual axis

Y

Remote Point-of-Gaze EstimationRemote Point-of-Gaze Estimation

Point-of-gaze is estimated to a fixed coordinate system

• C – centre of corneal curvature

• P – point-of-gaze

IR LEDs

eye camera

computer screen

O

assumed position of 2D scene object

Z

X

P

C

visual axis

P’

true position of 2D scene object

Y

Moving Reading CardMoving Reading Card

How can point-of-gaze be estimated to a coordinate system attached to a moving reading object?

O

assumed position of 2D scene object

Z

X

P

C

visual axis

P’

true position of 2D scene object

Y

Estimate MotionEstimate Motion

R, T

t0t1

O

assumed position of 2D scene object

Z

X

P

C

visual axis

P’

true position of 2D scene object

Y

Use a Scene Camera and TargetsUse a Scene Camera and Targets

t0t1

Scene Camera

H0

O

assumed position of 2D scene object

Z

X

P

C

visual axis

P’

true position of 2D scene object

Y

Calculate Two HomographiesCalculate Two Homographies

t0t1

H1

Scene Camera

O

assumed position of 2D scene object

Z

X

P

C

visual axis

P’

true position of 2D scene object

Y

Decompose Homography MatricesDecompose Homography Matrices

t0t1

Scene Camera

R0, T0 R1, T1

O

assumed position of 2D scene object

Z

X

P

C

visual axis

P’

true position of 2D scene object

Y

Calculate Motion of 2D Scene ObjectCalculate Motion of 2D Scene Object

t0t1

Scene Camera

R, T

R0, T0 R1, T1

Point-of-Gaze AccuracyPoint-of-Gaze Accuracy

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

16.00

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1

Noise Std. Dev. (pixels)

Err

or

in P

oin

t-o

f-G

aze

(m

m)

What We Need To Do: Step 2What We Need To Do: Step 2

1. Identify the word being read, in real-time

2. Detect when the word being read is an unknown word

3. Vocalization of the unknown word

Dual Route Reading ModelDual Route Reading Model

text

speech

Orthographic Analysis

OrthographicInput Lexicon

PhonologicalOutput Lexicon

Response Buffer

Semantic System

Grapheme-Phoneme

Rule System

LEXICAL ROUTE

NON-LEXICAL ROUTE

Coltheart, M. et al. (2001)

Each word’s graphemes are processed in parallel

Dual Route Reading ModelDual Route Reading Model

text

speech

Orthographic Analysis

OrthographicInput Lexicon

PhonologicalOutput Lexicon

Response Buffer

Semantic System

Grapheme-Phoneme

Rule System

LEXICAL ROUTE

NON-LEXICAL ROUTE

Dual Route Reading ModelDual Route Reading Model

text

speech

Orthographic Analysis

OrthographicInput Lexicon

PhonologicalOutput Lexicon

Response Buffer

Semantic System

Grapheme-Phoneme

Rule System

LEXICAL ROUTE

NON-LEXICAL ROUTE

Each word’s graphemes are individually converted into phonemes based on mapping rules

Detecting Unknown WordsDetecting Unknown Words

For unknown words, the lexical route fails and the slower non-lexical route is used

Hypothesis: we can differentiate between known and unknown words by the duration of the processing time

Processing TimeProcessing Time

Gaze Duration (Subject P.L. - Aloud Reading)

0

500

1000

1500

2000

2500

3000

3500

4000

0 2 4 6 8 10 12

Word Length (letters)

Tim

e (s

ec)

Normal Words

Difficult Words

Setting a Threshold CurveSetting a Threshold Curve

Gaze Duration (Subject P.L. - Aloud Reading)

0

500

1000

1500

2000

2500

3000

3500

4000

0 2 4 6 8 10 12

Word Length (letters)

Tim

e (s

ec)

Normal Words

Difficult Words

Threshold

Threshold curve is a function of word length Model processing time for known words (length k) as a Gaussian random variable

(μk, σk2)

Estimate μk, σk2 from a short training set for each subject

Each point on threshold curve is given by

• α is the constrained probability of false alarm

)μ 2α(1erf2σT k1

kk

Setting the ThresholdSetting the Threshold

Experiment: Detecting Unknown WordsExperiment: Detecting Unknown Words

Remote point-of-gaze estimation system• Reading material presented on computer screen

• Head position stabilized using a chinrest

Four subjects read from 40 passages of text • 20 passages aloud and 20 passages silently

• Divided into training set to “learn” μk, σk2 and set detection

threshold curves

Set false alarm probability α = 0.10 Evaluate detection performance

Experiment: Detecting Unknown WordsExperiment: Detecting Unknown Words

Detection Performance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4

Subject

Ra

te

Training Set Detection RateTest Set Detection RateTraining Set False Alarm RateTest Set False Alarm Rate

Experiment: Natural Setting Reading Experiment: Natural Setting Reading AssistanceAssistance

Natural reading pose• Unrestricted head movement

• Reading material is hand-held

Head-mounted eye-tracker• Identify viewed word in real-time

• Measure per-word processing time

Detecting unknown words• Processing time threshold curves

established in previous experiment

Assistance• Detection of unknown word activates

vocalization

Experiment: Natural Setting Reading Experiment: Natural Setting Reading AssistanceAssistance

Results

Point-of-gaze mapping method accommodated head and reading material movement without reducing detection performance

Subject Detection Rate False Alarm Rate

M.E. 0.94 0.10

P.L. 0.95 0.09

ConclusionsConclusions

Developed methods to map point-of-gaze estimates to an object coordinate system attached to a moving 2D scene object (e.g. reading card)• Head-mounted system

• Remote system

Developed method to detect when a reader encounters an unknown word

Demonstrated principle of operation for an automated reading assistance system

Future WorkFuture Work

Implement reading assistant using remote-gaze estimation methodology

Validate efficacy of system as a teaching tool for unskilled English readers, in collaboration with an audiologist

Evaluate other forms of assistive intervention

• e.g. translation, definition

Questions?Questions?