MOBILITY ENHANCEMENT USING SIMULATED ARTIFICIAL … · MOBILITY ENHANCEMENT USING SIMULATED...

Speech and Audio Research Laboratory of the SAIVT program

Centre for Built Environment and Engineering Research

MOBILITY ENHANCEMENT USING

SIMULATED ARTIFICIAL HUMAN VISION

Jason Dowling

BAppSc/BComp(Hons)

SUBMITTED AS A REQUIREMENT OF

THE DEGREE OF

DOCTOR OF PHILOSOPHY

AT

QUEENSLAND UNIVERSITY OF TECHNOLOGY

BRISBANE, QUEENSLAND

29 MAY 2007

Keywords

Artificial Human Vision, visual prosthesis, blind mobility, image processing, com-

puter vision, mobility assessment, visual simulation, Human Computer Interface

i

Abstract

The electrical stimulation of appropriate components of the human visual sys-

tem can result in the perception of blobs of light (or phosphenes) in totally

blind patients. By stimulating an array of closely aligned electrodes it is pos-

sible for a patient to perceive very low-resolution images from spatially aligned

phosphenes. Using this approach, a number of international research groups are

working toward developing multiple electrode systems (called Artificial Human

Vision (AHV) systems or visual prostheses) to provide a phosphene-based sub-

stitute for normal human vision.

Despite the great promise, there are currently a number of constraints with

current AHV systems. These include limitations in the number of electrodes

which can be implanted and the perceived spatial layout and display frequency of

phosphenes. Therefore the development of computer vision techniques that can

maximise the visualisation value of the limited number of phosphenes would be

useful in compensating for these constraints. The lack of an objective method for

comparing different AHV system displays, in addition to comparing AHV systems

and other blind mobility aids (such as the long cane), has been a significant

problem for AHV researchers. Finally, AHV research in Australia and many

other countries relies strongly on theoretical models and animal experimentation

due to the difficulty of prototype human trials. Because of this constraint the

experiments conducted in this thesis were limited to simulated AHV devices with

iii

normally sighted research participants and the true impact on blind people can

only be regarded as approximated.

In light of these constraints, this thesis has two general aims. The first aim is

to investigate, evaluate and develop effective techniques for mobility assessment

which will allow the objective comparison of different AHV system phosphene

presentation methods. The second aim is to develop a useful display framework to

guide the development of AHV information presentation, and use this framework

to guide the development of an AHV simulation device.

The first research contribution resulting from this work is a conceptual frame-

work based on literature reviews of blind and low vision mobility, AHV technol-

ogy, and computer vision. This framework incorporates a comprehensive number

of factors which affect the effectiveness of information presentation in an AHV

system. Experiments reported in this thesis have investigated a number of these

factors using simulated AHV with human participants. It has been found that

higher spatial resolution is associated with accurate walking (reduced veering),

whereas higher display rate is associated with faster walking speeds. In this way

it has been demonstrated that the conceptual framework supports and guides

the development of an adaptive AHV system, with the dynamic adjustment of

display properties in real-time.

The second research contribution addresses mobility assessment which has

been identified as an important issue in the AHV literature. This thesis presents

the adaptation of a mobility assessment method from the blind and low vision

literature to measure simulated AHV mobility performance using real-time com-

puter based analysis. This method of mobility assessment (based on parameters

for walking speed, obstacle contacts and veering) is demonstrated experimentally

in two different indoor mobility courses. These experiments involved sixty-five

participants wearing a head-mounted simulation device.

iv

The final research contribution in this thesis is the development and evalua-

tion of an original real-time looming obstacle detector, based on coarse optical

flow, and implemented on a Windows PocketPC based Personal Digital Assistant

(PDA) using a CF card camera. PDA based processors are a preferred main pro-

cessing platform for AHV systems due to their small size, light weight and ease

of software development. However, PDA devices are currently constrained by

restricted random access memory, lack of a floating point unit and slow internal

bus speeds. Therefore any real-time software needs to maximise the use of inte-

ger calculations and minimise memory usage. This contribution was significant

as the resulting device provided a selection of experimental results and subjective

opinions.

v

Contents

Keywords i

Abstract iii

List of Tables xiv

List of Figures xviii

List of Abbreviations xxxi

Authorship xxxiii

Acknowledgments xxxv

Publications xxxvii

1 Introduction 1

1.1 Motivation and Overview . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Aims and Research Questions . . . . . . . . . . . . . . . . . . . . 3

1.3 Research Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Thesis Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.5 Original Contributions of Thesis . . . . . . . . . . . . . . . . . . . 5

2 Blind and Low Vision Mobility Issues and Assessment 9

vii

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Blindness and Low Vision . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Mobility and related issues identified for people with low vision

and blindness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Primary mobility devices for the blind . . . . . . . . . . . . . . . 14

2.4.1 Long Cane . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4.2 Guide Dogs . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4.3 Electronic Travel Aids . . . . . . . . . . . . . . . . . . . . 15

2.5 Orientation Aids . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.6 Environmental accessibility . . . . . . . . . . . . . . . . . . . . . . 18

2.7 The ecological approach to perception . . . . . . . . . . . . . . . . 19

2.8 Mobility assessment . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.8.1 Mobility assessment: Self report research . . . . . . . . . . 21

2.8.2 Mobility assessment: Field Experiment research . . . . . . 23

2.8.3 Mobility assessment: Artificial environment research . . . . 31

2.8.4 Mobility assessment: Combined Field experiment and arti-

ficial environment research . . . . . . . . . . . . . . . . . . 38

2.8.5 Mobility Assessment Conclusion . . . . . . . . . . . . . . . 41

2.9 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3 A Review of Artificial Human Vision 45

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.2 Review of the Human Visual System . . . . . . . . . . . . . . . . 45

3.3 AHV technology and requirements . . . . . . . . . . . . . . . . . 49

3.4 Cortical stimulation . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.4.1 Cortical surface stimulation . . . . . . . . . . . . . . . . . 53

3.4.2 Intracortical stimulation . . . . . . . . . . . . . . . . . . . 55

3.5 Retinal Stimulation . . . . . . . . . . . . . . . . . . . . . . . . . . 59

viii

3.5.1 Subretinal stimulation . . . . . . . . . . . . . . . . . . . . 60

3.5.2 Other subretinal methods . . . . . . . . . . . . . . . . . . 65

3.5.3 Epiretinal stimulation . . . . . . . . . . . . . . . . . . . . 66

3.6 Optic Nerve devices . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.7 AHV simulation studies . . . . . . . . . . . . . . . . . . . . . . . 72

3.8 Evaluation of current AHV systems . . . . . . . . . . . . . . . . . 74

3.9 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4 A Framework for Blind Mobility Improvement via Computer

Vision 77

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.1.1 An information processing approach to computer vision . . 80

4.2 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.2.1 Information reduction . . . . . . . . . . . . . . . . . . . . 81

4.2.2 Scene understanding . . . . . . . . . . . . . . . . . . . . . 95

4.3 Previous applications of computer vision to assist the vision impaired 99

4.4 Relationship between computer vision methods and the Human

Vision System (HVS) . . . . . . . . . . . . . . . . . . . . . . . . . 105

4.5 A conceptual framework for AHV system information display . . . 107

4.5.1 Hypothesised operational scenario . . . . . . . . . . . . . . 114

4.5.2 Benefits of a conceptual framework for AHV information

display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

4.5.3 Application of the conceptual framework for previous AHV

research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

4.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5 AHV Mobility Assessment using Static Images 123

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

ix

5.2.1 Images selected . . . . . . . . . . . . . . . . . . . . . . . . 125

5.2.2 Assessing mobility information . . . . . . . . . . . . . . . . 129

5.2.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

5.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 153

6 AHV Simulation and Obstacle Detection using a Personal Digital

Assistant 155

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

6.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

6.2.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

6.2.2 Obstacle avoidance . . . . . . . . . . . . . . . . . . . . . . 158

6.2.3 Block Based Obstacle Alert . . . . . . . . . . . . . . . . . 161

6.2.4 AHV Simulation Implementation . . . . . . . . . . . . . . 162

6.2.5 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

6.2.6 Statistical methods . . . . . . . . . . . . . . . . . . . . . . 170

6.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 172

6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

6.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 179

7 Mobility Assessment using a PDA-based AHV Simulation in a

course environment 181

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

7.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

7.2.1 AHV Simulation Device . . . . . . . . . . . . . . . . . . . 184

7.2.2 Assessment of mobility performance . . . . . . . . . . . . . 188

7.2.3 Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . 190

7.2.4 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . 192

x

7.2.5 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

7.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 199

8 Effects of Spatial and Temporal Resolution on Mobility Assess-

ment 201

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

8.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

8.2.1 Simulation Hardware . . . . . . . . . . . . . . . . . . . . . 205

8.2.2 Simulation Software . . . . . . . . . . . . . . . . . . . . . 206

8.2.3 Mobility course . . . . . . . . . . . . . . . . . . . . . . . . 207

8.2.4 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . 209

8.2.5 Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . 210

8.2.6 Statistical methods . . . . . . . . . . . . . . . . . . . . . . 210

8.2.7 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

8.3.1 Phosphene spatial resolution . . . . . . . . . . . . . . . . . 212

8.3.2 Frame Rate . . . . . . . . . . . . . . . . . . . . . . . . . . 220

8.3.3 Prior experience with immersive VR . . . . . . . . . . . . 221

8.3.4 Gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

8.3.5 Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

8.3.6 Corrected Vision . . . . . . . . . . . . . . . . . . . . . . . 222

8.3.7 Learning effects . . . . . . . . . . . . . . . . . . . . . . . 225

8.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

8.4.1 Can specific main factors be identified as highly significant

for providing mobility information in an AHV system? . . 229

xi

8.4.2 Can objective measures be developed for the comparison of

effectiveness between AHV systems in providing mobility

information? . . . . . . . . . . . . . . . . . . . . . . . . . . 231

8.4.3 Can computer vision techniques be adopted and modified

to provide mobility information in an AHV system? . . . . 232

8.4.4 Connections with Vision Research . . . . . . . . . . . . . . 233

8.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 234

9 Conclusion and Future Work 237

9.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

9.1.1 Can specific main factors be identified as highly significant

for providing mobility information in an AHV system? . . 239

9.1.2 Can objective measures be developed for the comparison of

effectiveness between AHV systems in providing mobility

information? . . . . . . . . . . . . . . . . . . . . . . . . . . 240

9.1.3 Can computer vision techniques be adopted and modified

to provide mobility information in an AVH system? . . . . 242

9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

9.2.1 Mobility experiments with AHV system recipients . . . . . 243

9.2.2 Symbolic display . . . . . . . . . . . . . . . . . . . . . . . 244

9.2.3 Real world mobility assessment environments . . . . . . . 244

9.2.4 Integration of information from other sensors . . . . . . . . 244

9.2.5 Standard set of mobility related images . . . . . . . . . . 245

9.3 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Bibliography 246

A AHV project web sites 281

xii

B Chapter 7 and 8 experiment materials 285

xiii

List of Tables

2.1 Nottingham Blind Mobility Unit dependent variable measures [6]. 25

2.2 Revised Nottingham Mobility Unit measures from Dodds [51]. Shore-

lining refers to following a path or wall using tactile or auditory

information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3 Mobility measures used in Geruschat & de l’Aune [72] . . . . . . . 31

2.4 Obstacle types used in Lovie-Kitchin et al. [137] . . . . . . . . . . 33

2.5 Mobility and daily activities assessment from West et al. [243] . . 37

2.6 Mobility measures used in Marron et al. [145] . . . . . . . . . . . 39

2.7 Mobility incidents scored in Long et al. [135] . . . . . . . . . . . 40

2.8 Summary of mobility assessment research discussed in this Chap-

ter. ‘Time’ is the amount of time on the course, ‘obst.’ is a count of

obstacle contacts, and ‘veer.’ is the number of incidents of veering

from a path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.1 The feature set used in Everingham et al. [63] . . . . . . . . . . . 103

4.2 Overview of computer vision functionality performed by each part

of the HVS (Based on Thorpe [227]). . . . . . . . . . . . . . . . . 107

5.1 Mobility related image components identified for each image. . . . 130

xv

5.2 Image edge detection and line enhancement thresholds for each

image. Note that the Canny sensitivity listed is the high thresh-

old value. The low threshold value was set to 0.4 times the high

threshold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

5.3 This table shows the ranges used to convert the original x and y co-

ordinates (recorded for each question and image combination) into

a simplified 5x5 element array. For example the x,y value (227,156)

would be re-coded to (5,4). The simplified values were then com-

pared against an array of ‘correct responses’ for each question type. 142

5.4 Steps in identifying correct/incorrect and identified/not identified

grid responses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

5.5 Summary of response classification for each image type. Note that

203 responses of ‘Don’t know’ have been excluded from classification.145

6.1 The number of frames in each image sequence, along with the

duration of each captured sequence. . . . . . . . . . . . . . . . . . 170

6.2 Postal box image sequence results. . . . . . . . . . . . . . . . . . . 178

6.3 Bus shelter image sequence results. . . . . . . . . . . . . . . . . . 178

7.1 AHV simulation display types used for the pilot study . . . . . . . 185

7.2 Questionnaire responses for each participant. . . . . . . . . . . . . 194

7.3 PPWS results for each trial for each participant. The Benchmark

column is the time taken during the 10m guided walk. PWS is

10/Benchmark time. Course (s) is the amount of seconds taken

while walking through the 45m mobile course. SMC is 45/Course

speed. PPWS is SMC/PWS multiplied by 100. . . . . . . . . . . . 194

7.4 PPWS results for each task type and trial. . . . . . . . . . . . . . 194

7.5 Mobility error results for each trial for each participant. . . . . . 194

7.6 Mobility error results for each task type and trial. . . . . . . . . . 195

xvi

7.7 Mobility error summary for each display type. . . . . . . . . . . . 196

7.8 PPWS summary for each display type. . . . . . . . . . . . . . . . 196

7.9 Effect sizes (η2) for the main mobility factors in this pilot study.

‘DV’ represents the dependent variable, ‘F’ is the F-test result and

‘Sig’ represents significance. . . . . . . . . . . . . . . . . . . . . . 196

8.1 Gender and age groups of experiment participants. . . . . . . . . 211

8.2 Mean number of obstacle contacts (with standard deviations) for

different resolution and frame rate. . . . . . . . . . . . . . . . . . 213

8.3 Mean number of veering errors (with standard deviations) for dif-

ferent resolution and frame rate. . . . . . . . . . . . . . . . . . . . 214

8.4 Mean benchmark speeds over 10m (with standard deviations) for

resolution and frame rate. Benchmark no. 1 was recorded before

the first mobility trial. Benchmark no. 2 was recorded after the

second mobility trial. PWS is 10 divided by each Benchmark score.

The combined PWS score in the table is the average PWS for the

two benchmarks for each participant. . . . . . . . . . . . . . . . . 215

8.5 Mean scores (with standard deviations) for the amount for time

spent walking through the mobility course during each trial, and

for PPWS (calculated using combined PWS) during each trial. . 216

9.1 Summary of the main scientific contributions of this thesis. . . . 238

xvii

List of Figures

3.1 Horizontal diagram of the human eye. The locations for the epi-

and sub-retinal implants and the optic nerve electrode are shown.

Adapted from Gregory [83]. . . . . . . . . . . . . . . . . . . . . . 46

3.2 A simplified diagrammatic representation of the cellular layers of

the retina. Light passes through the outer layers of the retina

before being absorbed by the rods and cones of the photoreceptor

layer. Adapted from Sharp and Phillips [199]. . . . . . . . . . . . 47

3.3 The cortical lobes of the human brain. The primary visual cortex,

which is the site for cortical electrode array implants, is also shown.

Adapted from Wandell [239]. . . . . . . . . . . . . . . . . . . . . 49

3.4 Diagram of the main pathways in the HVS. Adapted from Bruce

et al. [24]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.1 Example of reduced visual information in an AHV system: Image

(a) shows a street scene image in suburban Brisbane; in image

(b) the resolution of this image has been reduced to 25x25 pixels.

Image (c) shows a simulated 25x25 phosphene display of the same

image. A sample symbolic representation of the mobility hazards

contained in the street scene is shown in image (d). . . . . . . . . 79

4.2 The 3x3 Laplacian kernel for high pass filtering (for image sharp-

ening). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

xix

4.3 An example 3x3 Gaussian low pass filter kernel for image smooth-

ing. Note the centre element has the greatest weight (0.2042) com-

pared to the others. . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.4 An example of high and low pass filtering on an image. A grey scale

post box image is shown in image (a). Image (b) shows the image

after it has been filtered using the 3x3 Laplacian high pass filter

(detailed in Figure 4.2.1). Image (c) shows the result of applying

the Gaussian low pass filter from Figure 4.2.1. This image has been

taken from an image sequence captured using a low quality PDA

card camera (this sequence and camera is described in more detail

in Chapter 6). As the camera was moving at the time of capture

there is a significant amount of motion blur. It is anticipated that

image quality from cameras used for AHV systems will improve as

technology advances. . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.5 An example of contrast enhancement by histogram expansion. The

base image (a) shows a Brisbane suburban bus shelter. (b) shows

the distribution of the 256 grey-scale values in image (a). The con-

trast in image (c) has been enhanced using histogram equalisation.

The histogram of image (c) is shown in (d). . . . . . . . . . . . . 85

4.6 The 3x3 kernel used for Sobel horizontal edge detection. . . . . . 87

4.7 The 3x3 kernel used for Sobel vertical edge detection. . . . . . . . 87

4.8 Sobel edge detection applied to captured image of a post box (a).

Image (b) shows the result of the horizontal Sobel edge kernel. The

output from the vertical Sobel edge kernel is shown in image (c). . 88

4.9 A comparison of different edge detection methods applied to an

image of suburban footpath (a). The output from the Canny de-

tector is shown in (b). The Sobel detector is shown in (c), and the

Roberts edge detector is displayed in (d). . . . . . . . . . . . . . . 89

xx

4.10 Example application of the Hough transform for locating the fence

boundary shown in image (a). Image (b) shows the output from

Sobel edge detection. The corresponding Hough transform output

is shown in image (c), with the origin in the top left hand side

of the image. This transform image was generated using software

from Seul at al. [198]. The horizontal axis represents r, and the

vertical axis represents θ, which increases from 0 radians in the

top left corner to π radians at the bottom. The dominant peak,

indicating the dominant line, is shown with a superimposed box.

Image (d) shows the pixels which are present along the dominant

line found by the Hough transform. . . . . . . . . . . . . . . . . . 91

4.11 Factors which influence the display processing for an Artificial Hu-

man Vision system. . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.12 Conceptual framework applied to Cha et al.’s simulated AHV mo-

bility experiment. Factors which are not included in this study are

marked with a line pattern. . . . . . . . . . . . . . . . . . . . . . 119

4.13 Conceptual framework applied to Long et al.’s low vision mobil-

ity experiment. Factors which are not included in this study are

marked with a line pattern. . . . . . . . . . . . . . . . . . . . . . 120

5.1 Conceptual framework diagram showing factors which influence

simulated AHV display effectiveness in this chapter. Factors from

Chapter 4 which are excluded from this chapter are marked with

a line pattern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

5.2 Original mobility images used in this chapter. A brief descriptions

for each image is shown in Table 5.1. . . . . . . . . . . . . . . . . 127

xxi

5.3 Example Matlab code used for generating images. This example

creates the output from Sobel edge detection for image A. Child

On Street. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

5.4 Flowchart showing the image processing steps applied for each of

the four image type used in this Chapter. . . . . . . . . . . . . . . 131

5.5 Image processing applied to image A (Child on street). The orig-

inal image (converted to 8 bit grey-scale and 256x256 pixel reso-

lution) is shown with the 5x5 grid mask in figure (a). The binary

image is shown in image (b) and the Canny edge detection image

shown in image (c). The 50x50 8 bit grey-scale image is shown

in image (d). Finally the Sobel edge detection output is shown in

image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

5.6 Image processing applied to image B (Path near road). The orig-






image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

5.7 Image processing applied to image C (Person in office). The orig-






image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

xxii

5.8 Image processing applied to image D (Person in bathroom). The

original image (converted to 8 bit grey-scale and 256x256 pixel

resolution) is shown with the 5x5 grid mask in figure (a). The

binary image is shown in image (b) and the Canny edge detection

image shown in image (c). The 50x50 8 bit grey-scale image is

shown in image (d). Finally the Sobel edge detection output is

shown in image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . 135

5.9 Image processing applied to image E (Sparse office). The original

image (converted to 8 bit grey-scale and 256x256 pixel resolution)

is shown with the 5x5 grid mask in figure (a). The binary image

is shown in image (b) and the Canny edge detection image shown

in image (c). The 50x50 8 bit grey-scale image is shown in image

(d). Finally the Sobel edge detection output is shown in image (e). 136

5.10 Image processing applied to image F (Street scene with tree). The






shown in image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . 137

5.11 Image processing applied to image G (Phone booth obstacle). The






shown in image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . 138

xxiii

5.12 Image processing applied to image H (Railway platform). The






shown in image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5.13 A sample screen from the static image experiment. The x and y

values on the right hand side of the screen show which part of the

image received a mouse click for each question. If the participant

selected a response of ‘5=Definitely No’ for a question, x and y

were set to -1 by default. . . . . . . . . . . . . . . . . . . . . . . . 141

5.14 Summary of question responses for each of the 32 images presented.145

5.15 Summary of responses for each image processing method used in

this experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

5.16 Results for questions 1-4 for each processing method on image 1

(Child on street). . . . . . . . . . . . . . . . . . . . . . . . . . . . 146


(Path near road). . . . . . . . . . . . . . . . . . . . . . . . . . . . 147


(Person in office). . . . . . . . . . . . . . . . . . . . . . . . . . . . 147


(Person in bathroom). . . . . . . . . . . . . . . . . . . . . . . . . 148


(Sparse office). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148


(Street scene with tree). . . . . . . . . . . . . . . . . . . . . . . . 149

xxiv


(Phone booth obstacle). . . . . . . . . . . . . . . . . . . . . . . . 149


(Railway platform). . . . . . . . . . . . . . . . . . . . . . . . . . . 150

5.24 Question 5 (‘next step’) result summary for each type of image. . 150

6.1 Front and side views of the AHV simulator used in the present study.158

6.2 Grid showing the 7x10 pixel blocks used from each 120x160 pixel

image for the PDA motion estimation described below. . . . . . . 161

6.3 Motion vectors extracted calculated from the PDA. The origin of

each motion vector is the centre of each of of the grid blocks in

Figure 6.2. In certain directions the arrow heads look like white

blobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

6.4 Processing steps for the PDA block-based AHV simulation . . . . 164

6.5 The first five steps of the block based approach are illustrated in

these images of a suburban footpath. The number of grey-levels

in the base image (a) was first reduced to 8 grey-levels (b), before

a median filter was applied (c). Finally the image was spatially

reduced from 160x120 pixels to 32x24 blocks (d). . . . . . . . . . 166

6.6 The maximum search area used in Step 7. Each block from the

current image block array (shown on the left) is compared against

the previous image block array (shown on the right). Initially only

the matching block position is compared. The search then checks

the 8 blocks surrounding this position. Finally if a matching block

has not been identified the surrounding 16 blocks are searched

(giving a total search area of 25 blocks). . . . . . . . . . . . . . . 167

xxv

6.7 An example block based alert, shown in (d), which has been trig-

gered in response to looming branches in front of a head-mounted

camera. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

6.8 Frames 10,70,130 and 190 from the postal box mid morning se-

quence (on left) and the bus stop early afternoon sequence (on

right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

6.9 Postal Box recall graph: This graph shows the recall (the number

of correct alerts/the number of alerts) at different stages during

each captured image sequence. . . . . . . . . . . . . . . . . . . . . 174

6.10 Bus shelter recall graph: This graph shows the recall (the number

of correct alerts/the number of alerts) at different stages during

each captured image sequence. . . . . . . . . . . . . . . . . . . . . 175

6.11 An example incorrect alert warning. The shadow shown in the

original median filtered and 8 grey-level image (a) is incorrectly

segmented from the lower resolution image (b) and is assumed to

be a looming obstacle in front of the camera (d). The objects

segments which have been identified are shown in image (c). . . . 176

6.12 Images 153 (top) to 156 (bottom) of the mid morning post box

sequence. The images on the left have been reduced to 8 grey

levels and median filtered. On the right is the segmentation result

for each image. An obstacle alert (shown with an ‘x’ pattern) was

identified for frame 156. . . . . . . . . . . . . . . . . . . . . . . . 177

7.1 Factors which influenced simulated AHV display effectiveness in

this chapter. Excluded factors are marked with a line pattern. . . 183

7.2 Processing steps for the AHV simulator used in the pilot study.

Note the display type is initialised before the images are processed.

The three display types are listed in table 7.1. . . . . . . . . . . 186

xxvi

7.3 Examples of the image types used in this study. Figure (a) is

the base 160x120 pixel 256 grey-level image. The simulator image

using display type 3 is shown in image (b). Image (c) shows the

base image from (a) with 8 grey-levels and a 3x3 median filter

applied. In image (d), image (c) has been reduced to a 32x24

phosphene display (this is used for simulator display types 1 and 2).187

7.4 Images taken of the Gait Lab before the mobility course was set

up. Image (a) shows the black curtains surrounding the lab. The

change area ‘tent’, and raised wooden platform are visible in image

(b). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

7.5 Example soft obstacle set up for the mobility course. . . . . . . . 189

7.6 The indoor course used for mobility assessment in this Chapter. . 191

7.7 Total number of mobility errors for both trials during the mobility

course experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 195

8.1 Factors which influence simulated AHV display effectiveness in this

chapter. Excluded factors are marked with a line pattern. . . . . . 204

8.2 Phosphenes (top row) displayed as grey level pixels in reduced

resolution images . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

8.3 Original 160x120 pixel captured image . . . . . . . . . . . . . . . 207

8.4 Original image reduced to 32x24 phosphenes . . . . . . . . . . . . 208

8.5 Original image reduced to 16x12 phosphenes . . . . . . . . . . . . 208

8.6 Map of the 30m mobility course built for this study. The grey

shaded area is the path identified by black tape on the floor. The

numbers refer to the placement of obstacles and the black lines

denote office partitions. . . . . . . . . . . . . . . . . . . . . . . . . 209

8.7 Different types of grey shading on each obstacle shown in Figure 8.6209

xxvii

8.8 Summary of obstacle errors during trials 1 and 2 by resolution

and frame rate (FPS). The boxes show the middle 50 per cent

of observations, with the median shown by the solid line in the

box. The whiskers coming from each box show the largest value

excluding outliers (which are shown as small circles). . . . . . . . 213

8.9 Frequency of obstacle contacts by obstacle number for different

resolution types and frame rates (FPS). The obstacle types are

displayed in Figure 8.7. . . . . . . . . . . . . . . . . . . . . . . . 214

8.10 Summary of veering errors during trials 1 and 2 by resolution and

frame rate (FPS) . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

8.11 Percentage of Preferred Walking Speed (PPWS) results for trials 1

(PPWS1) and 2 (PPWS2) displayed by resolution type and frame

rate (FPS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

8.12 Time spent walking through the mobility course for trials 1 (Time1)

and 2 (Time2) displayed by resolution type and frame rate (FPS). 217

8.13 Variation of trial 1 PPWS scores by frame rate (FPS) and res-

olution. These results suggest a confounding variable, perhaps

anxiety, during the initial trial. . . . . . . . . . . . . . . . . . . . 218

8.14 Variation of trial 2 PPWS scores by frame rate (FPS) and reso-

lution. These results show an increase in walking confidence as

frame rate and resolution increase. . . . . . . . . . . . . . . . . . 218

8.15 Variation of time spent walking during trial 1 scores by frame rate

(FPS) and resolution. . . . . . . . . . . . . . . . . . . . . . . . . . 219

8.16 Variation of time spent walking during trial 1 scores by frame rate

(FPS) and resolution. . . . . . . . . . . . . . . . . . . . . . . . . . 219

8.17 Median time spent walking through the mobility for during Trial

1 (Time1) and 2 (Trial 2) for different age groups. . . . . . . . . . 223

xxviii

8.18 Median PPWS scores from Trial 1 (PPWS1) and 2 (PPWS2 2) for

different age groups. . . . . . . . . . . . . . . . . . . . . . . . . . 224

8.19 Participant Preferred Walking Speed (PWS) results during trial 1

and 2 (r = 0.74). . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

8.20 Participant Percentage Preferred Walking Speed (PPWS) results

during trial 1 and 2 (r = 0.88). . . . . . . . . . . . . . . . . . . . 226

8.21 Participant time spent walking during trial 1 and 2 (r = 0.87). . . 227

8.22 Participant Speed on Mobility Course (SMC) results for trial 1 and

2 (r = 0.87). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

8.23 Veering incidents for each participant during trial 1 and 2 (r = 0.60).228

8.24 Participant obstacle contacts during trial 1 and 2 (r = 0.16). . . . 228

B.1 Coversheet provided to participants before the AHV simulation

experiments described in Chapter 7 and 8. . . . . . . . . . . . . . 286

B.2 Questionnaire provided to participants before the AHV simulation

experiments described in Chapter 7 and 8. . . . . . . . . . . . . . 287

B.3 Record sheet used by the experimenter during the AHV simulation

experiments described in Chapter 7 and 8. The locate object task

was not used for the Chapter 8 experiment. . . . . . . . . . . . . 288

xxix

List of Abbreviations

AHV Artificial Human Vision

CCD Charged Coupled Device

CMOS Complementary Metal-Oxide Semiconductor

CF CompactFlash

AHV Artificial Human Vision

CCD Charged Coupled Device

CMOS Complementary Metal-Oxide Semiconductor

CF CompactFlash

ETA Electronic Travel Aid (for the visually impaired)

fMRI functional Magnetic Resonance Imaging

FPU Floating Point Unit

FPS Frames Per Second

HCI Human Computer Interface

HMD Head Mounted Display

HVS Human Visual System

LGN Lateral Geniculate Nucleus

HVS Human Visual System

LGN Lateral Geniculate Nucleus

LVES Low Vision Enhancement System

MPDA Microphotodiode Array

NIH National Institutes of Health (United States)

O&M Orientation and Mobility

xxxi

PWI Productive Walking Index

PWS Preferred Walking Speed

PPWS Percentage of Preferred Walking Speed

ROI Region of Interest

RP Retinitis Pigmentosa

SAD Sum of Absolute Differences

SMC Speed on the Mobility Course

TMS Transcranial Magnetic Stimulation

UAE Utah Electrode Array

USB Universal Serial Bus

VR Virtual Reality

WHO World Health Organisation

xxxii

Authorship

The work contained in this thesis has not been previously submitted for a degree

or diploma at any other higher educational institution. To the best of my knowl-

edge and belief, the thesis contains no material previously published or written

by another person except where due reference is made.

Signed:

Date:

xxxiii

Acknowledgments

I greatly appreciate the supervision, time, books, motivation and patience shown

by Anthony Maeder and Wageeh Boles, my two supervisors over the course of this

PhD. Many thanks also to my associate supervisor Jim Patrick and the financial

assistance from Cochlear Inc and the Australian Research Council.

I was lucky to have worked with Justin Boyle during the early stages of this

PhD and have really appreciated his advice, friendship, and comments on my

papers. Thanks also to everyone in the SAIVT lab, especially to Chris McCool,

Patrick Lucey, Brendan Baker and Robbie Vogt. Also thanks to Michael Mason

for assistance over the past three years.

I am grateful to Vivek Chowdhury (who originally suggested researching AHV

mobility), and also to Gislin Dagnelie, Greg Suaning and Luke Hallum for helpful

comments. Thanks also to Peter Meijer for his encouragement, and references;

Jason Ford for his comments on insect vision included in Chapter 8 of this the-

sis; Trevor Laimer for his assistance with setting up the L Block experiment in

Chapter 8; Graham Kerr for allowing me to borrow the gait lab at Kelvin Grove

campus for the experiment in Chapter 7; Andy Boud (VR Solutions) for allowing

me to trial the VR HMD; and Darren Stacey for helping with the PDA headgear

attachment used in Chapter 7. Thanks to Bashir Ebrahim (Guide Dogs Queens-

land) for helpful discussions and letting me trial a number of ETA’s. Thanks also

to Jan Lovie-Kitchin, Graham Kerr, Grace Soong, Doug Mahar (all QUT) and

xxxv

Anthony Richardson (CSIRO/UQ) for experimental design and statistical advice.

Thanks also to my parents for their support and encouragement of my edu-

cation over my life.

Finally, and most importantly, to Joanne and Lewis: I truly appreciate your

support, laughs, encouragement and company, and look forward to having more

time to spend with you! I couldn’t have done this without you both.

Jason Dowling

Queensland University of Technology

May 2007

xxxvi

Publications

The research has resulted in the following fully refereed publications (or abstract

refereed where indicated by an asterisk).

Book Chapters

1. Dowling, J., Boles, W., & Maeder, A. (2007). Visual Prostheses for the

Blind: A Framework for Information Presentation. Mechatronics and

Machine Vision in Practice, Billingsley J. (ed), Springer-Verlag,

Heilderburg (In Press).

Journal Articles

2. Dowling, J. (2005). Artificial human vision. Expert Review of Medical

Devices, 2(1), 73-85.

3. Dowling, J., Maeder, A., & Boles, W. (2006). Effects of low spatial

resolution and frame rate on mobility assessment using simulated artificial

human vision. Displays (Submitted)

Conference Papers

4. Dowling, J., Maeder, A., & Boles, W. (2003). Intelligent image processing

constraints for blind mobility facilitated through artificial vision.

Proceedings of the 8th Australian and New Zealand Intelligent Information

Systems Conference (ANZIIS), 109-114.

xxxvii

5. *Dowling, J., Maeder, A., & Boles, W. (2004). Mobility enhancement and

assessment for a visual prosthesis. In A. A. Amini & A. Manduca (Eds.),

Proceedings of SPIE – Volume 5369 Medical Imaging 2004: Physiology,

Function, and Structure from Medical Images (pp. 780-791). San Diego.

6. Dowling, J., Maeder, A. J., & Boles, W. W. (2005). A PDA based

Artificial Human Vision Simulator. Proceedings of the APRS Workshop

on Digital Image Computing, February 2005, Brisbane, Australia, 109-114.

7. Dowling, J., Boles, W., & Maeder, A. (2005). Mobility assessment using

simulated Artificial Human Vision. Proceedings of the 2005 IEEE

Computer Society Conference on Computer Vision and Pattern

Recognition (CVPR’05) Volume 03 (pp. 32): IEEE Computer Society.

8. *Dowling, J., Boles, W., & Maeder, A. (2005). Enhancing Artificial

Human Vision Systems to Assist Blind Mobility. Proceedings of the 2005

Smart Systems - Postgraduate Research Conference (pp. 133-141):

Queensland University of Technology.

9. *Dowling, J., Boles, W., & Maeder, A. (2005). The effect of frame rate

and spatial resolution on mobility using simulated Artificial Human

Vision. Digital Image Computing: Techniques and Applications (DICTA

2005) - Unreviewed poster session, December 2005, Cairns, Australia.

10. Dowling, J., Boles, W., & Maeder, A. (2006). Simulated Artificial Human

Vision: The Effects of Spatial Resolution and Frame Rate on Mobility.

Advances in Intelligent IT - Active Media Technology 2006 (pp. 138-143),

June 2006, Brisbane, Australia.

11. Dowling, J., Boles, W., & Maeder, A. (2006). A Display Framework for

Artificial Human Vision information presentation. Thirteenth Annual

xxxviii

Conference on Mechatronics and Machine Vision in Practice (M2VIP

2006), December 2006, Toowoomba, Australia.

xxxix

Chapter 1

Introduction

1.1 Motivation and Overview

There are currently a number of research teams investigating the partial restora-

tion of sight to blind people. As early as 1929 it was noted by Otfrid Foerster

that stimulating the visual cortex of a person led to the perception of spots of

‘light’ [90]. These spots are referred to as phosphenes [90]. More recently it has

been found that stimulation of other components of the visual pathway (such as

the optic nerve or retina) can lead to phosphene perception. Phosphenes have

also been reported as a result of magnetic stimulation [82], hallucinogenic drugs

[202], and space travel [192] . With recent advances in digital camera technol-

ogy, computing, neuroscience and electrode technology, much progress has been

made toward building a useful Artificial Human Vision (AHV) system to present

phosphene information to a blind person.

The main motivation for AHV research is the promise of an improved quality

of life for blind people. Economic productivity could also be enhanced: for exam-

ple, it has been estimated that if all avoidable blindness in the USA in persons

under 20 and working-age adults were prevented, the federal budget could save

1

2 Chapter 1. Introduction

US$1.0 billion per year [245].

However, despite their great promise, there are currently a number of con-

straints in current AHV systems, including limitations in the number of elec-

trodes which can be implanted and the perceived spatial layout and frame rate of

phosphenes. The development of computer vision techniques that can maximise

the value of the limited number of phosphenes would be useful in compensating

for these constraints. A further problem is the limited number of people who have

received an AHV system implant and who are available for further research. For

this reason, there is a need to conduct simulated AHV experiments with normally

sighted research participants.

Three main functional requirements for blind users of an AHV system include

the ability to read text [47], [70]; to recognise faces [230], [226] and mobility [29],

[117]. Although reading and face recognition research have received attention in

simulation studies (for example [226], [19]) there has been less research conducted

on mobility. One reason for this gap in the literature is the difficulty in measuring

mobility objectively. The lack of an objective method to assess mobility perfor-

mance, allowing the comparison between different devices, was expressed by AHV

pioneer William Dobelle in 2000:

“We know of no objective method for comparing our ‘artificial vision’

system with a cane, guide dog, or other aid for the blind. For example,

there is no standard obstacle course on which such devices, or the

performance of volunteers using them, can be rated.” [48]

This problem remains unresolved as reported by Trick in 2004 [229]. However,

in this thesis it is argued that there is existing literature on low vision and blind

orientation and mobility (O&M) which can be adapted for this purpose during

AHV system development.

1.2. Aims and Research Questions 3

1.2 Aims and Research Questions

The general aims of the work described in this thesis are:

(i) To investigate, develop and evaluate techniques for mobility assessment

which will allow the objective comparison of different AHV system phosphene

presentation methods.

(ii) To develop a display framework for the presentation of AHV system in-

formation, and use this framework to guide the development of an AHV

simulation device.

The research questions which will be addressed in this thesis are:

1. Can specific main factors be identified as highly significant for

providing mobility information in an AHV system?

2. Can objective measures be developed for the comparison of effec-

tiveness between AHV systems in providing mobility information?

3. Can computer vision techniques be adopted and modified to pro-

vide mobility information in an AHV system?

1.3 Research Scope

(i) Currently AHV development in Australia (and in many other countries)

is limited to animal experiments as there is not access to implanted pa-

tients. To overcome this limitation, the mobility experiments described in

this thesis present an artificial visual scene simulation display to normally

sighted research participants. It is anticipated that the mobility experi-

ments described in this thesis could be repeated using implanted patients

when available.


(ii) The images presented to participants in this thesis are shown in an ordered

array (for example, a rectangular grid of 12 x 16 phosphenes). Current

generation AHV systems may not be capable of aligning phosphenes well in

such an ordered array, and these phosphenes may appear irregular in shape.

However, it is anticipated that future AHV technology will be capable of

addressing this current limitation.

(iii) A focus of this thesis is methodology. It is not the intention to validate a

particular method to a high degree of integrity.

(iv) The AHV simulations used in this thesis are currently more easily applied to

AHV systems involving a retinal implant. However, it is anticipated that the

methods used will eventually be applicable for other types of implants in the

future. In support of this statement, it has been reported that the perceived

display from the Dobelle Institute cortical surface electrode system has been

enhanced by Sobel edge detection [46].

(v) The aim of the work contained in this thesis was not to develop the best

AHV simulation device. A guiding principal in this work was to keep the

cost of simulation hardware to a minimum.

(vi) This thesis also identifies shortcomings of AHV simulation methods and

additional research areas of future opportunity.

1.4 Thesis Organisation

This thesis is structured as two main sections: Background (Chapters 1-4), and

experimental (Chapters 5-8), followed by a Conclusion (Chapter 9).

The background section provides a detailed review of fundamental theory

relevant to AHV mobility. Chapter 2 includes an overview of blindness and

1.5. Original Contributions of Thesis 5

mobility before providing a comprehensive review of blind and low vision mobil-

ity assessment. Chapter 3 describes current AHV system research, including

technology, requirements, stimulation locations, constraints and simulation re-

search. The final part of the background section is Chapter 4, which presents

methods for processing image information for AHV systems and a discussion of

previous applications of computer vision to assist the vision impaired. At the end

of Chapter 4 a proposed conceptual framework for the display of mobility related

information using an AHV system is provided. This framework drives research

questions around which the remaining thesis chapters are based.

The experimental chapters explore aspects of the conceptual framework pre-

sented in Chapter 4. Chapter 5 investigates the effect of four different image

processing methods on the recognisability of mobility components contained in

low quality AHV simulation images. The remaining three experimental chapters

focus on mobility assessment using custom built head-mounted AHV simulations.

These chapters are based on processing image sequences (i.e. video). Chapter 6

evaluates the use of a Personal Digital Assistance (PDA) real-time ‘looming ob-

stacle’ alert in a typical outdoor environment. In Chapter 7 this obstacle alert is

compared with two other AHV simulation display types using an indoor artificial

mobility course. In Chapter 8 the effects of two significant factors constraining

AHV systems (temporal and spatial resolution) are investigated in an artificial

mobility experiment involving 60 normally sighted volunteers.

Chapter 9 summarises the work in this thesis and provides a discussion of

how the research can be extended.

1.5 Original Contributions of Thesis

Resulting from this work are three significant original contributions to knowledge.

These contributions are explored through examination of the research questions


presented in Section 1.2.

The first research contribution resulting from this work is a conceptual frame-

work based on literature reviews of blind and low vision mobility, AHV technol-

ogy, and computer vision. This framework incorporates a comprehensive number

of factors which affect the effectiveness of information presentation in an AHV

system. Experiments reported in this thesis have investigated a number of these

factors using simulated AHV with human participants. It has been found that

higher spatial resolution is associated with accurate walking (reduced veering),

whereas higher display rate is associated with faster walking speeds. In this way

it has been demonstrated that the conceptual framework supports and guides

the development of an adaptive AHV system, with the dynamic adjustment of

display properties in real-time.

The second research contribution addresses mobility assessment which has

been identified as an important issue in the AHV literature. This thesis presents

the adaptation of a mobility assessment method from the blind and low vision

literature to measure simulated AHV mobility performance using real-time com-

puter based analysis. This method of mobility assessment (based on parameters

for walking speed, obstacle contacts and veering) is demonstrated experimentally

in two different indoor mobility courses. These experiments involved sixty-five

participants wearing a head-mounted simulation device.

The final research contribution in this thesis is the development and evalua-

tion of an original real-time looming obstacle detector, based on coarse optical

flow, and implemented on a Windows PocketPC based Personal Digital Assistant

(PDA) using a CF card camera. PDA based processors are a preferred main pro-

cessing platform for AHV systems due to their small size, light weight and ease

of software development. However, PDA devices are currently constrained by

restricted random access memory, lack of a floating point unit and slow internal

1.5. Original Contributions of Thesis 7

bus speeds. Therefore any real-time software needs to maximise the use of inte-

ger calculations and minimise memory usage. This contribution was significant

as the resulting device provided a selection of experimental results and subjec-

tive opinions. Experiments using this device were conducted in both indoor and

outdoor environments and are discussed in Chapters 6 and 7.

In addition, this thesis is original as it synthesises information from a number

of different research areas and supports this synthesis through scientific experi-

mentation.

Chapter 2

Blind and Low Vision Mobility

Issues and Assessment

2.1 Introduction

Before it is possible to measure how beneficial an Artificial Human Vision (AHV)

system will be, it is necessary to establish the main physical and psychological

requirements for such a system. In addition there is a requirement to establish

what the quality requirements are, and how they can be met. This chapter com-

mences by defining blindness and low vision. A review of the main mobility issues

for the blind is then provided, followed by a discussion of existing blind mobility

aids (such as the long cane, guide dog, and electronic devices). Orientation and

environmental accessibility are important factors which affect blind mobility, and

are briefly defined and discussed. A section on the ecological psychology work of

James Gibson is also provided in this chapter, as it provides a bridge between

mobility research and computer vision (for example regarding the concept of optic

flow). Finally the bulk of this chapter provides a thorough review of the literature

on mobility assessment.

9

10 Chapter 2. Blind and Low Vision Mobility Issues and Assessment

2.2 Blindness and Low Vision

The section provides a brief overview of the definition and causes of blindness.

The term Blindness usually refers to Legal Blindness which is defined as a

visual acuity of less than 20/200 after correction (that is, objects at 20 feet

appear as if they are at 200 feet) or where the visual field is restricted to 20◦ or

less (a normally sighted person has a visual field of almost 180◦). This definition

of blindness is used in this thesis. Blindness can also include a variety of highly

specific defects such as a loss of vision in a particular visual field area or a lessened

ability to see in low illumination [180]. In 1997 the World Health Organization

(WHO) estimated that there were close to 150 million people with significant

visual disability worldwide, with 38 million of those people being legally blind

[246].

Approximately 10% of legally blind people have no light perception at all [143],

and are described as totally blind. Currently, most AHV research is generally

targeted at the totally blind.

People defined as having Low Vision have serious visual impairments (such

as central field loss, tunnel vision or blurred vision), but are not necessarily

legally blind. Because AHV systems are expected to restore vision only partially,

previous mobility research conducted with low vision participants should provide

useful insight for AHV system requirements.

Common causes of blindness include hereditary retinal degeneration and age-

related macular degeneration [143]. Due to projected ageing and growth of the

Australian population, by 2030 rates of severe visual impairment in Australians

aged over 50 years will have doubled from 25,590 to 57,930 people) [68].

In economically developed societies, the leading cause of blindness and visual

disability in adults is diabetic retinopathy. Around 120 million people worldwide

have diabetes and after 15 years approximately 2% of those people become blind

2.3. Mobility and related issues identified for people with low vision and blindness 11

while about 10% develop severe visual disability. Eye injuries account for around

1 million cases of blindness each year [247]. The most common non-preventable

cause of blindness in the developed world is age-related macular degeneration,

which occurs in 25% of people aged 80 years and over [248].

Nine out of ten blind people live in the developing world and 18% of the

world’s blind are estimated to live in China [249]. In general, more than two-

thirds of today’s blindness could be prevented or treated by applying existing

knowledge and technology [247]. Nearly half of all blindness is due to cataract

and a quarter of the world’s blindness is due to trachoma (infection with Chlamy-

dia trachomatis, spread by flies from infected excreta). Other major causes of

blindness are glaucoma (a group of eye diseases characterized by an increase in

intraocular pressure), trachoma and onchocerciasis (both parasitic diseases) and

xerophthalmia (caused by vitamin A deficiency) [30].

2.3 Mobility and related issues identified for peo-

ple with low vision and blindness

A number of issues concerned with movement and perception for the blind and

those with low vision have been discussed in the literature. This information

is valuable for AHV research as a successful AHV device should be capable of

enhancing the ability of a user in coping with these issues.

Mobility is commonly defined as the ability to travel between locations grace-

fully, safely, comfortably and independently (Foulke 1970; cited in [200]). For

example, mobility includes the ability to move through space without accidental

contact with obstacles [200]. Mobility is a complex task that can be affected by

variations in environmental conditions, the personal characteristics of travellers,

and situational factors such as the traveller’s familiarity with the area [135].


Congenitally blind (meaning those who are blind from birth) children often have

hypotonia, or abnormally low muscle tone (due to delayed sensorimotor develop-

ment), which can affect mobility. Congenitally blind children usually commence

crawling after 13 months of age (compared to the sighted average of eight months)

[188]. An additional problem, experienced by most blind patients with no light

perception, is falling out of phase with the 24 hour day, often leading to severe

sleep disorders [221]. Blind people may also have multiple disabilities, such as

hearing loss.

Factors identified for low vision mobility include the amount of residual vision,

the age on onset of visual impairment, posture and balance, intelligence, space

orientation, auditory-tactile abilities and personality [145]. Age is another factor,

as many of the blind are elderly, which can restrict their ability to use some

mobility aids (such as a guide dog). Where there is substantial visual field loss,

the loss of the peripheral field affects mobility more than central field loss [145].

Two significant problems in street locomotion for the blind are the reliable

perception of objects, such as obstacles and landmarks, and adequate spatial and

geographic orientation [22]. Landmarks can also be obstacles, such as a tree or

steps.

Some of the everyday problems experienced by blind people involve street-

crossing, identifying and locating building entrances and interacting with auto-

matic teller machines and information kiosks [189]. Street crossings can cause

significant anxiety [73]. For example, Guth et al. [87] have reported that crossing

a street usually involves the following four main tasks:

• Detecting the street (ramp slop, sound, traffic, texture).

• Aligning the body (for example, by using the bars on a sewer grate, or by

tracking traffic sound).

• Deciding when to initiate the crossing (which can be difficult with quiet

2.3. Mobility and related issues identified for people with low vision and blindness 13

cars and bicycles).

• Walking across the road in a straight line (veering from a straight line is

common).

A summary of blind pedestrian needs were published by the National Research

Council in 1996 [64]. They were reported as:

1. Detection of obstacles in the travel path from ground level to head height

for the full body width.

2. Travel surface information (including texture and discontinuities).

3. Detection of objects bordering the travel path.

4. Distant object and cardinal direction information (particularly for the pro-

jection of a straight line).

5. Landmark location and identification information.

6. Information enabling self-familiarisation and mental mapping of an envi-

ronment.

Dangerous situations for the blind or partially sighted have been reported by

Pelli as [172]: drop offs (such as train platforms) and moving vehicles. These is

supported by Brambring [22] who stated that the most dangerous obstacles are

downward steps and low or fast moving obstacles. An additional problem involves

making unwanted contact with a pedestrian which can be socially awkward and

may pose a threat to a person’s safety [74]. Additional problems have been

reported by Geruschat and Smith [73] as lighting conditions and glare (eg. light

adaptation), changes in terrain and depth (stairs, curbs), differences between

reduced acuity (reading, etc) and restricted fields (trouble with groups of people,

shopping, etc) and visual clutter (for example, many signs or complex signs).


2.4 Primary mobility devices for the blind

This section provides an overview of the main mobility devices currently available

for the blind. An understanding of these mobility devices is important, as an AHV

system may need to provide similar functionality to compete commercially and

to be successful, the benefit from using a mobility aid must outweigh the cost

(which includes financial, mental, emotional and physical aggravation [90]). It is

also possible that a traditional mobility device and an AHV system may be used

in combination.

A primary mobility aid is one which can provide enough information about

a person’s immediate environment to allow them to be mobile. The amount of

time which a person has to react to the environment is important and is referred

to as preview time. The most important mobility aids are the long cane and the

guide dog, which are briefly described below.

2.4.1 Long Cane

The long cane is capable of providing a blind person with sufficient information

for safe movement in the immediate environment at a low cost. Mobility enhance-

ment in a blind person after long cane training has been described as dramatic

[52]. Long canes are usually made from fibreglass and are designed to provide

good vibration conductivity (different tips can be used on the end of the cane to

provide this information).

The high visibility of the long cane to drivers and other pedestrians can be an

advantage, although this may also make it easier for criminals to identify a target.

The most significant problem with the long cane is that it only provides two paces

of preview. A long cane user needs fast reaction times due to the limited amount

of preview information. Cane use requires a high level of concentration and the

arm movements can cause tiredness. A significantly problem with the cane is

2.4. Primary mobility devices for the blind 15

that it does not protect a person against obstacle collisions to the upper part of

their body (in Australia, an example of such an obstacle is a wall-mounted public

telephone, shown in Figure 4.1). There is also a risk of tripping other pedestrians

with a cane [64].

2.4.2 Guide Dogs

Guide dogs provide good mobility assistance by pulling in the same way that a

human guide would. They are able to respond to hand and voice signals (such as

‘forward’) and are trained to avoid obstacles, prevent veering in street crossings,

and stop if there is a dangerous situation. Guide dogs are also trained to intel-

ligently disobey commands that are not safe. Another benefit is that a dog can

remember common landmarks (such as a particular shop door).

However, guide dogs are not suitable for people who are not comfortable with

dogs, are not physically fit or cannot maintain a dog [244]. In addition, a guide

dog user still requires a high level of mobility skill and would usually be required

to have long cane training. Guide dogs are also expensive to train and there are

a large number of dogs who cannot be trained to the required high standards.

2.4.3 Electronic Travel Aids

A large number of Electronic Travel Aids (ETAs) for the blind have been de-

veloped over the past 40 years. ETA’s are designed to transform environmental

information into a form that can be conveyed through other sensory modalities

(auditory or tactile) [64]. Information from the ETA is usually obtained through

three sensor types: ultrasonic, laser or visible light. An AHV system could be

considered to be an ETA device (although this has not been reported in the AHV

literature).

As mentioned above, preview is important for the visually impaired traveller,


as it allows them to anticipate problems before they occur. The ability of a device

to provide greater preview information than the long cane could be very useful.

Therefore many of the ETA’s attach the sensors to the long cane.

Ultrasound based devices (which generally provide auditory information) in-

clude the SonicTorch, SonicGuide and SonicPathfinder [99]. More recent ultra-

sonic devices include the Navbelt and Guidecane [201]. Greg Phillips from the

Australian company, GDP Research [176], has developed a low-cost, hand held

ultrasonic device called the Miniguide. The Miniguide has been successfully tri-

alled by the Guide Dog Association of New South Wales, and is supplied at no

cost to blind people by the Guide Dogs for the Blind Association of Queensland

[101]. A recent ultrasound device is the Ultracane, developed by researchers from

Sound Foresight in the UK [69]. This device emits ultrasonic waves from the

cane handle, which are recorded as they bounce back from objects. The Ultra-

cane uses two buttons which vibrate, enabling distance and direction information

to be transmitted.

The original laser ETA device, developed during the 1960s, was the Laser

Cane, which used three low power lasers [53]. This device used the different

lasers to attempt to detect drop offs, overhanging obstacles and forward obstacles.

However, the tactile and auditory output was found to be confusing and the device

was very expensive. A problem with using laser energy is that wet surfaces can

provide misleading information, as light is reflected away from the device. A

recent version of the laser cane has been developed by the German company

VISTAC [121]. This device uses a single laser to detect objects directly above

the cane in the head and chest area. The cane then vibrates when an obstacle is

present.

An alternative ETA approach is to process images captured from a head

mounted camera and provide an auditory representation of these images via head-

phones. There are currently two systems which use this method. The first is the

2.5. Orientation Aids 17

vOICe system developed by Peter Meijer [150] which processes image informa-

tion (using a 64x64 pixel array) and provides a sound representation once per

second. The second device is the Prosthesis for Substitution of Vision by Au-

dition (PSVA) which maps images at a lower resolution (128 pixels comprising

8x8 pixels for the centre of the image and 8x8 pixels for the image peripheral),

however provides information at a higher rate of 25 images per second [177].

Despite the large range of ETAs which have been developed, none has achieved

widespread acceptance by blind people. The main reasons for the low uptake of

devices has been that they offer little benefit in mobility, are expensive and are

cosmetically unattractive [52]. The objective assessment of mobility benefits from

ETA devices has also hindered development. There is often little or no published

research on the benefits of these devices, which makes it difficult for consumers

to compare the benefits of different devices. Also training programs need to be

developed and conducted for device users.

2.5 Orientation Aids

Orientation is defined by Brabyn et al. [21] as a person knowing where they are

in absolute terms of reference. Blind people may use different problem solving

strategies such as locating landmarks, recalling mental maps of familiar places,

asking for help, or using systematic familiarisation to explore an environment

[134]. It should be possible to integrate orientation information with an AHV

system (for example using GPS data).

Existing orientation aids include large print or tactile maps which can provide

cognitive maps for the blind traveller. The most useful tactile maps are those that

model the real-world - such as Lego or matchbox cars. Environmental regulari-

ties, such as parking meters or sidewalks, can be monitored to solve orientation

problems. There is usually a pattern in number systems, such as house numbering


[14]. Verbal aids can be useful for the rote learning of routes: these may involve

a running documentary of a route in the actual time frame or a recording of a

route from observation [251]. Future orientation devices will probably consist of

electronic databases of geographic information combined with GPS to provide

orientation maps. It should be possible to integrate this type of information with

an AHV system.

2.6 Environmental accessibility

An alternate approach to assisting with blind mobility is to make changes to

the environment. The environmental information which is often unavailable to a

blind traveller includes: the names of streets and landmarks, room numbers, bus

numbers/destinations, directional information in train/bus stations, intersection

configuration/type of traffic control and the status of traffic light cycles [14].

Changes to the environment include tactile guide strips (also known as Braille

strips) which can be used to indicate the direction of paths, or warning of drop-

offs, such as stairs. When tactile strips are correctly applied, the safety and

confidence of a blind person can be significantly increased, while only mildly

inconveniencing other pedestrians [124].

Environmental accessibility for blind and the partially sighted can involve

the use of a logical design layout (for example, stairs should be next to lift),

assistance with visibility (for example, hand rails should have high contrast)

and adequate lighting (which should be 50-100 % greater than that required

for normally sighted) [14].

Useful mobility and navigation information can also be provided by locating

transmitters at strategic locations in the environment; a blind person can then

use a hand held receiver to receive directional information about the landmark.

This approach has been used in the ‘Talking Signs’ program which has been

2.7. The ecological approach to perception 19

implemented in many locations in San Francisco [222]. An adaptable AHV system

may be able to integrate sign recognition and landmark location, and present this

information to a blind traveller.

Further details of computer vision research for the blind is provided in Chapter

4.

2.7 The ecological approach to perception

The work of perceptual psychologist James Jerome Gibson provides a bridge

between visual perception, mobility and computer vision research, and his work

is heavily cited in the literature for each of these fields, in addition to the fields

of ergonomics and design. Gibson suggested that perceptual systems evolved in

moving organisms, and to understand perception it is necessary to consider the

immediate environment within which the organism has evolved. This ecological

approach to visual perception emphasises movement in a complex and changing

environment. This suggests that the assessment of mobility should be conducted

while research participants are moving.

Blind mobility is possible because perception involves senses apart from vision.

For example, people with normal vision generally do not notice the difference in

reading on a bright versus a dull day or hearing a tune in a different key [52].

Similarly, a blind person may perceive an obstacle near their head, but may not

realise that they have used ‘facial vision’ (auditory echo location) [77]. In the

context of blind mobility, perception can be considered to be the combination

of exploratory actions and knowledge of the surroundings gained from looking,

touching and hearing [87].

Invariant components of an environment are those that maintain their identity

but may change in appearance (such as the ground or a cup). The ecological


approach suggests that invariant components provide affordances, such as walk-

on-ability, grasp-ability or collision [139]. Surfaces are particularly important for

perception, as the surface affords locomotion to an organism. The ground is the

most important surface. However, it is possible to misperceive affordances (such

as mistaking a glass wall for an opening) [78].

The movement of the body, head or eyes of an observer produces a trans-

formation in the entire retinal image, which Gibson called optic flow [77]. For

example, there will be a lateral flow of information across the retina when the

observer moves their head from left to right and an expanding pattern as the

observer moves forward. As a person walks through an environment the trans-

formations in optic flow reveal the layout of surfaces by occlusions/disocclusions

and expanding patterns produced by approaching obstacles [53]. The direction

of observer motion is indicated by the point in the optic flow from which motion

vectors originate (this is called the focus of expansion). Optic flow computation

has generated a large amount of literature in computer vision, as it is useful for

solving motion problems with stationary or moving cameras, can provide rela-

tive distances of objects in an image sequence, and can be used to represent the

three-dimensional motion of an object across a two-dimensional image [206]. This

method is discussed further in Chapters 4, 6 and 8 of this thesis. Gibson’s theo-

ries on motion led to the speculation and discovery of higher order visual neurons

that analysed differences between the centre and surround of a receptive field,

independent of direction [153].

Although Gibson’s theories have been influential, two main limitations of his

work have been identified. First, Gibson underestimated the difficulty of the

visual system detecting invariants. Secondly, his work did not explain how three-

dimensional information is detected by a moving observer [144] . Later infor-

mation processing approaches to visual perception (such as the seminal work of

David Marr which is discussed briefly in Chapter 4) have provided greater insight

2.8. Mobility assessment 21

into these limitations.

2.8 Mobility assessment

As stated in Chapter 1, a number of AHV researchers have commented that

they are unable to objectively compare AHV system information presentation

methods. This section presents a critical review of blind and low vision mobility

assessment research. This research has generally been undertaken to test the

effects of instruction by orientation and mobility specialists on mobility (learning

effect) and the evaluation of specific mobility aids (such as a cane, or electronic

device).

Mobility assessment should be able to allow the objective testing of different

techniques or devices. However, it is difficult to generalized tightly controlled

findings from a laboratory setting to real world mobility tasks. Even within the

laboratory it is difficult to manipulate variables experimentally or to measure

responses [214].

Three main methods for assessing mobility are reported in the literature:

self report questionnaires, field experiments and artificial environments. These

methods are presented in the following three subsections. A trend in recent

research has been to include both artificial and field experiments, and these papers

are reviewed in the fourth subsection.

2.8.1 Mobility assessment: Self report research

Self report research is the most common method of conducting mobility perfor-

mance and may be the simplest method to obtain information on which parts

of the environment contribute to mobility problems. However, this method of

research is less reliable than the other methods of mobility assessment, as it relies

on subjective reports which may be biased. This method of mobility assessment is


not widely reported in the literature, however the following three papers illustrate

the types of interesting results that can be obtained from this approach.

In the first research paper, two experiments were conducted by Brambring

[22] into mobility problems experienced by blind people while walking on streets.

These investigations provide some insight into what information is most useful

for blind mobility. In the first investigation, four blind students (aged in their

twenties) were asked to describe a single walk on their daily routine from a dor-

mitory to their bus stop. Their descriptions was recorded and later analysed to

reveal that landmarks were the most frequently mentioned items, with distances,

directions and obstacles less frequently mentioned.

In the second experiment by Brambring, nine blind subjects (with an average

age of 25 years) were asked to walk along two different routes, and describe the

route for another blind person. In addition, nine sighted subjects (with similar

age, sex and education to the blind subjects) were asked to walk along the same

two routes and record descriptions for another sighted person. Transcriptions

of the descriptions suggested that the blind need considerably more information

for mobility than sighted people: which would require greater memory and more

effort to recall. The blind subjects made more explicit statements about distances

than the sighted subjects. Although these experiments provide some subjective

mobility information, a problem with this approach is that individual differences

(for example, in the ability to provide a description) could be significant variables.

Another example of a self report experiment, designed to gain an understand-

ing of how different types of visual problems affect mobility, was conducted by

Passini et al. [168] and involved interviewing 47 subjects who ranged from total

blindness to having a strong residual vision. Passini et al. reported that the most

mobility problems occurred in vast spaces (indoor and outdoor), shopping centres,

department stores, hotel lobbies, and public transport stations. In addition po-

tential sources of mobility accidents were reported as: descending stairs (without


a structural warning or handrail), benches, half-open doors, and objects which

cannot be detected with cane on the ground (such as public telephones fixed on

walls or rear vision mirrors on trucks). Subjects were most worried about cars.

Passini et al. noted that mobility research needs to involve communication with

visually impaired people. Although this study provides useful information, one

problem is that the interviewees may not have been aware of, or may not have

remembered, all mobility problems they had encountered.

2.8.2 Mobility assessment: Field Experiment research

Some people (for example those with cognitive impairment) may have trouble

expressing mobility performance in a self-report assessment), and a performance-

based assessment may be more accurate. Field experiments involve the use of

real-world environments for mobility assessment, such as streets or shopping cen-

tres. Although there is less control over variables in field experiments compared

to laboratory study (such as the frequency of pedestrians, noises, varying light

sources and unpredictable obstacles), they can be used to measure participant

behavior in an objective way and the results may be more generalisable to real-

life mobility situations than laboratory experiments. A number of alternate field

experiment studies will now be discussed.

Productive Walking Index

During the 1970s, an influential Nottingham University research team developed

a mobility assessment technique which has been used to evaluate a number of

mobility aids. This assessment involved measuring a subject’s performance over

a 1300 meter course through an urban environment, which included a range of

typical mobility situations [6]. A video recording was made of the subjects as they

moved through this environment. Three trials were used on this course: in the


first trial the subject used their conventional aid (such as a cane), in the second

they used the mobility device to be tested. In the third trial, the subject was

again recorded using their conventional aid. The purpose of the third trial was

to control for the effects of familiarisation with the test route (a similar approach

is used in Chapters 7 and 8 of this thesis).

The three video recordings were then analysed for safety, efficiency and psy-

chological stress information. The definitions used for these scores are listed in

Table 2.1. The safety measures were designed to record the frequency and type

of body contact with the environment and the number of accidental departures

from the footpath. The Productive Walking Index (PWI) was used to measure

the mobility efficiency of the subject. PWI is the ratio between the time taken

to complete the course and actual time spent walking in the correct direction

[100]. If a subject spent time standing still or backtracking after an orientation

error, this would be reflected in the PWI. In a later follow-up study Dodds et al.

[54] used an outdoor mobility course which involved walking along a foot path

to a telephone booth. The course involved three road crossings and a number of

natural obstacles such as trees, lamp posts and a hedge. From the results of this

study, Dodds et al. reported that the PWI was a reliable mobility measure.

The average stride length was later removed from the Blind Mobility Unit

evaluation. Stride length was meant to give a measure of subject stress, however

it was found to be unreliable (as it relies on the number of steps taken in each part

of the route which would be dependent a person’s gait) [54]. During the early

1980s the list of dependent variables in the Blind Mobility Unit evaluation method

was simplified and Dodds [51] proposed the measures in Table 2.2. However, a

problem with the Blind Mobility Unit assessment was the unreliability of data

between trials. Despite the same physical environment used for each trial, the

mobility route used by subjects differed slightly, which meant some obstacles

or environmental features were not within range of the participants [51]. An


Table 2.1: Nottingham Blind Mobility Unit dependent variable measures [6].

1. Safety Scores Body contacts at, or rising from, ground level

Body contacts with the inner shoreline

Body contacts with obstacles at, or near, head height

Accidental departures from side curb

Accidental departures from down curb

Trips at up curb

2. Efficiency score Average walking speed (metres per second)

Continuousness of progress (Productive Walking Index)

Variation in pavement position

Proportion of landmarks detected

Average angle of veering on road crossings (degrees)

Crossing efficiency index

3. Psychological stress scores Average stride length

additional problem uncovered from the Blind Mobility Unit studies was that many

long-cane users already travelled well and the effect of a secondary device, such as

a ultrasound-based Sonic Torch device, did not lead to a significant improvement

[51]. Also, when a subject was using a long cane, bodily contacts with obstacles

was very rare, which meant that most subjects scored high on safety and efficiency.

Percentage of Preferred Walking Speed

The Productive Walking Index was reviewed by Clarke-Carter et al. [41] who

suggested that a better measure was the Percentage of Preferred Walking Speed

(PPWS). The Preferred Walking Speed (PWS) is defined as the speed of a visually

impaired person walking at their preferred speed, with a sighted guide holding

their arm. PWS requires an instructor guiding a participant over a known dis-

tance and dividing the distance by the time taken. The walking efficiency of

experimental participants can then be measured and normalised over a longer


Table 2.2: Revised Nottingham Mobility Unit measures from Dodds [51]. Shore-lining refers to following a path or wall using tactile or auditory information.

1. Productive Walking Index (time taken / time spend walking)

2. Body contacts with obstacles

3. Cane contacts with inner shoreline

4. Cane contacts with outer shoreline

5. Pavement position

6. Body contact with shoreline

7. Major safety errors (tripping, bodily contacts with obstacles)

mobility course by calculating their measured speed as a percentage of the PWS

[208]:

PPWS =SMC

PWS× 100 (2.1)

where both Preferred Walking Speed (PWS, measured over a short distance)

and Speed on the Mobility Course (SMC) are defined as:

PWS = SMC =distance

time(2.2)

The PPWS can be used as a between-participants measure to compare differ-

ent walking speeds, in addition to assessing mobility changes in a single partici-

pant. The PPWS allowed the assessment of subjects who did not use a cane and

allowed different subjects to be compared.

Clarke-Carter et al. [41] tested the new PPWS measure by recording walking

speed using a large backpack strapped to the research subject. This backpack was

then connected to a cumbersome device consisting of three wheels which would

have limited the usefulness of this measure (for example, walking up stairs).

However, this study was able to show that participants using a guide dog had

significantly higher PPWS scores than participants who used the long cane.

A further field experiment using the PPWS was conducted by Dodds et al.


[55]. This study investigated if mobility in low vision clients could be predicted

by current theories of perceptual functioning. The authors developed a series of

four visual tasks based on the perceptual learning theories of James Gibson (dis-

cussed in Section 2.7 above) and Ulreich Neisser. These tasks (referred to as the

OCULA assessment and training suite) involved a subject pressing a computer

key as soon as they perceived movement or recognisable objects in simulations of

textural shearing, degraded figures, embedded figures and peripheral attention.

The results from 37 subjects suggest that the visual tasks were better predictors

of visual performance than visual field and acuity measures. This research also

suggested that learning from the OCULA tasks could be applied to real-life mo-

bility situations. A similar approach could be used in training people in using an

AHV system display.

The PPWS measure has also been used to measure mobility in a study of

simulated Retinitis Pigmentosa (RP) by Haymes et al. [94]. This study involved

20 normally sighted subjects wearing swimming goggles which had been painted

on the inside and with a 5 mm hole cut through the centre of each eye piece to

simulate advanced RP. Different filters were fixed inside the goggles to simulate

different lighting conditions. The outdoor mobility route was a 220 meter resi-

dential street of ‘moderate difficulty’, with obstacles such as driveways, cracked

concrete and overhanging branches. Haymes et al. stated that the effects of

learning a route on mobility performance diminish after two attempts, so sub-

jects were measured on the route twice, walking with normal vision. The time

taken for this was considered the preferred walking speed for the PPWS calcu-

lation. Ten subjects repeated the mobility experiments indoors; however there

were no details of the test route used. A significant finding from this study was

that the clinical vision measures (such as visual acuity and the Melbourne Edge

Test) taken indoors were not useful in predicting outdoor mobility performance.

In an alternate PPWS-based study by Haymes et al. [95], three real world


mobility routes in Melbourne, Australia were used to measure the effects of vision

and psychological variables. Eighteen subjects with varying degrees of blindness

from RP were involved in the study. The first route (228 meters) involved a

quiet residential street, the second (202 meters) involved an outdoor small busi-

ness area with numerous obstacles (pedestrians, rubbish bins, seats, etc), and the

third (254 meters) took place in an indoor shopping centre with many obstacles

(pedestrians, escalators, plants, etc). Subjects walked the routes twice, in random

order and only the walking speed from the second trial was used. The preferred

speed of subjects was also calculated on a separate, obstacle free course and this

measurement was used for the PPWS calculation. Subjects also had a vision as-

sessment and took the NEO-PI (Neuroticism Extraversion Openness Personality

Inventory-Revised) to measure personality variables. This study found a highly

significant correlation between vision assessment and retinitis pigmentosa. How-

ever, there were no significant correlations found between psychological variables

and mobility in this study.

Travel time and mobility incidents

In 1998 the effects of RP on mobility were examined by Geruschat et al.[74]. In

this study measures of visual function and a self-report mobility questionnaire

were recorded, in addition to mobility measures which included: travel time and

the number of mobility incidents (bumps, stumbles, neglecting to detect stairs,

and problems remaining oriented once travelling in the correct direction). Two

courses were used: the first was a simple course built in a basement hallway (49

meters) which had paper cups hanging at varying heights from overhead vents and

floor mats as obstacles. Two different illumination levels were randomly allocated

to the first course. The second course was the main corridor in the Johns Hopkins

Hospital Outpatient Centre (444 meters), which had many obstacles including

pedestrians, elevators, and small shops. Forty-one subjects were involved in the


study, 16 of whom were normally sighted and the remaining 25 with RP. Very

few mobility incidents occurred in the first course, however it was found that

subjects with RP were five times more likely to have mobility incidents than

sighted subjects. The RP subjects were found to walk more slowly than normally

sighted subjects. It was noted that very few contacts occurred with pedestrians in

the second course, as pedestrians will usually move out of the way before contact

is made. However, a problem with the results from this study are the potential

variations between trials, caused by movement in the crowded hospital outpatient

centre. A large sample size would be required to rule out the possibility that one

of the groups were exposed to more obstacles than the other group.

Detection of curb ramps

Curb ramps provide a smooth surface from a footpath onto a road and have

been designed to assist with the mobility of people with physical disabilities (for

example, with a wheelchair). However, they can increase the danger to blind

pedestrians, who may not realise they have entered a road. These ramps were

studied by Benzen and Barlow [15], who assessed the detection rate of 80 subjects

in eight U.S. cities. The measure used in this study was whether the subject

stopped before the road or not. Subjects failed to stop before entering a street

in 39% of approaches by a curb ramp, and this increased to 48% when the slope

was minimal. There was no difference found in curb detection between subjects

who travelled frequently (six or more times per week) and those who travelled

fewer than three times a week. These findings support the use of tactile strips at

either end of curb ramps to provide a warning for blind pedestrians.

Heart Rate

Heart rate has also been considered as an objective measure of stress in mobility.

For example it was found that the heart rates of blind and partially sighted


subjects were significantly higher when an instructor was not present on the

same mobility course [223]. However, the use of heart rate can be affected by

the momentary work load during mobility tasks [6]. Probably for this reason

heart rate has not been considered in any other Orientation and Mobility (O&M)

assessments reviewed in this Chapter.

Mobility measurement reliability and validity

Mobility field experiments generally involve the use of an O&M instructors ob-

servations of mobility (for example in scoring the frequency of obstacle contacts).

The reliability and validity of these observations were studied by Geruschat et al.

[72], whose study involved 36 subjects (mean age 57) with varying levels of visual

impairment walking a route which included residential travel and small business

travel. Five O&M instructors assessed subjects on the mobility problem types

listed in Table 2.3. These measures were selected by Geruschat et al. as they

had a low cost and did not involve expensive laboratory equipment. Interrater

reliability (the degree to which the O&M instructor scores were similar) between

the instructors was found to be satisfactory.

In a second component of the Geruschat et al. study, 19 of the 36 subjects

were assessed using the measures in Table 2.3 before and after mobility training.

Mobility ratings were reported to have improved significantly after training. To

assess whether the measures were valid, the five O&M instructors were asked to

rank subjects in terms of mobility improvement. The combined mean instructor

ranking was found to correlate significantly with the change from pre-post in-

struction. However, it has been suggested that an improvement could have been

expected without training due to the effects of practice on the mobility course

[208]. An additional problem, discussed by the authors, was that there was a low

number of recorded pre-test mobility incidents.


Table 2.3: Mobility measures used in Geruschat & de l’Aune [72]

1. Unsafe street crossing (crossing at an inappropriate time or to an incorrect area)

2. Bumps (body contact (excluding hands) with any person or object

3. Stumbling (change in posture/gait as a result of contact below the knee)

4. Orientation (change in direction that does not match instruction or subject

is unable to complete section)

5. Drop-offs (unexpectedly stepping off a curb or step)

Navigation

In more recent research, Loomis et al. [136] conducted an experiment involving

GPS navigation aids for the blind. For their study, the research participant was

taken to a large open field and was requested to walk along a route specified by

a computer. Loomis et al. suggest that visually impaired subjects will soon be

navigating indoor and out using GPS-based navigation systems and local position

technology (such as Talking Signs). If a person’s mobility was combined with

the ability to navigate (for example by using the PPWS measure), this type

of experiment could be a valuable way of assessing the benefits of GPS-based

navigation devices.

In summary, the most frequently used measure in field experiments is the

PPWS. Contacts between obstacles and a person’s body are also frequently used

as dependent variables.

2.8.3 Mobility assessment: Artificial environment research

An artificial or laboratory-based study involves designing and constructing a mo-

bility course for use in mobility assessment. The two significant advantages of

artificial environments over field environments is that they provide better exper-

imental control over variables, and that the results should be easier to replicate.

Although artificial environments provide the most reliable results, there is the risk


that the results may be too artificial to generalise real world situations. However,

it has been reported by West et al. [243] that performance in their artificial as-

sessment environment and a person’s home environment were highly correlated.

Jansson [115] suggests that the artificial environment method should be the main

method during the development phase of a new ETA.

However, an artificial environment may not include many of the variables

that affect mobility performance, such as traffic sounds, pedestrian density and

variation of footpath surface [98]. Also, artificial environments do not involve the

same level of risk (such as a collision with a moving object) as a real world course

[172], and this may affect generalisability. An additional criticism of artificial

environments has been that they may be biased to favour a particular mobility

device [98].

Walking speed

In 1963, Michunas and Sheridan (cited in [200]) provided the first published

experiment on a simulated mobility environment. In their study they built a 51.8

meter long course with obstacles such as environmental sounds, steps up and

down and obstacles at head and ground level. The measures used were the total

time on course and a count of operationally defined harmful events. However,

the effects of the obstacles and masking sounds used on mobility performance in

this experiment were inconclusive due to masking conditions being provided in

the same order for all five blind participants.

Echolocation

An investigation into the usefulness of long-cane tapping noises for echolation

(using reflected sound to identify objects) for blind people was conducted by

Schenkman & Jansson [193]. Subjects were presented with a range of objects to

detect under experimental conditions. The results indicated that objects could


Table 2.4: Obstacle types used in Lovie-Kitchin et al. [137]

1. Suspended horizontal objects with a base at a minimum height of 140 cm

2. Suspended vertical objects with a base at a minimum height of 140 cm

3. Ground level objects with a maximum height of 37 cm

4. Ground level objects with a maximum height between 38 and 139 cm

5. Ground level large objects with a maximum height 140 cm or greater.

be detected and localised by tapping sound alone, but it was difficult and the

results varied with the size of the object.

Percentage of Preferred Walking Speed

In a large artificial environment, Lovie-Kitchin et al. [137] examined which areas

of the visual field were important for safe efficient mobility. Eighteen subjects

were involved in the study, nine of whom were classified as low vision, with the

remaining nine normally sighted control subjects. Subjects received visual field

measurement and were then assessed twice on an indoor mobility course, each

under different illumination conditions. The mobility course was 79 meters in

length, and used 87 different obstacles (which are listed in Table 2.4). Mobility

was measured as time taken to complete the course, number of contacts with

obstacles and the number of times subjects strayed from the path or required

reorientation. The study found that the loss of visual field in the mid-peripheral

or peripheral inferior and lateral areas had the most significant effect on reduced

mobility.

In 2000, a non-guided version of the PWS was developed and assessed by

Soong et al. [207] and was found to be as reliable as the guided version. In this

version, the PWS is obtained by recording the subject walking down a 20 meter

corridor which does not contain obstacles.

PPWS was also used to evaluate the effect of O&M training on mobility


performance by Soong et al. [208]. Thirty-seven visually impaired subjects were

involved in the study: 19 underwent mobility training and a control group of 18

subjects did not receive training. The unguided PPWS and mobility incidents

were used to assess subject mobility on an indoor course. Two visits to the

mobility course were conducted - each subject conducted the mobility trial twice

on each visit. Where subjects in the training group were prescribed mobility aids,

these were used in the second trial. The indoor mobility course was constructed

in two linked laboratories and was 78.9 meters long. The allocation of the 100

obstacles used were similar to [137], with five different height ranges used (0-13cm,

14-49cm, 50-99cm, 100-150cm, 151+ cm). Half of the obstacles were covered in

light grey paper to provide high luminance, and the other half were covered in dark

grey to provide low luminance. Subjects were asked to proceed through the course

while carrying out two typical mobility tasks: taking a small packet to a bench;

and picking up three empty food containers, placing them in a bag and carrying

them to another table. Errors were defined as body contacts with obstacles;

errors made while conducting the two tasks; and straying off the mobility path

(which was marked by rolled up bubble wrap). If a subject was unable to re-

orient after contacting an obstacle or straying from the path two errors were

counted. Surprisingly, this study found that O&M training did not enhance

mobility performance compared to the control group, and any improvement was

simply the result of practice.

Walking speed and obstacle contacts

A simulation of an AHV system and the effects on mobility were investigated

by Cha et al. [29]. Their ‘pixelized vision simulator’ device consisted of a video

camera connected to a monitor in front of the subject’s eyes. A perforated mask

was used in front of the monitor to reproduce the effect of individual phosphenes.

Optical lenses were then placed between the mask and the subject’s eye to reduce


the size of the image. This device was then used to investigate the feasibility of

achieving visually guided mobility with a visual prosthesis. An indoor maze was

constructed which allowed the test path and obstacle positions to be randomly

varied for each trial. The obstacles used were cylindrical paper columns which

were 5cm in diameter and 1.8m in length. The room was divided into 1.4 x 1.4

meter square blocks. Walls, cloth screens and the floor were white, whereas the

obstacles where black. Three 2.5 cm wide black strips about 50cm apart were

placed horizontally on the walls and screens to provide a high contrast indicator

of wall or screen location. Normally sighted, undergraduate subjects wore the

pixelized vision simulator while moving through the maze. Each subject was

tested in one two hour session per day over 8-10 trials. Subjects were asked

to move as quickly as possible, and walking speed and the number of contacts

with obstacles and walls were measured. These measurements were designed to

evaluate mobility performance as a function of pixel number, pixel spacing, object

minification and field of view.

Cha et al. [29] found that a foveally projected visual scene consisting of 625

(25 x 25) or more pixels with a field of view of about 30◦ allowed nearly normal

walking speeds and reported that this would provide good obstacle avoidance

and a sense of confidence to patients in familiar environments. Another finding

from Cha et al. was that head movement helped depth perception and improved

spatial resolution, but that this movement needed to be efficient to avoid a loss

of body balance.

There are two limitations with Cha et al.’s study. The first is that individ-

ual differences in walking speed are not taken into account, which a normalised

measure such as the PPWS would have done. This may restrict the study results

to examining how an individual subject was able to learn to move through the

course (and not differences between subjects). In addition the mobility course

was very artificial and the results may not generalized to a real life environment.


For example the walls and floor were painted white, and obstacles were all the

same shape and size with very high contrast (black on white).

Another study which used walking speed as a dependent variable was by Kuyk

et al. who examined the effects of changing light level on mobility performance

[122]. This study involved the mobility assessment of 88 visually impaired adults

under different lighting effects. The performance measures used were: time taken

to walk the course and the total number of contacts (recorded by a trained O&M

instructor walking behind the subject) with objects in the course. These measures

were taken for each subject in normal and reduced illumination. Subjects wore

modified sun shades to simulate lower illumination levels. The mobility course

had two start points and one dead-end - it is unclear if the start points or illumi-

nation levels were randomised for different subjects. The course, constructed in

a laboratory, involved 60 objects, mostly foam cylinders of different types (such

as step-over, shoulder-to-head level and walk-around) in fixed locations. Each

object was rated as low or high contrast. The pathway was 3 to 5 feet wide and

was usually marked with dark blue tape on the floor. The contrast and location

of objects significantly effected mobility through the course. Also, the ability to

avoid these objects, particularly step-over objects, was significantly reduced with

low illumination.

In a follow up study, Kuyk et al. [123] evaluated the mobility of a further

156 subjects on the same course used in their earlier paper in order to assess how

mobility performance relates to visual function. Visual field extent and scanning

ability were found to be the best predictor variables for mobility performance.

Performance based measures

The Salisbury Eye Evaluation (SEE) project, was designed to determine the rela-

tionship between visual impairment and everyday tasks [243]. The measures for

everyday tasks include self-report and performance on the tasks listed below in


Table 2.5: Mobility and daily activities assessment from West et al. [243]

Category Task Measure

Mobility Walk 4m m/s

Ascend 7 steps steps/s

Descend 7 steps steps/s

Chair ascent/step Time to finish

Daily Living Tasks Insert Plug s

Insert key s

Dial telephone no. s

Visually Intensive Tasks Reading speed words/min

Face recognition no. recognised

Table 2.5. These tasks are broader than typical O&M studies, and may provide

a good set of tests for artificial human vision assessment. However, recent pub-

lications from the SEE project have focused on assessment of remaining visual

functions (such as visual field), PPWS and obstacle contacts [233], [169].

Veering

Veering is a typical problem for a blind pedestrian when environmental cues are

unavailable for guidance. A tennis court which had been marked into a grid with

duct tape was used to measure the veering tendency of blind pedestrians in a

paper by Guth & LaDuke [86]. Four blind adults were assessed over three 15 trial

sessions which commenced with the subject standing against a portable wall and

then being asked to walk in a straight line for 25 meters. The overall average

veering error was found to be 11.5 degrees. A similar assessment method could be

used to determine how helpful different image processing methods are to prevent

veering in an AHV system.


Search strategies

Search strategies are important in the navigation and orientation of visually im-

paired people. These strategies were examined by Hill et al. [102] in a study

which involved 65 subjects (mean age 33 years), who were blind or only had light

perception. The same testing environment was constructed in eight different lo-

cations, and involved a 15 x 15 foot square, bordered by four strips of rubberised

matting. A baseball, glove, hat and a cup were placed on plastic baseball tees in

specific locations within the square. Subjects were monitored after being asked

to judge the directions from some of the targets to others. Subjects who used a

range of search strategies (such as perimeter, mental image or object to object)

performed better than those who relied on a single strategy.

2.8.4 Mobility assessment: Combined Field experiment

and artificial environment research

By combining both artificial environment and field experiments it may be possible

to generate results which are generalizable to the real-world, and which can also

be controlled and replicated.

Obstacle contacts and disorientation

Early work evaluating low vision mobility (rather than blind mobility) was pub-

lished in 1982 by Marron & Bailey [145]. This experiment investigated visual

factors in mobility in both outdoor and indoor test courses. The outside course

was a city block which contained a series of objects such as high contrast mail-

boxes and low contrast footpath edges. The illumination and contrast conditions

were approximately the same for all 19 subjects by testing in the early afternoon

during a 3 week period. The indoor test course was a long corridor (12.2 meters

long and 2.4 meters wide), with walls covered to make them a similar colour and


Table 2.6: Mobility measures used in Marron et al. [145]

Problem type (Score)

Contact with obstacle or disoriented for less than five seconds. (1 Point)

Contact with obstacle or disoriented for five-15 seconds. (2 Points)

Longer than 15 seconds to reorient and required assistance (3 Points)

luminance to the floor. Paper cylinders of varying diameters and lengths were

hung from the ceiling. Details of how many cylinders, or their location on the

course were not provided. The cylindrical shape was chosen to reduce sharp edges

and shadows which could be used as detection cues. The error scores (listed in

Table 2.6) from these courses were then compared to the results of visual field,

spatial contrast sensitivity and visual acuity assessments. There was a poor corre-

lation found between subject’s performance on the Snellen visual acuity chart and

mobility performance, however the combined effects of spatial contrast sensitivity

and visual fields were found to correlate significantly with mobility performance.

A problem with this study is that the dependent variables involve timing the

response to mobility incidents (such as scoring 3 points for taking longer than 15

seconds to reorient). These differences in responding to incidents may have been

related to individual differences (such as personality) than the effects of different

levels of existing vision. In addition there can be significant differences in lighting

levels over a three week period, which could have been measured and recorded.

In a 1990 follow up study to Marron and Bailey, Long et al. [135] examined

the mobility of subjects with moderate levels of low vision. 22 subjects were

involved in this study, with an average age of 36.1 years. An assessment of vision

was conducted before the mobility assessment, which involved subjects walking

through six unfamiliar routes. These routes consisted of two paths in three differ-

ent environments (classroom building, residential area and small business area).

To simulate low illumination, subjects wore 1 percent ultraviolet sunglasses for


Table 2.7: Mobility incidents scored in Long et al. [135]

1. High stepping (looking for a step which is nonexistent or unexpectedly shallow)

2. Missed curb (overstepping a curb because it was not seen)

3. Loss of Balance (tripping, stumbling or mild unsteadiness in balance)

4. Object contact (with any part of the body)

5. Shuffling (sliding foot forward to investigate the path)

6. Stop (stopping inappropriately)

7. Spotter intervention

8. Veer (abrupt change in direction of travel or side-stepping)

9. Off Path (veering off footpath or into adjacent hallways or open areas)

one path per setting. An effort was made in this study to prevent disorienta-

tion, and subjects were provided with navigation instructions and corrected by

an O&M instructor when moving in the wrong direction. Mobility behaviours

that occurred during periods of disorientation were ignored. During the mobility

tasks, subjects were also asked to identify whether a tone, presented every 10-20

seconds, was high or low. This response was recorded, and tested to see if per-

formance on the tone task varied as a function of the demands of the primary

mobility task - however no results were provided for this part of the study. Mo-

bility performance was videotaped and assessed by one of three pairs of scorers,

who counted the frequency of behaviours listed in Table 2.7. The percentage

agreement between observers across different routes and levels or illumination

was reported as 86.4% (agreement for the indoor course was highest, at 93%, fol-

lowed by 90% for the residential route and 68% agreement for the small business

environment). One difficulty with this study is the lack of detail on the mobility

courses used (such as the number and type of obstacles), which makes it difficult

to replicate the results. An individuals visual fields and contrast sensitivity were

found to be related to mobility performance. Visual acuity was not found to be

related to mobility performance.


Pass or Fail mark

A combination of controlled and field experimentation was also used in a 2002

study of night vision goggles for people with degenerative retinal diseases (which

can impair night vision). Spandau et al. [210] used a totally darkened room to

test mobility in an artificial environment. The average age of the 42 subjects

involved in this study was 35 years (range of 10 to 70 years). The dark room task

required the subjects to walk around the room avoiding obstacles, name objects

in the room and read a visual acuity chart. The field experiment component of

this study took place at night in Heidelberg, Germany, and included a residential

area, a strip mall with bars, restaurants and shops (with many pedestrians),

high traffic and noise areas. In this study mobility was assessed by an O&M

instructor who allocated each subject a pass or fail mobility grade (no further

details were given about the method of assessment). Subjects were also asked to

fill out a pre-test and post-test mobility questionnaire. Most subjects were found

to adjust quickly to the night vision device and improved their mobility at night.

2.8.5 Mobility Assessment Conclusion

The development of objective, valid and reliable assessment techniques should

enable the comparison of O&M performance from different ETA’s, visual pros-

theses and other mobility devices. These comparisons should be conducted by

independent observers to reduce bias. This section has reviewed a large number

of research efforts in assessing mobility using self-report, field and artificial ex-

periments. These papers are briefly summarised in Table 2.8. The most widely

supported mobility measures have been PPWS and mobility incidents (generally

defined as contact with obstacles, although veering is also frequently used).

The layout and type of obstacles used could be standardized to increase the

comparability of studies. In addition the layout of the mobility test route should


be changeable to reduce practice effects.

Although psychological variables (such as personality factors) have been con-

sidered in some studies, none of the reviewed papers have considered the effects

of blindness and other impairments (such as hearing) and mobility. In addition,

as pointed out Geruschat et al. [72], the sample size used in experiments is often

small, which limits the ability to generalise the research.

2.9 Chapter Summary

This chapter has reviewed major mobility issues which a blind or vision impaired

person might experience. The main hazardous situations for blind mobility are

drop-offs, obstacles and fast moving objects. Mobility aids, both traditional and

electronic have also been reviewed. The common feature of these aids is that they

provide additional preview information to the blind traveller. The information

from these reviews is important for the development of an Artificial Human Vision

(AHV) device, and it provides areas of need to be targeted by a device. The

assessment of mobility was reviewed in this chapter in some detail, because this

research provides a methodology which can be used in future simulation research

and allows the comparison of AHV mobility research with other literature.

2.9. Chapter Summary 43

Table 2.8: Summary of mobility assessment research discussed in this Chapter.‘Time’ is the amount of time on the course, ‘obst.’ is a count of obstacle contacts,and ‘veer.’ is the number of incidents of veering from a path.

Authors Main Method Comment

Self Report

Brambring [22] Transcription Walking experience reported

Passini et al. [168] Interview Major mobility problems reported

Field experiments

Armstrong [6] Video 13 measures used

Dodds [54] PWI Outdoor path

Clarke-Carter et al. [41] PPWS Large backpack used

Dodds [55] PPWS & PC based Visual task performance used

Haymes et al. [94], [95] PPWS, personality Acuity not useful

Geruschat et al. [74] Multiple measures Pedestrians avoid collision

Beizen et al. [15] Road identified Curb ramps often missed

Tanaka et al. [223] Heart rate Confounded variable

Geruschat et al. [72] Instructor ratings Inter-rater ok

Loomis et al. [136] Instructions followed Promising GPS system

Artificial environment

Michumus et al. [200] Time, obst. No sig. findings

Schneckman et al. [193] Object detected Echolocation study

Lovie-kitchen et al. [137] Time, obst., veer. Peripheral vision important

Soong et al. [207] PPWS, obst., veer. O&M training not effective

Cha et al. [29] Time, obst. 1st AHV simulation

Kuyk et al. [122], [123] time, obst. Illumination important

Guth et al. [86] veering angle Consistent veering

Hill et al. [102] Search path Search strategy important

Combined

Marron et al. [145] Obst. & vision measures Some support for vision measures

Long et al. [135] Obst. veer. and others Acuity not related to mobility

Spandeau et al. [210] Pass of fail Night vision goggle study

Chapter 3

A Review of Artificial Human

Vision

3.1 Introduction

This chapter provides a comprehensive review of AHV technology and research.

An overview of the Human Visual System (HVS) and requirements for an AHV

system are given, followed by a discussion of work in the four main locations

for electrode stimulation: either on the surface or penetrating the visual cortex

(Section 3.3), either behind or just in front of the retina (Section 3.4) and, finally

surrounding the optic nerve (Section 3.4). As the number of implanted individuals

is limited, and psychophysical experiments on normally sighted subjects using

AHV simulation are often used, Section 3.6 presents a review of the literature on

AHV simulation.

3.2 Review of the Human Visual System

This section provides a description of the HVS, and defines many related terms

which are used in the remainder of this chapter.

45

46 Chapter 3. A Review of Artificial Human Vision

Optic Nerve LensSub-Retinal Implant

Epi-RetinalImplantOptic Nerve CuffElectrode CorneaRetinaSclera andChoroid

Figure 3.1: Horizontal diagram of the human eye. The locations for the epi-and sub-retinal implants and the optic nerve electrode are shown. Adapted fromGregory [83].

In the functioning human vision system, two types of photoreceptors (cells

which convert light energy to neural responses) in the retina (known as rods and

cones) are activated by light which has been focused by the lens and cornea in

the eye (see Figure 3.1). The rods provide scotopic vision, the ability to see in

dim light. Rods, which number around 91 million, are not colour sensitive but

are specialised for sensitivity to light [156]. In addition to the rods, there are

three types of cones which each contain a photopigment sensitive to different

wavelengths of light (blue, green and red). The retina contains approximately

4.5 million cones which are concentrated in a small region in the centre of the

retina called the fovea. It is this retinal location which provides the highest

visual acuity and colour vision [239]. The firing frequency of a receptor and

it’s neuronal connections is reduced under constant stimulation, which is called

adaptation [156].

3.2. Review of the Human Visual System 47

Photoreceptor cellsHorizontal, Bipolar and Amacrine cellsGanglion Cells To Optic NerveLight

Figure 3.2: A simplified diagrammatic representation of the cellular layers of theretina. Light passes through the outer layers of the retina before being absorbedby the rods and cones of the photoreceptor layer. Adapted from Sharp andPhillips [199].

Electrical signals from these photoreceptors are then passed through a layer of

bipolar cells to the ganglion cells within the retina 3.2. However, before reaching

the ganglion cells, the bipolar cells may be modified by two other cell types:

horizontal cells (which laterally inhibit the output of the bipolar cells, producing

concentric receptive fields, such as ‘off-centre, on surround) and amacrine cells

(associated with temporal responses shown by some ganglion cells) [199].

There is a strong convergence of signals from the rods and a single rod bipolar

cell can integrate the signal from 1500 rods. Therefore, the amount of information

entering the eye is reduced considerably from approximately 94.5 million from the

rods and cones to around 1 million ganglion cells [156].

The axons of the ganglion cells make up the optic nerve which carries visual

information from the eye (via the optic disk) to the optic chiasma (located at the

base of the hypothalamus) (see figure 3.4). The human brain is divided into two

separate halves (hemispheres) which are connected by a bunch of fibres called the


corpus callosum. In humans the axons in the optic nerve from the left halves of

each retina run through the optic chiasma to the left lateral geniculate nucleus

(LGN) and the opposite occurs for the right halves of each retina. This provides

images of the same object formed on the right and left retinas to be processed

in the same part of the brain [24]. The axons of the LGN then travel in an

optic radiation (shown in Figure 3.4) and terminate in the primary visual cortex

(which is part of the occipital cortex). The projection from each LGN to the

primary visual cortex is ordered, and each part of the retina is represented in the

primary visual cortex [253]. The map of the retina on the cortex is an example

of a retinographical map [239].

Not all of the optic nerve is connected to each LGN. About 20-30% of the

ganglion cell axons connect to the superior colliculus (a small part of the brain

present in each hemisphere), which provides a cruder retinotopic mapping and is

responsible for eye movements [83]. This location is responsible for a phenomenon

called blindsight, in which a person who has had their visual cortex removed is

aware of the location of objects although they are unable to recognise them [24].

The primary visual cortex is also known as the striate cortex because of it’s

distinctive visible striation (layers) [253]. The primary visual cortex is also known

as V1, while associated areas in the occipital lobe are known as V2 through V6.

Six ordered layers are visible, with layer 1 being closest to the surface of the

brain. Layer 4 consists of a number of subdivisions: 4A, 4B, 4Ca and 4Cb [199].

In addition to the layers, the striate cortex is structured into ocular dominance

columns (which can combine input from both eyes for the purpose of depth per-

ception) [156]. The ocular dominance columns are in turn divided into orientation

preference columns which respond to the orientation of receptive fields (such as

bars of light or edges in a particular orientation).

Although most visual information appears to be processed first in the V1 area,

there are many other areas involved in processing visual information such as V2,

3.3. AHV technology and requirements 49

Primary VisualCortexOccipital Lobe

Parietal LobeFrontal Lobe

Temporal LobeFigure 3.3: The cortical lobes of the human brain. The primary visual cortex,which is the site for cortical electrode array implants, is also shown. Adaptedfrom Wandell [239].

V3, the mid-temporal cortex (where neurons are particularly sensitive to stimulus

movement), V4 (colour processing) and the inferotemporal cortex (where stimulus

size, shape, contrast and colour appear to be processed) [227].

3.3 AHV technology and requirements

The development of an AHV system is a multidisciplinary field, involving input

from neuroscience, engineering, computer science, and ophthalmology, in addition

to orientation and mobility specialists.

With the exception of subretinal prostheses, most AHV systems have similar

system requirements. The main components, which will need to function in real

time, are:

A Camera is required to capture and digitise image information from the


Optic Nerve

OpticChiasm

Lateral GeniculateNucleusVisual CortexSuperior Colliculus

Optic Radiation

Figure 3.4: Diagram of the main pathways in the HVS. Adapted from Bruce etal. [24].

3.3. AHV technology and requirements 51

environment. Charged Coupled Device (CCD) based digital cameras are inex-

pensive, small and can be easily interfaced to other system components. An

adaptive mechanism (such as an automatic gain in current video cameras) will

also be required to allow the device to function at different levels of illumination

[45]. Complementary Metal-Oxide Semiconductor (CMOS) are an alternative to

CCD based cameras. Both CCD and CMOS camera sensors have a linear re-

sponse to light intensity: A logarithmic camera has a similar response to the

human visual system, and can reduce saturation in high contrast visual scenes.

The use of a logarithmic camera in an AHV is being investigated in at least one

current research project [170], however this method could also be applied to a

CCD or CMOS camera using a log transform of intensity.

Image processing: There will usually be more data retrieved from the cam-

era than can be used in an AHV device. The image data will usually be pre-

processed to reduce noise. After this, an information reduction (such as edge

detection or segmentation) or a scene understanding approach (attempting to ex-

tract information) can be used. Further details on the image processing methods

is provided in Chapter 4. Cortical prosthesis research by the Dobelle Institute

has found that edge detection and image reversal enhance the ability of subjects

to recognise important scene components (such as doorways) [48]. An alterna-

tive approach to traditional image processing is the use of neuromorphic vision

systems, designed to mimic the design of the human visual system [16].

Transmitter/Receiver: A link is required from the camera/image process-

ing components to the stimulator and electrode array, which are usually located

inside the body. Percutaneous connections, which involve a wire or cable fed

through the skin, have been used for most research because it is simple and reli-

able, however the risk of chronic infection is higher with this type of connection

[159]. The Dobelle Institute system uses a percutaneous connecting pedestal for


connection to the image processing unit (a notebook PC). A transcutaneous con-

nection does not involve cables passing through the skin. This type of connection

is commonly used in Cochlear implants [133], and uses radio frequency telemetry

to send data and power to the embedded stimulator, reducing the risk of infec-

tion. Most AHV research projects are planning to eventually use transcutaneous

connections. Reverse telemetry can also be used to provide details of stimulation

voltage waveforms, impedance measurements and reconstruction of stimulation

voltage waveforms [217]. A good description of a high efficiency transcutaneous

data link for implanted electronic devices is provided in a 1992 paper by Troyke

and Schwan [231].

Stimulator/Electrodes: An electrode is a thin wire, which allows a small

amount of precisely controlled electrical current to pass through it. Electrodes

can be used for either stimulation or recording the electrical activity of the brain.

Two important parameters which can be varied for electrodes include: amplitude

(the highest value reached by a current) and pulse duration (generally defined

to be the time interval between the pulse amplitude reaching half of it’s final

value and the time where the pulse amplitude returns to that value again [2]).

The purpose of the stimulator is to send current through multiple electrodes.

There are two main types of electrodes discussed in the AHV literature: surface

electrodes, which lie flat against the stimulation/recording target; and penetrat-

ing electrodes, which are inserted inside the stimulation/recording target. The

biocompatability, long term-effectiveness, and safe threshold levels for implanted

electrodes need to be carefully considered [46]. Electrodes can stimulate tissue

using monophasic (either positive or negative) or diphasic (alternating between

positive and negative) stimulation. However, the monophasic method can cause

cell damage [221].

3.4. Cortical stimulation 53

3.4 Cortical stimulation

Cortical-based AHV systems use either surface or intracortical (using penetrating

electrodes) stimulation . Cortical stimulation is the only treatment available for

blindness caused by glaucoma, optic atrophy or diseases of the central visual

pathways (such as brain injuries or stroke). The main negative feature of a cortical

implant is that the lack of preliminary processing by the brain (particularly in

the retina where much of the information reduction takes place).

Most research on AHV has focused on sending a captured image to the brain

as a bitmap representation. The ’bitmap’ approach to cortical devices has been

questioned [230]. Research by Hubel and Weisel [105] on macaque monkeys has

found that, in addition to spatial location of a stimulus in the visual field, neurons

in the visual cortex are selective for spatial, temporal, chromatic and binocular

cues. A greater knowledge of cortical physiology may be required before a cortical

prosthesis provides useful vision. Evidence has also been found that there may be

specialised cortical areas for the analysis of biologically important images (such

as faces) [187].

3.4.1 Cortical surface stimulation

The early developments in cortical prostheses involved surface electrode arrays.

The first person to expose the human occipital lobe to electrical stimulation

was the German researcher Otfrid Foerster who in 1929 noticed that stimulation

caused the subject to see a spot of light in a position which depends on the site

of stimulation [90].

Early surface stimulation research

Brindley and Lewin published the results of their groundbreaking study on corti-

cal stimulation in 1968 [23]. In their study a 52-year-old legally blind subject was


implanted with an array of eighty platinum electrodes, a design which had previ-

ously been tested in baboons. These electrodes were stimulated by pulsed radio

signals from an oscillator. Stimulation of these electrodes produced discernible

phosphenes. Brindley and Lewin suggested that there was probably no flicker

fusion frequency (ie. the frequency of intermittent light stimuli where it is per-

ceived as continuous lighting) for this implant. They also found that phosphenes

moved with eye movements and that phosphene perception usually (but not al-

ways) stopped when stimulation ceased. Stimulation of one electrode was found

to produce multiple phosphenes, and when multiple electrodes in close vicinity

were activated a larger, straight light phosphene was produced. Unfortunately,

the monophasic stimulus pulses used long-term in these earlier studies were also

likely to cause irreversible damage at the electrode-tissue interface [221].

William Dobelle

Brindley and Lewin’s research inspired pioneering work on 37 human subjects by

Dobelle and Mladejovsky in 1974 [49], where electrical stimulation was applied to

patients hospitalised for cranial surgery. Supporting Brindley and Lewin’s work,

they found eye movements caused phosphenes to move, and multiple phosphenes

could be produced from a single electrode. However, Dobelle and Mladejovsky

found that constant stimulation caused phosphenes to fade (suggesting that re-

fresh of the phosphenes is required). In a later paper [50], it was reported that

subjects were able to read electrode-induced Braille characters more efficiently

than using their tactile sense.

In 2000, Dobelle published a paper [48] describing a subject who had been

using a cortical visual prosthesis system for over 20 years. The system used a 64-

channel electrode array, which had been implanted on the mesial surface (towards

the middle) of the subject’s right occipital lobe in 1978. When stimulated, each

electrode produced 1-4 closely spaced phosphenes. The stimulation parameters


and phosphene locations had been stable for the past 20 years; however the elec-

trode thresholds required a 15 minute recalibration every morning. This system

utilised a black and white camera connected to a notebook computer. Cables

from the notebook were connected to a percutaneous connecting pedestal, which

interfaced to the microcontroller, stimulus generator and electrode array. Dobelle

reported that ‘frame rates’ of around 4 frames per second have been found to be

optimal. Using the device, Dobelle found that the subject had a visual acuity of

approximately 20/200.

Bionic Eye Research Project

Although research in the early 1990s moved toward intracortical stimulation,

a recently commenced project by Chowdury et al. at the University of New

South Wales is investigating the use of technology adapted from cochlear implants

(which generally use surface electrodes) [39], [40]. An in vivo model has been

successfully applied in animal experiments involving cats, where the transcallosal

evoked response to cortical stimulation is recorded on the opposite hemisphere to

the site of stimulation (this is possible as there are direct corpus callosum neural

pathways between surface points on the two hemispheres of the brain). Future

experiments are planned with a human subject who, unlike the cat, will be able

to describe their subjective response to stimulation.

3.4.2 Intracortical stimulation

National Institute of Health

The Neuroprosthesis Program at the U.S. National Institute of Health (NIH) was

the first to publish research concerning the use of intracortical stimulation to pro-

duce phosphenes. In a study by Bak et al. [8], three normally sighted patients,

undergoing occipital craniotomies (opening of the skull) for other conditions, were


tested for an hour each. Surface stimulation produced the same phosphenes de-

scribed by Dobelle and Brindley. After this, a dual microelectrode was inserted

to level 4B in the primary visual cortex and stimulation applied. Unlike surface

electrodes, the intracortical electrode phosphenes did not flicker. An additional

important finding from this research was the discovery that intracortical stimu-

lation required 10-100 times less electrical current to produce phosphenes than

surface electrodes. Also, intracortical electrodes located as closely as 500 µm

could evoke distinct phosphenes.

A more detailed experiment by the NIH team was described in 1996 by

Schmidt et al. [194]. Thirty-eight microelectrodes were inserted into the right

visual cortex of a 42 year old woman for four months. The patient, who had been

blind for 22 years, was consistently able to perceive phosphenes at stable posi-

tions in visual space. Phosphenes were produced with 34 of the microelectrodes,

at thresholds usually at 25 µA. It was found that these phosphenes did not flicker

and changing the stimulus amplitude, frequency and pulse duration could change

phosphene brightness. A perception of depth from the stimulation was reported.

It was also found that as the stimulation level was increased, the phosphenes

generally changed colour (varying from white, ‘yellowish’ and ‘greyish’). Sup-

porting earlier research, phosphenes moved with eye movements. Schmidt et al.

suggested that using this method electrodes could be placed five times closer than

surface stimulation. An important result of this study concerned after-discharge:

one phosphene was observed for up to 25 minutes after cessation of stimulation,

which suggests that even small electrical currents from repeated, patterned stim-

ulation may cause epilepsy. At least six of the electrode leads broke during the

study, due to accidental movement of the patient during sleep, which limited test-

ing on pattern recognition. The percutaneous leads and electrodes were removed

after four months.

The NIH Neuroprosthesis Program described above was discontinued by 2001


[183]. However, there is continuing collaboration with the Intracortical visual

prosthesis team at Illinois Institute of Technology (see below).

University of Utah

The University of Utah currently has an active intracortical research group led

by Richard Normann. This team has focused mainly on electrode array design

for stimulation and recording, behavioural experiments and psychophysical ex-

periments (for example the Cha et al. AHV simulation studies described in the

previous chapter).

The University of Utah has developed an array of 100 penetrating cortical

electrodes, each 1.5 mm in length and separated by 400 microns. This length has

been selected to reach level 4Cb of the primary visual cortex (area V1). Level 4Cb

is an area responsible for receiving form information from the lateral geniculate

nucleus (LGN), in which neurons have the smallest and simplest receptive fields,

and where lower thresholds can be used for generating phosphenes [157]. Manual

insertion of the array was found to cause cortical deformation, therefore a pneu-

matic insertion device was also developed and tested [190]. The biocompatibility

of this array has been extensively evaluated, and arrays have been inserted for

up to 14 months in cats [158]. The Utah Electrode Array (UEA) has been in-

vestigated as a recording structure for potential brain-computer interfaces [148],

and recently for investigating representations of simple visual stimuli in the cat

visual cortex [160]. A modification of the UEA is available which has graded elec-

trodes, allowing stimulation and recording to be conducted in both horizontal and

vertical directions [147].

Cortical Implant for the Blind (CORTIVIS)

The CORTIVIS project, commenced in 2001, is led by Eduardo Fernandez of

the University of Miguel Hernandez in Spain, and involves additional researchers


from Germany, Austria, France and Portugal.

This group has investigated the use of the UEA in animal experiments (cats,

rabbits and rats) over a period of 12 hours to six months. The electrodes were

found to be well-tolerated by the cortex, despite some inflammation in the vicinity

of the electrode tracks [65].

In order to develop a methodology to identify feasibility of a cortical pros-

thesis for a patient, and the preferred location for the prosthesis, Fernandez et

al. [67] have used transcranial magnetic stimulation (TMS) to evoke phosphenes

in 13 legally blind and 19 normally sighted patients. The advantage of TMS is

that it is painless and non-invasive. For each patient, twenty-eight positions ar-

ranged in a 2x2 cm grid over the occipital area were stimulated, and phosphenes

were perceived by 94% of the normally sighted participants. Interestingly how-

ever, only 54% of the legally blind patients perceived phosphenes using TMS

(even after adjusting the stimulation parameters). Evoked phosphenes were to-

pographically organised and the mapping results could generally be reproduced

between participants.

The CORTIVIS project is also developing a retina-like processor [171], de-

signed to simulate the functioning of the human retina to produce optimal elec-

trode stimulation at the cortical level. The output of this system is a series of

spike patterns, which could be used to stimulate neurons in the visual cortex.

In a 2003 study of brain plasticity by the CORTIVIS group [66], fMRI was used

to study the differences in reading Braille in normally sighted and congenitally

blind people. Unlike normally sighted participants, activation of the occipital

cortex (which contains the primary visual cortex) was recorded in blind partic-

ipants. The authors note that where cross-modal plasticity has been activated

in this way, the processing of tactile information is associated with significantly

improved tactile reading skill.

3.5. Retinal Stimulation 59

Intracortical visual prosthesis

This project is led by Philip R. Troyk, Director of the Laboratory of Neuro-

prosthetic Research in the United States, and involves collaboration with other

institutions and former staff from the NIH Neuroprosthesis Program. Their ap-

proach is to use small implanted arrays (consisting of eight electrodes) in groups

of intracortical electrodes which ‘tile’ the visual cortex. In a recent (2003) paper,

Troyke et al. [230], describe an interesting animal research model, using a male

macaque monkey, designed to investigate visual prosthesis functioning with this

tiled design. Before implantation, the animal was presented with a flash of light,

and then trained to continue staring at the flash location (so only the memory

of the flash remains). One hundred and ninety-two tiled electrodes were then

implanted into area V1 of the animal. Only 114 electrodes were functioning af-

ter implantation. The receptive field co-ordinates for each implanted electrode

were estimated, and a phosphene was generated in that location. The macaque

received a reward if its eye position moved within 2◦ of the known receptive field

for that electrode. The reported preliminary results indicated that this method

demonstrated a useful method for future AHV research.

3.5 Retinal Stimulation

The most common non-preventable reason for blindness in the developed world

is age-related macular degeneration. This condition affects the retina at the back

of the eye, while leaving the remaining components of the visual system intact.

Retinal prosthesis research aims to use the remaining visual pathway components

to provide partial restoration of sight. In 1956 an Australian researcher, G.E.

Tassiker, was the first to describe placing a light sensitive selenuium plate behind

a blind person’s retina and restoring some intermittent light sensation [225].


There are significant advantages to the retinal approach to AHV. Implanta-

tion of a cortical prosthesis requires intercranial neurosurgery, which may expose

a patient to higher risk. At a fine scale, the mapping of a stimulus to the appro-

priate place on the cortex may be variable between subjects [216]. An alternate

approach is to stimulate the eye rather than the brain. A retinal prosthesis could

assist people who still have a functioning optic nerve. In post-mortem exami-

nations of people without light perception, 80% of the optic nerve was found to

be functioning and approximately 30% of the ganglion cell layer was found to

be functioning [109]. However, there may also be continual remodelling by the

retina which could lead to spatial corruption and cryptic synapse formation after

a retinal implant has been attached [142].

The two types of retinal prosthesis are subretinal and epiretinal, located re-

spectively inside and outside the retinal layer as explained in more detail below.

3.5.1 Subretinal stimulation

As mentioned, the information from approximately 95 million receptors in the

retina, is reduced down to 1 million fibres in the optic nerve [156]. This infor-

mation reduction takes place in the inner nuclear layer (consisting of amacrine,

bipolar and horizontal cell nuclei) of the retina. Targeting this layer, a subretinal

implant is located behind the photoreceptor layer of the retina and in front of the

pigmented layer called the retinal pigment epithelium. Therefore the subretinal

approach (unlike the epiretinal) may be able to utilise the information reduction

functions in the retina, provided the electric field produced does not interfere

with other retina components (such as the ganglion cell layer).


Optobionics Corporation (United States)

Since the 1980s Alan and Vincent Chow have been investigating subretinal mi-

crophotodiodes for subretinal stimulation [36], and their company, Optobionics,

was awarded the original patent for an artificial subretinal device in 1991 [33].

In an early animal experiment, an implanted strip electrode was inserted be-

hind the photoreceptor layer in a rabbit’s eye. The electrical evoked response of

stimulation to the operated eye was compared to the normal eye by presenting

a flash of light, and then measuring the response from the scalp over the visual

cortex. It was found that a brief electrical spike was generated during stimulation

[35]. This experiment demonstrated the feasibility of converting light into elec-

trical energy using subretinal stimulation to produce a cortical electrical evoked

response [143].

A further animal experiment focused on the long term biocompatibility of sub

retinal stimulation [37]. Cats were selected for this study as they have both retinal

and choroidal circulation (unlike rabbits). The implants, approximately 50µm in

thickness with a diameter of 2 to 2.5 mm, consisted of a doped and ion implanted

silicon substrate, surrounded with a gold electrode layer. After implantation in

the cat’s right eye, the arrays were evaluated over 10 to 27 months. During this

time, a gradually decreased response to light was found, due to the dissolution of

the gold electrode layer. In addition, the silicon substrate blocked choroidal nour-

ishment to the retina, which led to a degeneration of the photoreceptors (which

are highly dependent on blood supply for oxygenation). The loss of photorecep-

tors may not be important as they may be damaged anyway, however design work

commenced on a fenestrated design (one containing holes to improve the flow of

nutrients from the choroid to the retina) [37]. The positive findings from this

study were that the implant maintained a stable position over time and there

was no rejection, inflammation, or degeneration of the retina outside the location


of the implant [166].

By June 2000, Optobionics received approval from the U.S. Food and Drug

Administration (FDA) to commence safety and feasibility trials in 6 patients [34].

The Artificial Silicon Retina (ASR), consisting of 5000 microelectrode-tipped mi-

crophotodiodes in a 2mm diameter device, was implanted into the right eyes

of 6 legally blind patients with retinitis pigmentosa. During a follow-up period

of 6 to 18 months, all ASRs were found to function electrically and there were

no signs of rejection, inflammation, erosion, retinal detachment or migration of

the device. During this study it was found that all patients experienced im-

provements in visual function (such as improved colour perception), and there

were also unexpected improvements in retinal areas distant from the implant.

These improvements may have been due to neurotrophic effects (meaning that

the improvement may have occurred due to the presence of a foreign body in the

subretinal space, and not as a result of microphotodiode functioning), and further

studies are intended to explore this improvement. Additional research is planned

to examine the implant and age related macular degeneration; and whether the

neurotrophic effect can be effective in earlier stages of retinitis pigmentosa [34].

An issue with the Optobionics research has been the lack of an experimental

control (by implanting an inactive device or conducting sham surgery), to evaluate

against the ASR. Pardue et. al. [165] have recently conducted research addressing

this issue. Their experiment involved 15 RCS rats, which have a genetic mutation

resulting in photoreceptor degeneration over approximately 77 days. The rats

received either the ASR device, an inactive device, sham surgery, or no surgery.

The outer retinal function was assessed with weekly Electroretinogram (ERG)

recordings. After 4-6 weeks there was a 30-70% higher b-wave amplitude response

with the ASR compared with the inactive device, indicating that the ASR device

appears to produce some temporary improvement in retinal function. However,

after 8 weeks, there was no significant difference in b-wave amplitude response


between the inactive and active devices. At 8 weeks, there were a significantly

greater number of photoreceptors remaining for rats who had received either the

ASR or inactive device compared to those rats that had undergone sham surgery

or no surgery. Pardue et. al. [165] suggest that enhanced protective effects from

the ASR may be possible by altering its design to increase current levels or by

increasing environmental light levels to produce higher stimulation levels.

MPD-Array project

After collaborating with the Optobionics group between 1994 and 1995 [38], a

Southern German team led by Eberhart Zrenner at the University Eye Hospital

in Tubingen, was formed in 1995 to develop a subretinal prosthesis. In 1996 the

Institute of Micro-Electronics in Stuttgart developed a prototype microphotodi-

ode array (MPDA) containing 7600 microelectrodes on a 3 mm disc, 50 µm in

diameter [258]. In vitro techniques have been predominantly reported by the

German subretinal project.

The first generation of MPDAs were tested using a ‘sandwich technique’,

which involved the retinae from newly hatched chickens being removed and ad-

hered to a recording multielectrode array (the ganglion cell side was adhered).

The photoreceptor outer segments were then damaged, and an MPDA placed

onto the retina. This technique allowed the recording of stimuli from the MPDA

[258]. A later study [259] examined degenerated rat retinae. The retinae were

removed and cut into 5x5mm segments, then attached to a 60-electrode micro-

electrode array. Beams of white light were flashed onto the MPDA and it was

found that intrinsic ganglion cell activity could be recorded even with a highly

degenerated retinal network. Further experiments have shown that it should be

possible to transform the basic features of images, such as points, bars and edges

into activity of the existing retinal network; which suggests that shape percep-

tion and object location may be possible with a subretinal device [213]. However,


recent epiretinal results from Rizzo et. al. [184] have not confirmed the pattern

perception of phosphenes from patterned electrical stimulation of the retina.

Further tests have been conducted in order to test the biocompatibility stabil-

ity of the MPDA. Various materials were placed in Petri dishes with the retinae

of pigmented rats. For comparison, a control dish contained only the retinae and

solution. None of the MPDA materials showed a toxic effect. Retinal cell cultures

from rats were also used by Guenther et al. to screen for technical implant ma-

terial [85]. Although most materials (including iridium and silica) showed good

biocompatibility, a reduced biocompatibility was found for titanium materials.

Interestingly, a later paper by Hammerle et al. [91] found that titanium nitrate

had excellent biostability, both in vivo and in vitro.

In a similar method to the Optobionics research, electroretinography was per-

formed in rabbits and rats to measure the effectiveness of the MPDA. Because

the MPDA are sensitive to infrared light, it is possible to stimulate the retina and

measure the discharged current. This method should be useful for the localising

electrical responses from an MPDA.

As with the early Optobionics MPDA [35] Zrenner et al. found in their early

work that metabolic processes in the photoreceptor layer can be disrupted by the

MPDA, and they placed very thin holes in their device to allow nutrients to be

passed [258].

As natural photoreceptors in the retina are far more efficient than photodiodes,

visible light is not powerful enough to stimulate the MPDA. Therefore infrared

enhancement of the photodiode arrays (by inserting an additional layer in the

array) has been suggested to enhance the stimulation current [195].

The German team commenced in vivo experiments in 2000, when evoked

cortical potentials were measured from Yucantan micropigs and rabbits. The

micropigs have eyes which are comparable in size and function to human eyes

[257]. Fourteen months after implantation, the implant and retina surrounding


it were examined, and there were no noticeable changes to anatomical integrity

[71]. However, because the existing MPDA does not function in ambient light

conditions, an electrode foil prototype with similar properties was implanted.

The micropigs required a higher threshold level than the rabbits [196], however

the implants were successful in producing evoked cortical potentials in half of

the animals tested. The thresholds identified in this study were similar to those

required in epiretinal stimulation [196].

The latest reports from this group concern the results of in vivo experiments

on cats. Volker et al. described the use of optical coherence tomography to

examine the morphological and circulatory conditions of the cat neuroretina and

its interface with an implanted MPDA [237].

3.5.2 Other subretinal methods

A team of Japanese researchers led by Tohru Yagi of Nagoya University has

been investigating the attachment of cultured neurons onto electrodes, and then

guiding the axons towards the central nervous system. As this ‘hybrid retinal

implant’ will not require retinal ganglion cells or an optic nerve, it could be useful

for patients with diseases in these components of the visual pathway. Results of

an experiment culturing neural cells obtained from the spinal cords of a 3-4 week

old rat are described in Ito et al. [112] who found that it was difficult to guide

neurons to grow in a particular direction. Another study by this team investigated

electrical stimulation requirements by stimulating the lateral geniculate nucleus in

a cat. Recordings of the evoked potentials from the cat’s cortex found that pulse

amplitude was a more important factor than pulse duration, and that a biphasic

pulse pattern was the most effective stimulation pattern [120]. Further studies

have suggested using a computer model for the 3-D configuration of electrode

arrays [119].


Peterman et al. are also investigating the use of directed cell growth and lo-

calised neurotransmitter release for a retinal interface. They have been successful

in directing the growth of neurons in a defined direction, using micropatterned

substrates [173] and have demonstrated that the localised chemical stimulation

of excitable cells is feasible. The authors suggest that chemical stimulation can

have a similar spatial resolution as an electrical stimulation, but with the ability

to mimic the major functions of synaptic transmission [174].

An interesting design for a MPDA has been recently reported by Ziegler et al.

(2003), who propose a device where each pixel acts as an independent oscillator

whose frequency is controlled by light intensity [256].

Kanda (2004) has suggested an alternative stimulation method for a reti-

nal device called Suprachoroidal-Transretinal Stimulation (STS), which does not

involve the attachment of electrodes to the retina and may result in less compli-

cated surgery for blind patients. In this method the anodic stimulating electrode

is located on the choroidal membrane, and the cathode is located in the vitreous

body. This technique has been used in animal experiments where evoked poten-

tials were recorded from the superior colliculus in rats. The authors are planning

long term, in vivo, biocompatability studies [118]. However, it has been demon-

strated that neural cells should not be separated from electrodes by more than a

few micrometers (due to overheating, cross-talk between neighboring pixels, and

electrochemical erosion) [164]. The thickness of the choroid is approximately 400

µm, therefore suprachoroidal placement precludes close proximity between elec-

trodes and cells, which will limit the potential visual acuity of the STS approach.

3.5.3 Epiretinal stimulation

An epiretinal device involves a neurostimulator chip being implanted against the

ganglion cells in the retina. This location is different from subretinal implants, in


that it bypasses the information reduction components of the retina. The advan-

tage of the epiretinal approach, however, is that the remaining retinal neurons can

be stimulated in patients who are blind from end-stage photoreceptor diseases.

Retinal Implant

Formerly from the Wilmer Ophthalmological Institute, Johns Hopkins Hospital,

Mark Humayun and Eugene De Juan Jr. are currently based at the Doheny

Retina Institute at the University of Southern California. Humayun’s 1992 PhD

thesis demonstrated that a visually impaired person could perceive phosphenes

during stimulation of the retina [106]. The engineering aspects of developing

electronic stimulators and supporting electronics have been mainly conducted by

Wentai Liu and his team at North Carolina State University [130].

In the first experiment to demonstrate successful phosphene perception from

local electrical stimulation of the retina [110], 14 patients (12 with retinitis pig-

mentosa, and two with age-related macular degeneration) had their inner retinal

surface electrically stimulated under local anaesthesia. The responses were retino-

topically correct (ie. the perceived phosphene location matched the location of

stimulation) in 13 of the patients, with the remaining patient (who was blind from

birth) unable to distinguish anything apart from flashing light. The phosphenes

were perceived exactly with the timing of the electrical stimulation [110]. Flicker

fusion was tested in two subjects and found to occur at approximately 50 Hz (the

phosphenes also appeared brighter at higher frequency) [107]. An earlier 1996

paper also reported on five of these patients [108].

In 1999, a further experiment was reported [109] on nine subjects, involving

nine or 25 electrode array electrodes. The electrodes were placed against the

retinal surface and were hand-held in place using a silicon-coated cable with the

guidance of a surgical microscope. The flicker fusion frequency was found to be

50 Hz in two subjects and 40 Hz in another two subjects (the remaining subjects


were not tested). By scanning with the head-mounted camera, subjects were able

to perceive simple shapes in response to stimulation (eg. horizontal and vertical

lines and ‘U’ and ‘H’ shapes).

A report on the long term biocompatibility of an implanted, inactive epireti-

nal device was also published in 1999 [140]. Twenty-five platinum disc-shaped

electrodes in a silicon matrix were implanted into the retinal surface of four nor-

mally sighted dogs. The arrays were held in place using metal alloy tacks. Over

a six-month period the implants were biologically well tolerated, mechanically

stable, and could be securely attached to the retinal surface.

A design for a functioning retinal prosthesis system has been described in joint

papers by Liu et al. at North Carolina State University and the John Hopkins

team in 1999 [129], [128]. The proposed device, called the Multiple Unit Artificial

Retina Chipset (MARC), consists of the extraocular unit containing the video

camera and video processing board, connected by a telemetric inductive link to

the intraocular unit. The power and signal transceiver, stimulation driver and

electrode array are contained in the intraocular unit.

In 2003, after obtaining FDA approval, the Doheny Eye Institute team and

Second Sight, a company formed by former North Carolina State University team

member, Robert Greenberg and Alfred Mann, developed the first human epireti-

nal implant. A subject with advanced retinitis pigmentosa received an implanted

4x4-electrode array, connected by a subcutaneous cable to an extraocular unit

which was surgically attached to the temporal area of the skull. A wireless link

transferred data and power from a belt worn visual-processing unit to the ex-

traocular unit. All 16 electrodes produced phosphenes, and the subject was able

to detect ambient light, motion and correctly recognise the location of phosphenes

(eg. left vs right, or ‘upside down’). Future plans are to develop more complex

stimulation control and provide a higher number of electrodes [111]. The use of

microwire glass is also being investigated as a method to assist with the mapping


of flat microelectric stimulator chips and curved neuronal tissue [116].

Retinal Prosthesis Project

Following earlier collaborative work with Humayan and de Juan, Wentai Liu and

his team have continued with the development of an epiretinal prosthesis. A 60

electrode stimulating chip, which integrates power transfer and back telemetry

has been developed [131]. One of the advantages of this system would be removing

the requirement for the cable connecting the intraocular and extraocular units

described in the Doheny Eye Institute team implant [111].

Boston Retinal Implant Project

This project is a collaboration between Joseph Rizzo (Massachusetts Eye and Ear

Infirmary-Harvard Medical School) and John Wyatt (Massachusetts Institute of

Technology) to develop an epiretinal prosthesis. The main difference between

their approach and Humayun et al. is the use of a miniature laser, located in

a pair of glasses, to transfer power and data to a stimulator chip. Although

the laser is required to be accurately directed to the implant, and needs to cope

with blinking, it will not be affected by electronic noise interference (unlike radio

frequency transmission) [182]. Electrically invoked cortical potentials have been

successfully recorded from stimulation of a rabbit retina with this method [181].

Recently the Boston retinal implant project microelectrode arrays have been

tested with six patients, five of them legally blind from retinitis pigmentosa. The

sixth patient was normally sighted, however their eye required removal due to

orbital cancer. All patients were able to perceive phosphenes in response to stim-

ulation, however the results were mixed. Threshold charge densities were found

to be significantly higher, and above safe levels, in blind patients compared to

the normally sighted patient [184]. In this study, it was often found (for example,

60% of tests in one subject) that multiple phosphenes would be presented when


a single electrode was stimulated. In addition, multiple-electrode stimulation did

not reliably produce matching phosphenes [185].

EPI-RET

Rolf Eckmillar, from the University of Bonn, leads the German EPI-RET project,

which involves 14 research groups. The aim of their first epiretinal device is to

allow blind people to identify the location and shape of large objects [59]. Their

approach involves replicating a healthy retina with a ‘retinal encoder’ device,

which consists of a photosensor array of 10,000-100,000 pixel inputs and simu-

lated output of 100-1,000 ‘ganglion cells’. Eventually this project aims to embed

this encoder into a contact lens. The output from the encoder is then sent to

an implanted retinal stimulator. Eckmilliar et al. suggest that a future epireti-

nal prosthesis will be tuned (to optimize phosphene perception) during a dialog

between a subject and their retinal encoder [60], [12], [13], [11]. More recently,

a ‘Learning Active Vision Encoder’ (LAVIE) has been proposed to compensate

for spontaneous eye movements (drift or nystagmus) and head movements in the

absence of vision. A smooth pursuit function is also being investigated [61].

Flat platinum microelectrodes have been developed for the EPI-RET project

and evoked cortical potentials have been recorded after stimulation in rabbits

[238]. In 2000, Hesse et al. reported problems with the fixation of the electrode

film and the retina in a cat experiment, partly due to the very thin posterior

sclera [97]. Research into alternate electrode shape and fixation techniques was

planned.

The company Intelligent Implants was formed in 1998 to commercialise re-

search by the EPI-RET group [61].

3.6. Optic Nerve devices 71

University of NSW and University of Newcastle Vision Prosthesis

Project

Australian research on an epiretinal prosthetic vision system is occurring at the

Vision Prosthesis Project at the Universities of NSW and Newcastle, led by Gregg

Suaning and Nigel Lovell. This project aims to extend concepts from the devel-

opment of cochlear prostheses.

A 100-channel neurostimulator circuit for the retina has been developed, which

uses bidirectional radio-frequency telemetry for transferring data and power [216],

[217]. A data format protocol has been introduced. The 100-channel neurostim-

ulator was found to function and successfully produce evoked potentials in sheep

[218], [219], [89]. An inexpensive technique for manufacturing platinum spher-

ical electrodes has also been proposed [220]. Recently, an hexagonal mosaic of

intraocular electrodes has been suggested by Hallum et al. [88] to optimise the

placement of electrodes and therefore improve visual acuity in prosthesis patients.

A proposed prototype for an epiretinal system, capable of 840 stimulating events

per second, using this electrode placement combined with a filtering approach to

image processing, has also been described [215].

3.6 Optic Nerve devices

The optic nerve is a collection of one million individual fibres running from the

retina to the lateral geniculate body in the centre of the brain. This nerve can be

reached surgically and could provide a suitable location for implanting a stimu-

lation electrode array.


Microsystems Based Visual Prosthesis (MiVip) and OPTIVIP projects

(ESPRIT programme of the European Union)

The MiVip team, led by Claude Veraat of the Neural Rehabilitation Engineering

Laboratory, Universit Catholique de Louvain in Belgium, has developed a pros-

thesis system which includes a spiral cuff silicon electrode to stimulate the optic

nerve.

In February 1998 a 59-year-old blind patient was implanted with the op-

tic nerve visual prosthesis. Localised phosphenes were successfully produced

throughout the visual field, and changing pulse duration or amplitude could alter

their brightness. After training it was reported that the patient could perceive

different shapes, line orientations and even letters [236]. However, this system

only displays one phosphene at a time and pattern recognition was achieved by

the subject scanning with a head-mounted camera over a time period of up to 3

minutes. An interesting feature of this study has been the different phosphene

shapes that have been generated: if these could be reliably replicated they might

add a useful dimension to prosthetic vision. The cuff electrode consists of four

platinum contacts and is able to adapt continuously to the diameter of the optic

nerve. Initially a subcutaneous connector conducted stimulation of the electrode;

however in August 2000 a neurostimulator and antenna were implanted and con-

nected to the electrode. An external controller with telemetry was then used for

stimulating the cuff electrode. Recently, an adaptive neural network technique

has been proposed to classify the phosphenes generated by this device [5], [4].

3.7 AHV simulation studies

Due to the difficulty in obtaining experimental participants with an AHV device

implanted, a number of simulation studies have been conducted with normally

3.7. AHV simulation studies 73

sighted subjects. The simulation approach assumes that normally sighted people

are receiving the same experience as a blind recipient of an AHV system. How-

ever, criticism of this approach has been raised by Weiland and Humayun (2003)

who have stated that human implant studies are the only way of verifying the

effectiveness of a visual prosthesis and have questioned the validity of simula-

tion studies [242]. This criticism has been addressed by Dagnelie (2006) who has

defended the use of AHV simulation studies as they can help identify require-

ments and find solutions for vision tasks; provide examples of prosthetic vision to

clinicians and the public; and also assist in designing rehabilitation programs for

future AHV system recipients [46]. In addition, simulation studies may reduce

the number of animals sacrificed in AHV studies.

As discussed in Section 2.8.3, an often cited prosthetic vision simulation was

conducted in 1992 at the University of Utah by Cha et al. [29], in order to cal-

culate the minimum number of phosphenes required for adequate mobility. The

main findings from this research were that a 25x25 array of phosphenes with a

field of view of 30◦ would be required for a successful device. However, the simu-

lation display in Cha et al. used a simple television-like display. Hayes et al. have

described a more sophisticated approach [93]: in their study, two different image

processing applications were used to display simulated phosphenes to a seated

subject, who wore a Head Mounted Display (HMD). The first image processing

application used a simple square phosphene array, where each phosphene con-

sisted of a solid grey scale value equal to the mean luminance of the contributing

image pixels. The second image processing application used a Gaussian filter.

Array size, contrast level, drop-out percentage, simulated phosphene size, and

background noise were adjustable features of the simulation. Object recognition

(plate, cup, spoon, etc), reading, candy pouring and cutting accuracy tasks were

conducted under different simulation conditions. The main result was to conclude

that the phosphene array size would be the most important factor in a usable


prosthesis.

Another image processing approach investigated the requirements for AHV

facial recognition [226]. A Low Vision Enhancement System (LVES) connected

to a PC and driven by a Visual Basic program was used to display the images.

Subjects were required to select which simulation image best matched a set of

four normal images of human faces (the images of the same person were varied by

head angle and whether the person was smiling or serious). All images displayed

occupied a visual field of 13◦ horizontally and 17◦ vertically. The simulation

display was presented in a circular ‘dot mask’, rather than the contiguous square

blocks. Electrode properties (such as drop outs; size and gaps), contrast and grey

levels could be varied experimentally. The grid sizes used in this study varied

from 10x10 to 32x32 phosphenes. The authors found high accuracy for all high

contrast tests (except those with significant drop out and two gray levels) and

suggest that reliable face recognition using a crude pixelized grid is feasible.

Research at the Queensland University of Technology (QUT), Australia by

Boyle et al. [19], has examined the use of various image processing techniques

(such as enhancing edges, using different grey scales and extracting the most im-

portant image features) to identify a recognition threshold for low quality station-

ary images. These images are used to represent the limited number of phosphenes

available to the subject (typically a 25x25 array). This research has found that at

these low information levels the use of image processing techniques is not helpful

in the identification of static scenes, although an automatic zoom feature did help

image understanding.

3.8 Evaluation of current AHV systems

With the current understanding of neuronal mechanisms in the visual system,

AHV systems do not appear likely to replace the functioning of normal human

3.8. Evaluation of current AHV systems 75

vision for some time. It is also currently difficult for microelectrodes to pro-

vide a regularly organised array of phosphenes [230]. It should be noted that

as the development of AHV systems continues, research into retinal transplan-

tation, growth factors and gene therapy has commenced which may also provide

alternative treatment options for blindness.

In the long term AHV systems are likely to offer many benefits, including mo-

bility, face recognition and reading which will have a profoundly positive effect

on the blind recipient. AHV research also offers important insight on the func-

tioning of the human visual system, and in brain-computer interface technology.

However, in the immediate future it will be important to consider whether the

benefits from the use of these systems outweigh the cost. Despite the overloading

of another sensory input channel, traditional mobility aids and ETA devices (such

as the vOICe system from Peter Meijer [150]), are currently cheaper, less invasive

and may require a similar amount of training to AHV systems. Additionally,

many people who are classified as blind are elderly, and still have some remaining

vision, and therefore may not be suited to an AHV system.

The need for standard psychophysical assessment methods have been noted

by a number of AHV researchers (for example, [215], [229] and [132]). To inform

consumers on the benefits of an AHV system compared to other technical aids

for the blind, future research comparing the effectiveness of these devices would

be useful. The lack of a method to compare mobility was also raised by Dobelle

in 2000 [48]. However, as discussed in the previous chapter, there are a number

of mobility assessment methods presented in the Orientation and Mobility Lit-

erature which could be useful for comparison of AHV systems and other devices

(recent examples are [137], [95], [74]).

A number of additional AHV review papers (which cover the same literature

discussed in this chapter) have been published. These reviews include: [46], [143],

[147], [240] and [235] (in German). A list of AHV project web sites is provided


in Appendix 1 of this thesis.

3.9 Chapter Summary

The subretinal implants developed by the Optobionics Corporation show the

greatest promise in restoring some vision; however there are doubts over whether

the improvements in vision are due to neurotrophic effect or the device itself.

Further tests to determine the reason for the improvements are planned. If the

device is responsible, it is conceivable that their implants may be available in the

next few years.

The cortical implant system from the Dobelle Institute is commercially avail-

able; however it has not been approved by the U.S. Food and Drug Adminis-

tration. It is difficult to obtain outcome information from the Dobelle system.,

however one recent article in the Wall Street Journal [152] reported a 33 year old

female recipient was able to use it for only 15 minutes per day (because it was

tiring and caused head pain).

The remaining cortical and optic nerve systems are still in varying stages of

preliminary human or animal testing. Preliminary research has also commenced

on microstimulation of the lateral geniculate nucleus [175]. Although progress

is being made, it does not appear likely that a commercial system using these

methods will be available within the next five years.

Finally, psychophysical and mobility assessment standards would help in com-

paring AHV systems with other technical aids for the blind.

Chapter 4

A Framework for Blind Mobility

Improvement via Computer

Vision

4.1 Introduction

The previous two chapters have reviewed mobility problems and assessment for

the blind, and current AHV system technology. This chapter examines how infor-

mation can be effectively presented to a blind person via the perceived phosphenes

from an AHV system. The main constraint on the amount of information which

can be provided using an AHV system is the limited number of electrodes which

can be stimulated, which limits the display spatial resolution. As a result, meth-

ods are required in an AHV system to reduce the resolution of images captured

from a video source. An example of this reduction in spatial resolution is shown in

Figure 4.1. Figure 4.1a shows a typical mobility hazard in the form of a telephone

booth in Latrobe St., East Brisbane, Australia. A reduced 25x25 resolution im-

age and it’s 625 phosphene representation are shown in Figures 4.1b and 4.1c.

77

78 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision

A symbolic phosphene display, which would highlight the telephone booth (at

appropriate proximity) is displayed in Figure 4.1d. There are a large number of

computer vision methods which could be discussed in this Chapter, however the

research reviewed has been limited to those methods which are computationally

efficient and stable (as mentioned in the research scope section of Chapter 1).

Computer vision, also known as image analysis or machine vision, usually in-

volves analysing an image or sequence of images and providing information about

the image contents (for example, by recognising an object within an image). A

closely related, and overlapping, field image processing involves the enhancement,

compression, reconstruction and restoration of images (for example, by highlight-

ing edges). The output from image processing is an image, whereas the output

from computer vision is usually information (which can often be used to con-

struct new images) [205]. Both of the these fields are important in enhancing the

effectiveness of an AHV system display. As computer vision methods often con-

sist of image processing methods (such as noise reduction), in this thesis image

processing is considered a subset of computer vision.

This chapter contains four related sections. The first section is an overview

of current computer vision (and image processing) methods which are useful

for blind mobility enhancement. All of the methods presented have either been

applied in experiments described in this thesis, or in the second section in this

chapter which provides a review of computer vision applications developed to

assist to blind and visually impaired. The third section briefly discusses the

links between computer vision methods and functionality provided by the Human

Vision System (HVS), which was discussed in the previous chapter. In the forth

and final section, the literature reviews from this and the previous two chapters

are integrated to develop a novel conceptual framework for AHV research. This

framework is used to guide the remaining chapters in this thesis.

4.1. Introduction 79

(b)

(c)

(a)

(d)Figure 4.1: Example of reduced visual information in an AHV system: Image (a)shows a street scene image in suburban Brisbane; in image (b) the resolution ofthis image has been reduced to 25x25 pixels. Image (c) shows a simulated 25x25phosphene display of the same image. A sample symbolic representation of themobility hazards contained in the street scene is shown in image (d).


4.1.1 An information processing approach to computer

vision

The work of J.J. Gibson was discussed in Chapter 2, and has been influential

in the fields of perception (particularly visual), O&M and computer vision (par-

ticularly the concept of optic flow). Although Gibson’s work helped identified

what information is required for a person or animal to function in an environ-

ment, it did not provide details on how the central nervous system performs

these functions. In the late 1970s the approach of David Marr provided a modern

framework which acts as a bridge between brain neurophysiology, visual percep-

tion and computer vision. Marr’s main contributions to computer vision were

on edge detection, stereopsis (using the combined information from two slightly

different images to calculate depth), and object representation in the brain. Marr

proposed that three levels of understanding are required for a system to carry out

an information processing task [144]:

1. Computational theory This level addresses the question: what is the goal of

this task and what is the logic required to carry it out? For example, how

can an object be identified from an image. Research at this level is similar

for biological and computer vision [141].

2. Representation and algorithm The main question addressed at this level is

how is the computation theory actually implemented? For example, how

are images recorded and processed (for example by neuronal operations).

3. Hardware implementation For a biological system, this level is mainly con-

cerned with anatomy and optics of the eye. A computer vision implemen-

tation would involve computer hardware.

4.2. Computer Vision 81

4.2 Computer Vision

The goal of computer vision has been defined as achieving machine behavior

which is similar to biological systems [206]. Computer vision is a complex field

with an enormous amount of literature. In this chapter an attempt will be made

to summarise only the most relevant material. This chapter also includes refer-

ences to relevant visual perception research (such as attention, motion and depth

perception).

Computer vision involves the processing of images captured from an image

source (such as a digital camera). A digital image can be considered a two-

dimensional (2D) array of numbers (or pixels). These numbers can represent light

intensities, distances or other physical quantities [232]. The spatial resolution of

an image refers to the size of the image array (for example, 160x120 pixels).

Image sequences (video) can be used to record temporal-spatial information. A

computer vision system typically has three hardware components: a camera,

which is usually a Charged Coupled Device (CCD) or CMOS, a frame grabber

to convert the camera signal to a rectangular array of NxM integer values, and a

host computer. Colour images require red, green and blue (RGB) values for each

pixel, which are often represented by a 24 bit value, where 8 bits (containing 256

values) are allocated for each colour. The pixels in a grey-scale image are usually

represented by an 8-bit number (256 values).

There are two main complementary methods for the processing of captured

images in an AHV system: information reduction and scene understanding.

4.2.1 Information reduction

Most existing AHV system efforts are aimed at the information reduction level,

which is concerned with the reduction or collapse of visual information. Infor-

mation reduction, which overlaps with some aspects of image processing, is also


referred to as low-level information processing, and usually assumes very little

knowledge about the content of images. Operations on images at this level are

designed to improve image saliency, or to emphasise features of particular impor-

tance or relevance, for example curbs or walls.

Methods at this level which are useful for blind mobility improvement com-

monly involve image compression, image filtering, edge detection and image sharp-

ening to identify objects within the image.

Image filtering

Image convolution is a fundamental operation in image analysis. It involves

setting the value of a pixel using a transformation function based on the values

of neighbouring pixels. A mask called a kernel (usually a square array) is used

and its values are often referred to as ‘weights’. An example 3 x 3 kernel is shown

in Figure 4.2. This kernel is then moved across each pixel position in the image

and the sum of pixel values at this position are multiplied by the corresponding

kernel entry to perform the convolution operation.

Image filtering is a method of transforming image intensities to reduce noise,

or emphasise certain features. Convolution can be useful for image smoothing,

for example by using a kernel with the same weighting for all values (called a

uniform filter), or by calculating the median value of grey-values which surround

each individual pixel (a 3x3 median filter). The Gaussian filter is another widely

used low pass filter and uses a kernel where central pixels have a higher weighting.

A 3x3 Gaussian filter kernel is shown in Figure 4.3. A high pass filter can be used

to sharpen images, such as the Laplacian kernel shown in Figure 4.2. The effects

of low and high pass filtering are shown in Figure 4.4.

The computer vision methods discussed to this point have dealt with images

in the spatial domain (that is based on a Cartesian grid composed of pixels). The

Fourier transform is a commonly used tool to separate the specific frequency


0 −1 0

−1 5 −1

0 −1 0

Figure 4.2: The 3x3 Laplacian kernel for high pass filtering (for image sharpen-ing).

0.0751 0.1238 0.0751

0.1238 0.2042 0.1238

0.0751 0.1238 0.0751

Figure 4.3: An example 3x3 Gaussian low pass filter kernel for image smoothing.Note the centre element has the greatest weight (0.2042) compared to the others.

ranges in images. Filtering in the frequency domain can be conducted by re-

stricting an output image to certain frequencies (for example, a low-pass filter

would block high frequency content). The fast Fourier transform (FFT) provides

an efficient implementation of the Fourier transform for image processing. It is

generally computationally more efficient to filter images in the frequency domain

than performing convolution in the spatial domain. The regularity in the ar-

rangement of objects can be identified more easily in the frequency domain. For

example, leaves on a tree typically show a random spatial arrangement, whereas

bricks in a wall would produce highly structured patterns [198]. There are many

alternate processing approaches which can be used, including the Gabor filter

(which has been found to model the behaviour of receptive fields in the visual

cortex of monkeys for example, [178]), the Harr transform and the Laplace trans-

form.

Global image processing

Global image analysis considers the entire image. For example, Histogram equal-

isation can be used to provide an image with a uniform distribution of grey scale


(a)(b) (c)

Figure 4.4: An example of high and low pass filtering on an image. A grey scalepost box image is shown in image (a). Image (b) shows the image after it hasbeen filtered using the 3x3 Laplacian high pass filter (detailed in Figure 4.2.1).Image (c) shows the result of applying the Gaussian low pass filter from Figure4.2.1. This image has been taken from an image sequence captured using a lowquality PDA card camera (this sequence and camera is described in more detail inChapter 6). As the camera was moving at the time of capture there is a significantamount of motion blur. It is anticipated that image quality from cameras usedfor AHV systems will improve as technology advances.


(a)(c) (d)

(b)

Figure 4.5: An example of contrast enhancement by histogram expansion. Thebase image (a) shows a Brisbane suburban bus shelter. (b) shows the distributionof the 256 grey-scale values in image (a). The contrast in image (c) has beenenhanced using histogram equalisation. The histogram of image (c) is shown in(d).


values. This can be achieved by obtaining a histogram of the image; then obtain-

ing the cumulative distribution of gray levels; and finally, replacing the original

gray level intensities with those from the cumulative distribution. Grey levels can

also either be set to black, white, or selected grey levels to emphasise an area of

a certain brightness (this is called thresholding).

Colour transformation is also a global operation. Images captured from a

digital camera generally use the RGB colour format, whereas most computer

vision operations are performed using grey-scale values. To convert from RGB

format to grey-scale, the value of each pixel (Y) cam be set using formula 4.1

(from [191]). This formula has been used to convert captured images in the

experiments described in Chapters 5 to 8 of this thesis.

Y = 0.299R + 0.587G + 0.114B (4.1)

Edge detection

Edge detection is a frequently used technique for information reduction in an

image. The output from edge detection is useful for further processing (line

detection and image segmentation). Cortical prosthesis research by the Dobelle

Institute has found that edge detection and image reversal enhance the ability

of subjects to recognise important scene components (such as doorways) [48]. In

this section, two widely used edge detection methods are briefly described. A

comparison is made between the Sobel and Canny methods for reduced spatial

resolution in Chapter 5. A comparison of the Sobel, Roberts and Canny edge

detection methods are shown in Figure 4.9.

The simplest edge detection methods involve checking for local spatial vari-

ations in pixel values. This variation can be obtained by calculating a discrete

approximation of the directional difference (ie. the gradient) between adjacent

pixels. A significant brightness change in a small spatial area will indicate an


−1 −2 −1

0 0 0

1 2 1

Figure 4.6: The 3x3 kernel used for Sobel horizontal edge detection.

−1 0 1

−2 0 2

−1 0 1

Figure 4.7: The 3x3 kernel used for Sobel vertical edge detection.

edge. Differences between adjacent pixels which are greater than a certain thresh-

old value are identified as edges. Changes in texture, lighting or image noise can

incorrectly result in identifying pixels as edges. The Sobel filters (which are

shown in Figures 4.6 and 4.7) are commonly used for edge detection and combine

differentiation with Gaussian smoothing to reduce noise [113].

Canny [26] proposed a more advanced edge detection method with three main

objectives: low error rate, well-localised edge points and finally to have only one

response to a single edge (as opposed to using horizontal and vertical responses

to calculate the overall response of the Sobel edge detector). The Canny edge

detector also uses Gaussian smoothing to find image gradients with high spatial

derivatives, however a number of additional steps are then applied. In order to

detect weak edges, the Canny method uses two thresholds (low and high). Edges

above the lower threshold are only included if they are connected to an edge which

is above the higher threshold. An example of the reduced noise and effective edge

identification from the Canny method are demonstrated in Figure 4.9b.

Line detection

Many of the mobility tasks discussed in Chapter 2 involve the identification of

lines in the environment (for example path following, doorway identification).


(a)(b) (c)

Figure 4.8: Sobel edge detection applied to captured image of a post box (a).Image (b) shows the result of the horizontal Sobel edge kernel. The output fromthe vertical Sobel edge kernel is shown in image (c).


(a)(c) (d)

(b)

Figure 4.9: A comparison of different edge detection methods applied to an imageof suburban footpath (a). The output from the Canny detector is shown in (b).The Sobel detector is shown in (c), and the Roberts edge detector is displayed in(d).


The Hough transform is a common method for detecting straight or curved lines

and is robust to noise and additional structures in the image [206]. This transform

is used in a number of prototype devices described in Section 4.3.

The Hough transform is applied to the binary output image from an edge de-

tection algorithm, and identifies which points are associated with particular lines.

This can be done by representing lines by their cartesian or polar coordinates.

Using the polar coordinate system, each pixel (x,y) in the input binary image is

converted using Formula 4.2 to the Hough transform parameters (r,θ).

r = x sinθ + y cosθ (4.2)

Generally an accumulator array is used to store the Hough parameter results

generated from Formula 4.2. If one or more lines exist in the image, there will

multiple pixels with the same parameter results in Hough space and therefore,

the accumulator array will be highest for these lines. The Hough transform does

not return the exact length of lines in an image, but returns the description of

these lines . Because the line orientation is available from the (r, θ) parameters,

it is possible to search for lines with a specific orientation (for example, when

searching for stairs in an image).

An example of applying the Hough transform is provided in Figure 4.10. In

this example, the transform is used to find the dominant line, which is the lower

edge of a fence.

Morphology

Morphology is a useful technique for image analysis, which is often used for noise

reduction and feature detection in binary images [198]. The morphological op-

erator is referred to as a structuring element, and it is similar to a filter kernel.

The two fundamental operations in morphology are erosion (which is used to thin


(a)(c) (d)

(b)

Figure 4.10: Example application of the Hough transform for locating the fenceboundary shown in image (a). Image (b) shows the output from Sobel edgedetection. The corresponding Hough transform output is shown in image (c),with the origin in the top left hand side of the image. This transform image wasgenerated using software from Seul at al. [198]. The horizontal axis represents r,and the vertical axis represents θ, which increases from 0 radians in the top leftcorner to π radians at the bottom. The dominant peak, indicating the dominantline, is shown with a superimposed box. Image (d) shows the pixels which arepresent along the dominant line found by the Hough transform.


region boundaries or increase the size of gaps in an image) and dilation (which is

used to thicken region boundaries or close holes in an image) [56]. In Chapter 5,

a dilation operation was applied to enhance the appearance of edge information

in the reduced 50x50 pixel resolution images.

Image segmentation

A common requirement for computer vision systems is to extract image compo-

nents (such as people, faces, cars and other objects) from the image background.

The process of subdividing an image into parts is a process called image segmen-

tation. The main methods of segmentation include:

• The simplest segmentation method is to use luminance thresholding to in-

dividual pixels. For example, if a dark object is located against a light-grey

background (such as text on a printed page) the object could be identified

from the dark pixels and the background could be ignored. This method

could also be used to identify particular colours (for example, in identifying

fruit using an automatic harvesting applications).

• Boundary detection can be used to identify segments from an edge detection

output image. Edge linking methods (such as the Hough transform, curve

fitting, or active contours) can be used to join segments with similar object

boundaries.

• Model based segmentation can be used when the object’s geometric shape

is known a priori. Parallel lines in an image are easily identifiable in Hough

space as the peaks also occur in parallel (which is useful for identifying a

square, or lines in a stair-case). The generalised Hough transform [9] can

be used to identify circles or other shape types.

• Region based methods involve ‘growing’ segments from individual or small


groups of pixels. Region growing can group neighbouring pixels with simi-

lar characteristics such as grey-levels. The split and merge technique uses

small blocks which are joined if they have a similar grey-level [104]. In the

watershed algorithm [186] local minima are found through the image, and

these pixels are ‘seeded’. These seeds are then ‘grown’ (or flooded) until the

region boundaries are established (areas of high edge magnitude) to prevent

the flood from spreading into neighbouring regions.

• Texture based methods can be be used to segment an image into regions

with similar textures. Autocorrelation (a measure of the amount of rep-

etition within an image) and statistical methods, such as the grey-level

co-occurrence matrix and run length matrices have been previously used to

measure texture [224]. Frequency domain methods, such the Gabor filters

and Wavelets are also useful for texture analysis [84].

• In addition, when using multiple images, motion can be used to segment

objects of interest. A difference image can be created by subtracting a

previous image from a current image which will identify parts of the image

which have changed (this method is useful for identifying moving objects

against a static background, for example in video surveillance applications).

Motion can also be used to estimate optical flow [103],[42] which is discussed

in the next section on scene understanding.

Human perception research

An additional method of segmenting an image is to use information from human

eye tracking experiments. This can be achieved by recording a person’s eye

movements while they are looking at an image or image sequence. By computing

which parts of each image have received the most eye fixations, it is possible to

determine the important regions of interest (ROI) within images (assuming that


the eyes fixate on the more important areas of the image). A number of image

components which influence eye fixations have been identified and include motion,

contrast, colour, size, location, shape, foreground/centrally located objects, edges,

texture, prior instruction and context, people, gestalt properties, clutters and

complexity and unusual stimuli [161].

Task and context have been found to be important in eye movements. If

sighted observers are asked to view a picture with a specific task or context in

mind, these become important predictors of eye movements [250]. An importance

map concept has been previously developed [162] in which the most visually

important areas of an image receive a weighting. These weightings compare well

to recorded eye movements of the same images. The importance map concept

has also been extended to the automatic detection of important areas in complex

video sequences [163]. These areas include moving objects and centrally located

objects.

For an AHV system, it may be useful to record the eye movements of normally

sighted people while they accomplish common mobility tasks (such as a road

crossing). These data could then be used to estimate the most useful image

components for an AHV system user, and possibly highlight these regions during

real-time use of an AVH system.

As discussed in Chapter 3, previous work on static image AHV simulation by

Boyle et al. [18],[20] has examined the use of various information reduction tech-

niques such as enhancing edges, using different grey-scale levels and presenting

the results of importance mapping on image recognition. Boyle et al. reported

that at low information levels (generally 25x25 pixels) the use of image process-

ing techniques is not helpful in the identification of static scenes, although an

automatic zoom feature was found to aid image understanding.


4.2.2 Scene understanding

The scene understanding component of computer vision is concerned with identi-

fying features and extracting information [81]. This level is also referred to as high

level computer vision. This section will provide a brief overview of object recog-

nition, symbolic representation, motion analysis, obstacle avoidance and machine

learning.

An example application of scene understanding might be to identify a bus

stop, fire hydrant or traffic light in an image. It has been suggested that because

reading and navigation tasks by the blind are possible using non-implant devices

(such as text-to-sound conversion or a cane) the most useful tasks for an AHV

system user may involve scene understanding (such as face recognition) [230]. It

may also be useful to know the distance to the object (number of steps, or time

at current walking speed).

Object Recognition

One of the aims of scene understanding is to determine what the objects are in an

image. To do this, the characteristics or features of these objects must be known

a priori. Segmentation (discussed in the previous section) is usually required to

break the image into regions before each of those segments are then processed to

determine if they belong to a particular type of object.

Three main approaches to object recognition can be identified, although there

is some overlap between these approaches (for example, the output from template

matching and shape analysis can be used as features for pattern recognition):

1. Template matching also known as matched filtering, involves searching for a

known shape within an image. This involves creating a template array, then

moving this array over each pixel position in the image, and calculating the


correlation between the template and the pixel neighbourhood. The corre-

lation result is obtained by multiplying each image pixel with the matching

template pixel and summing the results: the pixel position with the highest

result should be the closest match to the template [179] . However, this

method only works for objects of the same shape and size within an image,

therefore an array of templates for different geometric transformations could

be generated to help find varied instances of objects of interest. However,

an array of templates increases the computational complexity of finding the

best template match within an image.

2. Shape analysis involves matching image segments against previously iden-

tified object geometric features. These features, which are usually obtained

from binary images, include the area, perimeter boundary and curvature

of objects. Statistical moments are also commonly used to describe ob-

jects, and provide attributes which are independent of size, position and

orientation [198]. Fourier descriptors can also be used to describe region

boundaries [81]. A good review of shape representation and description can

be found in Zhang and Lu [254].

3. Statistical pattern recognition attempts to classify object classes where they

have previously been defined (supervised classification) or attempts to de-

fine differences between object classes (called unsupervised classification or

clustering) [241]. Object recognition applications generally use supervised

classification, where the first main stage involves identifying features of an

object class (such as colour, shape, texture) [96]. Linear transforms, such

as Principal Component Analysis (PCA), have also been widely used for

feature extraction and dimensionality reduction (for example in face recog-

nition) [255]. The next object recognition stage involves the classification


of the extracted features into a class of objects. Jain et al. [114] have sug-

gested the three main methods used for classifiers are similarity (like the

template approach, the pattern is assigned to a class which it is most closely

correlated), probability (assign a pattern to the class with the maximum

posterior probability) or decision boundaries (which focus on the minimi-

sation of criteria such as mean squared error). Artificial Neural Networks

(ANNs) are often used as an efficient implementation platform for classic

statistical pattern recognition methods [114]. A good recent review of im-

age processing using neural networks is presented byEgmont-Peterson et al.

[62]. An example ANN classifier for enhanced blind mobility is described

in Section 4.4. Additional pattern recognition methods include Support

Vector Machines (SVM) [25] and Hidden Markov Models (HMM) [125].

Motion analysis

The motion of a person provides visual information about movement relative

to the environment and information about the depths of observed scene points

[154]. Therefore the analysis of image sequences is desirable in a mobility device.

An interesting feature of image sequences is that less spatial resolution may be

required when image contents move [80]. One monocular method of judging

depth is motion parallax which is used when objects are moving at equal speed:

those which are closer to the observer seem to move faster. This information is

one method used to obtain mobility information by sighted people (for example,

when approaching a railway platform).

Obstacle avoidance

Obstacle detection and avoidance involves a combined estimate of ego-motion and

scene structure, as objects only become obstacles when they are in an observer’s

anticipated path. When a human perceives that an object is about to hit their


head there are usually stereotypical movements of the head and closing of the eyes.

Gibson has suggested that this action is based on the characteristic increases in

the size of the object as it approaches the head [77]. Human infants have been

found to display this behavior in computer simulations of collision, and diving

gannets also appear to use changes in the retinal image to decide when collision

with water is about to occur [17]. A simple approach to obstacle avoidance is to

examine optical flow in the left and right halves of the visual field and to turn

in the direction of smallest optical flow (bees use a similar method for travelling

down a corridor) [141]. An alternate method is to obtain two or more contiguous,

segmented, images from an image sequence and calculating which segmented

components of the image have changed size between images. If these segments

continue to expand past a threshold rate and size (perhaps 25% of the display)

then a looming alert warning should occur. Following this assumption, the block

based obstacle alert presented in Chapter 6 estimates the optical flow (discussed

in Section 2.7) of looming image segments in front of a head-mounted camera to

provide a real-time warning to an research participants using an AHV simulation.

Symbolic representation

When objects of interest have been recognised by an AHV system it could be

appropriate to present a symbolic representation, where an idealised or reduced

image is presented. For example a small part of the phosphene grid (perhaps 5x5

phosphenes) could be used for information on obstacle locations in the current

environment. Figure 4.1d shows a symbolic depiction of a looming obstacle.

Auditory information could also be provided either by tone or through natural

language. A scene description mode could be useful (similar to the system by

Tou et al. [228], discussed below in Section 4.4). Research on raised line pictures

for the blind could be useful for deciding on symbolic representations of objects.

4.3. Previous applications of computer vision to assist the vision impaired 99

Knowledge representation

The interpretation of objects depends on knowledge of possible objects, and might

also depend on context (for example, an outdoor scene versus a home environ-

ment). For orientation, it may be useful if an AHV system using a scene under-

standing approach could learn to recognise new objects - for example, an image

of a particular type of building (such as a tram stop) could be added to the ex-

isting object knowledge base. This interpretation combines methods from object

recognition with knowledge of the expected image content (an area of Artificial

Intelligence (AI) known as knowledge representation [138]).

4.3 Previous applications of computer vision to

assist the vision impaired

This section reviews research which has applied computer vision methods to as-

sist vision impaired people. In 2005 the first workshop on computer vision for

the visually impaired was held during the annual Computer Vision and Pattern

Recognition (CVPR 2005) conference in San Deigo. Papers presented included

wayfinding (orientation); visual audio and tactile interfaces; and sign detection. A

paper based on Chapter 7 of this thesis [57] was also presented at this conference.

The purpose of this section is to discuss the different image-based approaches to

the task of providing useful information to the blind and to evaluate these efforts.

It may be useful to integrate components of this research with AHV system soft-

ware. The eleven papers which are surveyed below all involve prototype systems

only. An additional aid, the vOICe system, was discussed in section 2.4.3 and

is the only freely available blind mobility software which uses a computer vision

approach.


Obstacle avoidance

An early mobility device was reported in 1985 by Tou & Adjouadi [228]. This

system used spoken output to describe the current scene. Two modes were pro-

vided: the first attempted to identify a safe route for the traveller, using an

analysis of grey levels within the image. The second mode used scene analysis in

an ‘object-identification’ mode. This mode attempted to use the aspect ratio (the

ratio between the width and height of the image) of identified objects to cate-

gorise an object into three classes: long thin objects (such as a pole or mail box),

square or circular objects (such as a pot hole), and large objects (such as a car or

wall). When an obstacle was detected, the system provided a warning and asked

the user to walk slowly. If object identification was required, the blind person

would need to stop walking and wait while this processing took place. An image

correspondence technique was used to identify drop-offs. Although this system

was too slow for real-time use, nevertheless it demonstrated that computer vision

techniques could be useful with future improvements in hardware speed .

More recently a proposed real time hazard detection system for low vision

developed at the Human Interface Technology Laboratory at the University of

Washington [3]. Although results were not provided this paper presented an

interesting computer vision approach. The real-time system collected image data

using a small head-mounted camera, and displayed an enhanced image on to

an optical scanning virtual retinal display. The system assumed that common

hazards (such as curbs, stairways and doors) would be based on straight lines;

therefore the Hough transform was used to detect straight lines. The set of

lines for each image were then passed to a neural network for classification. The

input vectors used were the orientation and position of each line, and the output

vector was a confidence value reflecting the likelihood of a hazard. The system

was trained using several minutes of video from doors and staircases captured


at Washington University. Misclassification was associated with poor lighting or

failure of the camera to focus or adjust to lighting conditions (camera details were

not provided).

A 1998 paper by Snaith et al. [204] has reported on the use of edge detection

to determine the positions of lines in an image. The grouping of these lines

was used to classify objects (such as doorways). Paths were also identified using

edges and the Hough transform was used to group these into straight lines. The

dominant vanishing point was then identified to indicate a person’s direction of

travel. A similar approach for a blind mobility device was investigated by Molton

et al. [151]. Their device used stereo vision, combined with sonar for obstacle

detection and curbs. Once an image was captured, edge points were detected and

the Hough transform used to locate parallel line clusters (which were assumed to

represent curb or path information).

Distance information

A head-worn stereo camera based device was proposed by Cyganek and Borgosz

[43]. Their software calculated depth information using a disparity map created

from two captured images using a small central window of the combined images.

In a hardware implementation the authors propose that this window could be

adjusted by the user depending on the context. The calculated depth information

was then converted to a stereo sound output for a blind pedestrian. This paper

shows impressive output from three images, however it is difficult to evaluate

the software as no details were provided on computational efficiency or results of

depth mapping accuracy.

Stair case detection

The identification of stair cases was addressed by Se & Brady [197]. This research

used a texture detection method (using Gabor filters) to locate distant stair cases.


Once a person had moved close enough to the stairs, they were then detected by

searching for groups of concurrent lines. The intensity variation was then used

to partition the convex and concave lines. Once the stairs were identified, a

further step was applied to find their vertical rotation and slope (to help a blind

person find the stair base and work out how steep the steps were). The vertical

rotation and slope of the stair case were then used to transform the captured

image into a new image with the camera facing the stairs. Although reasonable

results were achieved, the approach was found to be slow and not suitable for

real-time applications.

General object recognition

An addition object recognition system for blind mobility was developed by Ever-

ingham et al. at the University of Bristol [63]. This system used a trained neural

network implementation to classify segments from an image into previously de-

termined object classes (road, pavement, sky, building, vegetation, obstacle or ve-

hicle). Once identified, these object regions were displayed in different colours to

people with low vision. Two databases, the Bristol Image Database (200 outdoor

and suburban images) and the Bristol Blind Mobility Database (10 ‘challenging’

urban scenes) were used to train the neural network classifier. Thirty-five differ-

ent features from each object were used for each feature vector (listed in Table

4.1). This system performed adequately on a restricted number of environments,

however the Gabor calculations are computationally expensive and the system

required 9.5 seconds for each image. By discarding selected texture frequencies

the processing time dropped to 300 ms at slightly lower classification accuracy. A

pilot experiment using this approach with static images was conducted with 16

legally blind participants and was found to increase the rate of object recognition

(compared to the unmodified original images) by more than 100%.


Table 4.1: The feature set used in Everingham et al. [63]

Feature Description

1 Size (proportion of image)

2-3 Position ((x, y) co-ordinates)

4-5 Orientation (sin & cosine of angle)

6-8 Colour (mean color components)

9-18 Shape (invariant Fourier descriptors)

19-35 Texture (mean Gabor magnitude)

Sign Detection

Chen and Yuille [32] investigated automatic text detection using a cascade classi-

fier. The cascade approach is efficient as it searches for individual sets of features

(such as regions which do not contain text) and then eliminates these regions

from further classification. This method was used to develop text-detection soft-

ware which was trained on 423 street scene images and 4000 images without text.

When tested on a database of 530 test images, the system was able to process

40 fps with a 91% detection rate. Chen and Yuille suggest that this application

could be useful to identify image regions of interest for a blind person, who could

then choose to zoom in on these regions.

Two different papers on sign detection and classification were also presented

at CVPR 2005 from researchers at the University of Massachusetts. In the first,

Silapachote et al. [203] use local color and texture features to identify sign re-

gions, and then classifying these regions by comparison with previously identified

sign classes. A correct sign classification rate of 97% was reported. In the second

paper, Mattar et al. [146] use a similar approach to sign detection, and provide

the results of tests on 3975 sign images from two different image datasets (in-

corporating variations in lighting, orientation and viewing angle). Mattar et al.

reported a recognition accuracy of 99.5% with 35 sign classes, and 92.8% when


65 sign classes were used.

Drop off detection

As discussed in Chapter 2, changes in terrain depth (drop-offs) are a significant

problem for blind mobility. The only paper which has explicitly addressed drop-

off detection is by Yuan and Manduchi [252]. In their novel approach a prototype

hand held ‘virtual white cane’ was built which contained a laser pointer and a

camera. Matched filtering was used to detect the laser light return from captured

images. This range information is then provided to a user using a tactile or

auditory display at 15Hz.

Pedestrian crossing detection

The final paper reviewed in this section is by Uddin and Shioyama [234] investi-

gated the use of computer vision to assist blind mobility by detecting the standard

black and white lines for pedestrian crossings. Their approach is to segment the

image and search for regions which are highly bipolar (that is, in a histogram

there will be two peaks representing the darker and lighter grayscale pixels). The

location, direction and band frequency were then analysed (by checking that there

are four or more white bands in the region) to extract the crossing. A collection

of 100 static images were processed, resulting in 95% detection accuracy, however

there were no details provided on computational efficiency or the effects of other

scene factors (such as lighting, clutter or occlusions).

In summary, a number of different papers have been investigated which pro-

vide various experimental computer vision based devices for the blind. The most

widely used computer vision techniques are the use of edges, the Hough trans-

form and statistical pattern recognition. Most of the systems described process

individual images in an image sequence, rather than using information (such as

object movement) from the differences between images. A common constraint in

4.4. Relationship between computer vision methods and the Human Vision System(HVS) 105

much of this research has been the difficulty in providing output quickly enough

for the device to be useful for a mobile pedestrian. Few of the systems described

in these papers were evaluated by visually impaired people. It is also unclear

in a number of papers how the proposed systems would be used in practice (for

example how the information would be conveyed to a user). It would be useful

to assess objectively whether these computer vision based devices lead to an im-

provement in mobility performance compared to traditional mobility aids. None

of the proposed systems has been developed commercially.

4.4 Relationship between computer vision meth-

ods and the Human Vision System (HVS)

One the main functions for an AHV system is to convert information captured

from a non-biological sensor (such as a camera) and convert these signals into a

representation which can be interpreted by a blind person. There are significant

parallels between computer vision methods (many of which have been inspired

by biological systems) and the methods used by the human vision system. In

this section a brief discussion of the relationship between artificial and biological

algorithms for extracting information from images is provided. The aim of this

section is provide a link between the computer vision methods discussed in this

chapter and the HVS review provided in the previous chapter.

Light is initially captured and converted to neural signals by approximately

100 million photoreceptors in the retina of the human eye. [239]. An artificial

system relies on a camera (such as a Charged Coupled Device (CCD)) which

converts a captured image into a two dimensional array of numbers.

As discussed in Chapter 3, a large amount of processing occurs on the signals

generated from the photoreceptors which reduces approximately 100 million from


the retinal rods and cones to around 1 million ganglion cells [156]. The retinal

ganglion cells generally have concentric receptive fields which react to light falling

on a central region, but are inhibited by light falling in the surrounding area (an

ON-centre cell). As these cells do not react to uniform patches of light, they

provide information on contrast borders [227]. In addition, the photoreceptors in

the central fovea of the eye are more densely located than in other parts of the

retina which assists in data reduction. These biological methods are similar to

the image processing methods of filtering, edge detection and data compression.

The left lateral geniculate nucleus (LGN) is responsible for combining and

routing signals from the left and right sides of the retina in each eye to the visual

cortex [24]. The layers in the primary visual cortex are responsible for processing

depth perception and the orientation of receptive fields (such as bars of light or

edges in a particular orientation) [156]. Although most visual information appears

to be processed first in the primary visual cortex, there are many other areas in-

volved in processing visual information such as V2, V3, the mid-temporal cortex

(where neurons are particularly sensitive to stimulus movement), V4 (colour pro-

cessing) and the inferotemporal cortex (where stimulus size, shape, contrast and

colour appear to be processed) [227]. Attempts to mirror these biological pro-

cessing methods in computer vision research include texture processing, contour

extraction, segmentation, shape analysis, depth perception, motion detection and

face and object recognition.

In summary, increasingly sophisticated information is extracted in the human

vision system as information moves through each stage. These methods of infor-

mation extraction from images are approximated by computer vision methods,

which generally rely on a camera, frame grabber and computer. Table 4.2 pro-

vides a summary of the main types of computer vision functionality processed by

the HVS.

4.5. A conceptual framework for AHV system information display 107

Table 4.2: Overview of computer vision functionality performed by each part ofthe HVS (Based on Thorpe [227]).

HVS Location Functionality approximated by computer vision

1. Retina (Photoreceptors) Image capture

2. Retina (Processing) Image enhancement (eg. ON-centre cells

for contrasting borders)

Edge detection

Data compression

Short range motion detection

Colour constancy

3. Lateral Geniculate

Nucleus Routing of information from each eye

4. Cortical processing (V1-V4) Texture processing

Contours and spatial frequency

Segregation between figure and ground

Segmentation

Determining shape of objects

Stereoscopic depth perception

Long range motion detection

Colour of visual stimuli

Face recognition

Object recognition

4.5 A conceptual framework for AHV system

information display

Although the development of an AHV system involves research from a diverse

range of specialists, there has not been a unifying framework which combines the

requirements of blind end-users with different AHV system components. In this

section a new proposed conceptual framework (shown in Figure 4.11) is presented

which is based on the literature reviews of blind mobility (presented in Chapter


2), AHV technology (Chapter 3) and computer vision literature reviews (Chapter

4). The conceptual framework is discussed in detail below, and also provides the

context in which the remainder of this thesis is presented.

The conceptual framework is made up of different influences (for example, the

weather or location) which will affect how information from an AHV system (or

other mobility device) is perceived. The arrows in the framework indicate how

different factors interact: for example, both lighting (a dynamic, scene factor) and

camera resolution (an external, AHV technology factor) will influence effectiveness

of a number of computer vision methods (for example, histogram equalisation may

be effective in enhancing images in dull lighting). The output from the computer

vision processing stage can either be used for the AHV display, or it can be

presented by another display modality (for example with an auditory warning).

Dynamic factors can often have an affect on mobility effectiveness without any

computer vision processing: an example of this would be a person’s knowledge

that they are holding a handrail while walking down a sloping path.

In the next section each main component involved in the conceptual framework

will be briefly discussed. This is followed by a hypothesised scenario involving

a person using an AHV system to perform a number of mobility related tasks.

Finally the hypothesised scenario is linked back to the framework and a number

of benefits of a framework approach are discussed.

Dynamic factors

In the proposed framework, dynamic factors are those which relate to the current

situation and goals of a mobile person. As a person moves, these inter-related

factors can change rapidly. The identified dynamic factors are:

• Context: Situations and environments evoke different expectations of what

behaviour and actions are possible. Context is used here to describe the

4.5. A conceptual framework for AHV system information display 109Dynamic factors

Computer Visionmethods for AHVNumber of grey levelsColour filterLow pass filter (smoothing)High pass filter (edges)Histogram equalisationSpatial ResolutionFrame RateZoom modeLines (Hough transform)Colour recognitionNegative imageImportance mappingCombine information fromother sensorsObject Recognition...

ContextIndoor OfficeIndoor MallOutdoor TrainCrowded environmentBeach... Scene propertiesTextureComplexityLightingGlareContrastType of objectsConnectivityFractal Dimension...TaskFace recognitionReadingStreet crossingFinding KeysGesture recognitionLandmark identificationMapping locationWalking along path...Sensory InformationTasteHeatOlfactoryTactileAuditoryProprioception External factorsAHV TechnologyCameraResolutionFrame rateField of View ...Electrode ArrayLocationNumber of electrodesType of electrodeSpatial Layout ofelectrodesInfrastructureTransmitter/ReceiverTelemetry methodStimulator typeProcessor

EnvironmentWeatherAffordancesTactile stripsSignsLandmarks...

Human FactorsExperience/trainingPsychological factors (motivation,depression, etc)Physical factors (gait, etc)Use of secondary aidAgeGender...Non-image sensorsUltrasoundLaserGPS...Other modalitiesAuditoryTactile ...Other displaymodalitiesAuditoryTactile...AHV Display TypeAlert InformationLooming obstaclesDrop offFast moving objectsSymbolicStandard Display Mobility PerformanceMobility incidentsWalking speed...

Figure 4.11: Factors which influence the display processing for an Artificial Hu-man Vision system.


expectations (or schema) that a person will have in different situations,

which will in turn affect the type of mobility information required. For

example, a person walking in along a beach may not need to identify straight

lines within captured images. However, if the beach is very crowded a

looming obstacle alert may be useful.

• Scene properties: The extraction of information from captured scenes is

affected by a number of properties, such as lighting, low contrast, clutter

and texture. Different methods (such as automatic gain in a camera, or

filters) may be required to compensate for undesirable scene properties. A

number of scene properties have been identified in Chapter 2 as important

for low vision mobility (and likely also for AHV mobility). These properties

included lighting conditions, glare and visual clutter.

• Task: Different information is required depending on the current task.

A road crossing task may emphasise a straight path to the opposite curb

(to prevent veering), whereas a task involving identifying a set of keys on a

cluttered table may involve zooming or object recognition. Face recognition,

reading and walking up and down stairs were also identified as important

tasks for blind and low vision mobility in Chapter 2.

• Sensory Information: Auditory cues (such as the sound of an approach-

ing object) are particularly important for blind mobility and navigation.

Tactile cues (such as hand-rails or braille strips on a footpath) are also

important for effective mobility. In addition, temperature changes (for ex-

ample from an air-conditioning unit) and smells (such as a bakery or partic-

ular plants) also provide sensory information. Finally, proprioception (the

sensation of motion, position, location and orientation of a person’s body

in space) provides dynamic information about the current environment (for

example, when walking up or down a slope).


• Environment: The dynamic properties of the physical environment are

also important for a blind traveller. These properties include the weather

(for example, rain), landmarks, people or rubbish bins on a footpath. Af-

fordances (discussed in Chapter 2) are properties of the environment which

represent relationships between a person and the environment (such as a

door handle which affords a person the means to open a door) [79]. Addi-

tional examples of affordances are signs and paths.

External factors

This group of inter-related factors are important for displaying information, how-

ever these factors do not change while a person is moving (and are therefore

external to the current mobility situation).

• AHV Technology: The different components of an AHV system (dis-

cussed in detail in Chapter 3) affect the amount of information which can

be obtained (by camera properties such as frame rate, resolution and field

of view), processed and displayed (for example by the limited number of

electrodes and presentation frame rate restrictions). Potential bottlenecks

in an AHV system include the (possibly wireless) links between camera,

processor and stimulator unit.

• Human factors: Individual psychological and physical differences between

people may also affect the information display required from an AHV sys-

tem. Example differences include the amount of mobility training received,

duration of blindness, motivation, age, memory, expectancies and gender.

• Non-image sensors: In addition to image information captured using a

camera, information about the environment can be provided from other

sensors. Ultrasound and laser technology has been used in a number of


ETAs for the blind and the information from these sensors could be inte-

grated into an AHV system display (for example for collision detection).

Global Positioning System technology could provide information on a per-

son’s current location and could be integrated with mapping software to

display navigation information.

Computer vision methods

After they are acquired from a camera, images need to be updated before they

can be used for an AHV display. As there are a limited number of the electrodes

available, captured images will usually need to be reduced to a lower spatial reso-

lution (for example from 160x120 pixels to 16x12 phosphenes). A large number of

additional computer vision methods, discussed in Section 4.5, can be applied to

enhance the effectiveness of an AHV display. The computer vision methods used

will depend on the dynamic and external factors discussed above. For example,

if a person is searching for a blue shop sign, a helpful computer vision method

may be to apply a blue colour filter and select an edge display (assuming the sign

has straight edges, for example in a square or rectangle). The computer vision

system may also combine information from non-image sensors.

AHV Display Type

• Standard Display: The method used in most current AHV prototype

systems is to resize captured images to a lower resolution and then use

each pixel in the reduced resolution image to drive a single electrode. The

resizing may be combined with a smoothing filter (to reduce noise) and

edge detection (as in the Dobelle system [48]). A simulation of the standard

display is shown in Figure 4.1b.

• Alert: As discussed in Chapter 2, looming obstacles and drop-offs are


serious problems for a blind pedestrian. It should be beneficial if an AHV

system could continually search for hazardous features of the current scene.

These alerts, such as an approaching tree branch (obstacle detection) or

descending stairs (drop off) could run as background tasks, and interrupt

the current display when required (for example, by filling a quarter of the

current display with bright phosphenes).

• Symbolic: This type of display, shown in Figure 4.1d, would extract salient

objects from captured images and display a symbolic, or cartoon-like repre-

sentation. Therefore this display mode would rely on a scene understanding

approach. For example, a person searching for a sign could have any sign

objects in the current image shown as a group of four phosphenes.

Other display modalities

Although the primary method of displaying AHV information would be from elec-

trodes, additional information (particularly a warning) could be presented using

auditory channels (for example, an alarm sounding in the left ear could represent

a looming collision on that side of a person’s body), or tactile channels (such as an

vibrating alert on the left shoulder for a collision). However, sensory substitution

may overload an existing sensory input which could reduce the effectiveness of an

AHV device (for example if a person was required to wear headphones to hear the

audio alert). By including a non-AHV display in the framework, it is possible to

compare traditional and ETA mobility aids with an AHV system (these devices

were discussed in Chapter 2).

Mobility Performance

The final component of the mobility framework represents the dependent vari-

ables used to measure mobility. The mobility performance component allows the


experimenter to assess how the factors in other framework components effect an

individuals mobility effectiveness. As discussed in Chapter 2, a large number

of these have been included in previous O&M studies, with three of the most

common being Percentage of Preferred Walking Speed (PPWS), number of times

veering has occurred, and contact with obstacles.

4.5.1 Hypothesised operational scenario

The following simplified example illustrates the relationship between the factors

shown in Figure 4.11 and shows the type of computer vision system which is

possible with current techniques.

This hypothetical example takes place a few years in the future and involves a

40 year old female, K, who has no light perception. K lost her sight five years ago

as a result of non-arteritic ischemic optic neuropathy (a condition which prevents

the retina from receiving sufficient blood flow). As this condition damages the

optic nerve, K is unable to use a retinal or optic nerve prosthesis. Twelve months

ago K received a new generation intra-cortical implant. After surgery, training

and calibration K is able to use this system for around 10 hours each day.

In this scenario, K needs to travel from her suburban house by bus to a mu-

sic store in the city. K has travelled with a normally-sighted friend previously,

however this is her first independent trip. K exits her house, follows the drive-

way to a gate, and steps onto the pavement. K knows the tactile feeling of the

pavement. She orients herself in the direction of the local bus stop, and presses

one of the buttons on a small wireless computer located inside her pocket. A tiny

wireless camera, located inside a pair of glasses worn by X, capture and transmit

images to this computer. The 24x32 phosphene display is updated to show a

familiar symbolic menu. K selects the sign recognition display mode. As the bus

approaches the stop, K is able to confirm the bus number, and signals for the bus


to stop.

As the bus travels toward the city, K watches for known landmarks. She

cannot rely on timing the journey or counting the number of times the bus stops,

as many of the stops are empty and the bus continues without stopping. Therefore

K switches to a symbolic map mode display and selects her destination in the

city. The computer uses GPS information to plot K’s current location. As the

target location becomes closer, K confirms the location with the bus driver, who

stops at the required stop. K exits the bus and uses the GPS map to orient herself

in the correct direction, and then switches back to the 32x24 phosphene display.

As K walks along a pavement toward the city centre, the display automatically

switches to 16x12 phosphenes with a faster frame rate. K slows down as she

approaches an intersection, and the display switches automatically back to 32x24

phosphene mode. This intersection has traffic lights, so K selects a ‘walk traffic

signal’ recognition mode, which flashes when the walk signal is displayed. K

also listens to ensure that the traffic has stopped before crossing. As she walks

toward the shops the path becomes crowded and the automatic looming obstacle

alert is frequently shown. K remembers the locations of tactile strips in the

pavement and uses these while walking into the main shopping area. K switches

her display to a doorway identification mode (using object recognition software)

which utilises text identification software to automatically alert K to written signs.

K navigates to the music shop and enters the shop. In addition to the sound of

the salesperson’s voice, face recognition software confirms that K is talking with

the same person she spoke with the week before.


4.5.2 Benefits of a conceptual framework for AHV infor-

mation display

In the hypothesised scenario above, various computer vision methods are used to

enable K’s trip from home to a music shop. Although the AHV system provides

useful information for K, much of the information she requires is provided from

other sources (such as auditory, tactile, pre-existing knowledge of the objects

in the environment and mental maps to help her navigate). The AHV system

provides a number of different processing modes such as object recognition, char-

acter recognition, looming obstacle alerts. These modes can be initiated by K, or

are dynamically displayed (such as the looming obstacle alert). To be useful the

system needs to combine a consistent interface (for example, an intuitive method

to select processing mode, and a standard layout for any symbolic displays).

The reliability of the system is critical as incorrect information could reduce K’s

confidence in the system, and could lead to serious injury.

The conceptual framework is a significant contribution of this thesis which

supports and guides the development of an adaptive AHV system, and enables

the dynamic adjustment of display properties in real-time. The benefits of the

conceptual framework include:

• Experimental control of different factors : By manipulating and controlling

different factors from the framework the effectiveness of different AHV sys-

tem displays can be measured (such as altering display temporal resolution

while using a standard mobility assessment technique). To support this

point, the application of the framework to two previous mobility experi-

ments are discussed in the next section.

• Common language: The framework allows a common language for AHV

users, medical specialists, engineers, scientists, software developers, O&M

specialists and other groups.


• Standardised requirements : Research on the effects of different factors (such

as display types) can lead to a standardised set of requirements for AHV sys-

tem components (for example a common display interface used for menus).

This may lead to interchangeable components and a standardised testing

methodology.

• Training : The effects of different factors impacting on mobility effectiveness

(such as age of onset of blindness) can be examined. Using these results dif-

ferent training strategies and training assessment methods can be developed

and compared.

• Finally, the framework supports the development of adaptive systems which

alter their method of computer vision processing depending on a number

of external factors (for example, depending on the current task being per-

formed by an end-user).

4.5.3 Application of the conceptual framework for previ-

ous AHV research

The conceptual framework needs be sufficiently flexible to encompass different

types of mobility research. This section demonstrates how the framework can be

applied to two previous mobility experiments discussed in Chapter 2.

AHV mobility simulation

The first paper considered is the seminal AHV simulation mobility research by

Cha et al. from the University of Utah [29]. The conceptual framework for this

research is shown in Figure 4.12. The AHV simulation from this study did not

use computer vision methods: instead simulated phosphenes were displayed by

attaching different types of mask onto the display screen. The context of the


study was an indoor artificial mobility course, and participants were asked to

move through a maze without hitting obstacles (which is specified in the frame-

work task component). Environment factors available include the wall and door-

way markings. Sensory information and scene properties (such as lighting and

contrast) would have been influences on individual mobility performance. The

external factors include the head mounted camera used to capture images, and

the lenses used to reduce the field of view available to participants. Finally the

mobility assessment dependent variables used in this study were obstacle contacts

and time spent in the maze for each participant.

Low vision mobility assessment

The second paper discussed in this section is a mobility experiment published by

Long et al. [135]. In this paper, 22 participants with low vision had their vision

assessed before being asked to walk through two different paths in three unfamiliar

mobility environments. The conceptual framework for Long et al. is shown in

Figure 4.13. The context for this study are the three mobility environments (for

example, a classroom building). The tasks involved walking along a path and

avoiding obstacles. Again, computer vision methods were not required in this

research. However, participants wore sunglasses with different levels of reduced

illumination, and these have been included as external factors. Impaired vision

has been included as an ‘other display modality’ as this study did not involve

AHV simulation.

4.6 Chapter Summary

Computer vision methods provide a critical link between the camera and elec-

trode array of an effective AHV system. This chapter has examined the main

methods from computer vision for the reduction of unimportant information in


Mobility Performance

Dynamic factors

Computer Visionmethods for AHVNone

ContextIndoor artificial mobilitycourseScene propertiesTextureComplexityLightingGlareContrastType of objects...

TaskWalking through mazeObstacle avoidanceSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception

External factorsAHV TechnologyCameraResolutionFrame rateField of View ...Electrode ArrayLocationNumber of electrodesType of electrodeSpatial Layout ofelectrodesInfrastructureTransmitter/ReceiverTelemetry methodStimulator typeProcessor

EnvironmentSignsLandmarks...

Human FactorsExperience/trainingPsychological factors (motivation,depression, etc)Physical factors (gait, etc)Use of secondary aidAgeGender...Non-image sensorsUltrasoundLaserGPS...

Other display modalitiesAuditoryTactile...AHV Display TypeAlert InformationLooming obstaclesDrop offFast moving objects

Symbolic

Standard DisplayPhosphene masks Obstacle contactsTime on courseFigure 4.12: Conceptual framework applied to Cha et al.’s simulated AHV mo-bility experiment. Factors which are not included in this study are marked witha line pattern.


Dynamic factors

Computer Visionmethods for AHVNone

ContextClassroom buildingResidential areaSmall Business areaScene propertiesTextureComplexityLightingGlareContrastType of objects...

TaskWalking along pathObstacle avoidanceSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception



Human FactorsExperience/trainingPsychological factors (motivation,depression, etc)Physical factors (gait, etc)Use of secondary aidAgeGender...Non-image sensorsReduced illuminationsunglasses

Other modalitiesAuditoryTactile ...Other displaymodalitiesAuditoryTactileImpaired visionAHV Display Type

Alert InformationLooming obstaclesDrop offFast moving objectsSymbolic

Standard DisplayMobility PerformanceLoss of balanceVeeringShuffling and others ...

Figure 4.13: Conceptual framework applied to Long et al.’s low vision mobilityexperiment. Factors which are not included in this study are marked with a linepattern.


the identification of image components. A number of prototype systems to assist

the blind were then reviewed to illustrate how these methods can be applied.

Despite the number of systems reviewed, many struggle to provide useful mo-

bility information in real-time, and only one has received a reasonable level of

acceptance (the auditory vOICe system).

This chapter has also presented a framework for AHV system information

display. This framework, based on the reviews of blindness, blind mobility, AHV

systems, and computer vision includes the main factors which impact on a blind

traveller. The main benefits of using this framework are enhanced communication

between AHV researchers and the ability to explore and compare different factors

experimentally (such as age or gender, different types of computer vision methods,

and different environments). This framework has guided the experimental work

contained in the next four chapters of this thesis.

The framework presented in this chapter has guided the mobility assessment

methodologies contained in the next four chapters of this thesis. In Chapter 5

the ability to identify mobility-related information from degraded static images

is explored. Chapters 6 to 8 investigate the effect of various image processing

methods on the mobility of subjects wearing an AHV simulation.

Chapter 5

AHV Mobility Assessment using

Static Images

5.1 Introduction

This chapter describes a computer-based AHV simulation experiment using static

images. As discussed in Chapter 3, static AHV simulation images have previously

been used by Boyle et al. [18] to examine the effects of various image processing

techniques on object recognition. This experiment aimed to investigate the three

main research questions presented in Chapter 1:

Can specific main factors be identified as highly significant for provid-

ing mobility information in an AHV system?

It is anticipated that some mobility information will be available from low reso-

lution images. In this experiment participants were asked to identify a number

of mobility related components from low resolution static images. These image

components were selected based on the Chapter 2 review and included: people;

tall obstacles (such as a tree or pole); low obstacles (such as a chair) and drop

123

124 Chapter 5. AHV Mobility Assessment using Static Images

offs (such as a down-stair). In addition, the ten images chosen for this experi-

ment contained different image contexts (such as indoor office, outdoor path) and

scene properties (such as image clutter). Participants were also asked to imag-

ine they were walking while using the low resolution image as their only visual

information, and to predict where their next step should be placed.

Can objective measures be developed for the comparison of effective-

ness between AHV systems in providing mobility information?

This experiment has been performed using custom computer software and static

images. Reduced resolution static images have been used by other authors in a

range of simulated AHV experiments (for example, [226], [18] and [126]), there-

fore there is reason to believe that static images may also be effective for the

identification of mobility related image components.

Can computer vision techniques be adopted and modified to provide

mobility information in an AHV system?

As discussed in the previous chapter, edge detection can reduce the amount of

data, while preserving the important structural information in an image. Previous

AHV simulation research by Boyle et al. [18] has indicated that edge detection

may not be very useful at low resolution static images (25 x 25 phosphenes or

below). However, (as mentioned in Chapter 3) this is contradicted by the Dobelle

institute [48] who found that Sobel edge detection was useful in their commercial

AHV device for recognising useful scene components, such as doorways. There-

fore, one aim of the experiment presented in this chapter is to investigate whether

edge detection would be beneficial at a higher 50x50 pixel resolution.

Additionally this investigation attempts to determine if different types of edge

detection affect the recognition of low resolution mobility-related images. As

discussed in chapter 4, two of the most widely implemented methods are the

5.2. Method 125

Canny and Sobel algorithms, and these two methods are applied to each of the

static images and compared. As the Canny method convolves the image with

a Gaussian smoothing operator before calculating the edge locations, it is less

sensitive to noise, and was expected to result in improved recognition performance

than the Sobel method.

These questions are closely connected to a number of the conceptual frame-

work components shown in Figure 5.1. Note that in this framework figure, com-

ponents which are not relevant to this study are shown in grey. The dynamic

factors addressed include image context and scene properties. The computer vi-

sion methods applied involved reduced resolution, grey-scale images, filtering and

edge detection. External factors which could influence the identification of mobil-

ity related components could include human factors (such as age, or experience

with low resolution displays) and camera properties.

5.2 Method

This study involved presenting low resolution static images which had been pro-

cessed using four different methods. Research participants were required to per-

form two main tasks for each image: (a) whether they were able to identify a

number of mobility related components and (b) to click on each image to record

where they believed the mobility related component was located.

5.2.1 Images selected

This experiment involved eight mobility-related images (shown in Figure 5.2).

Three of these images were captured by the author using a 160x120 pixel resolu-

tion PDA Pretec CompactFlash card Camera (images a, g and h). Five images

were obtained from web searches (images b , c, d, e and f). For consistency all

images were resized to 256x256 pixel resolution. These images were chosen as


Dynamic factors

Computer Visionmethods for AHVSpatial ResolutionNumber of grey levelsLow pass filter (smoothing)High pass filter (edges)

ContextIndoor OfficeIndoor BathroomOutdoor pathOutdoor train stationScene propertiesTextureComplexityLightingContrastType of objects

TaskWalkingSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception




Other modalitiesAuditoryTactile ...Other displaymodalitiesAuditoryTactile...AHV Display Type

Alert InformationLooming obstaclesDrop offFast moving objectsSymbolic

Standard Display

Mobility PerformanceScene componentsidentifiedSelected next stepFigure 5.1: Conceptual framework diagram showing factors which influence sim-ulated AHV display effectiveness in this chapter. Factors from Chapter 4 whichare excluded from this chapter are marked with a line pattern.

5.2. Method 127

(a) (b)(c) (d)

(e) (f)(g) (h)

Figure 5.2: Original mobility images used in this chapter. A brief descriptionsfor each image is shown in Table 5.1.


they included both indoor and outdoor scenes, different mobility related objects

of interest (such as cars, people and drop-offs), and different levels of clutter and

brightness.

As mentioned, this experiment assumed that a hypothesised AHV system

was capable of displaying a 50x50 phosphene resolution. To reduce the risk of

participants having a priori knowledge of the images, it was decided to use unique

images for this experiment which were unlikely to have been previously seen.

Each of the eight images was processed using four different methods (creating

a test set of 32 images):

1. Binary only

2. Binary output from Canny edge detection

3. Grey-level (8 bit)

4. Binary output from Sobel edge detection

The stages of processing are summarised in Figure 5.4 below. Edge detection

sensitivity thresholds, resulting in the most accurate representation of mobility

information, were (subjectively) selected for each image (these are listed in Table

5.2). For Canny edge detection, the standard deviation of the Gaussian filter (σ)

was equal to 1 for all images. In addition, the edge images were dilated using a

flat, disk-shaped morphological structuring element, which helped to retain the

edge information after the image was resized. To simulate the pixelization effect

in an 50x50 phosphene resolution from an AHV system, all images were then

resized from their original 256x256 pixel resolution to 50x50 pixel resolution and

then back to 256x256 pixels. Finally, a 3x3 neighbourhood median filter was

applied to each image to soften the pixelization effects after resizing the images.

All image processing was conducted using the Matlab Image Processing Toolkit.

5.2. Method 129

iptsetpref(‘ImshowBorder’,‘tight’);

x=imread(‘Z:\ChildOnStreet.bmp’);

figure, imshow(x);

x1=edge(x,‘sobel’,.09);

figure, imshow(x1);

%These resize commands are designed to

%emulate the pixelization from 50x50 phosphenes

x2=imresize(x1,[50 50]);

x3=imresize(x2,[256 256]);

% Highlight edge information (this step only

%used for Canny and Sobel image types)

se = strel(‘disk’,1);

x3 = imdilate(x3,se);

figure, imshow(x3);

Figure 5.3: Example Matlab code used for generating images. This examplecreates the output from Sobel edge detection for image A. Child On Street.

The Matlab code for generating the Sobel edge detection output for image A is

shown in Figure 5.3.

5.2.2 Assessing mobility information

For each of the 32 processed images, participants were asked a series of five

questions. These questions were selected to investigate the amount of mobility

related information which could be identified by participants and were based on

the National Research Council’s summary of blind pedestrians needs (discussed

in Section 2.3). Each participant was required to respond to the following five

questions for each image:

Q1. Can you identify a person in this image?

Q2. Can you identify a tall obstacle (e.g. pole/tree)?

Q3. Can you identify a drop off in this image?


Table 5.1: Mobility related image components identified for each image.

Image Person Tall Obstacle Drop Off Low Obstacle

A. Child on street X X X

B. Path near road X X

C. Person in office X X X

D. Person in bathroom X X

E. Sparse office X

F. Street scene with tree X X

G. Phone booth obstacle X X X

H. Railway platform X X

Table 5.2: Image edge detection and line enhancement thresholds for each im-age. Note that the Canny sensitivity listed is the high threshold value. The lowthreshold value was set to 0.4 times the high threshold.

Image Sobel Sensitivity Canny Sensitivity Dilation Disk size

A. Child on street .09 .30 2

B. Path near road .18 .45 1

C. Person in office .14 .45 1

D. Person in bathroom .16 .40 3

E. Sparse office .17 .60 2

F. Street scene with tree .10 .45 1

G. Phone booth obstacle .14 .35 2

H. Railway platform .17 .40 1

5.2. Method 131

1. Convert image to 256 Grey-levels

Image Type = 3?

Resize image resolution to 50x50 pixels

Resize image resolution to

256x256 pixels

Apply 3x3 median filter

Image Type = 1?

Convert image to 2 grey-levels

(binary)

Image Type = 2?

Apply Canny edge detection

Apply Sobel edge detection

No No No

Yes Yes

Yes

Yes

Input Image

Output Image

Image Type 1. Binary (no edge detection)

2. Binary (Canny edge detection) 3. Grey-Scale

4. Binary (Sobel edge detection)

Figure 5.4: Flowchart showing the image processing steps applied for each of thefour image type used in this Chapter.


(a) (b)(c) (d)(e)

Figure 5.5: Image processing applied to image A (Child on street). The originalimage (converted to 8 bit grey-scale and 256x256 pixel resolution) is shown withthe 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).

5.2. Method 133

(a) (b)(c) (d)(e)

Figure 5.6: Image processing applied to image B (Path near road). The originalimage (converted to 8 bit grey-scale and 256x256 pixel resolution) is shown withthe 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).


(a) (b)(c) (d)(e)

Figure 5.7: Image processing applied to image C (Person in office). The originalimage (converted to 8 bit grey-scale and 256x256 pixel resolution) is shown withthe 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).

5.2. Method 135

(a) (b)(c) (d)(e)

Figure 5.8: Image processing applied to image D (Person in bathroom). Theoriginal image (converted to 8 bit grey-scale and 256x256 pixel resolution) isshown with the 5x5 grid mask in figure (a). The binary image is shown in image(b) and the Canny edge detection image shown in image (c). The 50x50 8 bitgrey-scale image is shown in image (d). Finally the Sobel edge detection outputis shown in image (e).


(a) (b)(c) (d)(e)

Figure 5.9: Image processing applied to image E (Sparse office). The originalimage (converted to 8 bit grey-scale and 256x256 pixel resolution) is shown withthe 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).

5.2. Method 137

(a) (b)(c) (d)(e)

Figure 5.10: Image processing applied to image F (Street scene with tree). Theoriginal image (converted to 8 bit grey-scale and 256x256 pixel resolution) isshown with the 5x5 grid mask in figure (a). The binary image is shown in image(b) and the Canny edge detection image shown in image (c). The 50x50 8 bitgrey-scale image is shown in image (d). Finally the Sobel edge detection outputis shown in image (e).


(a) (b)(c) (d)(e)

Figure 5.11: Image processing applied to image G (Phone booth obstacle). Theoriginal image (converted to 8 bit grey-scale and 256x256 pixel resolution) isshown with the 5x5 grid mask in figure (a). The binary image is shown in image(b) and the Canny edge detection image shown in image (c). The 50x50 8 bitgrey-scale image is shown in image (d). Finally the Sobel edge detection outputis shown in image (e).

5.2. Method 139

(a) (b)(c) (d)(e)

Figure 5.12: Image processing applied to image H (Railway platform). The orig-inal image (converted to 8 bit grey-scale and 256x256 pixel resolution) is shownwith the 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).


Q4. Can you identify a low obstacle (such as a chair)?

Q5. Please imagine you are moving through the scene and this image is the only

visual information available to you. Where would you aim your next step?

Please click on this button and then select the location on the image.

For the first four questions, the subject was then required to select a 5 point

Likert scale rank as follows:

1. Definitely yes

2. Probably yes

3. Don’t Know

4. Probably no

5. Definitely no

Software implementation details

The software for this research was written using Microsoft Visual Basic 6.0 and

presented on a Windows 2000 laptop. The user interface comprised two screens:

a) An initial screen with a randomly generated participant ID, and b) the main

experiment screen (shown in Figure 5.13). The image presentation sequence

was randomized for each volunteer. For each question if a ranking between 1

(‘definitely yes’) and 4 (‘probably no’) was selected, the subject was prompted

to click on the image location which best matched the object referred to in the

question. If a subject selected a ranking of ‘Definitely no’, they were not required

to click on an image location. These coordinates, along with the participant ID

and display image details, were recorded in an ascii delimited text file for further

analysis.

5.2. Method 141

Figure 5.13: A sample screen from the static image experiment. The x andy values on the right hand side of the screen show which part of the imagereceived a mouse click for each question. If the participant selected a response of‘5=Definitely No’ for a question, x and y were set to -1 by default.


Table 5.3: This table shows the ranges used to convert the original x and y co-ordinates (recorded for each question and image combination) into a simplified5x5 element array. For example the x,y value (227,156) would be re-coded to(5,4). The simplified values were then compared against an array of ‘correctresponses’ for each question type.

Original x or y value Grid x or y value

Between 0 and 51 1

Between 52 and 102 2




5.2.3 Procedure

Ten postgraduate students or staff at the Queensland University of Technology

volunteered to participate in the study. All participants had normal or corrected-

to-normal vision. Each participant was asked to sit in front of a computer with

the software loaded and the initial screen displayed. A definition of a ‘drop-

off’ was verbally provided to all subjects (as this is not a common expression

outside the O&M literature). Following this, participants were asked to read the

instructions on the screen and click on a start button. Then the first of the 32

randomly allocated images were presented on the computer.

Statistical analysis

To simplify the process of evaluating participant responses, recorded image co-

ordinates for each question/image were first reduced in scale from 256x256 to a

5x5 grid. The conversion values used are shown in Table 5.3. An example of the

5x5 grid is shown in Figure 5.5a. The grid origin is in the top left corner.

Prior to conducting the experiment, a matrix of ‘correct’ grid locations for

each image and question type combination was generated (this file consisted of

5.2. Method 143

Table 5.4: Steps in identifying correct/incorrect and identified/not identified gridresponses.

Response Is the question Response

valid for this image? classified as:

1,2 Yes True Positive

No False Positive

4,5 Yes False Negative

No True Negative

169 entries). After the experiment was completed, the grid locations selected by

each participant were compared to the list of ‘correct grid locations’ to identify

correct locations.

Not all questions were valid for all images. The mobility related compo-

nents identified for each image are shown in Table 5.1, for example there are

no people in images 5, 6 and 8. However participants may have incorrectly

identified these objects or people (false positives), which may be important for

mobility (for example, a low resolution image of a phone booth may be mis-

understood to be a person). Therefore question responses were classified into

true/false/positive/negative categories according to Table 5.4. Question responses

of ‘3. Don’t Know’ were excluded from this classification.

Unless stated otherwise, statistical significance was at the p<.05 level. The

Statistical Package for the Social Sciences (SPSS) (2004, SPSS Inc, Chicago,

USA) was used for all statistical calculations. Due to the small sample size of 10

participants and ordinal scale data recorded from this experiment, nonparametric

statistics have been used for analysis.


5.3 Results

The response results for questions 1-4 on all presented images are displayed in

Figure 5.14. The response of ‘Definitely no’ comprised 45% of responses to ques-

tions 1 to 4, and was highest (60%) for the identification of people (question

1).

Figure 5.15 shows responses by different image types. Most of the ‘definitely

yes’ responses (65%) were related to Grey scale images, which also had the least

proportion of ‘don’t know’ responses (9%). The results for binary and edge

detected images were similar.

As discussed in Section 5.2.3, the grid locations selected by subjects were

divided into four groups: true positive, true negative, false positive and false

negative. The results for all images are summarised in Figure 5.5. Participants

clicked on the correct image locations in 77% of ‘Definitely Yes’ responses and

59.7% of ‘Probably Yes’ responses. Interestingly in 28.5% of responses, where

the participant selected ‘Probably No’, they actually selected the correct image

location for that image. Results for each image are presented in Figures 5.16-5.23.

The results for questions 1 to 4 were significantly different for the eight types

of image (χ2=281.83, n=1077, p<0.01 ). Image D (person in bathroom) and

E (sparse office) received the highest frequency of true identifications (91% and

87.4% respectively). Images C (person in office), G (phone booth obstacle) and

B (path near road) (39.1%, 40.2% and 42.2% respectively) received the lowest

frequency of correct identifications.

Grey scale images received the highest number of true responses (71%). There

was no significant difference between results for the two types of edge detection

over all images (χ2=.055, df=1, n=523, p=0.815). The results for the two types of

edge detection were significantly worse than both binary and grayscale methods

of image processing over all images (χ2=17.08, df=2, n=1077, p<0.01). There

5.3. Results 145

Table 5.5: Summary of response classification for each image type. Note that 203responses of ‘Don’t know’ have been excluded from classification.

True Positive True Negative False Positive False Negative Total

Binary only 40 106 12 104 262

Canny 41 110 10 94 255

Grey-scale 108 100 32 52 292

Sobel 43 113 10 102 268

Total 232 429 64 352 1077

Low obstacleDrop offTall obstaclePerson

Question

Definitely no

Probably no

Don't Know

Probably yes

Definitely yesUserResponse

Figure 5.14: Summary of question responses for each of the 32 images presented.

was no significant difference between true responses for binary and edge detected

images (χ2=0.63, df=1, n=785, p=0.43).

Question 5 asked subjects to select where they would place their next step.

Results for this question (and the ’correct’ grid locations) were similar for each

image. As shown in Figure 5.24, the grayscale images scored the highest percent-

age of correct responses to question 5. There was no difference in results between

Binary, Sobel or Canny images (χ2=1.39, df=2, n=240, p=0.50). The results for

Sobel and Canny edge detection methods for this question were identical (84%

correct).


SobelGrey-scaleCannyBinary only

ImageType

Definitely no

Probably no

Don't Know

Probably yes

Definitely yesUserResponse

Figure 5.15: Summary of responses for each image processing method used inthis experiment.

Figure 5.16: Results for questions 1-4 for each processing method on image 1(Child on street).

5.3. Results 147

Figure 5.17: Results for questions 1-4 for each processing method on image 2(Path near road).

Figure 5.18: Results for questions 1-4 for each processing method on image 3(Person in office).


Figure 5.19: Results for questions 1-4 for each processing method on image 4(Person in bathroom).

Figure 5.20: Results for questions 1-4 for each processing method on image 5(Sparse office).

5.3. Results 149

Figure 5.21: Results for questions 1-4 for each processing method on image 6(Street scene with tree).

Figure 5.22: Results for questions 1-4 for each processing method on image 7(Phone booth obstacle).


Figure 5.23: Results for questions 1-4 for each processing method on image 8(Railway platform).

Figure 5.24: Question 5 (‘next step’) result summary for each type of image.

5.4. Discussion 151

5.4 Discussion

The purpose of this study was to investigate the benefits of some simple forms of

image processing on mobility related static images at 50x50 pixel resolution.



There were significant differences between results for the different base images

(shown in Figure 5.2). The correct recognition of mobility related components

were best for image 1 (Child on street) and 4 (Man in bathroom). These images

are less cluttered than other images. Images 2 (Path near road), 5 (Sparse office)

and 8 (Railway platform) had significantly less correct recognition. The original

resizing of images to 160x120 pixels may also have contributed to these results.

The results for the ‘next step’ question (Table 8) were high for all image types.

This indicates that a greater range of mobility related images (such as doorways

or stairs) may be required.



This experiment has demonstrated that a static image based AHV simulation can

provide useful information regarding mobility. There are a number of advantages

associated with static image experiments over mobility course studies. These

include: portability (for example using a laptop), control of extraneous variables

(such as lighting conditions), increased number of participants (it is easier to

recruit people for a computer based study), ease of data recording (participant

responses can be automatically record during experiments). However, the use of

static images simplifies the mobility task. Additional important components of

mobility include the effects of auditory, tactile, kinesthetic (the feeling of motion)


and proprioceptive (information about a person’s position, location and movement

in space).

By using image sequences (video) rather than static images, it should be pos-

sible to use a lower pixel resolution when a subject is able to use ego and object

motion to assist with object identification: Cha et al [30] found that head move-

ments were important in improving mobility performance at 25x25 resolution.

Applying an ecological approach to AHV system development could emphasise

this movement in a complex and changing environment. The movement of a

head-mounted camera of a visual prosthesis patient would produce a transforma-

tion in captured images (optic flow) which can be useful for segregating an image

into component parts [32] and the rate of expansion can be used to calculate the

amount of time before a collision will occur. By processing and presenting image

sequences it should also be possible to use fewer pixels for similar object recog-

nition performance. Therefore the next three chapters will focus on processing

image sequences for AHV simulation.

One issue with the research design for this experiment was that participants

may have been primed by previous exposure to an image (for example, if they saw

a grey-scale image of the office scene and then viewed the binary version of same

image). This is a consequence of randomising the display order of the 32 images.

In future work it would be beneficial to add a constraint to the ordering of images,

so that different processed versions of the same image cannot be repeated during

the experiment.



In this experiment four different methods were used to process images. The 256

grey-level image type resulted in significantly better recognition of mobility com-

ponents than the other image types. No significant difference was found between


either the Canny or Sobel edge detection methods and the binary threshold image

type. Therefore the results presented in this chapter do not support the use of

edge detection at this resolution of static images. Based on these results, if edge

detection was required in an AHV system, the Sobel method appears more suited

to an AHV system due to its lower computational cost than the Canny method.

5.5 Chapter Summary

This chapter has described a computer based experiment using low resolution

static images. A number of the factors from the AHV display framework presented

in Chapter 4 have been involved in this experiment. Neither the Canny nor

Sobel methods of edge detection were found to be more useful than a standard

thresholded binary image representation for recognising mobility related scene

components. A number of benefits associated with using a static image approach

have been discussed. However mobility involves movement of a person’s body in

a dynamic environment, and these factors are not considered in a static image

AHV simulation. Therefore, the remainder of this thesis will focus on the use of

image sequences and wearable AHV simulation.

Chapter 6

AHV Simulation and Obstacle

Detection using a Personal

Digital Assistant

6.1 Introduction

The previous chapter on static image research identified a number of limitations

concerning AHV mobility assessment. In that chapter it was argued that mobility

assessment should be more valid and generalisable if the effects of auditory, tactile,

kinesthetic and proprioceptive on mobility are also considered. Therefore this

chapter and the remaining experimental chapters of this thesis are concerned

with portable, head-mounted simulations.

This chapter addresses the following research question from Chapter 1:

155

156Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital

Assistant



As discussed in Chapter 2, the aesthetic appearance of Electronic Travels Aids

(ETAs)for the blind is very important. It would be preferable if AHV system users

were able to use a small, hidden computer for camera image processing. Current

generation Personal Digital Assistant (PDA) computers meet this requirement,

and can be easily concealed in a pocket or on a belt. Therefore this chapter

describes the development of a PDA-based AHV simulation. The PDA display

itself is used to present the phosphene simulation which enables a normally sighted

participant to be assessed on various mobility tasks under different contexts, alert

scenarios and image processing conditions.

The PDA chosen for this experiment was a Hewlett Packard iPaq 2210 us-

ing an Intel PXA255 processor. This PDA has a number of constraints (which

are common to all low cost PDAs): no floating point processor, relatively slow

bus speed and processor and limited Random Access Memory (RAM). These

constraints are likely to be typical of actual AHV devices (for example, [215]).

Working within these constraints, this chapter describes the development and

evaluation of a novel method of providing a real-time looming obstacle alert. This

alert display is hypothesised to result in reduced obstacle contacts by participants

during mobility assessment.

6.2 Method

6.2.1 Hardware

There are three main components to the PDA AHV simulator described in this

chapter: the PDA itself; the camera; and the headgear used to attach the PDA

centrally in front of a participant’s eyes. Each of these components are discussed

6.2. Method 157

below.

Hewlett Packard iPaq 2210 Pocket PC

The main benefit of using a PDA is the small size, lightweight and a reduction

in the number of connecting cables. Current generation PDA’s are however con-

strained by relatively slow CPU, and lack a floating-point unit for real number

computation.

The main operating systems for PDA’s are currently Palm OS and Microsoft

Window Mobile Pocket PC 2003. The Microsoft operating system was chosen

due to the availability of free development software. The HP iPaq was selected

because (at the time of purchase) it contained the fastest processor (a 400 MHz

Intel XScale PXA255) and fastest internal bus speed (200MHz).

Lifeview Flycam CompactFlash Camera Card

For image capture, a Lifeview Flycam CompactFlash (CF) Camera Card is used,

capable of capturing static images at a resolution up to 640x480 pixels, although

only 160x120 is available for video capture. This camera uses a 1/4 inch CMOS

sensor, has a viewing angle of 52◦, and has automatic gain and exposure. A ring

outside the lens provides a manual focus for the camera.

Ideally a head mounted camera could be connected (preferably via a wireless

link) to a concealed PDA. However, this was not possible at the time of develop-

ment, therefore the camera was required to be inserted into the CF card slot at

the top of the PDA. This meant that the PDA itself needed to be head-mounted,

which was feasible as the combined weight of the camera and PDA was 164 grams.

The advantage of using placing the PDA in front of a person’s eyes was that the

PDA display itself could function as the simulation display.


Assistant

Figure 6.1: Front and side views of the AHV simulator used in the present study.

Headgear

A standard head brace device was adapted to include a bracket for holding the

PDA in front of a person’s eyes (see Figure 6.1). The viewing distance from the

eyes was approximately 65 cm. The PDA screen display was 8.89 cm diagonal

with a resolution of 240 x 320 pixels.

6.2.2 Obstacle avoidance

The inclusion of an automatic looming obstacle alert would be a beneficial func-

tion in an AHV system. There are two main methods to achieve this goal on a

portable AHV system:

1. Process distance information provided by ultrasound or laser sensors. Apart

from the disadvantages of these technologies discussed in the section on

Electronic Travel Aids in Chapter 2, processing the received distance infor-

mation would place an additional burden on the the AHV system.

2. Use the images captured from the AHV system camera. In this thesis a

single camera is used, as this is simpler to implement, is easier to conceal,

requires less processing resources, and is biologically feasible (humans with

monocular vision are still able to judge distances and impending collisions).

A different, and more computationally intensive approach would be to use

6.2. Method 159

multiple cameras (which, when calibrated, would provide stereo information

in the same way as binocular human vision).

The camera used in this experiment was located on a person’s head. In

contrast to a camera located on a different part of the body (such as the feet), a

single, head-mounted, camera would identify objects which are expanding toward

the head. As discussed in Chapter 2, chest or head high looming obstacles are

often difficult to detect (for example, in Figure 4.1 the main body of the phone

booth is at chest height, while the base is quite narrow). In addition, a collision to

the head, particularly while wearing an implant system, is probably more serious

than other parts of the body.

The traditional approach to image based obstacle avoidance, using a single

camera, is to estimate the optical flow within the image sequence, compensate for

camera motion (ego motion), and suggest turning toward the direction where the

optical flow is smaller [141]. The PDA based looming obstacle alert developed and

presented in this chapter is broadly based on optic flow and motion estimation,

both of which will be briefly described below.

Motion Estimation

Optic flow is discussed in the blind mobility, visual perception and computer vi-

sion literature, and was first proposed by the experimental psychologist James

J. Gibson [76]. When a person (or camera) is moving, optic flow provides in-

formation on the spatial structure of the outside environment. If an object is

located directly in front of the observer while they move toward it, the central

part of the object will have no optic flow. However, the object edges will move

out as the object expands. The most popular methods for computing optical

flow include the differential, region based matching, frequency and phase based

approaches [10]. However, using these methods to extract optical flow on a PDA


Assistant

in near real-time (for example, 5 fps) is computationally challenging (at the time

of writing). Therefore in this chapter, a block based approach is used to estimate

optic flow.

A block-based approach to calculating displacement vectors is implemented in

the widely used MPEG 1 and 2 video compression standards, where a single mo-

tion vector is estimated for each 16x16 block. As the differences between images

in an image sequences are often small, during video encoding the movement of

blocks within an image can be calculated and stored, instead of the actual block.

As the block motion vectors require much less space than storing the entire block,

a considerable amount of inter-frame compression can be achieved.

Block matching searches for the location of the best-matching block in the

next frame(s) based on a distance criterion. Generally the Mean Square Error

(MSE) or Mean Absolute Error (MAE) is used as the matching criteria. A number

of fast search methods have been developed to assist with motion estimation [75].

Due to the lack of an FPU, a PDA implementation needs to use as few integers

as possible. To evaluate motion estimation information on a PDA, a fast integer-

based search algorithm from Srinivasan and Rao [212] was implemented. The

present implementation involved dividing a captured image (120x160 pixels in

size) into 7x10 blocks, each of which is 16x16 pixels. These blocks are shown in

the sample image in Figure 6.2). The block-based approach involves searching

for the location of each block in the previous image. A 5x5 pixel search area

from the centre of each block was used. The Sum of Absolute Differences (SAD)

between pixel brightness in the current and previous image blocks was computed.

The block with the lowest SAD value is assumed to be the same block, and the

motion vector is stored. The motion vectors are estimated at around 6 frames

per second on the PDA.

An example of the motion vectors extracted using the iPaq 2210 PDA are

shown in Figure 6.3. These images were captured as the camera was moved toward

6.2. Method 161

Figure 6.2: Grid showing the 7x10 pixel blocks used from each 120x160 pixelimage for the PDA motion estimation described below.

an obstacle (a knee high concrete bench). The image on the left was captured

immediately before the image on the right, and the motion vectors have been

calculated between these images. Because the camera was carried by a person

who was walking there is general camera movement (ego motion) to the right,

however the bench has increased slightly in size in the second image, and this is

indicated by the direction of the vectors around the bench. One possible use of

the information from these motion vectors is to identify (or segment) objects from

within the image sequence, and to work out if they represent hazards (obstacles

or drop offs) for a blind traveller.

6.2.3 Block Based Obstacle Alert

The main functions of the looming obstacle alert are to:

1. Segment an image sequence in real time to identify visible objects.


Assistant

Figure 6.3: Motion vectors extracted calculated from the PDA. The origin of eachmotion vector is the centre of each of of the grid blocks in Figure 6.2. In certaindirections the arrow heads look like white blobs.

2. Detect when a segmented object is growing larger (that is, approaching the

camera) at a sufficient rate to suggest that a collision is imminent.

6.2.4 AHV Simulation Implementation

The main purpose of the AHV simulation software was to convert input from the

camera into an on-screen simulated phosphene display. In addition, background

processing needed to determine if an alert warning should be displayed. The

alert processing was based on a block-matching approach. To be representative

of current AHV prototypes and maintain the same aspect ratio of the display

device, the PDA based AHV simulator reduced the resolution of captured images

from 160x120 RGB to 32x24 grey-level ‘phosphenes’, displayed as squares.

The Flycam-CF Software Development Kit was used for accessing images from

the camera. The simulator software was developed in Microsoft embedded Visual

6.2. Method 163

C++ version 4.0 on a Windows XP PC. After compilation, files were transferred

to the PDA using a USB connection and Microsoft ActiveSync. A Windows XP

test application was also developed using Microsoft Visual C++ version 6.0 to

test methods on image sequences previously captured from the PocketPC and

camera.

The approach used to provide the obstacle alert was to:

1. Segment each image based on grey-level values.

2. Check the size and rate of expansion of each segment between contiguous

images.

3. If a segment has expanded quickly and comprises at least 20 5x5 blocks

then display the alert warning.

The main steps used in the PDA simulation are shown in Figure 6.4. A set

of arrays for both the current and previous image is maintained, including the

block grey-level value, warning segments, and segment size. An array of allocated

segments is also maintained across images. To improve computation time, each

5x5 pixel area from the original 160x120 pixel image was used to generate one

32x24 pixel block. More detailed implementation details will now be provided.

Steps 1-4

Initially each 160x120 pixel RGB bitmap supplied by the camera is captured as

step one. In step two, the 256 level image was converted to an 8 grey-level

array. This reduction of grey-level information assists with the execution speed

of image segmentation and filtering.

A constant brightness level is required for motion estimation algorithms, oth-

erwise changes in brightness may be incorrectly identified as an object. Therefore,

if the difference between the sum of grey-level values in the current image and the


Assistant

1. Get current 160x120 pixel image

2. Convert image colour from 24 bit RGB

colour to 8 bit (256) grey-level image

4. Apply 3x3 median filter

3. Reset the previous image

arrays and image segments

5. Reduce spatial resolution of image to 32 x 24 blocks (using median pixel values)

6. Segment image based on

neighbourhood greyscale values

7. Search for the current block segment in the previous image

8. Smooth the updated segment values in current

image

9. Calculate rate of expansion for each

segment from previous image and set alerts if required

Is current image greyscale sum >

threshold? Yes

10. Display simulator output (32x24 blocks)

11. Copy current image to previous

image

No

Figure 6.4: Processing steps for the PDA block-based AHV simulation

6.2. Method 165

sum of grey levels in the previous image was greater than a threshold, the current

scene was assumed to have changed and the previous and segment arrays were

reset (step three). The threshold used was 245760, chosen as a 10% change in

total image grey-level for the image: (160x120x128)/10).

Following these steps, a 3x3 median filter was applied in step four to reduce

image noise.

These processing steps are illustrated in Figure 6.5a-c.

Step 5

In this step (shown in Figure 6.5a) the 32x24 block array was generated from

each image. The value of each block was calculated from the median value of

the 25 contributing pixels in the original 160x120 image. Image segments that

were expanding at a certain rate and were larger than a certain size were used to

determine the presence of a looming obstacle. The loss of spatial resolution (from

160x120 to 32x24 pixels) should be partly compensated by improved search time

in the following segmentation steps.

Step 6

Steps 6 through 10 use the 32x24 block array. In step 6 the eight neighboring

blocks of each block were scanned in a clockwise manner for a matching grey-

level value. If any of the grey-level values matched, and the matching block had

been allocated to a segment, the current block segment was set to the matching

block segment. If there were no matching grey-level or segment available, a new

segment was allocated.


Assistant

(a) (b)

(c) (d)

Figure 6.5: The first five steps of the block based approach are illustrated in theseimages of a suburban footpath. The number of grey-levels in the base image (a)was first reduced to 8 grey-levels (b), before a median filter was applied (c).Finally the image was spatially reduced from 160x120 pixels to 32x24 blocks (d).

Step 7

It is common for a camera to move between captured frames due to ego mo-

tion (movement of the camera due to small head movements or walking gait).

Therefore to compensate for this in step 7 the algorithm searched for the position

of each current block array element in the previous image. As in the previous

step, a matching grey-level value signifies a match. Ego motion was considered

by searching over a 5x5 block area in the previous block array in the following

6.2. Method 167

Figure 6.6: The maximum search area used in Step 7. Each block from the currentimage block array (shown on the left) is compared against the previous imageblock array (shown on the right). Initially only the matching block position iscompared. The search then checks the 8 blocks surrounding this position. Finallyif a matching block has not been identified the surrounding 16 blocks are searched(giving a total search area of 25 blocks).

manner:

• The current block value was first compared against the previous block array

value.

• If there was no match, a search was conducted over the neighboring 8 blocks

in the previous block array.

• If a match was still not made, a search was conducted over the 16 blocks

neighboring the 8 blocks (and therefore a total of 25 blocks are compared).

This search area is shown in Figure 6.6.

• If there was no match from any of the 25 blocks, a new segment was allocated

to the current block.

Step 8

The final part of the segmentation stage was designed to integrate the segmenta-

tion results from steps 6 and 7. For each block, a search was performed on the


Assistant

immediate 8-block neighbourhood and, if there was a matching grey-level value,

the current segment was updated to the matching block’s segment.

Step 9

In general, objects which are closer to the camera will expand at a faster rate

than objects which are further away. To check the rate of expansion, in this step a

comparison was made between the area (number of blocks) of each segment in the

current image block array compared to the previous image block array. Segments

that were smaller than a preset threshold (currently 20 blocks in area) were

ignored. If the rate of expansion (defined as the current image allocated segment

size/Previous image allocated segment size) was greater than 1.15 (determined

heuristically from test image sequences), an alert was set for that segment.

Steps 10-11

Finally, the phosphenes were displayed on the PDA display. Each phosphene was

displayed as a grey-level square. In this chapter, a 32x24 phosphene array was

displayed, therefore there was a simple one-to-one mapping between the block

array and the phosphene array. As the Pocket PC operating system does not

support the Microsoft DirectX set of APIs for high performance graphic display,

the older Game Application Programming Interface (GAPI) was used to directly

access the display memory. The block array was expanded to fill the 240x320

pixel PDA display. To improve efficiency, blocks were only displayed if they were

different from the previous display.

If an alert has been triggered from step 9 above, the phosphenes in the area of

that segment were identified with an ‘alert colour’. The colour red was selected

to stand out from the grey-level which was otherwise used for phosphenes. For

clarity in a printed thesis, the example figures in this chapter have had the red

alert areas filled with an ‘x’ shaped pattern (for example, 6.7(a) below.

6.2. Method 169

(a) (b)

(c) (d)

Figure 6.7: An example block based alert, shown in (d), which has been triggeredin response to looming branches in front of a head-mounted camera.

Figure 6.7 demonstrates the algorithm steps on a single image taken from a

suburban Brisbane footpath. In this image sequence the experimenter, wearing

the head-mounted camera, veered into bushes next to a path. Figure 6.7a is the

original 160x120 pixel grey scale image. Figure 6.7b is the same image after me-

dian filtering and conversion to 8 grey level values. Figure 6.7c is the 32x24 block

representation of Figure 6.7a. Figure 6.7d shows the location of alert segments,

which match the location of looming branches, that have been set for this image.


Assistant

Table 6.1: The number of frames in each image sequence, along with the durationof each captured sequence.

Postal Box Bus Stop

Time of capture Frames Seconds Frames Seconds

Mid morning 234 29 204 25

Early afternoon 244 30 200 25

Mid afternoon 268 33 171 21

Late afternoon 284 35 179 22

Mean 257.5 31.75 188.5 23.25

6.2.5 Procedure

To evaluate the performance of the obstacle alert component of the AHV simula-

tion, image sequences were captured at different times of the day using the AHV

simulation hardware. Two locations were used to capture the image sequences:

1. Postal Box sequence. The first sequence involved walking slowly around a

bend and toward a postal box (approximately 15 metres in total).

2. Bus Shelter sequence. In the second sequence, the experimenter walked

toward a bus shelter obstacle along a path with overhanging trees (a distance

of approximately 10 metres).

The number of frames and duration of each sequence are shown in Table 6.1.

An obstacle alert should be triggered as a person moves towards a large loom-

ing obstacle. Therefore, both sets of image sequences ended with the camera

approximately a centimetre from the final obstacle (either the postal box or the

bus shelter).

6.2.6 Statistical methods

These image sequences were then analysed on the PC based version of the alert

software. The alerts identified by the block based obstacle alert were subjectively

6.2. Method 171

(a) Postal Box Mid morning Frame 10

(c) Postal Box Mid morning Frame 70

(e) Postal Box Mid morning Frame 130

(g) Postal Box Mid morning Frame 190

(b) Bus stop early afternoon Frame 10

(d) Bus stop early afternoon Frame 70

(f) Bus stop early afternoon Frame 130

(h) Bus stop early afternoon Frame 190

Figure 6.8: Frames 10,70,130 and 190 from the postal box mid morning sequence(on left) and the bus stop early afternoon sequence (on right).


Assistant

rated as either correct (fences, overhanging trees, etc) or incorrect (incorrectly

segmented objects, shadows, etc).

There are two issues which should be noted regarding the analysis of results

from this experiment:

1. The first issue is that the number of correct alerts can be divided by the

number of incorrect alerts (recall). This information indicates how often

false alerts are presented which may distract an AHV system user, and an

indication of the usefulness of the obstacle feature. However, as the total

number of possible obstacles is unknown, the precision of the results in this

chapter [the number of correct alerts]/[the total number of possible alerts]

cannot be presented. A similar problem exists for Web searches, where the

total number of correct documents which can be retrieved is often unknown.

2. The second issue is that image sequences captured at different times of the

day have different numbers of image frames and obstacles. This has oc-

curred due to slight differences between head movements, walking speed,

and daylight levels (causing varying contrast between obstacles and back-

ground).

6.3 Experimental results

For all image sequences the post box and bus shelter were correctly identified as

looming obstacles at least once.

The results for the postal box 6.2 sequence were influenced by a white fence on

one side of the path. During the sequence captured at early afternoon, this fence

was captured less frequently which led to a reduction in valid alerts. This suggests

that following known structures, such as walls or fences, may be a useful method of

using an AHV system (a similar method, called shorelining, is frequently used by

6.3. Experimental results 173

blind people while walking next to walls or paths). Aside from the early afternoon

sequence, the ratio of correct/total number of alerts (Figure 6.9) decreased as the

experimenter moved away from the fence and increased again toward the postal

box. An example of correct obstacle identification for the midmorning postal box

sequence is shown in Figure 6.12.

Table 6.3 shows the results for the Bus shelter sequence. More alerts were

triggered for this sequence, as there was more visual clutter in front of the camera

as the images were recorded (from overhanging bushes along the section of path

leading to the bus shelter). In contrast to the early afternoon results for the postal

box sequence, only 37% of alerts were correct at this time for this sequence. This

can be partly explained by increased cloud cover while capturing this sequence

(note that the mean grey level is also reduced). Excluding the early afternoon

result, the other captured sequences for this route followed a similar pattern of

steadily decreasing during the day as natural illumination decreased. Figure 6.10

shows the mean recall for the bus shelter sequence over time.

Figures 6.9 and 6.10 show that in 7 out of 8 of the image sequences the recall

increased during the final 10% of the image sequence. This is desirable as each

sequence ended with the camera around two centimetres from the obstacle.

False alerts were usually shadows on the path, or the area surrounding an

obstacle. An example of a false alert is shown in Figure 6.11 where a path

shadow is incorrectly identified as an obstacle. The median filtered and 8 grey

level image is shown in Figure 6.11a. The 32 x 24 block image figure has been

segmented in Figure 6.11c. Figure 6.11d shows the alert segment, which has been

incorrectly identified.


Assistant

100.0090.0080.0070.0060.0050.0040.0030.0020.0010.00

Percent of image sequence completed

1.00

0.80

0.60

0.40

0.20

0.00

Mea

n R

ecal

l

Late afternoon

Mid afternoon

Early afternoon

Mid morningTime of Day

Figure 6.9: Postal Box recall graph: This graph shows the recall (the numberof correct alerts/the number of alerts) at different stages during each capturedimage sequence.


100.0090.0080.0070.0060.0050.0040.0030.0020.0010.00

Percent of image sequence completed

1.00

0.80

0.60

0.40

0.20

0.00

Mea

n R

ecal

l

Late afternoon

Mid afternoon

Early afternoon

Mid morningTime of Day

Figure 6.10: Bus shelter recall graph: This graph shows the recall (the numberof correct alerts/the number of alerts) at different stages during each capturedimage sequence.


Assistant

(a) (b)

(c) (d)

Figure 6.11: An example incorrect alert warning. The shadow shown in theoriginal median filtered and 8 grey-level image (a) is incorrectly segmented fromthe lower resolution image (b) and is assumed to be a looming obstacle in frontof the camera (d). The objects segments which have been identified are shown inimage (c).


Figure 6.12: Images 153 (top) to 156 (bottom) of the mid morning post boxsequence. The images on the left have been reduced to 8 grey levels and medianfiltered. On the right is the segmentation result for each image. An obstacle alert(shown with an ‘x’ pattern) was identified for frame 156.


Assistant

Table 6.2: Postal box image sequence results.

Time of Capture Mean Grey level Correct Alerts Total Alerts Recall

Mid morning 72.73 7 8 0.87

Early afternoon 110.48 18 18 1.00

Mid afternoon 76.44 3 7 0.30

Late afternoon 81.82 1 4 0.25

Total 85.37 29 37 0.78

Table 6.3: Bus shelter image sequence results.

Time of Capture Mean Grey level Correct Alerts Total Alerts Recall

Mid morning 110.60 13 18 0.72

Early afternoon 102.05 7 19 0.36

Mid afternoon 91.91 14 21 0.67

Late afternoon 87.70 11 23 0.48

Total 98.07 45 81 0.56

6.4 Discussion

There are currently no other AHV simulations based on PDA technology. Current

technology processor, FPU, and reduced memory have constrained the PDA AHV

simulation presented in this chapter to performing obstacle detection on reduced

resolution images. An additional constraint on the simulation device used in this

chapter was the low quality of images captured from the CF card camera.

The PDA based alert system was development to partly answer the following

thesis question: Can computer vision techniques be adopted and modified

to provide mobility information in an AHV system? The alert method

has demonstrated that the computer vision method of motion estimation can be

adapted to provide a warning to an AHV system user. The alert algorithm used

the rate of expansion of segmented objects to provide a warning of an impending

collision. The two main obstacles (bus shelter and postal box) were correctly


identified as hazards in every image sequence captured. However, there was a

wide variation in the number of alerts presented (between 1 and 18 for the postal

box sequence), due to differences in lighting and camera direction while recording

the images. Therefore, it would be useful to measure how frequently alerts should

be presented to assist with the mobility of an AHV system user.

An obvious method to improve the segmentation performance of the looming

obstacle alert would be to conduct all motion processing on the higher resolution

image before reducing the spatial resolution. It should be possible to use this

approach as PDA technology improves.

The algorithm described measures the rate of expansion of objects which are

approaching the camera. In an AHV system this camera will usually be mounted

on a person’s head. Therefore a person could make contact with an obstacle such

as a chair or car which is not within the camera’s field of view. However, as

discussed in Chapter 2, head-high obstacles (such as telephone booth jutting out

from a wall) can be particularly dangerous obstacles for the blind.

In future studies on the effect of light levels on obstacle alerts, the use of a

single sequence could be helpful. One way to do this might be to adjust the

grey-levels within an image sequence to reproduce decreasing illumination. How-

ever, caution needs to be used, as there may be compounding factors which effect

mobility at different times of the day (for example, shadow effects and temper-

ature may also affect walking speed). A set of standard mobility-related image

sequences would be very useful for the development and testing of alert algo-

rithms.

6.5 Chapter Summary

In this chapter a novel method of processing images using a PDA and attached

camera was presented. This method detects obstacles which are looming in front


Assistant

of the camera and provides an alert to the wearer. The results of two experi-

ments at four illumination levels have indicated that the initial segmentation and

adequate illumination are a significant factor in system performance. The overall

recall value of 63% indicates that the block based method shows reasonable per-

formance for development in future AHV systems, although it will be important

to consider what ratio of correct alerts versus false alerts will be acceptable for

system usability (this question is addressed in the next chapter).

An important question is whether the display of alert information in an AHV

simulation results in a reduction in the number of collisions during a mobility

assessment. In the next chapter an indoor mobility experiment will be described

to evaluate the use of the alerts on mobility effectiveness.

Chapter 7

Mobility Assessment using a

PDA-based AHV Simulation in a

course environment

7.1 Introduction

This chapter presents a pilot experiment in which an artificial mobility course was

constructed (based on the artificial courses discussed in Chapter 2). This course

allowed the mobility performance of volunteers to be assessed while they wore a

PDA-based AHV simulator. This simulator consisted of three modes including

the alert processing mode discussed in the previous chapter.

An image processing based AHV simulation has not been previously been used

in mobility assessment. As discussed in Chapter 2, in the only paper focussing on

AHV mobility Cha et al. [29] investigated walking speed and obstacle contacts

in an high contrast maze-like environment. However, their work used different

masks attached to a monitor in front of participants’ eyes (there was no processing

of images captured from the camera). Also, Cha et al.’s paper reported walking

181

182Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course

environment

speed through their maze, which (unlike PPWS) does not consider the effects of

individual differences in normal walking speed. Therefore a pilot experiment with

a small number of participants was conducted to investigate the use of PPWS

and computer vision methods in an AHV simulation.

The aims of this experiment were to investigate the following two main thesis

research questions:



The use of an artificial mobility course in this experiment should allow a larger

range of mobility related factors to be investigated than static image simulations.

The increased number of influences on mobility are shown in figure 7.1 and include

dynamic factors (such as sensory and environmental information) and human

factors (such as walking gait). A specific hypothesis investigated in this chapter

was that PPWS and mobility performance should increase during trials and with

repeated use of the simulator.


mobility information in an AVH system?

In this chapter the alert display developed in the previous chapter was compared

to two other display types. Using the artificial mobility course it was expected

that the frequency of mobility errors and time required to perform mobility tasks

should be less when the alert display is activated compared to the other display

types.


Dynamic factors

Computer Visionmethods for AHVSpatial ResolutionFrame RateNumber of grey levelsLow pass filter (smoothing)Motion estimationObstacle detection

ContextIndoor artificial mobilitycourseScene propertiesTextureComplexityLightingGlareContrastType of objects

TaskWalking along pathObstacle avoidanceFinding KeysSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception


EnvironmentAffordancesPath boundariesLandmarks


Other modalitiesAuditoryTactile ...Other display

modalitiesAuditoryTactile...AHV Display TypeAlert InformationLooming obstacles

Symbolic

Standard Display Mobility PerformancePPWSObstaclesVeeringFigure 7.1: Factors which influenced simulated AHV display effectiveness in thischapter. Excluded factors are marked with a line pattern.


environment

7.2 Method

This experiment involved human participants wearing a custom PDA-based AHV

simulator. Participants were required to walk through an artificial mobility course

while their mobility performance was assessed.

7.2.1 AHV Simulation Device

Custom hardware and software were developed for the novel AHV simulation

used in this chapter.

Hardware

A standard head-brace device (weighing 250 grams) was adapted to include a

bracket (400 grams in weight) for holding a PDA in front of the participant’s

eyes. This head-brace and PDA setup enables the simulation of AHV within

different experimental environments.

External light (not from the PDA display), was restricted by each participant

wearing a pair of modified ski goggles, lenses removed, and a sheet of block out

curtain sewn to the bottom of the frame. This curtain was then lifted over the

headgear and tied behind each the head of each participant. A layer of black felt

was also attached to the nose area of the goggles to restrict light.

The combined weight of the camera and PDA was only 164 grams. To conserve

battery life, the PDA display brightness was adjusted to 50% and bluetooth

communication settings were disabled. To prevent the PDA shutting down during

an experiment, all power saving options were also disabled.

Simulation software

The PDA software was an enhanced version of the program developed in the

previous chapter. The simulator software was developed in Microsoft embedded

7.2. Method 185

Display type Image processing

1 8 grey-scale median filtered display with Alerts

2 8 grey-scale median filtered display

3 256 grey-scale average display

Table 7.1: AHV simulation display types used for the pilot study

Visual C++ version 4.0 on a Windows XP PC. Functions from the Flycam-CF

Software Development Kit library was used to obtain images from the camera.

After compilation, the program files were transferred to the PDA using a USB

connection and Microsoft ActiveSync.

Three types of display were compared in this experiment. These display types

are described in Table 7.1 and shown in Figure 7.3. To exclude frame rate as

a confounding variable, each display type was standardised to 7.5 frames per

second (fps). These displays presented 32x24 simulated phosphenes, which filled

the 320x240 pixel PDA display.

The main steps in the image processing algorithm are shown in Figure 7.2.

The original 160x120 pixel camera image was captured, converted to either 8

(display types 1 and 2) or 256 grey-levels (display type 3). A 3x3 median filter

was applied to display types 1 and 2.

Detailed information on display type 1 (the alert display) was presented in

chapter 6. The main purpose of this display is to segment each image and com-

pare the growth of segments between images. If a segment expands above a

predetermined threshold, and takes up more than 40% of the screen, then an

alert is displayed (the expanding segment is shown as red on the PDA screen).

For display type 2, the reduced grey-level and median filtered output was

reduced to a 32x24 block array based on the average pixel values. Display type

3 was simply the original grayscale image reduced to a 32x24 ‘block’ array based

on average pixel values.


environment

Get current160x120 pixelimageConvert from RGBto 256 grey-levelimageDisplay = 1 or 2? Reduce image to 8greylevelsApply 3x3 medianfilterYesNoDisplay = 1?Reduce spatialresolution of image to32x24 blocks (usingaverage pixel values) No Yes Reset the previousarray and imagesegmentsReduce spatialresolution of image to32 x 24 blocks (usingmedian pixel values)Segment imagebased onneighbourhoodgreyscale valuesSearch for thecurrent blocksegment in theprevious imageSmooth the updatedsegment values incurrent imageCalculate rate ofexpansion for eachsegment fromprevious image andset alerts if required

Current imagegreyscale sum >threshold? YesNo

Display simulatoroutput (32x24blocks)Copy current imageto previous imageFigure 7.2: Processing steps for the AHV simulator used in the pilot study. Notethe display type is initialised before the images are processed. The three displaytypes are listed in table 7.1.

7.2. Method 187

(a) (b)

(c) (d)

Figure 7.3: Examples of the image types used in this study. Figure (a) is the base160x120 pixel 256 grey-level image. The simulator image using display type 3 isshown in image (b). Image (c) shows the base image from (a) with 8 grey-levelsand a 3x3 median filter applied. In image (d), image (c) has been reduced to a32x24 phosphene display (this is used for simulator display types 1 and 2).


environment

(a) (b)

Figure 7.4: Images taken of the Gait Lab before the mobility course was set up.Image (a) shows the black curtains surrounding the lab. The change area ‘tent’,and raised wooden platform are visible in image (b).

7.2.2 Assessment of mobility performance

To assess mobility performance using the AHV simulation, an indoor mobility

course (shown in Figure 7.6) was constructed within an 11m x 10m laboratory

(used for gait analysis at the School of Human Movement, Queensland University

of Technology). The walls of this laboratory were covered with black curtains.

The course consisted of a winding path, approximately 1.2m in width. Path

boundaries were marked with 48mm black duct tape. A wooden platform (raised

approximately 8cm from the floor) was incorporated into the mobility path and

is visible in Figure 7.4b. The floor of the course consisted of wood and concrete

(painted light grey). The total length of the course was approximately 45m.

Eleven obstacles of differing heights were placed through the course (Figure

7.6). Two of the obstacles were suspended from the ceiling to a height of 1.2

m. All obstacles along the path were made of soft materials. Obstacle number 4

7.2. Method 189

Figure 7.5: Example soft obstacle set up for the mobility course.

from Figure 7.6, a 100cm tall light grey obstacle, is shown in Figure 7.5. These

obstacles were chosen as a safe way of replicating various obstacles found in real

life such as chairs or people. The obstacle types were based on those used in

previous low vision mobility studies (for example, Lovie-Kitchin et al. [137]).

One of the main dependent variables used in this pilot experiment was PPWS.

As discussed in Chapter 2, PPWS allows the objective comparison of different

people by normalising their walking speed. Therefore a straight, unobstructed,

10m section of the course was used to measure the Preferred Walking Speed

(PWS) of each participant. This area is shown in Figure 7.6.

In this experiment each participant was required to perform two different tasks

(A and B). The order of these tasks alternated for each participant. These tasks

were:

1. Task A: Navigate through the course from the start to a clearly marked end


environment

point.

2. Task B: Find a set of keys, located on a table next to the path, and carry

these keys to the end of the course.

7.2.3 Questionnaire

A number of human factors, which have been identified in the display framework

(Figure 7.1) were also recorded for each participant in this experiment. Therefore

before commencing the mobility tasks, each participant was asked to fill in a

questionnaire comprising the following questions:

1. What is your gender? Male Female

2. Please indicate your age (years): <20 yrs 20-30 yrs 30-40 yrs 40-50 yrs 50-60

yrs over 60 yrs

3. Are you wearing any vision correction device (glasses or contact lenses)?

Yes No

4. Have you ever used an immersive Virtual Reality (VR) environment (using

a head mounted VR display) before? Yes No

5. If you have used an immersive VR environment before, approximately how

many times have you done this?

The questions on vision correction or prior experience with virtual reality were

included as these factors may enhance a person’s ability to compensate for a lower

resolution AHV simulation display.

7.2. Method 191Mobility Path Mobility Path ChangeRoom

14 4 44

5 5 22

33A

B TableTableTableTableGait analysis equipment

31 245Obstacle Types:90 cm high from floor, 48 cm diameter120 cm high from floor, 70 cm x 30 cm30 cm high from floor, 55 cm x 40 cm100 cm high from floor, 50 cm diameter120 cm high suspended from ceiling, 35 cm x 35 cm x 10 cm depthBA Start position A and B10 metre Preferred Walking Speed (PWS) measurement area

Figure 7.6: The indoor course used for mobility assessment in this Chapter.


environment

7.2.4 Participants

This experiment was designed as a pilot to investigate the use of a PDA based

AHV simulator in an artificial mobility course. Therefore a small sample of

five people participated in this pilot study. Three volunteers were selected from

the postgraduate student population and another from academic staff, at the

Queensland University of Technology. The final volunteer was an undergraduate

student at the University of Queensland. All participants had normal or corrected

to normal vision. Three participants were aged 20-30 years, one was less than 20

years, and one was aged 40-50 years.

Level 1 (Low risk) ethical clearance (number (3887H)) was obtained from the

QUT University Human Research Ethics Committee for this experiment.

7.2.5 Procedure

Each participant was randomly allocated one of the three display types, and was

allocated to commence the first trial with one of the two task types (Task A

involved moving through the course or Task B which involved searching for a set

of keys in addition to moving through the course). An hour was allocated for

testing each individual. Study participants were met in a waiting room, blind-

folded, and led to a screened ‘change room’, where they were asked to read a

consent sheet and fill out the questionnaire. The simulation headgear was then

explained and fitted. Each participant was allowed two minutes to familiarise

themselves with the display. If the alert based display was used, the red flashing

display sections (obstacle warning) were explained. The guided PWS was then

recorded over 10m (this area is shown by a dotted line in Figure 7.6). After this

the participant was led to the task starting location and the first mobility task

was conducted. Each participant was offered a short break before the second

task was conducted. Finally, the PWS was again measured. During the mobility

7.3. Results 193

tasks, a single experimenter recorded walking speed, obstacle contacts, the num-

ber of times participants were told they were walking backwards (that is they

were walking normally but had become disoriented and started walking in the

opposite direction to that required to complete the course) and the number of

times participants veered outside the path boundary.

7.3 Results

The questionnaire responses are shown in Table 7.2. None of the participants had

personal experience with Virtual Reality environments. Three participants played

computer games monthly, one played weekly and one played daily. There was no

significant relationship found between game playing and mobility performance.

The number of recorded mobility errors for each two minute interval during

the mobility course are presented in Figure 7.7. The number of errors decreased

steadily as participants adapted to the simulation device. The number of errors

also peaked during the first two minutes for both the first and second trials.

The calculations used for PWS, SMC and PPWS are shown in Table 7.3 for

trials 1 and 2. These calculations are based on equations 2.1 and 2.2 in Chapter

2. Overall PPWS was significantly reduced between the first and second trial

(F (1,9) = 9.70, p<0.05). A reduction in mobility errors was also found between

the first and second trials. The mean number of obstacle contacts was reduced

(5.8 in the first trial to 5.2 in the second trial), veering errors (10.4 to 6.4) and

walking backwards errors (1.8 to 0.8). The overall mean PPWS improved from

the first trial (mean value of 14.76) to the second trial (mean value of 21.52).

No participants were successful in finding the keys during the searching mo-

bility task. However, in this pilot study the type of mobility task did not appear

to make a difference in mobility performance (Tables 7.4 and 7.6 ). A summary

of mobility errors for each task is shown in Table 7.5.


environment

Table 7.2: Questionnaire responses for each participant.

No. Gender Age Video game use Used VR Times VR used

1 Male 20-30 Weekly No 0

2 Male 40-50 Monthly No 0

3 Male 20-30 Monthly No 0

4 Male 20-30 Daily No 0

5 Male <20 Monthly No 0

Table 7.3: PPWS results for each trial for each participant. The Benchmarkcolumn is the time taken during the 10m guided walk. PWS is 10/Benchmarktime. Course (s) is the amount of seconds taken while walking through the 45mmobile course. SMC is 45/Course speed. PPWS is SMC/PWS multiplied by 100.

Benchmark (s) PWS (m/s) Time(s) SMC (m/s) PPWS

No. D T 1 2 1 2 Ave 1 2 1 2 1 2

1 1 B 16.82 13.98 0.59 0.72 0.66 775 297 0.06 0.15 8.86 23.13

2 3 A 12.50 13.34 0.80 0.74 0.77 357 320 0.13 0.14 16.37 18.26

3 2 B 19.53 16.30 0.51 0.61 0.56 475 288 0.09 0.16 16.92 27.90

4 2 A 13.50 15.22 0.74 0.66 0.70 459 325 0.10 0.14 14.01 19.78

5 1 A 14.98 17.59 0.67 0.57 0.62 420 357 0.11 0.13 17.28 20.33

Table 7.4: PPWS results for each task type and trial.

PPWS Trial 1 Trial 2

Task A 15.89 25.52

Task B 12.89 19.46

Table 7.5: Mobility error results for each trial for each participant.

Obstacle Walk

contacts Veering backwards

Task/Trial 1 2 1 2 1 2

A 5.00 6.00 10.33 8.33 2.00 1.33

B 7.00 4.00 10.50 3.50 1.50 0.00

7.3. Results 195

Table 7.6: Mobility error results for each task type and trial.

Mobility Errors Trial 1 Trial 2

Task A 17.33 15.67

Task B 19.00 7.50

12-1410-128-106-84-62-40-2

Minutes

40.00

30.00

20.00

10.00

0.00

Nu

mb

er o

f m

ob

ility

err

ors

Trial 2

Trial 1Trial

Figure 7.7: Total number of mobility errors for both trials during the mobilitycourse experiments


environment

Table 7.7: Mobility error summary for each display type.

Display Type 1 2 3 Total

Obstacle contacts trial 1 15 9 5 29

Obstacle contacts trial 2 13 10 3 26

Veering errors trial 1 21 23 8 52

Veering errors trial 2 19 7 6 32

Walking backwards trial 1 3 6 0 9

Walking backwards trial 2 2 1 1 4

Total: 73 56 23 152

Table 7.8: PPWS summary for each display type.

Display Type 1 2 3 Mean

Trial 1 mean PPWS 13.07 15.46 15.76 14.76

Trail 2 mean PPWS 21.73 23.84 19.00 21.52

Combined mean PPWS 17.40 19.65 17.38 18.14

Table 7.9: Effect sizes (η2) for the main mobility factors in this pilot study. ‘DV’represents the dependent variable, ‘F’ is the F-test result and ‘Sig’ representssignificance.

Factor DV F Sig η2

Trial PPWS 9.700 0.014 0.548

Trial Veering errors 3.008 0.121 0.273

Trial Obstacle contacts 0.151 0.707 0.019

Display PPWS 0.196 0.827 0.053

Display Veering errors 0.472 0.642 0.119

Display Obstacle contacts 1.683 0.253 0.323

7.4. Discussion 197

The results for each display type are summarised in tables 7.8 and 7.7. Display

type 2 (the 8 grey-level median filtered display) resulted in the highest PPWS

results. Display type 3 (the 256 grey-level average display) resulted in only 23

mobility errors, compared to 56 for the Display type 2 and 73 errors for Display

type 1 (8 grey-scale median filtered display with Alerts) (Table 7.6).

Table 7.9 shows the effect sizes from this pilot study. η2 was used as a measure

of effect and represents the proportion of variance in the dependent variable that

is attributable to each factor. The greatest degrees of association were found

to be between the trial number (first or second) and display type and obstacle

avoidance.

7.4 Discussion

This pilot study has demonstrated the feasibility of using a low cost PDA-based

AHV simulator to assess mobility performance. PPWS and mobility errors have

provided a useful method of measuring the three display types used in this study.

This chapter aimed to address the following questions:



The use of PPWS within an artificial mobility environment has been demon-

strated. This environment has enabled the comparison of three different AHV

simulation display types on mobility performance.

PPWS and mobility performance should increase during trials and

with repeated use of the simulator.

Learning effects were demonstrated with the two trials used in this study. Cha et

al [27] have previously noted the effects of learning on mobility skill, in particular


environment

the use of head movements to help depth perception and familiarity with the

environment. During this study participants learnt to recognise and follow the

path boundaries, usually by slight head movements. However, this meant that

participants tended to bend over, and walk in a shuffling gait during the mobility

course, similar to the gait of the elderly or the congenitally blind [188].



In this chapter the alert display developed in the previous chapter was compared

to two other display types. All of these display types have used image process-

ing techniques to process captured camera input and provide a low resolution

simulation image to participants.

The frequency of mobility errors and time required to perform mobility

tasks should be less when the alert display is activated compared to

the other display types.

Display type 1 (the alert display) did not assist with mobility performance, and

led to the highest number of mobility errors (see table 7.7). One problem with

this display type was that although each of the three display types were standard-

ised at 7.5 fps, the alert display could temporarily pause with large changes in

luminance (generally due to large head movements). The reason for the delay was

the alert software re-initialising and performing initial segmentation. The large

number of false positives with the alert display also reportedly confused partici-

pants. Although the idea of checking the rate of expansion of segmented objects

is probably sound, these results indicate that an efficient method of performing

this processing needs to occur on the full size image and not the reduced 32x24

block spatial image (which was chosen to reduce the computational burden on

the PDA). At the lower spatial images, the human brain appears better prepared


to extract looming obstacle than the alert system presented in this chapter. This

supports Boyle et al. [19] who found that further processing on low resolution

images does not result in greater image understanding, and that the most impor-

tant current constraint on AHV systems is the limited number of electrodes (and

therefore reduced spatial resolution).

The usefulness of the alert modes could depend on the location of the AHV

electrode implant and the degree of learning, neuroplasticity and general health

available to the recipient. For example, a retinal implant recipient may be less

likely to require alert assistance due to further processing in the human visual

system.

Although the PDA simulator tended to pull down on the participants fore-

head, none of the participants asked to stop the experiment. Two participants

needed a break between trials due to nausea and dry eyes. Nausea is a well-known

side effect of display lag within VR environments (making vestibulo-ocular adap-

tation difficult for the participant) [1]. A lighter, and less conspicuous, simulation

device would be useful, particularly for outdoor mobility assessment. It should

be feasible to connect a wireless head mounted camera with a PDA and send the

display to either VR goggles, or a Low Vision Enhancement System (as used in

[226]). In addition the material shroud used to block external light had the effect

of trapping heat generated from the PDA, which added to participant discomfit.

7.5 Chapter Summary

In summary, the simulator has demonstrated that it is possible to capture and

perform image processing on camera input using a PDA device. Although only

a small number of subjects were involved, this study has not supported the use

of the ‘intelligent’ alert display developed in Chapter 6. All participants, with all

display types, were able to improve mobility performance, measured by PPWS


environment

and mobility errors, over only two trials. Although the PDA provided a small and

low cost simulation platform, a problem with the PDA based simulator was the

weight of the bracket on the front of the head brace, which may have altered the

movement of participants, and which could have affected mobility performance.

In addition, the square grey boxes displayed to represent phosphenes are not

representative of those described by human recipients of a AHV system. The next

chapter describes the development and use of a more advanced and comfortable

AHV simulator which overcomes these issues.

Chapter 8

Effects of Spatial and Temporal

Resolution on Mobility

Assessment

8.1 Introduction

This chapter presents the results of an AHV mobility experiment involving a large

number of participants. The design of this experiment builds on the pilot mobility

course described in Chapter 7, however the heavy PDA based simulator has been

upgraded to a Virtual Reality (VR) type Head Mounted Display (HMD), which is

connected to a standard Windows XP laptop running custom software. A guiding

principal used throughout this thesis was to develop a low cost simulation, and

this has guided design decisions on hardware and software used in this Chapter.

The following thesis research questions are addressed in this chapter:

201

202 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment



One of the aims of the research presented in this chapter is to investigate the

effects of frame rate and spatial resolution on mobility performance. There have

been reported limits to the temporal resolution at which phosphenes can be per-

ceived (for example, Dobelle has reported that 4 frames per second (FPS) was the

most effective temporal resolution for his cortical device. [48]). Also, although

faster stimulation has been reported in this literature, much research on temporal

resolution (for example, Eckhorn et al. [58]) is currently based on animal experi-

ments and the effects of chronic electrical stimulation on the human visual system

may cause a reduction in temporal resolution. In addition sensory substitution

devices for the blind also provide information at low frame rates (for example,

the vOICe auditory based device provides soundscapes at 1 fps [149]).

Although a focus of much current AHV research is to increase the number of

implantable electrodes and therefore increasing perceived spatial resolution, the

effect of frame rate on mobility for an AHV display has not yet been examined. It

is hypothesised that mobility performance should increase with increased spatial

resolution and also with frame rate. This chapter investigates the interaction

between display frame rate (1, 2 and 4 FPS) and spatial resolution (32x24 and

16x12 phosphenes).

A number of additional factors from the proposed AHV mobility display

framework were also evaluated (see Figure 8.1). Participants with corrected-

to-normal vision were hypothesised to demonstrate better mobility performance

than normally sighted participants due to their previous experience compensating

for minor visual loss. Similarly, this experiment was also interested in whether


participants who had previous experience with immersive virtual reality environ-

ments could perform more effectively with the AHV simulation than those with-

out previous experience. The effect of gender on mobility performance was also

examined, as research has reported differences in navigation speed performance

on virtual reality display field of view [44] and differences in mental rotation (for

example [167]).



The experiment described in this chapter uses a custom developed indoor artificial

course to compare mobility between different people using an AHV simulation

device. This course is similar to the course used in the previous chapter, however

the obstacle types are standardized, background clutter has been reduced by the

use of office partitions along the course, and the level of noise has been reduced.

The dependent variables used in this experiment are walking speed through the

course, PPWS, and mobility errors (obstacle contacts and veering). One aim of

this study was to explore the suitability of these mobility related variables as

objective measures for comparing different AHV display types.



Both the resolution and frame rate of the simulation display were programatically

controlled for each participant. Captured camera images were processed to reduce

both the resolution and the number of grey-levels reduced before display. The

alert display, found to be confusing and ineffective in the previous chapter, was

not used in this experiment.


Dynamic factors

Computer Visionmethods for AHVSpatial ResolutionFrame RateNumber of grey levelsGaussian filter

ContextIndoor artificial mobilitycourseScene propertiesTextureComplexityLightingGlareContrastType of objects

TaskWalking along pathObstacle avoidanceSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception


EnvironmentAffordancesPath boundariesLandmarks

Human FactorsExperience/trainingPsychological factorsPhysical factors (gait, etc)Use of secondary aidAgeCorrected visionExperience with virtual reality...Non-image sensorsUltrasoundLaserGPS...

Other modalitiesAuditoryTactile ...Other displaymodalitiesAuditoryTactile...AHV Display Type

Alert InformationLooming obstaclesSymbolic

Standard Display Mobility PerformancePPWSObstaclesVeeringFigure 8.1: Factors which influence simulated AHV display effectiveness in thischapter. Excluded factors are marked with a line pattern.

8.2. Method 205

8.2 Method

The experimental set up used in this chapter is similar to the previous chapter. A

significant different was the use of a VR display device to display the AHV sim-

ulation. Custom software was also developed to display more realistic simulated

phosphenes. An indoor mobility course was set up in a large civil engineering

concrete laboratory.

8.2.1 Simulation Hardware

The hardware used in this study consisted of three components:

Head Mounted Display

The Head Mounted Display (HMD) used in this study was the i-O Display Sys-

tems (Sacramento, CA) i-glasses PC/SVGA, which provided a selected resolution

of 640-by-480, total field of view 26.5◦ at 60Hz refresh rate. This display was cho-

sen due to its low cost (AU$1230) and simple interface to a laptop PC. An external

lithium polymer battery (cost AU$215) powered the HMD.

To block out external light, a custom shroud was constructed from block out

curtain and sewn around to the HMD (with slots to allow ventilation).

Camera

A Swann Netmate Universal Serial Bus (USB) camera was attached, at eye level,

to the front of the HMD. This camera was selected due to its low cost (AU$53),

small size, light weight and simple integration with the Windows operating sys-

tem. The camera used a 1/7 inch CMOS sensor, and has automatic gain com-

pensation, exposure and white balance. The field of view (FOV) for this camera

was manually calculated at 34◦ horizontal and 27◦ vertical.


Laptop PC

A Toshiba Tecra laptop (1.6GHz Centrino processor) was either worn by partic-

ipants in a backpack, or carried by the experimenter. The camera was powered

from the USB port of this computer.

8.2.2 Simulation Software

The main requirement for the AHV simulation software was to convert input

from the USB camera into an on-screen phosphene display. To be representative

of current AHV prototypes and maintain the same aspect ratio of the display

device the simulation reduced the resolution of captured images from 160x120

RGB colour to 32x24 or 16x12 simulated phosphenes. As discussed in Chapter

3, it is possible to modulate the size of phosphenes to represent a limited number

of grey levels. Therefore in this simulation it is assumed that eight grey levels for

each phosphene can be displayed.

The simulation software was written in Visual C++ 6.0 (Microsoft, Redmond,

WA), using the Microsoft Video for Windows library to capture incoming video

images. These images were sub-sampled (using the mean grey level of contributing

pixels) to a lower resolution image, which was then converted to 8 grey levels. To

simulate a perceived electrode response, the low resolution image was displayed as

a phosphene array using the DirectDraw component of Microsoft DirectX. Figure

8.2 shows the mapping between image grey levels and the different phosphene

representations. Each phosphene was generated from an original circle, 40 pixels

in diameter, filled with the matching grey level and blurred with a Gaussian filter

(Radius=10). Examples of the simulation display are shown in Figures 8.3 to

8.5. These simulated phosphenes are similar to those generated by Thompson et

al. [226] and Dagnelie et al. [47].

8.2. Method 207

Figure 8.2: Phosphenes (top row) displayed as grey level pixels in reduced reso-lution images

Figure 8.3: Original 160x120 pixel captured image

8.2.3 Mobility course

To assess mobility performance, an indoor mobility course (Figure 8.6) was con-

structed within an emptied 30x40m civil engineering laboratory at the Queens-

land University of Technology. The mobility course consisted of a winding path,

approximately 1m wide and 30m long. Path boundaries were marked with 48mm

black duct tape. The floor of the course was concrete, which was painted light

grey, however a 3m2 section was painted white from a previous experiment. Grey

office partitions, approximately 200 cm tall, were placed on either side of the path

to reduce visual clutter and to prevent participants from confusing the neighbor-

ing path with the current path.


Figure 8.4: Original image reduced to 32x24 phosphenes

Figure 8.5: Original image reduced to 16x12 phosphenes

8.2. Method 209

Figure 8.6: Map of the 30m mobility course built for this study. The grey shadedarea is the path identified by black tape on the floor. The numbers refer to theplacement of obstacles and the black lines denote office partitions.

Figure 8.7: Different types of grey shading on each obstacle shown in Figure 8.6

Eight obstacles, painted in different shades of matt grey, were placed through

the course (see Figure 8.7). Two of the obstacles were suspended from the ceiling

to a height of 1.2 m above floor level. All obstacles along the path were made

from empty packing boxes (450x410x300mm). As in the mobility course described

in Chapter 7, these obstacles were designed to replicate obstacles which a blind

person could encounter in the real world.

A straight, unobstructed, 10m section of the course (shown in Figure 8.6)

was used to measure the Preferred Walking Speed (PWS) of each participant.

8.2.4 Participants

Ten female and 50 male volunteers were recruited from staff and students at dif-

ferent faculties at the Queensland University of Technology (QUT). The method


of recruitment involved emails and posters placed around the three QUT cam-

puses. The age and gender distribution of participants is shown in Table 8.1.

All participants had normal or corrected-to-normal vision.

Level 1 (Low risk) ethical clearance (number (3887H)) was obtained from the

QUT University Human Research Ethics Committee for this experiment.

8.2.5 Questionnaire

As in the previous chapter, before commencing the experiment each participant

was asked to provide details of gender, age and whether the participant was

wearing glasses or contact lenses were collected from a questionnaire. In addition,

participants were asked how many times (if any) they had used an immersive

Virtual Reality environment. The Questionnaire is included in Appendix B of

this thesis.

8.2.6 Statistical methods

Unless stated otherwise, statistical significance was at the p<.05 level. The Sta-

tistical Package for the Social Sciences (SPSS) (2004, SPSS Inc, Chicago, USA)

was used for all statistical calculations. Multifactorial ANOVA was used to test

for significant effects among the experimental factors. Normality and homogene-

ity of variance were assessed visually and found to be acceptable for the use of

parametric statistics.

The formulae for calculating PPWS are provided in Chapter 2 ( equations 2.1

and 2.2 ). In the experiment described in this chapter the preferred walking speed

(PWS) for each participant was measured before and after their two mobility

trials. These two PWS scores were defined as distance (metres) divided by speed

(seconds). As previous research has found the PWS to be a stable mobility

measure for individuals (for example, Soong et al. [209] and Lovie-Kitchin et al.

8.2. Method 211

Table 8.1: Gender and age groups of experiment participants.

Age Male Female Total

0-19 3 1 4

20-29 27 5 32

30-39 11 1 12

40-49 6 3 9

50+ 3 0 3

Total 50 10 60

[137])), the two PWS results were averaged. This average PWS score was used

for all PPWS calculations.

8.2.7 Procedure

Each participant was randomly allocated to one frame rate (1, 2 or 4 fps) and one

display type level (16x12 or 32x24 phosphenes) and commenced their first trial

with one of the two course start locations (marked ‘A’ or ‘B’ in Figure 8.6). One

hour was allocated for testing each individual. Study participants were met in

a corridor outside the lab, read a consent sheet and filled out the questionnaire.

The simulation headgear was then explained and fitted before the participant was

led into the lab. Each participant was then allowed two minutes to familiarise

themselves with the display. The guided PWS was then recorded over 10m. After

this the participant was led to the trial starting location and the first mobility

trial was conducted. Participants were offered a short break of approximately one

minute before the second trial was conducted. Finally, the PWS was measured

for the second time. During the mobility trials, a single experimenter recorded

walking speed, obstacle contacts, the number of times participants were told they

were walking backwards and the number of times participants veered outside the

path boundary. Mobility performance was recorded for each participant on the


experimenter sheet shown in Appendix B.

8.3 Results

The average number of obstacle contacts by frame rate and resolution are sum-

marised in Table 8.2. A boxplot showing the median obstacle contacts is shown

in in Figure 8.8. The frequency of contact with different obstacle types is shown

in Figure 8.9.

Table 8.3 summarises the average number of veering errors by frame rate and

resolution. Figure 8.10 demonstrates a trend in reduced veering errors as frame

rate and spatial resolution increase.

A summary of the two benchmark speeds, used to calculate PWS, are shown

in Table 8.4. A similar summary for the time spent walking through the mobility

course and PPWS are presented in Table 8.4. The decrease in walking time and

PPWS is more noticeable in the second trial.

A boxplot showing the median PPWS on trials 1 and 2 by resolution and

frame rate is shown in Figure 8.11, which shows a general increase in PPWS as

frame rate increased (although there was an unexpected lower score for 2 frames

per second at the 16x12 resolution level). A similar plot for the time spent walking

through the course (referred to as Time 1 and Time 2 from now on) is shown in

Figure 8.12.

The interaction between resolution, frame rate and PPWS during each trial is

shown in Figures 8.13 and 8.14. Similar graphs showing the interaction between

resolution, frame rate and Time 1 and Time 2 is shown in Figure 8.15 and 8.16.

8.3.1 Phosphene spatial resolution

As shown in Figure 8.10, overall veering was significantly less with a higher level of

spatial resolution (F (1,54) = 21.25, p<0.01). There was no significant difference

8.3. Results 213

Table 8.2: Mean number of obstacle contacts (with standard deviations) for dif-ferent resolution and frame rate.

Resolution Frame Obstacle Obstacle Total

Rate Trial 1 Trial 2

16x12 1 4.30 (1.70) 3.70 (1.57) 8.00 (2.67)

2 4.10 (1.73) 3.20 (1.55) 7.30 (2.26)

4 3.40 (1.35) 4.30 (1.34) 7.70 (1.95)

32x24 1 3.90 (1.37) 2.70 (1.16) 6.80 (2.04)

2 3.70 (1.77) 4.10 (1.29) 7.80 (2.44)

4 3.10 (1.10) 2.80 (1.62) 5.90 (2.42)

Figure 8.8: Summary of obstacle errors during trials 1 and 2 by resolution andframe rate (FPS). The boxes show the middle 50 per cent of observations, withthe median shown by the solid line in the box. The whiskers coming from eachbox show the largest value excluding outliers (which are shown as small circles).


4.002.00

1.00

FP

S

32x2416x12

Resolution

14.00

12.00

10.00

8.00

6.00

4.00

2.00

0.00

No

. of

Ob

stac

le C

on

tact

s

14.00

12.00

10.00

8.00

6.00

4.00

2.00

0.00

No

. of

Ob

stac

le C

on

tact

s

14.00

12.00

10.00

8.00

6.00

4.00

2.00

0.00

No

. of

Ob

stac

le C

on

tact

s

ObstacleH2

ObstacleH1

Obstacle6

Obstacle5

Obstacle4

Obstacle3

Obstacle2

Obstacle1

Figure 8.9: Frequency of obstacle contacts by obstacle number for different res-olution types and frame rates (FPS). The obstacle types are displayed in Figure8.7.

Table 8.3: Mean number of veering errors (with standard deviations) for differentresolution and frame rate.

Resolution Frame Veering Veering Total

Rate Trial 1 Trial 2

16x12 1 11.50 (2.76) 11.50 (4.88) 23.00 (5.91)

2 12.50 (4.01) 10.10 (4.93) 22.60 (7.11)

4 10.30 (2.79) 9.30 (1.77) 19.60 (3.63)

32x24 1 10.00 (4.50) 7.80 (4.02) 17.80 (7.96)

2 7.10 (3.63) 5.40 (3.13) 12.50 (6.11)

4 6.50 (3.47) 5.30 (4.08) 11.80 (7.21)

8.3. Results 215

Figure 8.10: Summary of veering errors during trials 1 and 2 by resolution andframe rate (FPS)

Table 8.4: Mean benchmark speeds over 10m (with standard deviations) for res-olution and frame rate. Benchmark no. 1 was recorded before the first mobilitytrial. Benchmark no. 2 was recorded after the second mobility trial. PWS is 10divided by each Benchmark score. The combined PWS score in the table is theaverage PWS for the two benchmarks for each participant.

Resolution Frame Benchmark Benchmark Combined

Rate no. 1 (s) no. 2 (s) PWS (m/s)

16x12 1 16.14 (3.54) 15.01 (2.81) 0.67 (0.12)

2 16.08 (2.34) 16.70 (4.55) 0.63 (0.11)

4 16.45 (1.94) 17.06 (2.68) 0.61 (0.07)

32x24 1 16.75 (4.30) 16.11 (5.20) 0.65 (0.14)

2 17.56 (5.30) 15.93 (3.33) 0.63 (0.13)

4 14.06 (1.38) 14.53 (2.40) 0.71 (0.08)


Table 8.5: Mean scores (with standard deviations) for the amount for time spentwalking through the mobility course during each trial, and for PPWS (calculatedusing combined PWS) during each trial.

Resolution Frame Time (s) Time (s) PPWS PPWS

Rate Trial 1 Trial 2 Trial 1 Trial 2

16x12 1 326.40 (190.66) 317.80 (207.14) 27.87 (15.49) 29.76 (18.05)

2 353.20 (142.64) 376.80 (208.75) 24.50 (11.83) 26.07 (15.27)

4 245.30 (61.91) 237.50 (55.93) 31.79 (6.34) 32.88 (7.20)

32x24 1 306.20 (86.82) 251.80 (112.80) 25.55 (8.90) 32.39 (11.27)

2 266.10 (78.50) 264.10 (122.14) 29.85 (9.68) 31.74 (10.43)

4 204.60 (79.80) 178.70 (68.93) 35.84 (10.34) 40.00 (12.62)

4.002.001.00

FPS

32 x 2416 x 12

Reso

lutio

n

PPWS2PPWS1 PPWS2PPWS1 PPWS2PPWS1

70

60

50

40

30

20

10

0

70

60

50

40

30

20

10

0

44

60

Figure 8.11: Percentage of Preferred Walking Speed (PPWS) results for trials 1(PPWS1) and 2 (PPWS2) displayed by resolution type and frame rate (FPS).

8.3. Results 217

4.002.001.00

FPS

32 x 2416 x 12

Reso

lutio

n

Time2Time1 Time2Time1 Time2Time1

800

600

400

200

800

600

400

200

33

32

54

12

54

Figure 8.12: Time spent walking through the mobility course for trials 1 (Time1)and 2 (Time2) displayed by resolution type and frame rate (FPS).


4.002.001.00

FPS

36.00

34.00

32.00

30.00

28.00

26.00

24.00

Est

imat

ed M

arg

inal

Mea

ns

PP

WS

Tri

al 1 32 x 24

16 x 12Resolution

Figure 8.13: Variation of trial 1 PPWS scores by frame rate (FPS) and resolution.These results suggest a confounding variable, perhaps anxiety, during the initialtrial.

4.002.001.00

FPS

40.00

35.00

30.00

Est

imat

ed M

arg

inal

Mea

ns

PP

WS

Tri

al 2 32 x 24

16 x 12Resolution

Figure 8.14: Variation of trial 2 PPWS scores by frame rate (FPS) and resolution.These results show an increase in walking confidence as frame rate and resolutionincrease.

8.3. Results 219

4.002.001.00

FPS

350.00

300.00

250.00

200.00

Est

imat

ed M

arg

inal

Mea

ns

of

Tim

e S

pen

tW

alki

ng

Du

rin

g T

rial

132 x 24

16 x 12Resolution

Figure 8.15: Variation of time spent walking during trial 1 scores by frame rate(FPS) and resolution.

4.002.001.00

FPS

400.00

350.00

300.00

250.00

200.00

150.00

Est

imat

ed M

arg

inal

Mea

ns

of

Tim

e S

pen

tW

alki

ng

Du

rin

g T

rial

2

32 x 24

16 x 12Resolution

Figure 8.16: Variation of time spent walking during trial 1 scores by frame rate(FPS) and resolution.


found between the two levels of display spatial resolution and overall obstacle

contacts (F (1,54) = 0.08, p=0.78). However, there was a significant interaction

between frame rate, resolution and obstacle avoidance on the second trial (F (2,54)

= 9.16 ,p<0.05). Contact with obstacle 5 on both trials were significantly less

with increased resolution (F (1,54) = 9.16 ,p<0.01).

There were no significant relationships found between resolution and PPWS

on the first (F (1,54) = 0.51 ,p=0.48) or second trials (F (1,54) = 2.37, p=0.30).

The results for Time 1 were also not significantly different (F (1,54) = 2.52,

p=0.12). However, participants did spend significantly less time walking through

the course during the second trial (F (1,54) = 4.40, p<0.05).

8.3.2 Frame Rate

Using PPWS as the dependent variable, frame rate was not related to improved

performance on the first (F (2,54) = 1.80, p=0.18) or second trials (F (2,54) =

2.33, p=0.11). However, time spent walking through the mobility course was

significantly affected by frame rate on both the first trial (F (2,54) = 3.86, p<0.05)

and the second trial (F (2,54) = 3.24, p<0.05). Post-hoc Tukey’s HSD analysis

revealed significant differences between frame rate values of 1 and 4 FPS (p<0.05)

and time spent walking on the first trial, and significant differences between frame

rate values of 2 and 4 FPS (p<0.05) on the second trial.

There was also a marginally significant relationship between frame rate and

overall veering on both trials (F (2,54) = 2.68, p=0.08). Overall contact with

obstacle 5 was related to frame rate (F (2,54) = 3.21, p<0.05), however frame

rate was not related to overall obstacle contacts (F (2,54) = 0.59, p=0.56).

8.3. Results 221

8.3.3 Prior experience with immersive VR

Eleven (18.3%) participants had previous experience with VR. However, no signif-

icant relationships were found between VR and overall obstacle contacts (F (1,54)

= 0.24, p=0.62), overall veering (F (1,54) = 0.08, p=0.78), or PPWS on trial 1

(F (1,54) = 1.39, p=0.24) or 2 (F (1,54)=0.053, p=0.82). Similarly VR was not

related to time spent walking on the course during trial 1 (F (1,54) = 0.932,

p=0.338) or trial 2 (F (1,54) = 0.217, p=0.643).

Significant interactions were found between resolution, frame rate and VR

experience and contacts with obstacle 1 (F (2,54) = 3.53, p<0.05), and the same

factors and contact with obstacle 3 (F (2,54) = 3.42, p<0.05).

8.3.4 Gender

Although only 10 participants (16.7%) in this study were female, they had sig-

nificantly fewer obstacle contacts than males overall (F (1,54) = 9.27, p<0.01)).

Further analysis showed this difference on obstacle contacts was significant on the

first trial (F (1,54) = 7.84, p<0.01)), but not on the second trial (F (1,54) = 2.75,

p=0.10)). Overall, significant gender differences were found between Obstacles

one (F (1,54) = 5.89, p<0.05)) and three (F (1,54) = 5.55, p<0.05)). There was

no difference found between gender on veering during either trial.

On average females scored higher on PPWS in both trials (Trial 1: male

mean=28.31 (SD=10.46); female mean=32.82 (SD=13.06)), (Trial 2: Male mean=30.85,

SD=12.18; female mean=38.62, (SD=16.09)), however these differences were not

significant (Trial 1: F (1,54) = 1.40, p=0.24)) , (Trial 2: F (1,54) = 3.04, p=0.08)).

Similarly there was no significant difference between gender and time spent walk-

ing in trial 1 (F (1,54) = 0.384, p=0.54)) and trial 2 (F (1,54) = 0.729, p=0.40)).


8.3.5 Age

Boxplots summarising age group results for PPWS and time spent walking through

the course are provided in Figures 8.17 and 8.18. Age was not significantly related

to overall obstacle contacts (F (5,54) = 1.93, p=0.11), overall veering (F (5,54) =

0.36, p=0.87), PPWS on trial 1 (F (5,54) = 0.49, p=0.74) or Trial 2 (F (5,54) =

0.70, p=0.59), or time spent walking during Trial 1 (F (5,54) = 0.48, p=0.75) or

trial 2 (F (5,54) = 1.52, p=0.21).

There was a significant difference with obstacle 1 contact (F (5,54) = 3.78,

p<0.01), probably due to the 50-60 and 60-70 year age groups making contact

with this obstacle on every trial (at least double any other age group). However,

there were only three participants within those age groups. A significant inter-

action between age and resolution type was also found for obstacle 6 ((F (3,54)

= 4.00, p<0.05)). Grouping ages 0-30 (n=36) and participants with an age >30

(n=24) did not reveal any significant differences.

8.3.6 Corrected Vision

Twenty-two (36.7%) participants had corrected to normal vision. Corrected

vision was not significantly related to overall obstacle contacts (F (1,54)=0.66,

p=0.42), overall veering (F (1,54) = 0.25, p=0.61), PPWS on trial 1 (F (1,54) =

0.18, p=0.67) or 2 (F (1,54) = 0.25, p=0.62), or time spent walking through the

course during trial 1 (F (1,54) = 0.74, p=.39) or 2 (F (1,54) = 1.45, p=0.23).

Significant interactions were found between resolution and corrected vision

and contacts with obstacle 3 (F (1,54) = 4.82, p<0.05), and between frame rate

and corrected vision with contact with obstacle 6 (F (2,54) = 3.27, p<0.05).

8.3. Results 223

50+40-4930-3920-290-19

Age

800

600

400

200

33

12

8

1

8

4

12

Time2

Time1

Figure 8.17: Median time spent walking through the mobility for during Trial 1(Time1) and 2 (Trial 2) for different age groups.


50+40-4930-3920-290-19

Age

70

60

50

40

30

20

10

0

25

54

PPWS2

PPWS1

Figure 8.18: Median PPWS scores from Trial 1 (PPWS1) and 2 (PPWS2 2) fordifferent age groups.

8.4. Discussion 225

8.3.7 Learning effects

The initial and final measurements of preferred walking speed (PWS) were sig-

nificantly correlated (r = 0.74, p<0.01). However, the correlation between time

spent on the mobility course during the two mobility trials was higher (r =

0.87,p<0.01). The relationship between PPWS1 and PPWS2 was also significant

(r = 0.88, p<0.01), although this relationship is artificially enhanced due to the

same (average) PWS score being used for the calculation of PPWS on the first

and second trials for each participant. These results do not support the reliability

of the PPWS measure over simply recording the time taken by participants dur-

ing the mobility course. Scatterplots showing the correlation between dependent

variables on the first and second trial are shown in Figures 8.19 TO 8.24.

Repeated measures ANOVA between Trials 1 and 2 showed a significant de-

crease in veering errors (F (1,54) = 7.97, p<0.01), but only a marginally significant

change in PPWS (F (1,54) = 3.58, p=0.06). There was not a significant reduction

in obstacle contacts between trials (F (1,54) = 1.35, p=0.25).

8.4 Discussion

The experiment described in this chapter has provided further information ad-

dressing the following main thesis questions:


0.900.800.700.600.500.400.30

Preferred Walking Speed 1

1.00

0.90

0.80

0.70

0.60

0.50

0.40

0.30

Pre

ferr

ed W

alki

ng

Sp

eed

2

R Sq Linear = 0.53

Figure 8.19: Participant Preferred Walking Speed (PWS) results during trial 1and 2 (r = 0.74).

60.0050.0040.0030.0020.0010.00

PPWS1

70.00

60.00

50.00

40.00

30.00

20.00

10.00

0.00

PP

WS

2

R Sq Linear = 0.775

Figure 8.20: Participant Percentage Preferred Walking Speed (PPWS) resultsduring trial 1 and 2 (r = 0.88).

8.4. Discussion 227

800.00700.00600.00500.00400.00300.00200.00100.00

Time1

800.00

600.00

400.00

200.00

Tim

e2

R Sq Linear = 0.751

Figure 8.21: Participant time spent walking during trial 1 and 2 (r = 0.87).

0.400.300.200.10

SMC1

0.40

0.30

0.20

0.10

SM

C2

R Sq Linear = 0.761

Figure 8.22: Participant Speed on Mobility Course (SMC) results for trial 1 and2 (r = 0.87).


20.0015.0010.005.000.00

Veering1

20.00

15.00

10.00

5.00

0.00

Vee

rin

g2

R Sq Linear = 0.365

Figure 8.23: Veering incidents for each participant during trial 1 and 2 (r = 0.60).

7.006.005.004.003.002.001.00

Obstacles1

7.00

6.00

5.00

4.00

3.00

2.00

1.00

0.00

Ob

stac

les2

R Sq Linear = 0.027

Figure 8.24: Participant obstacle contacts during trial 1 and 2 (r = 0.16).

8.4. Discussion 229

8.4.1 Can specific main factors be identified as highly

significant for providing mobility information in an

AHV system?

Effects of phosphene spatial resolution and frame rate

An increase in spatial resolution from 16x12 phosphenes to 32x24 phosphenes was

associated with a significant reduction in veering errors between participants.

However, frame rate, during the second of two trials for each participant, was

significantly related to increased walking speed. The variability of results for the

first PPWS trial could be due to learning effects, and mixed levels of comfort

and confidence by participants. The results from this study indicate that spatial

resolution is more useful than increased frame rate for following a path without

veering. However, the display frame rate has a significant effect on a person’s

preferred walking speed. These findings suggest the development of an adaptive

AHV system which could provide a lower resolution/faster display mode which

a person is moving, and a higher resolution/slower display when a person has

ceased movement.

Interestingly, three participants reported useful echolation from nearby par-

titions as they were walking. One participant reported trying to use sound to

assist with navigation.

Effects of gender, age, corrected vision and VR experience

Although there were some significant interactions for particular obstacles, prior

experience with VR was not found to be an important factor in improved mo-

bility performance using the simulator. Similarly, age and corrected vision were

not significantly associated with improved performance (despite some interaction

effects).


There was a significant reduction in obstacle contacts during the first trial by

female participants. This may reflect a more cautious approach during this trial

rather than innate gender differences. The finding was not repeated in the second

trial. The obstacle avoidance finding suggests that a confounding variable may

have been the gender of the experimenter. Therefore a balance between both

male and female experimenters should be used for further studies investigated

gender differences in mobility.

Learning effects

It would be interesting to assess the effect resolution and frame rate have on

mobility over a number of repeated trials. However, it would be difficult and

time-consuming to maintain a sufficient number of participants for reasonable

statistical results over a period of time. Learning effects have been found in many

AHV simulation studies (for example, [29], [31] and [70]). Mean scores generally

improved between the first and second trials in the current experiment, however

an extraneous variable could be the level of confidence each participant felt while

being effectively blindfolded in a strange environment. Some participants also

required time to adjust to the location of camera and the associated difference

in display viewing angle from their usual vision. The following suggestions were

received from participant feedback and observation during the sessions which may

enhance future AHV mobility performance:

• During training, to assist in obstacle avoidance, allow participants to ob-

serve the increased rate of expansion from a high-contrast looming object

as they walk toward it.

• Advise participants to adjust their walking speed to the speed of display

(for example, 1 step per display update).

• Demonstrate the width of the camera field of view (FOV) by showing an

8.4. Discussion 231

object of a known width (for example. a doorway) and allow the participant

to touch the object.

• To reduce veering, show participants the black tape marking the path

boundaries and ask participants to touch it.

• Suggest using slow head movements to compensate for the narrow displayed

FOV (see insect vision peering behaviour comments below). However, ex-

plain that faster head movements may result in image corruption due to

motion blur.

In addition, some participants tended to point the head mounted camera too

high to locate the path boundaries. Therefore, an artificial horizon indicator may

be useful to assist with camera orientation.

8.4.2 Can objective measures be developed for the com-

parison of effectiveness between AHV systems in

providing mobility information?

The highly significant relationships between pre- and post-trial Preferred Walk-

ing Speed offer some support for the use of the PPWS method as a mobility

assessment measure for AHV research. However, this relationship is artificially

enhanced as PWS scores from the beginning and end of sessions were averaged.

In fact the correlation between PWS scores was lower (r = 0.74) than the times

spent on the mobility course during trial 1 and trial 2 (r = 0.87). The difference

in PWS scores is probably due to learning effects (as the first measurement takes

place soon after participants wear the AHV simulator for the first time). There-

fore the stability of PPWS should improve as people spent more time training

with the simulator.


The mean PPWS results for this experiment range from 24.5 for 16x12 phosphene

resolution at 2 FPS to 40.0 for 32x24 resolution at 4 FPS. These results are higher

than those reported in Chapter 7 (where the overall mean PPWS was 18.14), in-

dicating that the weight (and discomfit) of the PDA head-gear may have effected

walking speed. Participants generally moved at a slow pace, and spent time

scanning for both obstacles and the edges of the path. However, these results are

similar to those reported in Jones et al. [117], who recorded PPWS while inves-

tigating eight visually impaired adults and the effectiveness of an image based

ETA.

In conclusion, time spent walking through the course, combined with veering

and obstacle contacts form the basis for an objective method to assess the effects

of different image processing methods in both simulated and real AHV systems.

This method of assessment could also be extended to comparing different blind

mobility aids with an implanted AHV system, for example comparing the freeware

vOICe auditory electronic aid for the blind (which is limited to presenting one

frame per second [150]) with simulated AHV.

8.4.3 Can computer vision techniques be adopted and

modified to provide mobility information in an AHV

system?

The experimental hardware and software performed reliably. No participants

reported nausea during the experiment, although two required a rest between

trials. The front of the HMD sometimes became warm during the experiment, due

to the shroud attached to block external light. One hardware constraint in this

study was the narrow 34◦ field of view (FOV) of the Swann USB camera, which is

a similar constraint to current generation night-vision goggles (eg. [92]). However,

an image captured with a wider FOV may not necessarily enhance mobility, as

8.4. Discussion 233

the spatial resolution will still need to be greatly reduced for an electrode array.

It would be useful in future work to compare the effect of different camera fields

of view on mobility.

8.4.4 Connections with Vision Research

Insects have fixed visual systems, with fixed focus, that lack stereoscopy, but are

still able to determine the distance to features within their environment with

enough precision to exhibit safe mobility [211]. It has been demonstrated that

the insect principles for mobility can be effectively translated to solve mobility

problems in the context of autonomous systems [211].

The limitations of the insect visual system appear similar to the features of

the mobility problem faced by participants in the experiment described in this

chapter. These features include safe mobility using a fixed focus, low resolution

and a monoscopic vision system. Therefore the mobility strategies used by insects

may provide some insight into the principles for mobility for AHV system users.

Within the biology community, it is believed that optical flow is a fundamental

quantity in enabling safe mobility of insects [211]. As discussed in Chapter 6,

optical flow is the apparent motion of apparent brightness patterns in an image

sequence. An insect can use the apparent motion of objects in its environment to

make a good estimate of their distance. For the optical information to be reliable,

the motion of a brightness pattern must be observed. If self motion is too fast,

the apparent brightness pattern is too fast to see; if self motion is too slow, no

apparent motion occurs. It is believed that some insects (such as grasshoppers)

artificially generate optical flow when stationary by exhibiting peering behaviour

(translation of the head) [211]. Insects have also developed large fields of view to

improve the robustness of navigation based on their low resolution, monoscopic

information.


These observations provide one possible explanation as to why walking speed

was found to be strongly dependent on frame rate. At a fixed resolution and low

frame rate, there may be a threshold below which there is insufficient optical flow

information. As the frame rate increases, motion at a faster speed is possible.

Another observation from insect mobility is that both peering type behaviour

and head rotation behaviours (to increase the effective field-of-view) improve the

information available and therefore improve the safety of mobility. This behaviour

was demonstrated by participants in the current simulated AHV experiment, and

has also been reported in previous AHV simulation research ([28]) and research

on auditory vision substitution devices [7].

Another aspect of frame rate can be understood in terms of existing results

from the computer vision community. It has previously been shown that the effec-

tive resolution of one image frame within an image sequence can be improved by

considering the information contained in other image frames [127]. This process

of improving the effective resolution is known as super-resolution. In the context

of AHV systems, the principle of super-resolution suggests that higher frame rates

can effectively increase the resolution of information available to participants.

8.5 Chapter Summary

The research described in this chapter fills a gap in the AHV literature, by pro-

viding evidence that a method of mobility assessment adapted from the low vision

community (time spent walking through mobility course, obstacles and veering)

can be used as a practical method to objectively assess AHV system technology.

For example, this method could be easily used to compare the effects of different

image processing algorithms. In this chapter, this assessment method was used

with a custom AHV simulator to investigate the effects of frame rate and res-

olution. Higher spatial resolution was important for accurate walking (reduced


veering), and higher frame rate resulted in faster walking speeds. Female partic-

ipants made contact with a significantly lower number of obstacles than males.

Prior experience with immersive virtual reality, age and corrected vision did not

significantly affect mobility performance.

Chapter 9

Conclusion and Future Work

This chapter contains a summary of the work presented in each chapter of this

thesis. Additionally, conclusions are drawn and possible avenues for future work

are identified. The main scientific contributions of this work are summarised in

Table 9.1.

9.1 Conclusions

This thesis has provided thorough reviews of blind and low vision mobility, Arti-

ficial Human Vision (AHV) and computer vision. The original work in this thesis

is primarily aimed at two particular areas of AHV mobility:

1. The first aim of this thesis was to investigate, evaluate and develop

techniques for mobility assessment which will allow the objec-

tive comparison of different AHV system phosphene presentation

methods. The lack of an objective method for comparing different AHV

system displays, in addition to comparing AHV systems and other blind

mobility aids (such as the long cane), has been identified by other authors

as a significant problem. In this thesis a number of different methods have

237

238 Chapter 9. Conclusion and Future Work

Table 9.1: Summary of the main scientific contributions of this thesis.

1 A conceptual framework based on literature reviews of blind and low vision

mobility, AHV technology, and computer vision. This framework incorporates

a comprehensive number of factors which affect the effectiveness of information

presentation in an AHV system.

2 The adaptation of a mobility assessment method from the blind and low vision

literature to measure simulated AHV mobility performance using real-time

computer based analysis. This method of mobility assessment (based on

parameters for walking speed, obstacle contacts and veering) is demonstrated

experimentally in two different indoor mobility courses.

3 The development and evaluation of an original real-time looming obstacle

detector, based on coarse optical flow, and implemented on a Windows PocketPC

based Personal Digital Assistant (PDA) using a CF card camera.

4 The development of a novel head-mounted Windows PocketPC PDA based AHV

simulator.

5 The development of a novel Windows XP based AHV simulation with an immersive

Head Mounted Display.

9.1. Conclusions 239

been developed to evaluate differences in the perception of information from

phosphene displays. These methods have included a computer-based static

image simulation, a PDA-based simulation display, and a Virtual Reality

Head Mounted Display.

2. The second aim was to develop a display framework for the presenta-

tion of AHV system information, and use this framework to guide

the development of an AHV simulation device. A novel framework

for AHV system information display was developed and presented in Chap-

ter 4. This framework has been based on the literature reviews of blind-

ness, blind mobility, AHV systems, and computer vision. The framework

includes the main factors which impact on a blind traveller. The main ben-

efits of using this framework are enhanced communication between AHV

researchers and the ability to explore experimentally and compare different

factors (such as age or gender, different types of computer vision methods,

and different environments). Experimental work contained in this thesis

has been guided by this original framework.

The research questions which have been addressed in this thesis are:

9.1.1 Can specific main factors be identified as highly

significant for providing mobility information in an

AHV system?

Chapter 2 provided a review of major mobility issues which a blind or vision

impaired person might experience. The main hazardous situations for blind mo-

bility were identified as drop-offs, obstacles and fast moving objects. Mobility

aids, both traditional and electronic were also reviewed. The main benefit of


these devices is that they provide additional preview information to blind pedes-

trians. As mentioned above, one of the novel contributions in this thesis is the

conceptual framework based on literature reviews of blind and low vision mobility,

AHV, and computer vision presented in Chapter 4. This framework incorporates

a comprehensive number of factors which affect the effectiveness of information

presentation in an AHV system. Experiments reported in this thesis have inves-

tigated a number of these factors. In Chapter 5, it was found that less cluttered

images resulted in the highest number of correct recognition of mobility related

scene components.

In the experiment reported in Chapter 8 it was found that higher spatial

resolution is associated with accurate walking (reduced veering), whereas higher

frame rate is associated with faster walking speeds. This finding supports the

development of an adaptive AHV system, with dynamic adjustment of display

properties in real-time. This experiment also found that prior experience with

immersive VR was not an important factor in improved mobility performance

using the simulator. Similarly, age and corrected vision were not significantly

associated with improved performance.

9.1.2 Can objective measures be developed for the com-

parison of effectiveness between AHV systems in

providing mobility information?

In this thesis a number of different methods have been used to evaluate differences

in the perception of information.

In Chapter 5 a novel computer based static image software was developed

which demonstrated that a static-image based AHV simulation can provide useful

information regarding mobility. The advantages of using a static image approach

9.1. Conclusions 241

for such studies include portability, control of extraneous variables, ease of par-

ticipant recruitment and ease of data recording. However, a number of important

mobility related sensations (such as auditory, tactile, kinesthetic and proprio-

ceptive input) are not considered in a static image study. These are severely

limiting factors for making deductions about mobility as they are strong influ-

ences. Therefore, the static image method was not used for the remainder of the

thesis.

A PDA-based AHV simulator was developed in Chapter 6. This simulator

was used in a pilot artificial mobility course study, described in Chapter 7. The

simulator demonstrated the feasibility of capturing, processing and displaying

images using a PDA device. In this pilot study it was found that all participants

were able to improve mobility performance, measured by PPWS and mobility

errors, over two trials. However, the PDA simulator and head brace bracket

was found to be heavy after a few minutes of mobility assessment, and this may

have altered the movement of participants (which could have affected mobility

performance).

Chapter 8 presented the results of an experiment using a custom VR HMD-

based simulator and an artificial mobility course. This experiment provided ev-

idence that a method of mobility assessment adapted from the low vision com-

munity (time spent walking through mobility course, obstacles and veering) can

be used as a practical and useful method to objectively assess AHV system tech-

nology.


9.1.3 Can computer vision techniques be adopted and

modified to provide mobility information in an AVH

system?

Computer vision methods provide a critical link between the camera and electrode

array of an effective AHV system. All of these systems currently need to reduce

the resolution and the number of colours of captured images to match the number

of stimulating electrodes. This thesis has demonstrated the use of a number of

computer vision techniques to provide mobility information.

Chapter 4 examined the main computer vision methods for the reduction of

unimportant image information, and the extraction of important features from

images. A number of prototype systems to assist the blind were then reviewed

to illustrate how these methods can be applied. Despite the number of systems

reviewed, many struggle to provide useful mobility information in real-time, and

only one has received wide acceptance (viz. the auditory vOICe system).

In Chapter 5, four different methods were used to process static images. The

256 grey-level image type resulted in significantly better recognition of mobility

components than binary or edge detected image types. There was no significant

difference found between two different types of edge detection (the Canny or

Sobel methods). These results did not support the use of edge detection for low

resolution static images.

Chapter 6 described the development and evaluation of an original real-time

looming obstacle detector, based on coarse optical flow, and implemented on

a Windows PocketPC based PDA using a Compact Flash (CF) card camera.

This method detected obstacles which were looming in front of the camera and

provided an alert to the wearer. The results of two experiments at four different

lighting levels indicated that the initial segmentation and adequate lighting was

a significant factor in system performance. The accuracy of alerts ranged from

9.2. Future Work 243

100% for a sequence captured in the early afternoon, down to 25% for an image

captured in the late afternoon.

The VR HMD-based simulator presented in Chapter 8 reduced the number of

grey-levels and resolution of images captured from a head mounted camera, and

provided realistic simulated arrays of phosphenes to experiment participants.

9.2 Future Work

A number of avenues for future work have been identified during the undertaking

of this thesis. These are summarised below:

9.2.1 Mobility experiments with AHV system recipients

There is a limited amount of published data regarding AHV system recipients.

This appears to be due to commercial reasons (for example, the lack of reported

outcomes from the Dobelle cortical AHV system), or because the technology

is still being tested on animals (for example, much of the current retinal im-

plant research). Therefore a number of assumptions must be made regarding

the phosphene display which a person may perceive (for example, the shape and

layout of phosphenes or temporal resolution). As AHV technology (particularly

electrode array technology) develops over the next five to ten years it should be

possible to provide more realistic and generalizable simulations. Ultimately it

would be beneficial to measure the mobility performance of a number of AHV

system recipients objectively, and use these findings to drive system improve-

ments.


9.2.2 Symbolic display

There is a large amount of human factors research on Human Computer Inter-

face (HCI) and the perception of information by people (such as in aircraft dis-

plays). This research could be applied and extended to provide a more effective

AHV system display. For example, object recognition, discussed in Chapter 4,

could provide a simplified display for blind people (such as displaying a standard

symbol for a doorway or tactile strip). Additionally, a standard for ‘phosphene

menus’ could also be developed and assessed. Additional fields of study which

also overlap with an AHV symbolic display include wearable computing, mobile

communication, augmented reality and virtual reality.

9.2.3 Real world mobility assessment environments

This thesis has described experiments involved static images and indoor artifi-

cial mobility course. However, as discussed in Chapter 2, real world mobility

assessment would also be a useful method of comparing and assessing the effec-

tiveness of simulated AHV displays. In addition, self reported observations from

AHV system recipients would provide valuable information on different display

methods.

9.2.4 Integration of information from other sensors

Current generation AHV systems are based on image data captured from a single

head-mounted camera. However, there has been a large amount of research on the

development of different sensor based ETA’s for the blind (using ultrasound or

laser reflection). It could be useful to integrate information from these sensors into

an AHV system, such as using ultrasound to provide a phosphene ‘distance map’.

Additionally the use of multiple cameras could be a useful source of providing

depth information. Finally, although technically a navigation aid (rather than

9.3. Final Remarks 245

mobility) the integration of GPS data with an AHV system would allow the use

of current location and directional maps to be displayed (possibly in a symbolic

format).

9.2.5 Standard set of mobility related images

One approach which has been successfully applied in the field of Information

Retrieval [155] and could be useful for AHV research is the development of a

standard set of images/image sequences for evaluation and comparability of com-

puter vision methods. The main benefit of a standard is that different algorithms

could be objectively tested against each other and measured for efficiency and

effectiveness. Sample image sequences might be captured from a subject walking

at normal pace toward different obstacles, over different drop-offs or toward a

door. Different environments (such as indoors/outdoors) and lighting conditions

could also be included.

9.3 Final Remarks

The field of AHV and mobility research is entering a period of challenges as the

enabling technology is developed in advance of our knowledge and understanding

of the human factor aspects. This research has attempted to examine and stim-

ulate some investigative paths in this space. It should be seen as a beginning,

rather than reaching a conclusive or advanced stage in what will necessarily be

a lengthy process, requiring many different perspectives and approaches to be

considered, given the human, subjective nature of this topic.

Bibliography

[1] R. Allison, L. Harris, M. Jenkin, U. Jasiobedzka, and J. Zacher, “Tolerance

of temporal delay in virtual environments,” in Proceedings of Virtual Reality

2001, pp. 247–254, 2001.

[2] American National Standards Institute, Inc., “Atis telecom glossary 2000,”

(accessed July 2006).

[3] J. Andersen and E. Seibel, “Real-time hazard detection via machine vision

for wearable low vision aids,” in Fifth International Symposium on Wearable

Computers (ISWC’01), pp. 182–183, 2001.

[4] C. Archambeau, J. Delbeke, C. Veraart, and M. Verleysen, “Prediction of

visual perceptions with artificial neural networks in a visual prosthesis for

the blind,” Artificial Intelligence in Medicine, vol. 32, no. 3, pp. 183–194,

2004.

[5] C. Archambeau, J. Delbeke, and M. Verleysen, “Classification of visual sen-

sations generated electrically in the visual field of the blind,” in Proceedings

of the 5th IFAC symposium on Modelling and Control in Biomedical Systems,

(Melbourne, Australia), pp. 223–228, 2003.

[6] J. D. Armstrong, “Evaluation of man-machine systems in the mobility of

the visually handicapped,” in Human factors in health care (R. Pickett and

T. Triggs, eds.), pp. 331–343, Lexington: Lexington Books, 1975.

247

248 Bibliography

[7] P. Arno, A. Vanlierde, E. Streel, M. Wanet-Defalque, S. Sanabria-Bohorquez,

and C. Veraart, “Auditory substitution of vision: pattern recognition by the

blind,” Applied Cognitive Psychology, vol. 15, no. 5, pp. 509–519, 2001.

[8] M. Bak, J. Girvin, F. Hambrecht, C. Kufta, G. Loeb, and E. Schmidt,

“Visual sensations produced by intracortical microstimulation of the human

occipital cortex,” Medical & Biological Engineering & Computing, vol. 28,

pp. 257–259, 1990.

[9] D. Ballard, “Generalizing the Hough transform to detect arbitray shapes,”

Pattern Recognition, vol. 13, no. 2, pp. 111–122, 1981.

[10] J. Barron, D. Fleet, and S. Beauchemin, “Performance of optical flow tech-

niques,” International Journal of Computer Vision, vol. 12, no. 1, pp. 43–77,

1994.

[11] O. Baruth, R. Eckmiller, and D. Neumann, “Retina encoder tuning and data

encryption for learning retina implants,” in Proceedings of the International

Joint Conference on Neural Networks, vol. 2, pp. 1249–1252, 2003.

[12] M. Becker, M. Braun, and R. Eckmiller, “Retina implant adjustment with

reinforcement learning,” in Proceedings of the 1998 IEEE International Con-

ference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1181–1184,

1998.

[13] M. Becker, R. Eckmiller, and R. Hunermann, “Psychophysical test of a tun-

able retina encoder for retina implants,” in Proceedings of the International

Joint Conference on Neural Networks, vol. 1, pp. 192–195 vol.1, 1999.

[14] B. Bentzen, “Environmental accessibility,” in Foundations of Orientation

and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York: American

Foundation for the Blind, 2nd ed., 1997.

Bibliography 249

[15] B. Bentzen and J. Barlow, “Impact of curb ramps on the safety of persons

who are blind,” Journal of Visual Impairment & Blindness, vol. 89, pp. 319–

328, 1995.

[16] K. Boahen, “A retinomorphic vision system,” IEEE Micro, vol. 16, no. 5,

pp. 30–39, 1996.

[17] R. G. Boothe, Perception of the visual environment. New York: Springer-

Verlag, 2002.

[18] J. Boyle, Improving Perception From Electronic Visual Prostheses. PhD

thesis, Queensland University of Technology, 2005.

[19] J. R. Boyle, A. J. Maeder, and W. W. Boles, “Can environmental knowledge

improve perception with electronic visual prostheses?,” in Proceedings of the

World Congress on Medical Physics and Biomedical Engineering (WC2003),

(Sydney, Australia), 2003.

[20] J. Boyle, A. Maeder, and W. Boles, “Inherent visual information for low

quality image presentation,” in Proceedings of the 2003 APRS workshop on

digital image computing, (Brisbane, Australia), pp. 51–56, 2003.

[21] J. Brabyn, “A review of mobility aids and means of assessment,” in Elec-

tronic Spatial Sensing for the Blind (D. H. Warren and E. R. Strelow, eds.),

pp. 13–27, Dordrecht: Martinus Nijhoff Publishers, 1985.

[22] M. Brambring, “Mobility and orientation processes of the blind,” in Elec-



[23] G. S. Brindley, “Sensations produced by electrical stimulation of the occip-

ital poles of the cerebral hemispheres, and their use in constructing visual

250 Bibliography

prostheses,” Annals Of The Royal College Of Surgeons Of England, vol. 47,

no. 2, pp. 106–108, 1970.

[24] V. Bruce, P. Green, and M. Georgeson, Visual perception. Psychology Press:

New York, 4th ed., 2003.

[25] C. J. C. Burges, “A tutorial on support vector machines for pattern recog-

nition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121–167,

1998.

[26] J. Canny, “A computational approach to edge detection,” IEEE Transac-

tions on Pattern Analysis and Machine Intelligence, vol. 8, pp. 679–698,

1986.

[27] K. Cha, K. Horch, and R. Norman, “Reading speed with a pixelized vi-

sion system,” Journal of the Optical Society of America A-Optics & Image

Science, vol. 9, no. 5, pp. 673–677, 1992.

[28] K. Cha, K. Horch, and R. Normann, “Simulation of a phosphene field based

visual prosthesis,” in Proceedings of the IEEE International Conference on

Systems, Man and Cybernetics, pp. 921–923, 1990.

[29] K. Cha, K. Horch, and R. Normann, “Mobility performance with a pixelised

vision system,” Vision Research, vol. 32, no. 7, pp. 1367–1372, 1992.

[30] H. Chawla, Essential Opthamology. Edinburgh: Churchill Livingstone, 1981.

[31] S. C. Chen, L. Hallum, N. Lovell, and G. J. Suaning, “Visual acuity mea-

surement of prosthetic vision: a virtual-reality simulation study,” Journal of

Neural Engineering, vol. 2, pp. 135–145, 2005.

[32] X. Chen and A. Yuille, “A time-efficient cascade for real-time object de-

tection: With applications for the visually impaired,” in IEEE Computer

Bibliography 251

Society Conference on Computer Vision and Pattern Recognition, vol. 3,

pp. 28–28, 2005.

[33] A. Chow, “Artificial retina device.” Optobionics Corporation, 1991.

[34] A. Chow, “First trials and future technologies for artificial retinas,” in Pro-

ceedings of the 14th Annual Meeting of the IEEE Lasers and Electro-Optics

Society, vol. 2, pp. 734–735, 2001.

[35] A. Y. Chow and V. Y. Chow, “Subretinal electrical stimulation of the rabbit

retina,” Neuroscience Letters, vol. 225, no. 1, pp. 13–16, 1997.

[36] A. Chow, V. Chow, M. Pardue, G. Peyman, C. Liang, J. Pearlman, and

N. Peachey, “The semiconductor-based microphotodiode array artificial sil-

icon retina,” in Proceedings of the IEEE International Conference on Sys-

tems, Man, and Cybernetics, vol. 4, pp. 404–408, 1999.

[37] A. Chow, M. Pardue, V. Chow, G. Peyman, C. Liang, J. Perlman, and

N. Peachey, “Implantation of silicon chip microphotodiode arrays into the cat

subretinal space,” IEEE Transactions on Neural Systems and Rehabilitation

Engineering, vol. 9, no. 1, pp. 86–95, 2001.

[38] A. Y. Chow and N. S. Peachey, “The subretinal microphotodiode array reti-

nal prosthesis,” Ophthalmic Research, vol. 30, pp. 195–198, 1998.

[39] V. Chowdhury, J. W. Morley, and M. T. Coroneo, “An in-vivo paradigm

for the evaluation of stimulating electrodes for use with a visual prosthesis,”

ANZ Journal of Surgery, vol. 74, no. 5, pp. 372–378, 2004.

[40] V. Chowdhury, J. W. Morley, and M. T. Coroneo, “Surface stimulation of

the brain with a prototype array for a visual cortex prosthesis,” Journal of

Clinical Neuroscience, vol. 11, no. 7, pp. 331–341, 2004.

252 Bibliography

[41] D. D. Clarke-Carter, A. D. Heyes, and C. Howarth, “The efficiency and walk-

ing speed of visually impaired people,” Ergonomics, vol. 29, no. 6, pp. 779–

789, 1986.

[42] M. Coimbra and M. E. Davies, “Approximating optical flow within the

MPEG-2 compressed domain,” IEEE Transactions on Circuits and Systems

for Video Technology, vol. 15, no. 1, pp. 96–100, 2005.

[43] B. Cyanek and J. Borgosz, Computer platform for transformation of vi-

sual information into sound sensations for vision impaired people, vol. 2626.

Springer: Berlin, 2003.

[44] M. Czerwinski, D. S. Tan, and G. G. Robertson, “Women take a wider view,”

in Proceedings of the SIGCHI conference on Human factors in computing

systems CHI ’02, (New York, USA), pp. 195–202, 2002.

[45] G. Dagnelie, “Toward an artificial eye,” IEEE Spectrum, vol. 33, no. 5,

pp. 20–29, 1996.

[46] G. Dagnelie, “Visual prosthetics 2006: Assessment and expectations,” Expert

Review of Medical Devices, vol. 3, no. 3, pp. 315–325, 2006.

[47] G. Dagnelie, D. Barnett, M. Humayun, and R. Thompson Jr., “Paragraph

text reading using a pixelized prosthetic vision simulator: Parameter depen-

dence and task learning in free-viewing conditions,” Investigative Ophthal-

mology and Visual Science, vol. 47, pp. 1241–1250, 2006.

[48] W. Dobelle, “Artificial vision for the blind by connecting a television camera

to the brain,” ASAIO Journal, vol. 46, no. 1, pp. 3–9, 2000.

[49] W. H. Dobelle and M. G. Mladejovsky, “Phosphenes produced by electrical

Bibliography 253

stimulation of human occipital cortex, and their application to the devel-

opment of a prosthesis for the blind,” The Journal Of Physiology, vol. 243,

no. 2, pp. 553–576, 1974.

[50] W. Dobelle, M. Mladejovsky, J. Evans, T. Roberts, and J. Girvin, “”braille”

reading by a blind volunteer by visual cortex stimulation.,” Nature, vol. 259,

no. 5539, pp. 111–112, 1976.

[51] A. Dodds, “Evaluating mobility aids: an evolving methodology,” in Elec-



[52] A. Dodds, Mobility Training for Visually Handicapped People: A Person-

Centred Approach. London: Croom Helm, 1988.

[53] A. Dodds, Rehabilitating Blind and Visually Impaired People. London:

Chapman and Hall, 1993.

[54] A. G. Dodds, D. D. Carter, and C. I. Howarth, “Improving objective mea-

sures of mobility.,” Journal of Visual Impairment & Blindness, vol. 77, no. 9,

p. 438, 1983.

[55] A. G. Dodds and D. P. Davis, “Assessment and training of low vision clients

for mobility,” Journal of Visual Impairment & Blindness, pp. 439–446, 1989.

[56] E. R. Dougherty, An introduction to morphological image processing. SPIE

Optical Engineering Press, 1992.

[57] J. Dowling, A. J. Maeder, and W. W. Boles, “Mobility assessment using

simulated artificial human vision,” in IEEE Computer Society Conference

on Computer Vision and Pattern Recognition (CVPR 2005), vol. 3, pp. 32–

32, 2005.

254 Bibliography

[58] R. Eckhorn, M. Wilms, T. Schanze, M. Eger, L. Hesse, U. T. Eysel, Z. F.

Kisvrday, E. Zrenner, F. Gekeler, and H. Schwahn, “Visual resolution with

retinal implants estimated from recordings in cat visual cortex,” Vision Re-

search, vol. In Press, 2006.

[59] R. Eckmiller, “Learning retina implants with epiretinal contacts,” Oph-

thalmic Research, vol. 29, pp. 281–289, 1997.

[60] R. Eckmiller, M. Becker, and R. Hunermann, “Dialog concepts for learning

retina encoders,” in Proceedings of the International Conference on Neural

Networks., vol. 4, pp. 2315–2320, 1997.

[61] R. Eckmiller, M. Becker, and R. Hunermann, “Towards a learning retina

implant with epiretinal contacts,” in Proceedings of the IEEE International

Conference on Systems, Man, and Cybernetics, vol. 4, pp. 396–399, 1999.

[62] M. Egmont-Petersen, D. de Ridder, and H. Handels, “Image processing with

neural networks-A review,” Pattern Recognition, vol. 35, no. 10, pp. 2279–

2301, 2002.

[63] M. R. Everingham, B. T. Thomas, T. Troscianko, and D. Easty, “A neural-

network virtual reality mobility aid for the severly visually impaired,” in 2nd

Annual Conference on Disability, Virtual Reality and Associated Technolo-

gies, pp. 183–192, 1998.

[64] L. Farmer and D. Smith, “Adaptive technology,” in Foundations of Ori-

entation and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York:

American Foundation for the Blind, 2nd ed., 1997.

[65] E. Fernandez, P. Ahnelt, P. Rabischong, C. Botella, F. Garcia-de Quiros,

P. Bonomini, C. Marin, R. Climent, J. Tormos, and R. Normann, “Towards

Bibliography 255

a cortical visual neuroprosthesis for the blind,” Proceedings of the 2nd In-

ternational Federation for Medical & Biological Engineering (IFMBE) Con-

ference, vol. 3, no. 2, pp. 1690–1691, 2002.

[66] J. Fernandez, A. Alfaro, P. Bonomini, J. Tormos, L. Concepcion, F. Pelayo,

and E. Fernandez, “Brain plasticity: feasibility of a cortical visual prosthesis

for the blind,” in Proceedings of the 25th Annual International Conference

of the IEEE Engineering in Medicine and Biology Society., vol. 3, pp. 2027–

2030, 2003.

[67] E. Fernandez, A. Alfaro, J. M. Tormos, R. Climent, M. Martinez, H. Vi-

lanova, V. Walsh, and A. Pascual-Leone, “Mapping of the human visual cor-

tex using image-guided transcranial magnetic stimulation,” Brain Research

Protocols, vol. 10, no. 2, pp. 115–124, 2002.

[68] S. Foran, J. J. Wang, E. Rochtchina, and P. Mitchell, “Projected number of

Australians with visual impairment in 2000 and 2030,” Clinical and Experi-

mental Opthalmology, vol. 28, pp. 143–145, 2000.

[69] J. Fowler, “The next generation of mobility aid,” in 2nd Australasian orien-

tation and mobility conference, (Gold Coast, Australia), 2003.

[70] L. Fu, S. Cai, H. Zhang, G. Hu, and X. Zhang, “Psychophysics of reading

with a limited number of pixels: Towards the rehabilitation of reading ability

with visual prosthesis,” Vision Research, vol. 46, no. 8-9, pp. 1292–1301,

2006.

[71] F. Gekeler, H. Schwahn, A. Stett, K. Kohler, and E. Zrenner, “Subretinal

microphotodiodes to replace photoreceptor-function. A review of the current

state,” in Vision, sensations et environnement (M. Doly, M.-T. Droy, and

Y. Christen, eds.), pp. 77 – 95, Paris: Irvinn, 2001.

256 Bibliography

[72] D. R. Geruschat and W. de l’Aune, “Reliability and validity of O&M in-

structor observations,” Journal of Visual Impairment & Blindness, vol. 83,

pp. 457–60, 1989.

[73] D. Geruschat and A. J. Smith, “Low vision and mobility,” in Foundations of

Orientation and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York:


[74] D. Geruschat, K. A. Turano, and J. Stahl, “Traditional measures of mobility

performance and retinis pigmentosa,” Optometry and Vision Science, vol. 75,

no. 7, pp. 525–537, 1998.

[75] M. Ghanbari, Video coding: an introduction to standard codecs. London:

The Institute of Electrical Engineers, 1999.

[76] J. J. Gibson, The Perception of the Visual World. Boston: Houghton-Mifflin,

1950.

[77] J. J. Gibson, The senses considered as perceptual systems. Massachusetts:

Houghton-Mifflin, 1966.

[78] J. J. Gibson, “The theory of affordances,” in Perceiving, acting, and know-

ing: toward an ecological psychology (R. Shaw and J. Bransford, eds.), New

Jersey: Lawrence Erlbaum Associates, 1977.

[79] J. J. Gibson, The ecological approach to visual perception. Hillsdale, NJ:

Lawrence Erlbaum Associates, 1979.

[80] B. Girod, “What’s wrong with mean-squared error?,” in Digital Images and

Human Vision (A. Watson, ed.), Cambridge: MIT Press, 1993.

[81] R. C. Gonzalez and R. E. Woods, Digital Image Processing. Reading, Mas-

sachusetts: Addison-Wesley, 1992.

Bibliography 257

[82] J. Gothe, S. A. Brandt, K. Irlbacher, S. Roricht, B. A. Sabel, and B.-U.

Meyer, “Changes in visual cortex excitability in blind subjects as demon-

strated by transcranial magnetic stimulation,” Brain, vol. 125, no. 3,

pp. 479–490, 2002.

[83] R. L. Gregory, Eye and Brain: The Psychology of Seeing. Tokyo: Oxford

University Press, 5th ed., 1998.

[84] S. Grigorescu, N. Petkov, and P. Kruizinga, “Comparison of texture features

based on gabor filters,” IEEE Transasctions on Image Processing, vol. 11,

no. 10, pp. 1160–1167, 2002.

[85] E. Guenther, B. Troger, B. Schlosshauer, and E. Zrenner, “Long-term sur-

vival of retinal cell cultures on retinal implant materials,” Vision Research,

vol. 39, no. 24, pp. 3988–3994, 1999.

[86] D. Guth and R. LaDuke, “Veering by blind pedestrians: Individual differ-

ences and their implications for instruction,” Journal of Visual Impairment

& Blindness, vol. 89, pp. 28–37, 1995.

[87] D. A. Guth and J. J. Rieser, “Perception and the control of locomotion

by blind and visually impaired pedestrians,” in Foundations of Orientation

and Mobility (B. B. Blasch and W. R. Weiner, eds.), pp. 9–39, New York:


[88] L. E. Hallum, D. S. Taubman, G. J. Suaning, J. W. Morley, and N. H. Lovell,

“A filtering approach to artificial vision: A phosphene visual tracking task,”

in Proceedings of the World Congress on Medical Physics and Biomedical

Engineering, (Sydney, Australia), 2003.

[89] L. Hallum, G. Tsafnet, N. Lovell, and G. Suaning, “Artificial vision for the

blind,” Australasian Science, vol. 30, no. 1, pp. 21–23, 2003.

258 Bibliography

[90] F. T. Hambrecht, “The history of neural stimulation and its relevance to

future neural prostheses,” in Neural Prostheses: Fundamental Studies (W. F.

Agnew and D. B. McCreery, eds.), New Jersey: Prentice Hall, 1990.

[91] H. Hammerle, K. Kobuch, K. Kohler, W. Nisch, H. Sachs, and M. Stel-

zle, “Biostability of micro-photodiode arrays for subretinal implantation,”

Biomaterials, vol. 23, no. 3, pp. 797–804, 2002.

[92] D. T. Hartong, F. F. Jorritsma, J. J. Neve, B. J. M. Melis-Dankers, and A. C.

Kooijman, “Improved mobility and independence of night-blind people using

night-vision goggles,” Investigative Ophthalmology & Visual Science, vol. 45,

no. 6, pp. 1725–1731, 2004.

[93] J. S. Hayes, V. T. Yin, D. Piyathaisere, J. D. Weiland, M. S. Humayun, and

G. Dagnelie, “Visually guided performance of simple tasks using simulated

prosthetic vision,” Artificial Organs, vol. 27, no. 11, pp. 1016–1028, 2003.

[94] S. Haymes, D. Guest, A. Heyes, and A. Johnston, “Comparison of functional

mobility performance with clinical vision measures in simulated retinitis pig-

mentosa,” Optometry and Vision Science, vol. 71, no. 7, pp. 442–453, 1994.

[95] S. Haymes, D. Guest, A. Heyes, and A. Johnston, “Mobility of people with

retinitis pigmentosa as a function of vision and psychological variables,”

Optometry and Vision Science, vol. 73, no. 10, pp. 621–637, 1996.

[96] B. Heisele, “Visual object recognition with supervised learning,” IEEE In-

telligent Systems, vol. 18, no. 3, pp. 38–42, 2003.

[97] L. Hesse, T. Schanze, M. Wilms, and M. Eger, “Implantation of retina stim-

ulation electrodes and recording of electrical stimulation responses in the

visual cortex of the cat,” Graefe’s Archive for Clinical and Experimental

Ophthalmology, vol. 238, no. 10, pp. 840–845, 2000.

Bibliography 259

[98] A. D. Heyes, “The sonic pathfinder - a new travel aid for the blind,” in High

technology aids for the disabled (W. J. Perkins, ed.), pp. 165–171, London:

Butterworth, 1983.

[99] T. Heyes, “The sonic pathfinder: An electronic travel aid for the vision

impaired.” http://www.sonicpathfinder.org/, (accessed July 2006).

[100] A. D. Heyes, A. G. Dodds, D. D. C. Carter, and C. I. Howarth, “Evaluation

of the mobility of blind pedestrians,” in High technology aids for the disabled

(W. J. Perkins, ed.), pp. 14–19, London: Butterworth, 1983.

[101] J. Hill and J. Black, “The miniguide: A new electronic travel device.,”

Journal of Visual Impairment & Blindness, vol. 97, no. 10, pp. 655–656,

2003.

[102] E. Hill, J. Rieser, M. Hill, M. Hill, J. Halpin, and R. Halpin, “How persons

with visual impairments explore novel spaces: Strategies of good and poor

performers,” Journal of Visual Impairment & Blindness, vol. 87, 1993.

[103] B. K. P. Horn and B. G. Schunck, “Determining optical flow,” Artificial

Intelligence, vol. 17, no. 1, pp. 185–203, 1981.

[104] S. Horowitz and T. Pavlidis, “Picture segmentation by a tree traversal

algorithm,” Journal of the ACM, vol. 23, no. 2, pp. 368–388, 1976.

[105] D. H. Hubel, “Exploration of the primary visual cortex, 1955-78,” in Cogni-

tive Neuroscience: A reader (M. S. Gazzaniga, ed.), Massachusetts: Black-

well, 2000.

[106] M. S. Humayun, “Is surface electrical stimulation of the retina a feasible

approach towards the development of a visual prosthesis?,” PhD thesis, Uni-

versity of North Carolina at Chapel Hill, 1992.

260 Bibliography

[107] M. S. Humayun and E. de Juan, “Artificial vision,” Eye, vol. 12, pp. 605–

607, Jun 1998.

[108] M. Humayun, E. De Juan Jr., G. Dagnelie, R. Greenberg, R. Propst, and

D. Phillips, “Visual perception elicited by electrical stimulation of retina in

blind humans,” Archives of Ophthalmology, vol. 114, no. 1, pp. 40–46, 1996.

[109] M. S. Humayun, J. de Juan, Eugene, J. D. Weiland, G. Dagnelie, S. Katona,

R. Greenberg, and S. Suzuki, “Pattern electrical stimulation of the human

retina,” Vision Research, vol. 39, pp. 2569–2576, 1999.

[110] M. S. Humayun, Y. Sato, R. Propst, and E. de Juan Jr, “Can potentials

from the visual cortex be elicited electronically despite severe retinal de-

generation and a markedly reduced electroretinogram?,” German Journal of


[111] M. S. Humayun, J. D. Weiland, G. Y. Fujii, R. Greenberg, R. Williamson,

J. Little, B. Mech, V. Cimmarusti, G. Van Boemel, and G. Dagnelie, “Visual

perception in a blind subject with a chronic microelectronic retinal prosthe-

sis,” Vision Research, vol. 43, no. 24, pp. 2573–2581, 2003.

[112] Y. Ito, T. Yagi, H. Kanda, S. Tanaka, M. Watanabe, and Y. Uchikawa,

“Cultures of neurons on micro-electrode array in hybrid retinal implant,”

in Proceedings of the IEEE International Conference on Systems, Man, and

Cybernetics, vol. 4, pp. 414–417 vol.4, 1999.

[113] B. Jahne and H. Haußecker, eds., Computer Vision and Applications. Aca-

demic Press: San Diego, 2000.

[114] A. Jain, R. Duin, and J. Mao, “Statistical pattern recognition: a review,”

IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22,

no. 1, pp. 4–37, 2000.

Bibliography 261

[115] G. Jansson, “Development and evaluation of mobility aids for the visually

handicapped,” in Development of electronic aids for the visually impaired

(P. Emiliani, ed.), Dordrecht: Martinus Nijhoff, 1986.

[116] L. Johnson, F. K. Perkins, T. O’Hearn, P. Skeath, C. Merritt, J. Frieble,

S. Sadda, M. Humayun, and D. Scribner, “Electrical stimulation of isolated

retina with microwire glass electrodes,” Journal of Neuroscience Methods,

vol. 137, no. 2, pp. 265–273, 2004.

[117] T. Jones and T. Troscianko, “Mobility performance of low-vision adults

using an electronic mobility aid,” Clinical and Experimental Optometry,

vol. 89, no. 1, pp. 10–17, 2006.

[118] H. Kanda, T. Morimoto, T. Fujikado, Y. Tano, Y. Fukuda, and H. Sawai,

“Electrophysiological studies of the feasibility of suprachoroidal-transretinal

stimulation for artificial vision in normal and rcs rats,” Investigative Oph-

thalmology & Visual Science, vol. 45, no. 2, pp. 560–566, 2004.

[119] H. Kanda, T. Yagi, Y. Ito, S. Tanaka, M. Watanabe, and Y. Uchikawa, “Ef-

ficient stimulation inducing neural activity in retinal implant,” Proceedings

of IEEE Systems, Man, and Cybernetics Conference, vol. 4, pp. 409 – 413,

1999.

[120] H. Kanda, T. Yagi, T. Nakatsu, M. Watanabe, and Y. Uchikawa, “A study

on electrical stimulation to visual nervous system in visual prosthesis,” in

Proceedings of the 26th Annual Conference of the IEEE, vol. 1, pp. 108–113

vol.1, 2000.

[121] L. Konig, “The laser long cane,” in 2nd Australasian orientation and mo-

bility conference, (Gold Coast, Australia), 2003.

262 Bibliography

[122] T. Kuyk, J. L. Elliott, J. Biehl, and P. S. Fuhr, “Environmental variables

and mobility performance in adults with low vision,” Journal of the Ameri-

can Optometric Association, vol. 67, no. 7, pp. 403–409, 1996.

[123] T. Kuyk, J. L. Elliott, and P. S. Fuhr, “Visual correlates of mobility in

real world settings in older adults with low vision,” Optometry and Vision

Science, vol. 75, pp. 538–47, Jul 1998.

[124] P. Larcombe, “Tactile ground surface indicators and the law,” in 2nd Aus-

tralasian orientation and mobility conference, (Gold Coast, Australia), 2003.

[125] J. Li, A. Najmi, and R. M. Gray, “Image classification by a two dimensional

hidden markov model,” IEEE Transactions on Signal Processing, vol. 48,

no. 2, pp. 517–533, 2000.

[126] R. Li, X. Zhang, and G. Hu, “A computational pixelization model based on

selective attention for artificial visual prosthesis,” Lecture Notes in Computer

Science, pp. 654–662, 2005.

[127] D. S. H. Ling, H. Hsu, G. C. Lin, and S. Lee, “Enhanced image-based

coordinate measurement using a super-resolution method,” Robotics and

Computer-Integrated Manufacturing, vol. 21, no. 6, pp. 579–588, 2005.

[128] W. Liu, E. McGucken, R. Cavin, M. Clements, K. Vichienchom, C. De-

marco, M. Humayun, E. d. Juan, J. Weiland, and R. Greenberg, “A retinal

prosthesis to benefit the visually impaired,” in Intelligent Systems and Tech-

nologies in Rehabilitation Engineering (H.-N. L. Teodorescu and L. C. Jain,

eds.), Boca Raton, Florida: CRC Press, 2001.

[129] W. Liu, E. McGucken, K. Vichienchom, S. Clements, S. Demarco, M. Hu-

mayun, E. de Juan, J. Weiland, and R. Greenberg, “Retinal prosthesis to aid

Bibliography 263

the visually impaired,” in Proceedings of the IEEE International Conference

on Systems, Man, and Cybernetics, vol. 4, pp. 364–369, 1999.

[130] W. Liu, E. McGucken, K. Vitchiechom, M. Clements, E. de Juan, and

M. Humayun, “Dual unit visual intraocular prosthesis,” in Proceedings of the

19th Annual International Conference of the IEEE Engineering in Medicine

and Biology society, vol. 5, pp. 2303–2306, 1997.

[131] W. Liu, M. Sivaprakasam, P. R. Singh, R. Bashirullah, and G. Wang, “Elec-

tronic visual prosthesis,” Artificial Organs, vol. 27, no. 11, pp. 986–995, 2003.

[132] J. I. Loewenstein, S. R. Montezuma, and J. F. Rizzo III, “Outer reti-

nal degeneration: An electronic retinal prosthesis as a treatment strategy,”

Archives of Ophthalmology, vol. 122, no. 4, pp. 587–596, 2004.

[133] P. C. Loizou, “Introduction to cochlear implants,” IEEE Signal Processing

Magazine, vol. 15, no. 5, pp. 101–130, 1998.

[134] R. Long and E. Hill, “Establishing and maintaining orientation for mobil-

ity,” in Foundations of Orientation and Mobility (B. B. Blasch and W. R.

Weiner, eds.), New York: American Foundation for the Blind, 2nd ed., 1997.

[135] R. Long, J. Rieser, and E. Hill, “Mobility in individuals with moderate

visual impairments,” Journal of Visual Impairment & Blindness, vol. 84,

1990.

[136] J. M. Loomis, R. L. Klatzky, and R. G. Goledge, “Navigating without

vision: Basic and applied research,” Optometry and Vision Science, vol. 78,

no. 5, pp. 282–289, 2001.

[137] J. Lovie-Kitchin, J. Mainstone, J. Robinson, and B. Brown, “What areas of

the visual field are important for mobility in low vision patients?,” Clinical

Vision Sciences, vol. 5, no. 3, 1990.

264 Bibliography

[138] G. Luger and W. StubbleField, Artificial Intelligence: Stuctures and strate-

gies for complex problem solving. Benjamin/Cumming: California, 2nd ed.,

1993.

[139] W. M. Mace, “James J. Gibson’s strategy for perceiving: Ask not what’s

inside your head, but what your head is inside of,” in Perceiving, acting, and

knowing: toward an ecological psychology (R. Shaw and J. Bransford, eds.),

pp. 43–66, New Jersey: Lawrence Erlbaum Associates, 1977.

[140] A. B. Majji, M. S. Humayun, J. D. Weiland, S. Suzuki, S. A. DAnna, and

J. de Juan, Eugene, “Long-term histological and electrophysiological results

of an inactive epiretinal electrode array implantation in dogs,” Investigative

Ophthalmology & Visual Science, vol. 40, no. 9, pp. 2073–2081, 1999.

[141] H. A. Mallot, Computational vision : Information processing in perception

and visual behavior. Cambridge, Mass.: MIT Press, 2000.

[142] R. E. Marc, B. W. Jones, C. B. Watt, and E. Strettoi, “Neural remodeling

in retinal degeneration,” Progress in Retinal and Eye Research, vol. 22, no. 5,

pp. 607–655, 2003.

[143] E. Margalit, M. Maia, J. D. Weiland, R. J. Greenberg, G. Y. Fujii, G. Tor-

res, D. V. Piyathaisere, T. M. O’Hearn, W. Liu, and G. Lazzi, “Retinal

prosthesis for the blind,” Survey of Ophthalmology, vol. 47, no. 4, pp. 335–

356, 2002.

[144] D. Marr, Vision. San Francisco, USA: W. H. Freeman, 1982.

[145] J. Marron and I. Bailey, “Visual factors and orientation-mobility perfor-

mance,” American Journal of Optometry and Physiological Optics, vol. 59,

no. 5, 1982.

Bibliography 265

[146] M. Mattar, A. Hanson, and E. Learned-Miller, “Sign classification using lo-

cal and meta-features,” in IEEE Computer Society Conference on Computer

Vision and Pattern Recognition, vol. 3, pp. 26–26, 2005.

[147] E. M. Maynard, “Visual prostheses,” Annual Review of Biomedical Engi-

neering, vol. 3, no. 1, pp. 145–168, 2001.

[148] E. M. Maynard, C. T. Nordhausen, and R. A. Normann, “The Utah intra-

cortical electrode array: A recording structure for potential brain-computer

interfaces,” Electroencephalography and Clinical Neurophysiology, vol. 102,

no. 3, pp. 228–239, 1997.

[149] P. B. Meijer, “An experimental system for auditory image representations,”

IEEE Transactions on Biomedical Engineering, vol. 39, no. 2, pp. 112–121,

1992.

[150] P. B. Meijer, “Vision technology for the totally blind.”

http://www.seeingwithsound.com/, (accessed July 2006).

[151] N. Molton, S. Se, M. Brady, D. Lee, and P. Probert, “Robotic sensing for

the partially sighted,” Robotics and Autonomous Systems Journal, vol. 26,

no. 3, pp. 185–201, 1999.

[152] G. Naik and A. Regalado, “An inventor struggles to restore sight,” Wall

Street Journal, p. B.1, August 27 2003.

[153] K. Nakayama, “James J. Gibson - an appreciation,” in Cognitive Neuro-

science (M. S. Gazzaniga, ed.), Oxford: Blackwell, 2000.

[154] V. S. Nalwa, A guided tour of computer vision. Reading, Massachusetts:

Addison-Wesley, 1993.

266 Bibliography

[155] National Institute of Standards and Technology, “Text retrieval conference

(trec).” http://trec.nist.gov/, (accessed July 2006).

[156] C. Noback, N. Strominger, R. Demarest, and D. Ruggiero, The Human

Nervous System. Humana Press: New Jersey, 6th ed., 2005.

[157] R. Normann, “A penetrating, cortical electrode array: design considera-

tions,” in Proceedings of IEEE International Conference on Systems, Man

and Cybernetics, pp. 918–920, 1990.

[158] R. Normann, “Visual neuroprosthetics-functional vision for the blind,”

IEEE Engineering in Medicine and Biology Magazine, vol. 14, no. 1, pp. 77–

83, 1995.

[159] R. Normann, E. Maynard, K. Guillory, and D. Warren, “Cortical implants

for the blind,” IEEE Spectrum, vol. 33, no. 5, pp. 54–59, 1996.

[160] R. Normann, D. Warren, and A. Koulakov, “Representations and dynamics

of representations of simple visual stimuli by ensembles of neurons in cat

visual cortex studied with a microelectrode array,” in Proceedeings of the

First International IEEE EMBS Conference on Neural Engineering, pp. 91–

94, 2003.

[161] W. Osberger, Perceptual vision models for picture quality assessment and

compression applications. PhD thesis, Queensland University of Technology,

1999.

[162] W. Osberger and A. Maeder, “Automatic identification of perceptually im-

portant regions in an image using a model of the human vision system,”

Proceedings of the 14th International Conference on Pattern Recognition,

pp. 701–704, 1998.

Bibliography 267

[163] W. Osberger and A. Rohaly, “Automatic detection of regions of interest

in complex video sequences,” in Human Vision and Electronic Imaging VI,

vol. 4299, pp. 361–372, Bellingham, USA: SPIE - The International Society

for Optical Engineering, 2001.

[164] D. Palanker, P. Huie, A. Vankov, Y. Freyvert, H. Fishman, M. Marmor, and

M. Blumenkranz., “Attracting retinal cells to electrodes for high-resolution

stimulation,” in Ophthalmic Technologies, (SPIE vol.5314), pp. 306–313,

2004.

[165] M. T. Pardue, M. J. Phillips, H. Yin, B. Sippy, S. Webb-Wood, A. Y.

Chow, and S. L. Ball, “Neuroprotective effect of subretinal implants in the

RCS rat,” Investigative Ophthalmology & Visual Science, vol. 46, pp. 674–

682, 2004.

[166] M. T. Pardue, E. B. Stubbs, Jr., J. I. Perlman, K. Narfstrom, A. Y. Chow,

and N. S. Peachey, “Immunohistochemical studies of the retina following

long-term implantation with subretinal microphotodiode arrays,” Experi-

mental Eye Research, vol. 73, no. 3, pp. 333–343, 2001.

[167] T. D. Parsons, P. Larson, K. Kratz, M. Thiebaux, B. Bluestein, J. G.

Buckwalter, and A. A. Rizzo, “Sex differences in mental rotation and spatial

rotation in a virtual environment,” Neuropsychologia, vol. 42, no. 4, pp. 555–

562, 2004.

[168] R. Passini, A. Dupre, and C. Langlois, “Spatial mobility of the visually

handicapped active person: A description study,” Journal of Visual Impair-

ment & Blindness, pp. 904–907, 1986.

[169] I. Patel, K. A. Turano, A. T. Broman, K. Bandeen-Roche, B. Munoz, and

S. K. West, “Measures of visual function and percentage of preferred walking

268 Bibliography

speed in older adults: The salisbury eye evaluation project,” Investigative

Ophthalmology & Visual Science, vol. 47, pp. 65–71, Jan 2006.

[170] F. Pelayo, A. Martinez, S. Romero, C. Morillas, E. Ros, and E. Fer-

nandez, “Cortical visual neuro-prosthesis for the blind: Retina-like soft-

ware/hardware preprocessor,” in Proceedings of the First International IEEE

EMBS Conference on Neural Engineering, pp. 150–153, 2003.

[171] F. Pelayo, S. Romero, C. Morillas, A. Martinez, E. Ros, and E. Fernan-

dez, “Translating image sequences into spike patterns for cortical neuro-

stimulation,” Neurocomputing, vol. 58-60, pp. 885–892, 2003.

[172] D. G. Pelli, “The visual requirements of mobility,” in Low Vision: Princi-

ples and Applications (G. C. Woo, ed.), pp. 134–146, New York: Springer-

Verlag, 1986.

[173] M. C. Peterman, D. M. Bloom, C. Lee, S. F. Bent, M. F. Marmor, M. S.

Blumenkranz, and H. A. Fishman, “Localized neurotransmitter release for

use in a prototype retinal interface,” Investigative Ophthalmology & Visual

Science, vol. 44, no. 7, pp. 3144–3149, 2003.

[174] M. C. Peterman, N. Z. Mehenti, K. V. Bilbao, C. J. Lee, T. Leng,

J. Noolandi, S. F. Bent, M. S. Blumenkranz, and H. A. Fishman, “The

artificial synapse chip: A flexible retinal interface based on directed retinal

cell growth and neurotransmitter stimulation,” Artificial Organs, vol. 27,

no. 11, pp. 975–985, 2003.

[175] J. Pezaris and R. Reid, “Microstimulation in LGN produces focal visual

percepts,” Journal of Vision, vol. 5, no. 8, p. 367, 2005.

[176] G. Phillips, “Gpd research.” http://www.gpd-research.com.au/, (accessed

July 2006).

Bibliography 269

[177] C. Poirier, M.-A. Richard, D. T. Duy, and C. Veraart, “Assessment of

sensory substitution prosthesis - Potentialities in minimalist conditions of

learning,” Applied Cognitive Psychology, vol. 20, no. 4, pp. 447–460, 2006.

[178] D. A. Pollen and S. Ronner, “Visual cortical neurons as localized spatial

frequency filters,” IEEE Transactions on Systems, Man, and Cybernetics,

vol. 13, pp. 907–916, 1983.

[179] W. K. Pratt, Digital Image Processing: PIKS Inside. John Wiley & Sons,

Inc., 3rd ed., 2001.

[180] A. S. Reber, Dictionary of Psychology. London: Penguin, 2nd ed., 1995.

[181] J. F. Rizzo, S. Miller, T. Denison, and J. Wyatt, “Electrically-evoked cor-

tical potentials from stimulation of rabbit retina with a microfabricated

electrode array (abstract),” Investigative Ophthalmology & Visual Science,

vol. 37:S707, 1996.

[182] J. F. Rizzo and J. Wyatt, “Prospects for a visual prosthesis,” The Neuro-

scientist, vol. 3, no. 4, pp. 251–262, 1997.

[183] J. F. Rizzo, J. Wyatt, M. Humayun, E. d. Juan, W. Liu, A. Chow, R. Eck-

miller, E. Zrenner, T. Yagi, and G. Abrams, “Retinal prosthesis: An encour-

aging first decade with major challenges ahead,,” Ophthalmology, vol. 108,

no. 1, pp. 13–14, 2001.

[184] J. Rizzo, J. Wyatt, J. Loewenstein, S. Kelly, and D. Shire, “Methods and

perceptual thresholds for short-term electrical stimulation of human retina

with microelectrode arrays,” Investigative Ophthalmology & Visual Science,

vol. 44, no. 12, pp. 5355–5361, 2003.

270 Bibliography

[185] J. Rizzo, J. Wyatt, J. Loewenstein, S. Kelly, and D. Shire, “Perceptual

efficacy of electrical stimulation of human retina with a microelectrode ar-

ray during short-term surgical trials,” Investigative Ophthalmology & Visual

Science, vol. 44, no. 12, pp. 5362–5369, 2003.

[186] J. Roerdink and A. Meijster, “The watershed transform: Definitions, al-

gorithms and parallelization strategies,” Fundamenta Informatica, vol. 41,

pp. 187–228, 2000.

[187] S. F. Ronner, “Electrical excitation of CNS neurons,” in Neural Prostheses:

Fundamental Studies (W. F. Agnew and D. B. McCreery, eds.), New Jersey:

Prentice Hall, 1990.

[188] S. Rosen, “Kinesiology and sensorimotor function,” in Foundations of Ori-

entation and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York:


[189] D. Ross, “Wearable computers as a virtual environment interface for people

with visual impairment,” Virtual Reality, vol. 3, pp. 212–221, 1998.

[190] P. Rousche and R. Normann, “A system for impact insertion of a 100 elec-

trode array into cortical tissue,” in Proceedings of the Twelfth Annual In-

ternational Conference of the IEEE Engineering in Medicine and Biology

Society, pp. 494–495, 1990.

[191] J. Russ, The Image Processing Toolkit. CRC Press, 2002.

[192] W. G. Sannita, L. Narici, and P. Picozza, “Positive visual phenomena in

space: A scientific case and a safety issue in space travel,” Vision Research,

vol. 46, no. 14, pp. 2159–2165, 2006.

Bibliography 271

[193] B. N. Schenkman and G. Jansson, “The detection and localization of objects

by the blind with the aid of long-cane tapping sounds,” Human Factors,

vol. 28, no. 5, pp. 607–618, 1986.

[194] E. M. Schmidt, M. Bak, F. Hambrecht, C. Kufta, D. K. O’Rourke, and

P. Vallabhanath, “Feasibility of a visual prosthesis for the blind based on

intracortical microstimulation of the visual cortex,” Brain, vol. 119, no. 2,

pp. 507–522, 1996.

[195] M. B. Schubert, A. Hierzenberger, H. J. Lehner, and J. H. Werner, “Op-

timizing photodiode arrays for the use as retinal implants,” Sensors and

Actuators A: Physical, vol. 74, no. 1-3, pp. 193–197, 1999.

[196] H. N. Schwahn, F. Gekeler, K. Kohler, K. Kobuch, H. G. Sachs, F. Schul-

meyer, W. Jakob, V. P. Gabel, and E. Zrenner, “Studies on the feasibility of a

subretinal visual prosthesis: Data from yucatan micropig and rabbit,” Grae-

fes Archive for Clinical and Experimental Ophthalmology, vol. 239, no. 12,

pp. 961–967, 2001.

[197] S. Se and M. Brady, “Vision-based detection of stair-cases,” in Proceedings

of Fourth Asian Conference on Computer Vision, vol. 1, (Taipei), pp. 535–

540, 2000.

[198] M. Seul, L. O’Gorman, and M. Sammon, Practical algorithms for image

analysis. Cambridge University Press, 2000.

[199] P. Sharp and R. Phillips, “Physiological optics,” in The Perception of Visual

information (W. Hendee and P. Wells, eds.), Springer: New York, 2nd ed.,

1997.

[200] C. A. Shingledecker and E. Foulke, “A human factors approach to the

272 Bibliography

assessment of mobility of blind pedestrians,” Human Factors, vol. 20, no. 3,

pp. 273–286, 1978.

[201] S. Shoval, I. Ulrich, and J. Borenstein, “Computerized obstacle avoidance

systems for the blind and visually impaired,” in Intelligent Systems and

Technologies in Rehabilitation Engineering (H. N. L. Teodorescu and L. C.

Jain, eds.), pp. 414 – 448., CRC Press, 2000.

[202] R. Siegel, “Hallucinations,” Scientific American, vol. 237, no. 4, pp. 132–

140, 1977.

[203] P. Silapachote, J. Weinman, A. Hanson, and M. Mattar, “Automatic sign

detection and recognition in natural scenes,” in IEEE Computer Society

Conference on Computer Vision and Pattern Recognition (CVPR 2005),

vol. 3, pp. 27–27, 2005.

[204] M. Snaith, D. Lee, and P. Probert, “A low-cost system using sparse vision

for navigation in the urban environment,” Image and Vision Computing,

vol. 16, no. 4, pp. 223–292, 1998.

[205] W. Snyder and Q. Hairong, Machine Vision. Cambridge University Press,

2004.

[206] M. Sonka, V. Hlavac, and R. Boyle, Image processing, analysis, and machine

vision. California: Brookes/Cole, 2nd ed., 1999.

[207] G. P. Soong, J. E. Lovie-Kitchin, and B. Brown, “Preferred walking speed

for assessment of mobility performance: Sighted guide versus non-sighted

guide techniques.,” Clinical and Experimental Optometry, vol. 83, no. 5,

pp. 279–282, 2000.

[208] G. P. Soong, J. E. Lovie-Kitchin, and B. Brown, “Does mobility perfor-

mance of visually impaired adults improve immediately after orientation and

Bibliography 273

mobility training?,” Optometry and Vision Science, vol. 78, no. 9, pp. 657–

66, 2001.

[209] G. P. Soong, J. E. Lovie-Kitchin, and B. Brown, “Measurement of pre-

ferred walking speed in subjects with central and peripheral vision loss,”

Ophthalmic and Physiological Optics, vol. 24, no. 4, pp. 291–295, 2004.

[210] U. H. M. Spandau, S. Wechsler, and A. Blankenagel, “Testing night vi-

sion goggles in a dark outside environment,” Optometry and Vision Science,

vol. 79, no. 1, pp. 39–45, 2002.

[211] M. V. Srinivasan, J. S. Chahl, K. Weber, S. Venkatesh, M. G. Nagle, and

S. W. Zhang, “Robot navigation inspired by principles of insect vision,”

Robotics and Autonomous Systems, vol. 26, no. 2-3, pp. 203–216, 1999.

[212] R. Srinivasan and K. Rao, “Predictive coding based on efficient motion es-

timation,” Communications, IEEE Transactions on, vol. 33, no. 8, pp. 888–

896, 1985.

[213] A. Stett, W. Barth, S. Weiss, H. Haemmerle, and E. Zrenner, “Electrical

multisite stimulation of the isolated chicken retina,” Vision Research, vol. 40,

no. 13, pp. 1785–1795, 2000.

[214] E. R. Strelow, “What is needed for a theory of mobility: Direct perception

and cognitive maps - lessons from the blind,” Psychological Review, vol. 92,

no. 2, pp. 226–248, 1985.

[215] G. J. Suaning, L. E. Hallum, S. C. Chen, P. J. Preston, and N. H. Lovell,

“Phosphene vision: Development of a portable visual prosthesis system for

the blind,” in Proceedings of the 25th Annual International Conference of

the IEEE/EMBS, (Cancun, Mexico), 2003.

274 Bibliography

[216] G. J. Suaning and N. H. Lovell, “A 100 channel neural stimulator for ex-

citation of retinal ganglion cells,” Proceedings of the 20th Annual Interna-

tional Conference of the IEEE Engineering in Medicine and Biology Society,

vol. 20, no. 4, pp. 2232–2235, 1998.

[217] G. Suaning and N. Lovell, “CMOS neurostimulation system with 100 chan-

nels, scaleable output and bi-directional radio frequency telemetry,” IEEE

Transactions on Biomedical Engineering, vol. 48, no. 2, pp. 248 –260, 2001.

[218] G. Suaning, N. Lovell, and Y. Kerdraon, “Physiological response in ovis

aries resulting from electrical stimuli delivered by an implantable vision pros-

thesis,” in Proceedings of the 23rd Annual International Conference of the

IEEE Engineering in Medicine and Biology Society., vol. 2, pp. 1419–1422,

2001.

[219] G. Suaning, N. Lovell, and Y. Kerdraon, “Trans-retinal electrical stimu-

lation using a neuroprosthesis: The effects of damage to the r-membrane,”

in Proceedings of the Second Joint Annual Conference and the Annual Fall

Meeting of the Biomedical Engineering Society., vol. 3, pp. 2091–2092, 2002.

[220] G. J. Suaning, N. H. Lovell, and C. Y. Kwok, “Fabrication of platinum

spherical electrodes in an intra-ocular prosthesis using high-energy electrical

discharge,” Sensors and Actuators A: Physical, vol. 108, no. 1-3, pp. 155–

161, 2003.

[221] G. Suaning, N. Lovell, K. Schindhelm, and A. Coroneo, “The bionic eye

(electronic visual prosthesis): A review,” Australian and New Zealand Jour-

nal of Ophthamology, vol. 26, no. 3, pp. 195–202, 1998.

[222] Talking Signs Inc, “Talking signs infrared communications system.”

http://www.talkingsigns.com/, 2003.

Bibliography 275

[223] I. Tanaka, T. Murakami, and O. Shimzu, “Heart rate as an objective mea-

sure of stress in mobility,” Visual Impairment and Blindness, vol. 75, no. 2,

pp. 55–60, 1981.

[224] X. Tang, “Texture information in run-length matrices,” IEEE Transactions

on Image Processing, vol. 7, no. 11, pp. 1602–1609, 1998.

[225] G. E. Tassiker. U.S. patent 2,760,483, 1956.

[226] R. W. Thompson, G. D. Barnett, M. Humayun, and G. Dagnelie, “Fa-

cial recognition using simulated prosthetic pixelized vision.,” Investigative

Ophthalmology & Vision Science, vol. 44, no. 11, pp. 5035–5042, 2003.

[227] S. Thorpe, “Image processing by the human visual system,” tech. rep.,

Eurographics ’90 : Image Processing by the Human Visual System, 1990.

[228] J. T. Tou and M. Adjouadi, “Computer vision for the blind,” in Electronic

Spatial Sensing for the Blind (D. H. Warren and E. R. Strelow, eds.), pp. 83–

124, Dordrecht: Martinus Nijhoff Publishers, 1985.

[229] G. Trick, “Artificial vision: What are we hoping to restore?,” in The Eye

and The Chip 2004: : World Congress on Artificial Vision., (Detroit, Michi-

gan, USA), 2004.

[230] P. Troyk, M. Bak, J. Berg, D. Bradley, S. Cogan, R. Erickson, C. Kufta,

D. McCreery, E. Schmidt, and V. Towle, “A model for intracortical visual

prosthesis research,” Artificial Organs, vol. 27, no. 11, pp. 1005–1015, 2003.

[231] P. Troyk and M. Schwan, “Closed-loop class E transcutaneous power and

data link for microimplants,” IEEE Transactions on Biomedical Engineering,

vol. 39, no. 6, pp. 589–599, 1992.

276 Bibliography

[232] E. Trucco and A. Verri, Introductory techniques for 3-D computer vision.

New Jersey: Prentice-Hall, 1998.

[233] K. Turano, A. Broman, K. Bandeen-Roche, B. Munoz, G. Rubin, S. West,

and SEE Project Team, “Association of visual field loss and mobility per-

formance in older adults: Salisbury eye evaluation study.,” Optometry and

Vision Science, vol. 81, no. 5, pp. 298–307, 2004.

[234] M. Uddin and T. Shioyama, “Detection of pedestrian crossing using bipo-

larity feature - an image based approach,” IEEE Transactions on Intelligent

Transportation Systems, vol. 6, no. 4, pp. 439–445, 2005.

[235] C. E. Uhlig, S. Taneri, F. P. Benner, and H. Gerding, “Elektrostimulation

des visuellen systems,” Ophthalmologe, vol. 98, no. 11, pp. 1089–1096, 2001.

[236] C. Veraart, M.-C. Wanet-Defalque, B. Grard, A. Vanlierde, and J. Del-

beke, “Pattern recognition with the optic nerve visual prosthesis,” Artificial

Organs, vol. 27, no. 11, pp. 996–1004, 2003.

[237] M. Volker, K. Shinoda, H. Sachs, H. Gmeiner, T. Schwarz, K. Kohler,

W. Inhoffen, K. Bartz-Schmidt, E. Zrenner, and F. Gekeler, “In vivo assess-

ment of subretinally implanted microphotodiode arrays in cats by optical

coherence tomography and fluorescein angiography,” Graefe’s Archive For

Clinical And Experimental Ophthalmology, vol. 242, no. 9, pp. 792–799, 2004.

[238] P. Walter and K. Heimann, “Evoked cortical potentials after electrical stim-

ulation of the inner retina in rabbits,” Graefe’s Archive for Clinical and

Experimental Ophthalmology, vol. 238, no. 4, pp. 315–318, 2000.

[239] B. Wandell, Foundations of vision. Sinauer Associates: Massachusetts,

1995.

Bibliography 277

[240] D. J. Warren and R. A. Normann, “Visual neuroprostheses,” in Handbook

of Neuroprosthetic Methods (W. E. Finn and P. G. LoPresti, eds.), Boco

Raton: CRC Press, 2003.

[241] A. Webb, Statistical Pattern Recognition. Wiley, 2nd ed., 2002.

[242] J. D. Weiland and M. S. Humayun, “Past, present, and future of artificial

vision,” Artificial Organs, vol. 27, no. 11, pp. 961–962, 2003.

[243] S. K. West, G. S. Rubin, A. T. Broman, B. Munoz, K. Bandeen-Roche,

and K. Turano, “How does visual impairment affect performance on tasks

of everyday life? The SEE Project. Salisbury eye evaluation,” Archives of


[244] R. Whitestock, L. Frank, and R. Haneline, “Dog guides,” in Foundations of

Orientation and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York:


[245] World Health Organization, “WHO fact sheet no. 145: Blindness and visual

disability: Socioeconomic aspects,” 1997.

[246] World Health Organization, “WHO fact sheet no. 146. blindness and visual

disability: Seeing ahead - projections into the next century,” 1997.


disability: Major causes worldwide.,” 1999.


disability: Other leading causes worldwide,” 1999.

[249] World Health Organization, “WHO fact sheet no. 233: Blindness as a public

health problem in china,” 1999.

[250] A. L. Yarbus, Eye Movements and Vision. New York: Plenum, 1967.

278 Bibliography

[251] C. S. Yoon, “Audible maps - a simple and effective tool,” in 2nd Aus-

tralasian orientation and mobility conference, (Gold Coast, Australia), 2003.

[252] D. Yuan and R. Manduchi, “Dynamic environment exploration using a

virtual white cane,” in Proceedings of IEEE Computer Society Conference

on Computer Vision and Pattern Recognition, pp. 243– 249, 2005.

[253] S. Zeki, A vision of the brain. Blackwell Scientific Publications: London,

1993.

[254] D. Zhang and G. Lu, “Review of shape representation and description tech-

niques,” Pattern Recognition, vol. 37, no. 1, pp. 1–19, 2004.

[255] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, “Face recognition:

A literature survey,” ACM Computing Surveys, vol. 35, no. 4, pp. 399–458,

2003.

[256] D. Ziegler, P. Linderholm, M. Mazza, S. Ferazzutti, D. Bertrand, A. M.

Ionescu, and P. Renaud, “An active microphotodiode array of oscillating

pixels for retinal stimulation,” Sensors and Actuators A: Physical, vol. 110,

no. 1-3, pp. 11–17, 2003.

[257] E. Zrenner, “The subretinal implant: Can microphotodiode arrays re-

place degenerated retinal photoreceptors to restore vision?,” Ophthalmolog-

ica, vol. 216, pp. Suppl 1:8–20, 2002.

[258] E. Zrenner, K.-D. Miliczek, V. P. Gabel, H. G. Graf, E. Guenther, H. Haem-

merle, B. Hoefflinger, K. Kohler, W. Nisch, M. Schubert, A. Stett, and

S. Weiss, “The development of subretinal microphotodiodes for replacement

of degenerated photoreceptors,” Ophthalmic Research, vol. 29, pp. 269–280,

1997.

Bibliography 279

[259] E. Zrenner, A. Stett, S. Weiss, R. B. Aramant, E. Guenther, K. Kohler,

K.-D. Miliczek, M. J. Seiler, and H. Haemmerle, “Can subretinal micropho-

todiodes successfully replace degenerated photoreceptors?,” Vision Research,

vol. 39, pp. 2555–2567, 1999.

280 Bibliography

Appendix A

AHV project web sites

A list of AHV project web sites (current at July 2006) and main contacts is

provided below:

Bionic Eye Research Project (Cortical Neuroprosthesis - UNSW, Australia)

Vivek Chowdhury and John Morley

http://ophthalmology.med.unsw.edu.au/bioniceye.htm

Cortical Implant for the Blind (CORTIVIS, Europe)

Edwardo Fernandez

http://cortivis.umh.es/

EPI RET (Retina implant research in Cologne, Germany)

Rolf Eckmiller

http://www.medizin.uni-koeln.de/kliniken/augenklinik/epi-ret3e.htm

Intracortical Visual Prosthesis (Illinois Institute of Technology, United States)

Phillip Troyk

http://neural.iit.edu/intro.html

281

282 Appendix A. AHV project web sites

Microsystems Based Visual Prosthesis (MiVip, now OPTIVIP, Europe)

Claude Veraart

http://www.md.ucl.ac.be/gren/Projets/mivip.html

OPTIVIP projects (ESPRIT programme of the European Union)

Claude Veraart

http://www.dice.ucl.ac.be/optivip/

Optobionics Corporation (United States)

Alan Chow and Vincent Chow

http://www.optobionics.com

Retinal Implant (Doheny Retina Institute, United States)

Mark Humayun and Eujene De Juan Jr

http://www.usc.edu/hsc/doheny/

Retinal Implant & Bio-hybrid Implant (Japan)

Tohru Yagi

http://www.bmc.riken.jp/yagi/retina/

Retinal Implant-AG (was SUB RET project, Germany)

Eberhart Zrenner

http://www.retina-implant.de/tour/

Retinal Prosthesis Project (North Carolina State University, United States)

Wentai Liu

http://www.icat.ncsu.edu/projects/retina/

283

Retinomorphic chip (University of Pennsylvania, United States)

http://www.neuroengineering.upenn.edu/boahen/pub/fs pub.htm

Second Sight (California, United States)

Alfred E. Mann and Robert Greenberg

http://www.2-sight.com/

The Boston Retinal Implant Project (United States)

John Wyatt and Joseph Rizzo

http://www.bostonretinalimplant.org/

The Dobelle Institute (Lisbon, Portugal)

William Dobelle

http://www.dobelle.com/

University of Utah (Intracortical prosthesis, United States)

Richard A. Normann

http://www.bioen.utah.edu/cni/projects/blindness.htm

Vision Prosthesis Project (UNSW and Newcastle University, Australia)

Gregg Suaning

http://bionic.gsbme.unsw.edu.au/

284 Appendix A. AHV project web sites

Appendix B

Chapter 7 and 8 experiment

materials

285

286 Appendix B. Chapter 7 and 8 experiment materials

Participant Information Sheet

“Mobility enhancement using simulated Artificial Human Vision” Jason Dowling (PhD Candidate, EESE, S1102, Gardens Point, 3864 1608 [email protected]) Description Artificial Human Vision (AHV) systems are designed to help restore some sense of vision to the blind by electrically stimulating a component of the visual pathway. However there are limits to the amount of visual information that can be provided to a person using an AHV system. We are interested in how we can process images from a camera to enhance mobility for blind recipients of AHV systems. The research team requests your assistance in testing one method of information display and its effect on your mobility performance. Your participation will involve wearing a head mounted simulation device. The device display will be your only visual information during the experiment. After the device has been placed on your head: • You will be allowed two minutes for familiarization with the device display; • Your walking speed while wearing the device will be measured; • You will be asked to complete two tasks within a mobility course; • Your walking speed while wearing the device will be measured again. The experiment is expected to take approximately 40 minutes. Expected benefits It is expected that this project will not benefit you. However, it may benefit the mobility performance of blind people who use an artificial human vision system. This research may also be useful for the development of nno-surgical image processing based electronic travel aids for the blind. Risks 1. Some people may experience disorientation or nausea while using Virtual Reality (VR) headgear. If you feel sick during the experiment please tell the experimenter who will immediately stop the experiment. 2. As the simulation presents a reduced amount of visual information, there is a risk of tripping or hitting obstacles during the experiment. However we have designed the mobility course to reduce these risks. In addition, the experimenter will walk directly behind you during the experiment to monitor and prevent any personal danger. Confidentiality All comments and responses are anonymous and will be treated confidentially. The names of individual persons are not required in any of the responses. During the experiment we may record your movements on video: this video data will be coded and destroyed within two months of the experiment. Voluntary participation Your participation in this project is voluntary. If you do agree to participate, you can withdraw from participation at any time during the project without comment or penalty. Your decision to participate will in no way impact upon your current or future relationship with QUT. Questions / further information Please contact the researchers if you require further information about the project, or to have any questions answered. Concerns / complaints Please contact the Research Ethics Officer on 3864 2340 or [email protected] if you have any concerns or complaints about the ethical conduct of the project. Consent The return of the completed questionnaire is accepted as an indication of your consent to participate in this project. Thank you for your time in completing this questionnaire. Figure B.1: Coversheet provided to participants before the AHV simulation ex-periments described in Chapter 7 and 8.

287

C:\Documents and Settings\Jason Dowling\My Documents\2005 PhD thesis\Other\Questionnaire_CVPR.doc

1

Mobility enhancement using simulated Artificial Human Vision

1. What is your gender: □ Female □ Male

2. Please indicate your age (years):

□ 0-20 yrs □ 20-30 yrs □30-40 yrs □40-50 yrs □50-60 yrs □over 60 yrs

3. How frequently do you play computer/video games?

□ Never □ Once a year □ Monthly □ Weekly □ Daily 4. Have you ever used an immersive Virtual Reality (VR) environment (using a head mounted VR display) before?

□ Yes □ No If you have used an immersive VR environment before, approximately how many times have you done this? ____ times

Figure B.2: Questionnaire provided to participants before the AHV simulationexperiments described in Chapter 7 and 8.

288 Appendix B. Chapter 7 and 8 experiment materials

C:\Documents and Settings\Jason Dowling\My Documents\2005 PhD thesis\Other\Experimenter record sheet.doc

1

Mobility enhancement using simulated Artificial Human Vision Experimenter Sheet: Jason Dowling ([email protected], x1608)

Participant id#: ____________

1. PWS(a) Duration: ___________________________ 2. Task 1 (mobility measure/locate object)

Start Time: ______________________________ End Time: ______________________________ Duration: ______________________________ Time Obstacle # Veering# Other# 0-2 mins 2-4 mins 4-6 mins 6-8 mins 8-10 mins 10-12 mins 12-14 mins 14-16 mins 16-18 mins 18-20 mins

3. Task 2 (mobility measure/locate object)

Start Time: ______________________________ End Time: ______________________________ Duration: ______________________________ Time Obstacle # Veering# Other# 0-2 mins 2-4 mins 4-6 mins 6-8 mins 8-10 mins 10-12 mins 12-14 mins 14-16 mins 16-18 mins 18-20 mins

4. PWS(b) Duration: ___________________________ Comments:

Figure B.3: Record sheet used by the experimenter during the AHV simulationexperiments described in Chapter 7 and 8. The locate object task was not usedfor the Chapter 8 experiment.

MOBILITY ENHANCEMENT USING SIMULATED ARTIFICIAL … · MOBILITY ENHANCEMENT USING SIMULATED...

Documents

Transcript of MOBILITY ENHANCEMENT USING SIMULATED ARTIFICIAL … · MOBILITY ENHANCEMENT USING SIMULATED...