MOBILITY ENHANCEMENT USING SIMULATED ARTIFICIAL … · MOBILITY ENHANCEMENT USING SIMULATED...
Transcript of MOBILITY ENHANCEMENT USING SIMULATED ARTIFICIAL … · MOBILITY ENHANCEMENT USING SIMULATED...
Speech and Audio Research Laboratory of the SAIVT program
Centre for Built Environment and Engineering Research
MOBILITY ENHANCEMENT USING
SIMULATED ARTIFICIAL HUMAN VISION
Jason Dowling
BAppSc/BComp(Hons)
SUBMITTED AS A REQUIREMENT OF
THE DEGREE OF
DOCTOR OF PHILOSOPHY
AT
QUEENSLAND UNIVERSITY OF TECHNOLOGY
BRISBANE, QUEENSLAND
29 MAY 2007
Keywords
Artificial Human Vision, visual prosthesis, blind mobility, image processing, com-
puter vision, mobility assessment, visual simulation, Human Computer Interface
i
ii
Abstract
The electrical stimulation of appropriate components of the human visual sys-
tem can result in the perception of blobs of light (or phosphenes) in totally
blind patients. By stimulating an array of closely aligned electrodes it is pos-
sible for a patient to perceive very low-resolution images from spatially aligned
phosphenes. Using this approach, a number of international research groups are
working toward developing multiple electrode systems (called Artificial Human
Vision (AHV) systems or visual prostheses) to provide a phosphene-based sub-
stitute for normal human vision.
Despite the great promise, there are currently a number of constraints with
current AHV systems. These include limitations in the number of electrodes
which can be implanted and the perceived spatial layout and display frequency of
phosphenes. Therefore the development of computer vision techniques that can
maximise the visualisation value of the limited number of phosphenes would be
useful in compensating for these constraints. The lack of an objective method for
comparing different AHV system displays, in addition to comparing AHV systems
and other blind mobility aids (such as the long cane), has been a significant
problem for AHV researchers. Finally, AHV research in Australia and many
other countries relies strongly on theoretical models and animal experimentation
due to the difficulty of prototype human trials. Because of this constraint the
experiments conducted in this thesis were limited to simulated AHV devices with
iii
normally sighted research participants and the true impact on blind people can
only be regarded as approximated.
In light of these constraints, this thesis has two general aims. The first aim is
to investigate, evaluate and develop effective techniques for mobility assessment
which will allow the objective comparison of different AHV system phosphene
presentation methods. The second aim is to develop a useful display framework to
guide the development of AHV information presentation, and use this framework
to guide the development of an AHV simulation device.
The first research contribution resulting from this work is a conceptual frame-
work based on literature reviews of blind and low vision mobility, AHV technol-
ogy, and computer vision. This framework incorporates a comprehensive number
of factors which affect the effectiveness of information presentation in an AHV
system. Experiments reported in this thesis have investigated a number of these
factors using simulated AHV with human participants. It has been found that
higher spatial resolution is associated with accurate walking (reduced veering),
whereas higher display rate is associated with faster walking speeds. In this way
it has been demonstrated that the conceptual framework supports and guides
the development of an adaptive AHV system, with the dynamic adjustment of
display properties in real-time.
The second research contribution addresses mobility assessment which has
been identified as an important issue in the AHV literature. This thesis presents
the adaptation of a mobility assessment method from the blind and low vision
literature to measure simulated AHV mobility performance using real-time com-
puter based analysis. This method of mobility assessment (based on parameters
for walking speed, obstacle contacts and veering) is demonstrated experimentally
in two different indoor mobility courses. These experiments involved sixty-five
participants wearing a head-mounted simulation device.
iv
The final research contribution in this thesis is the development and evalua-
tion of an original real-time looming obstacle detector, based on coarse optical
flow, and implemented on a Windows PocketPC based Personal Digital Assistant
(PDA) using a CF card camera. PDA based processors are a preferred main pro-
cessing platform for AHV systems due to their small size, light weight and ease
of software development. However, PDA devices are currently constrained by
restricted random access memory, lack of a floating point unit and slow internal
bus speeds. Therefore any real-time software needs to maximise the use of inte-
ger calculations and minimise memory usage. This contribution was significant
as the resulting device provided a selection of experimental results and subjective
opinions.
v
vi
Contents
Keywords i
Abstract iii
List of Tables xiv
List of Figures xviii
List of Abbreviations xxxi
Authorship xxxiii
Acknowledgments xxxv
Publications xxxvii
1 Introduction 1
1.1 Motivation and Overview . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aims and Research Questions . . . . . . . . . . . . . . . . . . . . 3
1.3 Research Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Thesis Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Original Contributions of Thesis . . . . . . . . . . . . . . . . . . . 5
2 Blind and Low Vision Mobility Issues and Assessment 9
vii
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Blindness and Low Vision . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Mobility and related issues identified for people with low vision
and blindness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Primary mobility devices for the blind . . . . . . . . . . . . . . . 14
2.4.1 Long Cane . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Guide Dogs . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.3 Electronic Travel Aids . . . . . . . . . . . . . . . . . . . . 15
2.5 Orientation Aids . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 Environmental accessibility . . . . . . . . . . . . . . . . . . . . . . 18
2.7 The ecological approach to perception . . . . . . . . . . . . . . . . 19
2.8 Mobility assessment . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.8.1 Mobility assessment: Self report research . . . . . . . . . . 21
2.8.2 Mobility assessment: Field Experiment research . . . . . . 23
2.8.3 Mobility assessment: Artificial environment research . . . . 31
2.8.4 Mobility assessment: Combined Field experiment and arti-
ficial environment research . . . . . . . . . . . . . . . . . . 38
2.8.5 Mobility Assessment Conclusion . . . . . . . . . . . . . . . 41
2.9 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3 A Review of Artificial Human Vision 45
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Review of the Human Visual System . . . . . . . . . . . . . . . . 45
3.3 AHV technology and requirements . . . . . . . . . . . . . . . . . 49
3.4 Cortical stimulation . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.1 Cortical surface stimulation . . . . . . . . . . . . . . . . . 53
3.4.2 Intracortical stimulation . . . . . . . . . . . . . . . . . . . 55
3.5 Retinal Stimulation . . . . . . . . . . . . . . . . . . . . . . . . . . 59
viii
3.5.1 Subretinal stimulation . . . . . . . . . . . . . . . . . . . . 60
3.5.2 Other subretinal methods . . . . . . . . . . . . . . . . . . 65
3.5.3 Epiretinal stimulation . . . . . . . . . . . . . . . . . . . . 66
3.6 Optic Nerve devices . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.7 AHV simulation studies . . . . . . . . . . . . . . . . . . . . . . . 72
3.8 Evaluation of current AHV systems . . . . . . . . . . . . . . . . . 74
3.9 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4 A Framework for Blind Mobility Improvement via Computer
Vision 77
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.1.1 An information processing approach to computer vision . . 80
4.2 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2.1 Information reduction . . . . . . . . . . . . . . . . . . . . 81
4.2.2 Scene understanding . . . . . . . . . . . . . . . . . . . . . 95
4.3 Previous applications of computer vision to assist the vision impaired 99
4.4 Relationship between computer vision methods and the Human
Vision System (HVS) . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.5 A conceptual framework for AHV system information display . . . 107
4.5.1 Hypothesised operational scenario . . . . . . . . . . . . . . 114
4.5.2 Benefits of a conceptual framework for AHV information
display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.5.3 Application of the conceptual framework for previous AHV
research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5 AHV Mobility Assessment using Static Images 123
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
ix
5.2.1 Images selected . . . . . . . . . . . . . . . . . . . . . . . . 125
5.2.2 Assessing mobility information . . . . . . . . . . . . . . . . 129
5.2.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6 AHV Simulation and Obstacle Detection using a Personal Digital
Assistant 155
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.2.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.2.2 Obstacle avoidance . . . . . . . . . . . . . . . . . . . . . . 158
6.2.3 Block Based Obstacle Alert . . . . . . . . . . . . . . . . . 161
6.2.4 AHV Simulation Implementation . . . . . . . . . . . . . . 162
6.2.5 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.2.6 Statistical methods . . . . . . . . . . . . . . . . . . . . . . 170
6.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7 Mobility Assessment using a PDA-based AHV Simulation in a
course environment 181
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
7.2.1 AHV Simulation Device . . . . . . . . . . . . . . . . . . . 184
7.2.2 Assessment of mobility performance . . . . . . . . . . . . . 188
7.2.3 Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . 190
7.2.4 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . 192
x
7.2.5 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8 Effects of Spatial and Temporal Resolution on Mobility Assess-
ment 201
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.2.1 Simulation Hardware . . . . . . . . . . . . . . . . . . . . . 205
8.2.2 Simulation Software . . . . . . . . . . . . . . . . . . . . . 206
8.2.3 Mobility course . . . . . . . . . . . . . . . . . . . . . . . . 207
8.2.4 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . 209
8.2.5 Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.2.6 Statistical methods . . . . . . . . . . . . . . . . . . . . . . 210
8.2.7 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
8.3.1 Phosphene spatial resolution . . . . . . . . . . . . . . . . . 212
8.3.2 Frame Rate . . . . . . . . . . . . . . . . . . . . . . . . . . 220
8.3.3 Prior experience with immersive VR . . . . . . . . . . . . 221
8.3.4 Gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
8.3.5 Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
8.3.6 Corrected Vision . . . . . . . . . . . . . . . . . . . . . . . 222
8.3.7 Learning effects . . . . . . . . . . . . . . . . . . . . . . . 225
8.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
8.4.1 Can specific main factors be identified as highly significant
for providing mobility information in an AHV system? . . 229
xi
8.4.2 Can objective measures be developed for the comparison of
effectiveness between AHV systems in providing mobility
information? . . . . . . . . . . . . . . . . . . . . . . . . . . 231
8.4.3 Can computer vision techniques be adopted and modified
to provide mobility information in an AHV system? . . . . 232
8.4.4 Connections with Vision Research . . . . . . . . . . . . . . 233
8.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9 Conclusion and Future Work 237
9.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
9.1.1 Can specific main factors be identified as highly significant
for providing mobility information in an AHV system? . . 239
9.1.2 Can objective measures be developed for the comparison of
effectiveness between AHV systems in providing mobility
information? . . . . . . . . . . . . . . . . . . . . . . . . . . 240
9.1.3 Can computer vision techniques be adopted and modified
to provide mobility information in an AVH system? . . . . 242
9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
9.2.1 Mobility experiments with AHV system recipients . . . . . 243
9.2.2 Symbolic display . . . . . . . . . . . . . . . . . . . . . . . 244
9.2.3 Real world mobility assessment environments . . . . . . . 244
9.2.4 Integration of information from other sensors . . . . . . . . 244
9.2.5 Standard set of mobility related images . . . . . . . . . . 245
9.3 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Bibliography 246
A AHV project web sites 281
xii
B Chapter 7 and 8 experiment materials 285
xiii
xiv
List of Tables
2.1 Nottingham Blind Mobility Unit dependent variable measures [6]. 25
2.2 Revised Nottingham Mobility Unit measures from Dodds [51]. Shore-
lining refers to following a path or wall using tactile or auditory
information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Mobility measures used in Geruschat & de l’Aune [72] . . . . . . . 31
2.4 Obstacle types used in Lovie-Kitchin et al. [137] . . . . . . . . . . 33
2.5 Mobility and daily activities assessment from West et al. [243] . . 37
2.6 Mobility measures used in Marron et al. [145] . . . . . . . . . . . 39
2.7 Mobility incidents scored in Long et al. [135] . . . . . . . . . . . 40
2.8 Summary of mobility assessment research discussed in this Chap-
ter. ‘Time’ is the amount of time on the course, ‘obst.’ is a count of
obstacle contacts, and ‘veer.’ is the number of incidents of veering
from a path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1 The feature set used in Everingham et al. [63] . . . . . . . . . . . 103
4.2 Overview of computer vision functionality performed by each part
of the HVS (Based on Thorpe [227]). . . . . . . . . . . . . . . . . 107
5.1 Mobility related image components identified for each image. . . . 130
xv
5.2 Image edge detection and line enhancement thresholds for each
image. Note that the Canny sensitivity listed is the high thresh-
old value. The low threshold value was set to 0.4 times the high
threshold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.3 This table shows the ranges used to convert the original x and y co-
ordinates (recorded for each question and image combination) into
a simplified 5x5 element array. For example the x,y value (227,156)
would be re-coded to (5,4). The simplified values were then com-
pared against an array of ‘correct responses’ for each question type. 142
5.4 Steps in identifying correct/incorrect and identified/not identified
grid responses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.5 Summary of response classification for each image type. Note that
203 responses of ‘Don’t know’ have been excluded from classification.145
6.1 The number of frames in each image sequence, along with the
duration of each captured sequence. . . . . . . . . . . . . . . . . . 170
6.2 Postal box image sequence results. . . . . . . . . . . . . . . . . . . 178
6.3 Bus shelter image sequence results. . . . . . . . . . . . . . . . . . 178
7.1 AHV simulation display types used for the pilot study . . . . . . . 185
7.2 Questionnaire responses for each participant. . . . . . . . . . . . . 194
7.3 PPWS results for each trial for each participant. The Benchmark
column is the time taken during the 10m guided walk. PWS is
10/Benchmark time. Course (s) is the amount of seconds taken
while walking through the 45m mobile course. SMC is 45/Course
speed. PPWS is SMC/PWS multiplied by 100. . . . . . . . . . . . 194
7.4 PPWS results for each task type and trial. . . . . . . . . . . . . . 194
7.5 Mobility error results for each trial for each participant. . . . . . 194
7.6 Mobility error results for each task type and trial. . . . . . . . . . 195
xvi
7.7 Mobility error summary for each display type. . . . . . . . . . . . 196
7.8 PPWS summary for each display type. . . . . . . . . . . . . . . . 196
7.9 Effect sizes (η2) for the main mobility factors in this pilot study.
‘DV’ represents the dependent variable, ‘F’ is the F-test result and
‘Sig’ represents significance. . . . . . . . . . . . . . . . . . . . . . 196
8.1 Gender and age groups of experiment participants. . . . . . . . . 211
8.2 Mean number of obstacle contacts (with standard deviations) for
different resolution and frame rate. . . . . . . . . . . . . . . . . . 213
8.3 Mean number of veering errors (with standard deviations) for dif-
ferent resolution and frame rate. . . . . . . . . . . . . . . . . . . . 214
8.4 Mean benchmark speeds over 10m (with standard deviations) for
resolution and frame rate. Benchmark no. 1 was recorded before
the first mobility trial. Benchmark no. 2 was recorded after the
second mobility trial. PWS is 10 divided by each Benchmark score.
The combined PWS score in the table is the average PWS for the
two benchmarks for each participant. . . . . . . . . . . . . . . . . 215
8.5 Mean scores (with standard deviations) for the amount for time
spent walking through the mobility course during each trial, and
for PPWS (calculated using combined PWS) during each trial. . 216
9.1 Summary of the main scientific contributions of this thesis. . . . 238
xvii
xviii
List of Figures
3.1 Horizontal diagram of the human eye. The locations for the epi-
and sub-retinal implants and the optic nerve electrode are shown.
Adapted from Gregory [83]. . . . . . . . . . . . . . . . . . . . . . 46
3.2 A simplified diagrammatic representation of the cellular layers of
the retina. Light passes through the outer layers of the retina
before being absorbed by the rods and cones of the photoreceptor
layer. Adapted from Sharp and Phillips [199]. . . . . . . . . . . . 47
3.3 The cortical lobes of the human brain. The primary visual cortex,
which is the site for cortical electrode array implants, is also shown.
Adapted from Wandell [239]. . . . . . . . . . . . . . . . . . . . . 49
3.4 Diagram of the main pathways in the HVS. Adapted from Bruce
et al. [24]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1 Example of reduced visual information in an AHV system: Image
(a) shows a street scene image in suburban Brisbane; in image
(b) the resolution of this image has been reduced to 25x25 pixels.
Image (c) shows a simulated 25x25 phosphene display of the same
image. A sample symbolic representation of the mobility hazards
contained in the street scene is shown in image (d). . . . . . . . . 79
4.2 The 3x3 Laplacian kernel for high pass filtering (for image sharp-
ening). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
xix
4.3 An example 3x3 Gaussian low pass filter kernel for image smooth-
ing. Note the centre element has the greatest weight (0.2042) com-
pared to the others. . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.4 An example of high and low pass filtering on an image. A grey scale
post box image is shown in image (a). Image (b) shows the image
after it has been filtered using the 3x3 Laplacian high pass filter
(detailed in Figure 4.2.1). Image (c) shows the result of applying
the Gaussian low pass filter from Figure 4.2.1. This image has been
taken from an image sequence captured using a low quality PDA
card camera (this sequence and camera is described in more detail
in Chapter 6). As the camera was moving at the time of capture
there is a significant amount of motion blur. It is anticipated that
image quality from cameras used for AHV systems will improve as
technology advances. . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.5 An example of contrast enhancement by histogram expansion. The
base image (a) shows a Brisbane suburban bus shelter. (b) shows
the distribution of the 256 grey-scale values in image (a). The con-
trast in image (c) has been enhanced using histogram equalisation.
The histogram of image (c) is shown in (d). . . . . . . . . . . . . 85
4.6 The 3x3 kernel used for Sobel horizontal edge detection. . . . . . 87
4.7 The 3x3 kernel used for Sobel vertical edge detection. . . . . . . . 87
4.8 Sobel edge detection applied to captured image of a post box (a).
Image (b) shows the result of the horizontal Sobel edge kernel. The
output from the vertical Sobel edge kernel is shown in image (c). . 88
4.9 A comparison of different edge detection methods applied to an
image of suburban footpath (a). The output from the Canny de-
tector is shown in (b). The Sobel detector is shown in (c), and the
Roberts edge detector is displayed in (d). . . . . . . . . . . . . . . 89
xx
4.10 Example application of the Hough transform for locating the fence
boundary shown in image (a). Image (b) shows the output from
Sobel edge detection. The corresponding Hough transform output
is shown in image (c), with the origin in the top left hand side
of the image. This transform image was generated using software
from Seul at al. [198]. The horizontal axis represents r, and the
vertical axis represents θ, which increases from 0 radians in the
top left corner to π radians at the bottom. The dominant peak,
indicating the dominant line, is shown with a superimposed box.
Image (d) shows the pixels which are present along the dominant
line found by the Hough transform. . . . . . . . . . . . . . . . . . 91
4.11 Factors which influence the display processing for an Artificial Hu-
man Vision system. . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.12 Conceptual framework applied to Cha et al.’s simulated AHV mo-
bility experiment. Factors which are not included in this study are
marked with a line pattern. . . . . . . . . . . . . . . . . . . . . . 119
4.13 Conceptual framework applied to Long et al.’s low vision mobil-
ity experiment. Factors which are not included in this study are
marked with a line pattern. . . . . . . . . . . . . . . . . . . . . . 120
5.1 Conceptual framework diagram showing factors which influence
simulated AHV display effectiveness in this chapter. Factors from
Chapter 4 which are excluded from this chapter are marked with
a line pattern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.2 Original mobility images used in this chapter. A brief descriptions
for each image is shown in Table 5.1. . . . . . . . . . . . . . . . . 127
xxi
5.3 Example Matlab code used for generating images. This example
creates the output from Sobel edge detection for image A. Child
On Street. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.4 Flowchart showing the image processing steps applied for each of
the four image type used in this Chapter. . . . . . . . . . . . . . . 131
5.5 Image processing applied to image A (Child on street). The orig-
inal image (converted to 8 bit grey-scale and 256x256 pixel reso-
lution) is shown with the 5x5 grid mask in figure (a). The binary
image is shown in image (b) and the Canny edge detection image
shown in image (c). The 50x50 8 bit grey-scale image is shown
in image (d). Finally the Sobel edge detection output is shown in
image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.6 Image processing applied to image B (Path near road). The orig-
inal image (converted to 8 bit grey-scale and 256x256 pixel reso-
lution) is shown with the 5x5 grid mask in figure (a). The binary
image is shown in image (b) and the Canny edge detection image
shown in image (c). The 50x50 8 bit grey-scale image is shown
in image (d). Finally the Sobel edge detection output is shown in
image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.7 Image processing applied to image C (Person in office). The orig-
inal image (converted to 8 bit grey-scale and 256x256 pixel reso-
lution) is shown with the 5x5 grid mask in figure (a). The binary
image is shown in image (b) and the Canny edge detection image
shown in image (c). The 50x50 8 bit grey-scale image is shown
in image (d). Finally the Sobel edge detection output is shown in
image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
xxii
5.8 Image processing applied to image D (Person in bathroom). The
original image (converted to 8 bit grey-scale and 256x256 pixel
resolution) is shown with the 5x5 grid mask in figure (a). The
binary image is shown in image (b) and the Canny edge detection
image shown in image (c). The 50x50 8 bit grey-scale image is
shown in image (d). Finally the Sobel edge detection output is
shown in image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.9 Image processing applied to image E (Sparse office). The original
image (converted to 8 bit grey-scale and 256x256 pixel resolution)
is shown with the 5x5 grid mask in figure (a). The binary image
is shown in image (b) and the Canny edge detection image shown
in image (c). The 50x50 8 bit grey-scale image is shown in image
(d). Finally the Sobel edge detection output is shown in image (e). 136
5.10 Image processing applied to image F (Street scene with tree). The
original image (converted to 8 bit grey-scale and 256x256 pixel
resolution) is shown with the 5x5 grid mask in figure (a). The
binary image is shown in image (b) and the Canny edge detection
image shown in image (c). The 50x50 8 bit grey-scale image is
shown in image (d). Finally the Sobel edge detection output is
shown in image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.11 Image processing applied to image G (Phone booth obstacle). The
original image (converted to 8 bit grey-scale and 256x256 pixel
resolution) is shown with the 5x5 grid mask in figure (a). The
binary image is shown in image (b) and the Canny edge detection
image shown in image (c). The 50x50 8 bit grey-scale image is
shown in image (d). Finally the Sobel edge detection output is
shown in image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . 138
xxiii
5.12 Image processing applied to image H (Railway platform). The
original image (converted to 8 bit grey-scale and 256x256 pixel
resolution) is shown with the 5x5 grid mask in figure (a). The
binary image is shown in image (b) and the Canny edge detection
image shown in image (c). The 50x50 8 bit grey-scale image is
shown in image (d). Finally the Sobel edge detection output is
shown in image (e). . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.13 A sample screen from the static image experiment. The x and y
values on the right hand side of the screen show which part of the
image received a mouse click for each question. If the participant
selected a response of ‘5=Definitely No’ for a question, x and y
were set to -1 by default. . . . . . . . . . . . . . . . . . . . . . . . 141
5.14 Summary of question responses for each of the 32 images presented.145
5.15 Summary of responses for each image processing method used in
this experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.16 Results for questions 1-4 for each processing method on image 1
(Child on street). . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.17 Results for questions 1-4 for each processing method on image 2
(Path near road). . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.18 Results for questions 1-4 for each processing method on image 3
(Person in office). . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.19 Results for questions 1-4 for each processing method on image 4
(Person in bathroom). . . . . . . . . . . . . . . . . . . . . . . . . 148
5.20 Results for questions 1-4 for each processing method on image 5
(Sparse office). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.21 Results for questions 1-4 for each processing method on image 6
(Street scene with tree). . . . . . . . . . . . . . . . . . . . . . . . 149
xxiv
5.22 Results for questions 1-4 for each processing method on image 7
(Phone booth obstacle). . . . . . . . . . . . . . . . . . . . . . . . 149
5.23 Results for questions 1-4 for each processing method on image 8
(Railway platform). . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.24 Question 5 (‘next step’) result summary for each type of image. . 150
6.1 Front and side views of the AHV simulator used in the present study.158
6.2 Grid showing the 7x10 pixel blocks used from each 120x160 pixel
image for the PDA motion estimation described below. . . . . . . 161
6.3 Motion vectors extracted calculated from the PDA. The origin of
each motion vector is the centre of each of of the grid blocks in
Figure 6.2. In certain directions the arrow heads look like white
blobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.4 Processing steps for the PDA block-based AHV simulation . . . . 164
6.5 The first five steps of the block based approach are illustrated in
these images of a suburban footpath. The number of grey-levels
in the base image (a) was first reduced to 8 grey-levels (b), before
a median filter was applied (c). Finally the image was spatially
reduced from 160x120 pixels to 32x24 blocks (d). . . . . . . . . . 166
6.6 The maximum search area used in Step 7. Each block from the
current image block array (shown on the left) is compared against
the previous image block array (shown on the right). Initially only
the matching block position is compared. The search then checks
the 8 blocks surrounding this position. Finally if a matching block
has not been identified the surrounding 16 blocks are searched
(giving a total search area of 25 blocks). . . . . . . . . . . . . . . 167
xxv
6.7 An example block based alert, shown in (d), which has been trig-
gered in response to looming branches in front of a head-mounted
camera. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.8 Frames 10,70,130 and 190 from the postal box mid morning se-
quence (on left) and the bus stop early afternoon sequence (on
right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.9 Postal Box recall graph: This graph shows the recall (the number
of correct alerts/the number of alerts) at different stages during
each captured image sequence. . . . . . . . . . . . . . . . . . . . . 174
6.10 Bus shelter recall graph: This graph shows the recall (the number
of correct alerts/the number of alerts) at different stages during
each captured image sequence. . . . . . . . . . . . . . . . . . . . . 175
6.11 An example incorrect alert warning. The shadow shown in the
original median filtered and 8 grey-level image (a) is incorrectly
segmented from the lower resolution image (b) and is assumed to
be a looming obstacle in front of the camera (d). The objects
segments which have been identified are shown in image (c). . . . 176
6.12 Images 153 (top) to 156 (bottom) of the mid morning post box
sequence. The images on the left have been reduced to 8 grey
levels and median filtered. On the right is the segmentation result
for each image. An obstacle alert (shown with an ‘x’ pattern) was
identified for frame 156. . . . . . . . . . . . . . . . . . . . . . . . 177
7.1 Factors which influenced simulated AHV display effectiveness in
this chapter. Excluded factors are marked with a line pattern. . . 183
7.2 Processing steps for the AHV simulator used in the pilot study.
Note the display type is initialised before the images are processed.
The three display types are listed in table 7.1. . . . . . . . . . . 186
xxvi
7.3 Examples of the image types used in this study. Figure (a) is
the base 160x120 pixel 256 grey-level image. The simulator image
using display type 3 is shown in image (b). Image (c) shows the
base image from (a) with 8 grey-levels and a 3x3 median filter
applied. In image (d), image (c) has been reduced to a 32x24
phosphene display (this is used for simulator display types 1 and 2).187
7.4 Images taken of the Gait Lab before the mobility course was set
up. Image (a) shows the black curtains surrounding the lab. The
change area ‘tent’, and raised wooden platform are visible in image
(b). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.5 Example soft obstacle set up for the mobility course. . . . . . . . 189
7.6 The indoor course used for mobility assessment in this Chapter. . 191
7.7 Total number of mobility errors for both trials during the mobility
course experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 195
8.1 Factors which influence simulated AHV display effectiveness in this
chapter. Excluded factors are marked with a line pattern. . . . . . 204
8.2 Phosphenes (top row) displayed as grey level pixels in reduced
resolution images . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
8.3 Original 160x120 pixel captured image . . . . . . . . . . . . . . . 207
8.4 Original image reduced to 32x24 phosphenes . . . . . . . . . . . . 208
8.5 Original image reduced to 16x12 phosphenes . . . . . . . . . . . . 208
8.6 Map of the 30m mobility course built for this study. The grey
shaded area is the path identified by black tape on the floor. The
numbers refer to the placement of obstacles and the black lines
denote office partitions. . . . . . . . . . . . . . . . . . . . . . . . . 209
8.7 Different types of grey shading on each obstacle shown in Figure 8.6209
xxvii
8.8 Summary of obstacle errors during trials 1 and 2 by resolution
and frame rate (FPS). The boxes show the middle 50 per cent
of observations, with the median shown by the solid line in the
box. The whiskers coming from each box show the largest value
excluding outliers (which are shown as small circles). . . . . . . . 213
8.9 Frequency of obstacle contacts by obstacle number for different
resolution types and frame rates (FPS). The obstacle types are
displayed in Figure 8.7. . . . . . . . . . . . . . . . . . . . . . . . 214
8.10 Summary of veering errors during trials 1 and 2 by resolution and
frame rate (FPS) . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
8.11 Percentage of Preferred Walking Speed (PPWS) results for trials 1
(PPWS1) and 2 (PPWS2) displayed by resolution type and frame
rate (FPS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.12 Time spent walking through the mobility course for trials 1 (Time1)
and 2 (Time2) displayed by resolution type and frame rate (FPS). 217
8.13 Variation of trial 1 PPWS scores by frame rate (FPS) and res-
olution. These results suggest a confounding variable, perhaps
anxiety, during the initial trial. . . . . . . . . . . . . . . . . . . . 218
8.14 Variation of trial 2 PPWS scores by frame rate (FPS) and reso-
lution. These results show an increase in walking confidence as
frame rate and resolution increase. . . . . . . . . . . . . . . . . . 218
8.15 Variation of time spent walking during trial 1 scores by frame rate
(FPS) and resolution. . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.16 Variation of time spent walking during trial 1 scores by frame rate
(FPS) and resolution. . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.17 Median time spent walking through the mobility for during Trial
1 (Time1) and 2 (Trial 2) for different age groups. . . . . . . . . . 223
xxviii
8.18 Median PPWS scores from Trial 1 (PPWS1) and 2 (PPWS2 2) for
different age groups. . . . . . . . . . . . . . . . . . . . . . . . . . 224
8.19 Participant Preferred Walking Speed (PWS) results during trial 1
and 2 (r = 0.74). . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
8.20 Participant Percentage Preferred Walking Speed (PPWS) results
during trial 1 and 2 (r = 0.88). . . . . . . . . . . . . . . . . . . . 226
8.21 Participant time spent walking during trial 1 and 2 (r = 0.87). . . 227
8.22 Participant Speed on Mobility Course (SMC) results for trial 1 and
2 (r = 0.87). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
8.23 Veering incidents for each participant during trial 1 and 2 (r = 0.60).228
8.24 Participant obstacle contacts during trial 1 and 2 (r = 0.16). . . . 228
B.1 Coversheet provided to participants before the AHV simulation
experiments described in Chapter 7 and 8. . . . . . . . . . . . . . 286
B.2 Questionnaire provided to participants before the AHV simulation
experiments described in Chapter 7 and 8. . . . . . . . . . . . . . 287
B.3 Record sheet used by the experimenter during the AHV simulation
experiments described in Chapter 7 and 8. The locate object task
was not used for the Chapter 8 experiment. . . . . . . . . . . . . 288
xxix
xxx
List of Abbreviations
AHV Artificial Human Vision
CCD Charged Coupled Device
CMOS Complementary Metal-Oxide Semiconductor
CF CompactFlash
AHV Artificial Human Vision
CCD Charged Coupled Device
CMOS Complementary Metal-Oxide Semiconductor
CF CompactFlash
ETA Electronic Travel Aid (for the visually impaired)
fMRI functional Magnetic Resonance Imaging
FPU Floating Point Unit
FPS Frames Per Second
HCI Human Computer Interface
HMD Head Mounted Display
HVS Human Visual System
LGN Lateral Geniculate Nucleus
HVS Human Visual System
LGN Lateral Geniculate Nucleus
LVES Low Vision Enhancement System
MPDA Microphotodiode Array
NIH National Institutes of Health (United States)
O&M Orientation and Mobility
xxxi
PWI Productive Walking Index
PWS Preferred Walking Speed
PPWS Percentage of Preferred Walking Speed
ROI Region of Interest
RP Retinitis Pigmentosa
SAD Sum of Absolute Differences
SMC Speed on the Mobility Course
TMS Transcranial Magnetic Stimulation
UAE Utah Electrode Array
USB Universal Serial Bus
VR Virtual Reality
WHO World Health Organisation
xxxii
Authorship
The work contained in this thesis has not been previously submitted for a degree
or diploma at any other higher educational institution. To the best of my knowl-
edge and belief, the thesis contains no material previously published or written
by another person except where due reference is made.
Signed:
Date:
xxxiii
xxxiv
Acknowledgments
I greatly appreciate the supervision, time, books, motivation and patience shown
by Anthony Maeder and Wageeh Boles, my two supervisors over the course of this
PhD. Many thanks also to my associate supervisor Jim Patrick and the financial
assistance from Cochlear Inc and the Australian Research Council.
I was lucky to have worked with Justin Boyle during the early stages of this
PhD and have really appreciated his advice, friendship, and comments on my
papers. Thanks also to everyone in the SAIVT lab, especially to Chris McCool,
Patrick Lucey, Brendan Baker and Robbie Vogt. Also thanks to Michael Mason
for assistance over the past three years.
I am grateful to Vivek Chowdhury (who originally suggested researching AHV
mobility), and also to Gislin Dagnelie, Greg Suaning and Luke Hallum for helpful
comments. Thanks also to Peter Meijer for his encouragement, and references;
Jason Ford for his comments on insect vision included in Chapter 8 of this the-
sis; Trevor Laimer for his assistance with setting up the L Block experiment in
Chapter 8; Graham Kerr for allowing me to borrow the gait lab at Kelvin Grove
campus for the experiment in Chapter 7; Andy Boud (VR Solutions) for allowing
me to trial the VR HMD; and Darren Stacey for helping with the PDA headgear
attachment used in Chapter 7. Thanks to Bashir Ebrahim (Guide Dogs Queens-
land) for helpful discussions and letting me trial a number of ETA’s. Thanks also
to Jan Lovie-Kitchin, Graham Kerr, Grace Soong, Doug Mahar (all QUT) and
xxxv
Anthony Richardson (CSIRO/UQ) for experimental design and statistical advice.
Thanks also to my parents for their support and encouragement of my edu-
cation over my life.
Finally, and most importantly, to Joanne and Lewis: I truly appreciate your
support, laughs, encouragement and company, and look forward to having more
time to spend with you! I couldn’t have done this without you both.
Jason Dowling
Queensland University of Technology
May 2007
xxxvi
Publications
The research has resulted in the following fully refereed publications (or abstract
refereed where indicated by an asterisk).
Book Chapters
1. Dowling, J., Boles, W., & Maeder, A. (2007). Visual Prostheses for the
Blind: A Framework for Information Presentation. Mechatronics and
Machine Vision in Practice, Billingsley J. (ed), Springer-Verlag,
Heilderburg (In Press).
Journal Articles
2. Dowling, J. (2005). Artificial human vision. Expert Review of Medical
Devices, 2(1), 73-85.
3. Dowling, J., Maeder, A., & Boles, W. (2006). Effects of low spatial
resolution and frame rate on mobility assessment using simulated artificial
human vision. Displays (Submitted)
Conference Papers
4. Dowling, J., Maeder, A., & Boles, W. (2003). Intelligent image processing
constraints for blind mobility facilitated through artificial vision.
Proceedings of the 8th Australian and New Zealand Intelligent Information
Systems Conference (ANZIIS), 109-114.
xxxvii
5. *Dowling, J., Maeder, A., & Boles, W. (2004). Mobility enhancement and
assessment for a visual prosthesis. In A. A. Amini & A. Manduca (Eds.),
Proceedings of SPIE – Volume 5369 Medical Imaging 2004: Physiology,
Function, and Structure from Medical Images (pp. 780-791). San Diego.
6. Dowling, J., Maeder, A. J., & Boles, W. W. (2005). A PDA based
Artificial Human Vision Simulator. Proceedings of the APRS Workshop
on Digital Image Computing, February 2005, Brisbane, Australia, 109-114.
7. Dowling, J., Boles, W., & Maeder, A. (2005). Mobility assessment using
simulated Artificial Human Vision. Proceedings of the 2005 IEEE
Computer Society Conference on Computer Vision and Pattern
Recognition (CVPR’05) Volume 03 (pp. 32): IEEE Computer Society.
8. *Dowling, J., Boles, W., & Maeder, A. (2005). Enhancing Artificial
Human Vision Systems to Assist Blind Mobility. Proceedings of the 2005
Smart Systems - Postgraduate Research Conference (pp. 133-141):
Queensland University of Technology.
9. *Dowling, J., Boles, W., & Maeder, A. (2005). The effect of frame rate
and spatial resolution on mobility using simulated Artificial Human
Vision. Digital Image Computing: Techniques and Applications (DICTA
2005) - Unreviewed poster session, December 2005, Cairns, Australia.
10. Dowling, J., Boles, W., & Maeder, A. (2006). Simulated Artificial Human
Vision: The Effects of Spatial Resolution and Frame Rate on Mobility.
Advances in Intelligent IT - Active Media Technology 2006 (pp. 138-143),
June 2006, Brisbane, Australia.
11. Dowling, J., Boles, W., & Maeder, A. (2006). A Display Framework for
Artificial Human Vision information presentation. Thirteenth Annual
xxxviii
Conference on Mechatronics and Machine Vision in Practice (M2VIP
2006), December 2006, Toowoomba, Australia.
xxxix
xl
Chapter 1
Introduction
1.1 Motivation and Overview
There are currently a number of research teams investigating the partial restora-
tion of sight to blind people. As early as 1929 it was noted by Otfrid Foerster
that stimulating the visual cortex of a person led to the perception of spots of
‘light’ [90]. These spots are referred to as phosphenes [90]. More recently it has
been found that stimulation of other components of the visual pathway (such as
the optic nerve or retina) can lead to phosphene perception. Phosphenes have
also been reported as a result of magnetic stimulation [82], hallucinogenic drugs
[202], and space travel [192] . With recent advances in digital camera technol-
ogy, computing, neuroscience and electrode technology, much progress has been
made toward building a useful Artificial Human Vision (AHV) system to present
phosphene information to a blind person.
The main motivation for AHV research is the promise of an improved quality
of life for blind people. Economic productivity could also be enhanced: for exam-
ple, it has been estimated that if all avoidable blindness in the USA in persons
under 20 and working-age adults were prevented, the federal budget could save
1
2 Chapter 1. Introduction
US$1.0 billion per year [245].
However, despite their great promise, there are currently a number of con-
straints in current AHV systems, including limitations in the number of elec-
trodes which can be implanted and the perceived spatial layout and frame rate of
phosphenes. The development of computer vision techniques that can maximise
the value of the limited number of phosphenes would be useful in compensating
for these constraints. A further problem is the limited number of people who have
received an AHV system implant and who are available for further research. For
this reason, there is a need to conduct simulated AHV experiments with normally
sighted research participants.
Three main functional requirements for blind users of an AHV system include
the ability to read text [47], [70]; to recognise faces [230], [226] and mobility [29],
[117]. Although reading and face recognition research have received attention in
simulation studies (for example [226], [19]) there has been less research conducted
on mobility. One reason for this gap in the literature is the difficulty in measuring
mobility objectively. The lack of an objective method to assess mobility perfor-
mance, allowing the comparison between different devices, was expressed by AHV
pioneer William Dobelle in 2000:
“We know of no objective method for comparing our ‘artificial vision’
system with a cane, guide dog, or other aid for the blind. For example,
there is no standard obstacle course on which such devices, or the
performance of volunteers using them, can be rated.” [48]
This problem remains unresolved as reported by Trick in 2004 [229]. However,
in this thesis it is argued that there is existing literature on low vision and blind
orientation and mobility (O&M) which can be adapted for this purpose during
AHV system development.
1.2. Aims and Research Questions 3
1.2 Aims and Research Questions
The general aims of the work described in this thesis are:
(i) To investigate, develop and evaluate techniques for mobility assessment
which will allow the objective comparison of different AHV system phosphene
presentation methods.
(ii) To develop a display framework for the presentation of AHV system in-
formation, and use this framework to guide the development of an AHV
simulation device.
The research questions which will be addressed in this thesis are:
1. Can specific main factors be identified as highly significant for
providing mobility information in an AHV system?
2. Can objective measures be developed for the comparison of effec-
tiveness between AHV systems in providing mobility information?
3. Can computer vision techniques be adopted and modified to pro-
vide mobility information in an AHV system?
1.3 Research Scope
(i) Currently AHV development in Australia (and in many other countries)
is limited to animal experiments as there is not access to implanted pa-
tients. To overcome this limitation, the mobility experiments described in
this thesis present an artificial visual scene simulation display to normally
sighted research participants. It is anticipated that the mobility experi-
ments described in this thesis could be repeated using implanted patients
when available.
4 Chapter 1. Introduction
(ii) The images presented to participants in this thesis are shown in an ordered
array (for example, a rectangular grid of 12 x 16 phosphenes). Current
generation AHV systems may not be capable of aligning phosphenes well in
such an ordered array, and these phosphenes may appear irregular in shape.
However, it is anticipated that future AHV technology will be capable of
addressing this current limitation.
(iii) A focus of this thesis is methodology. It is not the intention to validate a
particular method to a high degree of integrity.
(iv) The AHV simulations used in this thesis are currently more easily applied to
AHV systems involving a retinal implant. However, it is anticipated that the
methods used will eventually be applicable for other types of implants in the
future. In support of this statement, it has been reported that the perceived
display from the Dobelle Institute cortical surface electrode system has been
enhanced by Sobel edge detection [46].
(v) The aim of the work contained in this thesis was not to develop the best
AHV simulation device. A guiding principal in this work was to keep the
cost of simulation hardware to a minimum.
(vi) This thesis also identifies shortcomings of AHV simulation methods and
additional research areas of future opportunity.
1.4 Thesis Organisation
This thesis is structured as two main sections: Background (Chapters 1-4), and
experimental (Chapters 5-8), followed by a Conclusion (Chapter 9).
The background section provides a detailed review of fundamental theory
relevant to AHV mobility. Chapter 2 includes an overview of blindness and
1.5. Original Contributions of Thesis 5
mobility before providing a comprehensive review of blind and low vision mobil-
ity assessment. Chapter 3 describes current AHV system research, including
technology, requirements, stimulation locations, constraints and simulation re-
search. The final part of the background section is Chapter 4, which presents
methods for processing image information for AHV systems and a discussion of
previous applications of computer vision to assist the vision impaired. At the end
of Chapter 4 a proposed conceptual framework for the display of mobility related
information using an AHV system is provided. This framework drives research
questions around which the remaining thesis chapters are based.
The experimental chapters explore aspects of the conceptual framework pre-
sented in Chapter 4. Chapter 5 investigates the effect of four different image
processing methods on the recognisability of mobility components contained in
low quality AHV simulation images. The remaining three experimental chapters
focus on mobility assessment using custom built head-mounted AHV simulations.
These chapters are based on processing image sequences (i.e. video). Chapter 6
evaluates the use of a Personal Digital Assistance (PDA) real-time ‘looming ob-
stacle’ alert in a typical outdoor environment. In Chapter 7 this obstacle alert is
compared with two other AHV simulation display types using an indoor artificial
mobility course. In Chapter 8 the effects of two significant factors constraining
AHV systems (temporal and spatial resolution) are investigated in an artificial
mobility experiment involving 60 normally sighted volunteers.
Chapter 9 summarises the work in this thesis and provides a discussion of
how the research can be extended.
1.5 Original Contributions of Thesis
Resulting from this work are three significant original contributions to knowledge.
These contributions are explored through examination of the research questions
6 Chapter 1. Introduction
presented in Section 1.2.
The first research contribution resulting from this work is a conceptual frame-
work based on literature reviews of blind and low vision mobility, AHV technol-
ogy, and computer vision. This framework incorporates a comprehensive number
of factors which affect the effectiveness of information presentation in an AHV
system. Experiments reported in this thesis have investigated a number of these
factors using simulated AHV with human participants. It has been found that
higher spatial resolution is associated with accurate walking (reduced veering),
whereas higher display rate is associated with faster walking speeds. In this way
it has been demonstrated that the conceptual framework supports and guides
the development of an adaptive AHV system, with the dynamic adjustment of
display properties in real-time.
The second research contribution addresses mobility assessment which has
been identified as an important issue in the AHV literature. This thesis presents
the adaptation of a mobility assessment method from the blind and low vision
literature to measure simulated AHV mobility performance using real-time com-
puter based analysis. This method of mobility assessment (based on parameters
for walking speed, obstacle contacts and veering) is demonstrated experimentally
in two different indoor mobility courses. These experiments involved sixty-five
participants wearing a head-mounted simulation device.
The final research contribution in this thesis is the development and evalua-
tion of an original real-time looming obstacle detector, based on coarse optical
flow, and implemented on a Windows PocketPC based Personal Digital Assistant
(PDA) using a CF card camera. PDA based processors are a preferred main pro-
cessing platform for AHV systems due to their small size, light weight and ease
of software development. However, PDA devices are currently constrained by
restricted random access memory, lack of a floating point unit and slow internal
1.5. Original Contributions of Thesis 7
bus speeds. Therefore any real-time software needs to maximise the use of inte-
ger calculations and minimise memory usage. This contribution was significant
as the resulting device provided a selection of experimental results and subjec-
tive opinions. Experiments using this device were conducted in both indoor and
outdoor environments and are discussed in Chapters 6 and 7.
In addition, this thesis is original as it synthesises information from a number
of different research areas and supports this synthesis through scientific experi-
mentation.
8 Chapter 1. Introduction
Chapter 2
Blind and Low Vision Mobility
Issues and Assessment
2.1 Introduction
Before it is possible to measure how beneficial an Artificial Human Vision (AHV)
system will be, it is necessary to establish the main physical and psychological
requirements for such a system. In addition there is a requirement to establish
what the quality requirements are, and how they can be met. This chapter com-
mences by defining blindness and low vision. A review of the main mobility issues
for the blind is then provided, followed by a discussion of existing blind mobility
aids (such as the long cane, guide dog, and electronic devices). Orientation and
environmental accessibility are important factors which affect blind mobility, and
are briefly defined and discussed. A section on the ecological psychology work of
James Gibson is also provided in this chapter, as it provides a bridge between
mobility research and computer vision (for example regarding the concept of optic
flow). Finally the bulk of this chapter provides a thorough review of the literature
on mobility assessment.
9
10 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
2.2 Blindness and Low Vision
The section provides a brief overview of the definition and causes of blindness.
The term Blindness usually refers to Legal Blindness which is defined as a
visual acuity of less than 20/200 after correction (that is, objects at 20 feet
appear as if they are at 200 feet) or where the visual field is restricted to 20◦ or
less (a normally sighted person has a visual field of almost 180◦). This definition
of blindness is used in this thesis. Blindness can also include a variety of highly
specific defects such as a loss of vision in a particular visual field area or a lessened
ability to see in low illumination [180]. In 1997 the World Health Organization
(WHO) estimated that there were close to 150 million people with significant
visual disability worldwide, with 38 million of those people being legally blind
[246].
Approximately 10% of legally blind people have no light perception at all [143],
and are described as totally blind. Currently, most AHV research is generally
targeted at the totally blind.
People defined as having Low Vision have serious visual impairments (such
as central field loss, tunnel vision or blurred vision), but are not necessarily
legally blind. Because AHV systems are expected to restore vision only partially,
previous mobility research conducted with low vision participants should provide
useful insight for AHV system requirements.
Common causes of blindness include hereditary retinal degeneration and age-
related macular degeneration [143]. Due to projected ageing and growth of the
Australian population, by 2030 rates of severe visual impairment in Australians
aged over 50 years will have doubled from 25,590 to 57,930 people) [68].
In economically developed societies, the leading cause of blindness and visual
disability in adults is diabetic retinopathy. Around 120 million people worldwide
have diabetes and after 15 years approximately 2% of those people become blind
2.3. Mobility and related issues identified for people with low vision and blindness 11
while about 10% develop severe visual disability. Eye injuries account for around
1 million cases of blindness each year [247]. The most common non-preventable
cause of blindness in the developed world is age-related macular degeneration,
which occurs in 25% of people aged 80 years and over [248].
Nine out of ten blind people live in the developing world and 18% of the
world’s blind are estimated to live in China [249]. In general, more than two-
thirds of today’s blindness could be prevented or treated by applying existing
knowledge and technology [247]. Nearly half of all blindness is due to cataract
and a quarter of the world’s blindness is due to trachoma (infection with Chlamy-
dia trachomatis, spread by flies from infected excreta). Other major causes of
blindness are glaucoma (a group of eye diseases characterized by an increase in
intraocular pressure), trachoma and onchocerciasis (both parasitic diseases) and
xerophthalmia (caused by vitamin A deficiency) [30].
2.3 Mobility and related issues identified for peo-
ple with low vision and blindness
A number of issues concerned with movement and perception for the blind and
those with low vision have been discussed in the literature. This information
is valuable for AHV research as a successful AHV device should be capable of
enhancing the ability of a user in coping with these issues.
Mobility is commonly defined as the ability to travel between locations grace-
fully, safely, comfortably and independently (Foulke 1970; cited in [200]). For
example, mobility includes the ability to move through space without accidental
contact with obstacles [200]. Mobility is a complex task that can be affected by
variations in environmental conditions, the personal characteristics of travellers,
and situational factors such as the traveller’s familiarity with the area [135].
12 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
Congenitally blind (meaning those who are blind from birth) children often have
hypotonia, or abnormally low muscle tone (due to delayed sensorimotor develop-
ment), which can affect mobility. Congenitally blind children usually commence
crawling after 13 months of age (compared to the sighted average of eight months)
[188]. An additional problem, experienced by most blind patients with no light
perception, is falling out of phase with the 24 hour day, often leading to severe
sleep disorders [221]. Blind people may also have multiple disabilities, such as
hearing loss.
Factors identified for low vision mobility include the amount of residual vision,
the age on onset of visual impairment, posture and balance, intelligence, space
orientation, auditory-tactile abilities and personality [145]. Age is another factor,
as many of the blind are elderly, which can restrict their ability to use some
mobility aids (such as a guide dog). Where there is substantial visual field loss,
the loss of the peripheral field affects mobility more than central field loss [145].
Two significant problems in street locomotion for the blind are the reliable
perception of objects, such as obstacles and landmarks, and adequate spatial and
geographic orientation [22]. Landmarks can also be obstacles, such as a tree or
steps.
Some of the everyday problems experienced by blind people involve street-
crossing, identifying and locating building entrances and interacting with auto-
matic teller machines and information kiosks [189]. Street crossings can cause
significant anxiety [73]. For example, Guth et al. [87] have reported that crossing
a street usually involves the following four main tasks:
• Detecting the street (ramp slop, sound, traffic, texture).
• Aligning the body (for example, by using the bars on a sewer grate, or by
tracking traffic sound).
• Deciding when to initiate the crossing (which can be difficult with quiet
2.3. Mobility and related issues identified for people with low vision and blindness 13
cars and bicycles).
• Walking across the road in a straight line (veering from a straight line is
common).
A summary of blind pedestrian needs were published by the National Research
Council in 1996 [64]. They were reported as:
1. Detection of obstacles in the travel path from ground level to head height
for the full body width.
2. Travel surface information (including texture and discontinuities).
3. Detection of objects bordering the travel path.
4. Distant object and cardinal direction information (particularly for the pro-
jection of a straight line).
5. Landmark location and identification information.
6. Information enabling self-familiarisation and mental mapping of an envi-
ronment.
Dangerous situations for the blind or partially sighted have been reported by
Pelli as [172]: drop offs (such as train platforms) and moving vehicles. These is
supported by Brambring [22] who stated that the most dangerous obstacles are
downward steps and low or fast moving obstacles. An additional problem involves
making unwanted contact with a pedestrian which can be socially awkward and
may pose a threat to a person’s safety [74]. Additional problems have been
reported by Geruschat and Smith [73] as lighting conditions and glare (eg. light
adaptation), changes in terrain and depth (stairs, curbs), differences between
reduced acuity (reading, etc) and restricted fields (trouble with groups of people,
shopping, etc) and visual clutter (for example, many signs or complex signs).
14 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
2.4 Primary mobility devices for the blind
This section provides an overview of the main mobility devices currently available
for the blind. An understanding of these mobility devices is important, as an AHV
system may need to provide similar functionality to compete commercially and
to be successful, the benefit from using a mobility aid must outweigh the cost
(which includes financial, mental, emotional and physical aggravation [90]). It is
also possible that a traditional mobility device and an AHV system may be used
in combination.
A primary mobility aid is one which can provide enough information about
a person’s immediate environment to allow them to be mobile. The amount of
time which a person has to react to the environment is important and is referred
to as preview time. The most important mobility aids are the long cane and the
guide dog, which are briefly described below.
2.4.1 Long Cane
The long cane is capable of providing a blind person with sufficient information
for safe movement in the immediate environment at a low cost. Mobility enhance-
ment in a blind person after long cane training has been described as dramatic
[52]. Long canes are usually made from fibreglass and are designed to provide
good vibration conductivity (different tips can be used on the end of the cane to
provide this information).
The high visibility of the long cane to drivers and other pedestrians can be an
advantage, although this may also make it easier for criminals to identify a target.
The most significant problem with the long cane is that it only provides two paces
of preview. A long cane user needs fast reaction times due to the limited amount
of preview information. Cane use requires a high level of concentration and the
arm movements can cause tiredness. A significantly problem with the cane is
2.4. Primary mobility devices for the blind 15
that it does not protect a person against obstacle collisions to the upper part of
their body (in Australia, an example of such an obstacle is a wall-mounted public
telephone, shown in Figure 4.1). There is also a risk of tripping other pedestrians
with a cane [64].
2.4.2 Guide Dogs
Guide dogs provide good mobility assistance by pulling in the same way that a
human guide would. They are able to respond to hand and voice signals (such as
‘forward’) and are trained to avoid obstacles, prevent veering in street crossings,
and stop if there is a dangerous situation. Guide dogs are also trained to intel-
ligently disobey commands that are not safe. Another benefit is that a dog can
remember common landmarks (such as a particular shop door).
However, guide dogs are not suitable for people who are not comfortable with
dogs, are not physically fit or cannot maintain a dog [244]. In addition, a guide
dog user still requires a high level of mobility skill and would usually be required
to have long cane training. Guide dogs are also expensive to train and there are
a large number of dogs who cannot be trained to the required high standards.
2.4.3 Electronic Travel Aids
A large number of Electronic Travel Aids (ETAs) for the blind have been de-
veloped over the past 40 years. ETA’s are designed to transform environmental
information into a form that can be conveyed through other sensory modalities
(auditory or tactile) [64]. Information from the ETA is usually obtained through
three sensor types: ultrasonic, laser or visible light. An AHV system could be
considered to be an ETA device (although this has not been reported in the AHV
literature).
As mentioned above, preview is important for the visually impaired traveller,
16 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
as it allows them to anticipate problems before they occur. The ability of a device
to provide greater preview information than the long cane could be very useful.
Therefore many of the ETA’s attach the sensors to the long cane.
Ultrasound based devices (which generally provide auditory information) in-
clude the SonicTorch, SonicGuide and SonicPathfinder [99]. More recent ultra-
sonic devices include the Navbelt and Guidecane [201]. Greg Phillips from the
Australian company, GDP Research [176], has developed a low-cost, hand held
ultrasonic device called the Miniguide. The Miniguide has been successfully tri-
alled by the Guide Dog Association of New South Wales, and is supplied at no
cost to blind people by the Guide Dogs for the Blind Association of Queensland
[101]. A recent ultrasound device is the Ultracane, developed by researchers from
Sound Foresight in the UK [69]. This device emits ultrasonic waves from the
cane handle, which are recorded as they bounce back from objects. The Ultra-
cane uses two buttons which vibrate, enabling distance and direction information
to be transmitted.
The original laser ETA device, developed during the 1960s, was the Laser
Cane, which used three low power lasers [53]. This device used the different
lasers to attempt to detect drop offs, overhanging obstacles and forward obstacles.
However, the tactile and auditory output was found to be confusing and the device
was very expensive. A problem with using laser energy is that wet surfaces can
provide misleading information, as light is reflected away from the device. A
recent version of the laser cane has been developed by the German company
VISTAC [121]. This device uses a single laser to detect objects directly above
the cane in the head and chest area. The cane then vibrates when an obstacle is
present.
An alternative ETA approach is to process images captured from a head
mounted camera and provide an auditory representation of these images via head-
phones. There are currently two systems which use this method. The first is the
2.5. Orientation Aids 17
vOICe system developed by Peter Meijer [150] which processes image informa-
tion (using a 64x64 pixel array) and provides a sound representation once per
second. The second device is the Prosthesis for Substitution of Vision by Au-
dition (PSVA) which maps images at a lower resolution (128 pixels comprising
8x8 pixels for the centre of the image and 8x8 pixels for the image peripheral),
however provides information at a higher rate of 25 images per second [177].
Despite the large range of ETAs which have been developed, none has achieved
widespread acceptance by blind people. The main reasons for the low uptake of
devices has been that they offer little benefit in mobility, are expensive and are
cosmetically unattractive [52]. The objective assessment of mobility benefits from
ETA devices has also hindered development. There is often little or no published
research on the benefits of these devices, which makes it difficult for consumers
to compare the benefits of different devices. Also training programs need to be
developed and conducted for device users.
2.5 Orientation Aids
Orientation is defined by Brabyn et al. [21] as a person knowing where they are
in absolute terms of reference. Blind people may use different problem solving
strategies such as locating landmarks, recalling mental maps of familiar places,
asking for help, or using systematic familiarisation to explore an environment
[134]. It should be possible to integrate orientation information with an AHV
system (for example using GPS data).
Existing orientation aids include large print or tactile maps which can provide
cognitive maps for the blind traveller. The most useful tactile maps are those that
model the real-world - such as Lego or matchbox cars. Environmental regulari-
ties, such as parking meters or sidewalks, can be monitored to solve orientation
problems. There is usually a pattern in number systems, such as house numbering
18 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
[14]. Verbal aids can be useful for the rote learning of routes: these may involve
a running documentary of a route in the actual time frame or a recording of a
route from observation [251]. Future orientation devices will probably consist of
electronic databases of geographic information combined with GPS to provide
orientation maps. It should be possible to integrate this type of information with
an AHV system.
2.6 Environmental accessibility
An alternate approach to assisting with blind mobility is to make changes to
the environment. The environmental information which is often unavailable to a
blind traveller includes: the names of streets and landmarks, room numbers, bus
numbers/destinations, directional information in train/bus stations, intersection
configuration/type of traffic control and the status of traffic light cycles [14].
Changes to the environment include tactile guide strips (also known as Braille
strips) which can be used to indicate the direction of paths, or warning of drop-
offs, such as stairs. When tactile strips are correctly applied, the safety and
confidence of a blind person can be significantly increased, while only mildly
inconveniencing other pedestrians [124].
Environmental accessibility for blind and the partially sighted can involve
the use of a logical design layout (for example, stairs should be next to lift),
assistance with visibility (for example, hand rails should have high contrast)
and adequate lighting (which should be 50-100 % greater than that required
for normally sighted) [14].
Useful mobility and navigation information can also be provided by locating
transmitters at strategic locations in the environment; a blind person can then
use a hand held receiver to receive directional information about the landmark.
This approach has been used in the ‘Talking Signs’ program which has been
2.7. The ecological approach to perception 19
implemented in many locations in San Francisco [222]. An adaptable AHV system
may be able to integrate sign recognition and landmark location, and present this
information to a blind traveller.
Further details of computer vision research for the blind is provided in Chapter
4.
2.7 The ecological approach to perception
The work of perceptual psychologist James Jerome Gibson provides a bridge
between visual perception, mobility and computer vision research, and his work
is heavily cited in the literature for each of these fields, in addition to the fields
of ergonomics and design. Gibson suggested that perceptual systems evolved in
moving organisms, and to understand perception it is necessary to consider the
immediate environment within which the organism has evolved. This ecological
approach to visual perception emphasises movement in a complex and changing
environment. This suggests that the assessment of mobility should be conducted
while research participants are moving.
Blind mobility is possible because perception involves senses apart from vision.
For example, people with normal vision generally do not notice the difference in
reading on a bright versus a dull day or hearing a tune in a different key [52].
Similarly, a blind person may perceive an obstacle near their head, but may not
realise that they have used ‘facial vision’ (auditory echo location) [77]. In the
context of blind mobility, perception can be considered to be the combination
of exploratory actions and knowledge of the surroundings gained from looking,
touching and hearing [87].
Invariant components of an environment are those that maintain their identity
but may change in appearance (such as the ground or a cup). The ecological
20 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
approach suggests that invariant components provide affordances, such as walk-
on-ability, grasp-ability or collision [139]. Surfaces are particularly important for
perception, as the surface affords locomotion to an organism. The ground is the
most important surface. However, it is possible to misperceive affordances (such
as mistaking a glass wall for an opening) [78].
The movement of the body, head or eyes of an observer produces a trans-
formation in the entire retinal image, which Gibson called optic flow [77]. For
example, there will be a lateral flow of information across the retina when the
observer moves their head from left to right and an expanding pattern as the
observer moves forward. As a person walks through an environment the trans-
formations in optic flow reveal the layout of surfaces by occlusions/disocclusions
and expanding patterns produced by approaching obstacles [53]. The direction
of observer motion is indicated by the point in the optic flow from which motion
vectors originate (this is called the focus of expansion). Optic flow computation
has generated a large amount of literature in computer vision, as it is useful for
solving motion problems with stationary or moving cameras, can provide rela-
tive distances of objects in an image sequence, and can be used to represent the
three-dimensional motion of an object across a two-dimensional image [206]. This
method is discussed further in Chapters 4, 6 and 8 of this thesis. Gibson’s theo-
ries on motion led to the speculation and discovery of higher order visual neurons
that analysed differences between the centre and surround of a receptive field,
independent of direction [153].
Although Gibson’s theories have been influential, two main limitations of his
work have been identified. First, Gibson underestimated the difficulty of the
visual system detecting invariants. Secondly, his work did not explain how three-
dimensional information is detected by a moving observer [144] . Later infor-
mation processing approaches to visual perception (such as the seminal work of
David Marr which is discussed briefly in Chapter 4) have provided greater insight
2.8. Mobility assessment 21
into these limitations.
2.8 Mobility assessment
As stated in Chapter 1, a number of AHV researchers have commented that
they are unable to objectively compare AHV system information presentation
methods. This section presents a critical review of blind and low vision mobility
assessment research. This research has generally been undertaken to test the
effects of instruction by orientation and mobility specialists on mobility (learning
effect) and the evaluation of specific mobility aids (such as a cane, or electronic
device).
Mobility assessment should be able to allow the objective testing of different
techniques or devices. However, it is difficult to generalized tightly controlled
findings from a laboratory setting to real world mobility tasks. Even within the
laboratory it is difficult to manipulate variables experimentally or to measure
responses [214].
Three main methods for assessing mobility are reported in the literature:
self report questionnaires, field experiments and artificial environments. These
methods are presented in the following three subsections. A trend in recent
research has been to include both artificial and field experiments, and these papers
are reviewed in the fourth subsection.
2.8.1 Mobility assessment: Self report research
Self report research is the most common method of conducting mobility perfor-
mance and may be the simplest method to obtain information on which parts
of the environment contribute to mobility problems. However, this method of
research is less reliable than the other methods of mobility assessment, as it relies
on subjective reports which may be biased. This method of mobility assessment is
22 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
not widely reported in the literature, however the following three papers illustrate
the types of interesting results that can be obtained from this approach.
In the first research paper, two experiments were conducted by Brambring
[22] into mobility problems experienced by blind people while walking on streets.
These investigations provide some insight into what information is most useful
for blind mobility. In the first investigation, four blind students (aged in their
twenties) were asked to describe a single walk on their daily routine from a dor-
mitory to their bus stop. Their descriptions was recorded and later analysed to
reveal that landmarks were the most frequently mentioned items, with distances,
directions and obstacles less frequently mentioned.
In the second experiment by Brambring, nine blind subjects (with an average
age of 25 years) were asked to walk along two different routes, and describe the
route for another blind person. In addition, nine sighted subjects (with similar
age, sex and education to the blind subjects) were asked to walk along the same
two routes and record descriptions for another sighted person. Transcriptions
of the descriptions suggested that the blind need considerably more information
for mobility than sighted people: which would require greater memory and more
effort to recall. The blind subjects made more explicit statements about distances
than the sighted subjects. Although these experiments provide some subjective
mobility information, a problem with this approach is that individual differences
(for example, in the ability to provide a description) could be significant variables.
Another example of a self report experiment, designed to gain an understand-
ing of how different types of visual problems affect mobility, was conducted by
Passini et al. [168] and involved interviewing 47 subjects who ranged from total
blindness to having a strong residual vision. Passini et al. reported that the most
mobility problems occurred in vast spaces (indoor and outdoor), shopping centres,
department stores, hotel lobbies, and public transport stations. In addition po-
tential sources of mobility accidents were reported as: descending stairs (without
2.8. Mobility assessment 23
a structural warning or handrail), benches, half-open doors, and objects which
cannot be detected with cane on the ground (such as public telephones fixed on
walls or rear vision mirrors on trucks). Subjects were most worried about cars.
Passini et al. noted that mobility research needs to involve communication with
visually impaired people. Although this study provides useful information, one
problem is that the interviewees may not have been aware of, or may not have
remembered, all mobility problems they had encountered.
2.8.2 Mobility assessment: Field Experiment research
Some people (for example those with cognitive impairment) may have trouble
expressing mobility performance in a self-report assessment), and a performance-
based assessment may be more accurate. Field experiments involve the use of
real-world environments for mobility assessment, such as streets or shopping cen-
tres. Although there is less control over variables in field experiments compared
to laboratory study (such as the frequency of pedestrians, noises, varying light
sources and unpredictable obstacles), they can be used to measure participant
behavior in an objective way and the results may be more generalisable to real-
life mobility situations than laboratory experiments. A number of alternate field
experiment studies will now be discussed.
Productive Walking Index
During the 1970s, an influential Nottingham University research team developed
a mobility assessment technique which has been used to evaluate a number of
mobility aids. This assessment involved measuring a subject’s performance over
a 1300 meter course through an urban environment, which included a range of
typical mobility situations [6]. A video recording was made of the subjects as they
moved through this environment. Three trials were used on this course: in the
24 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
first trial the subject used their conventional aid (such as a cane), in the second
they used the mobility device to be tested. In the third trial, the subject was
again recorded using their conventional aid. The purpose of the third trial was
to control for the effects of familiarisation with the test route (a similar approach
is used in Chapters 7 and 8 of this thesis).
The three video recordings were then analysed for safety, efficiency and psy-
chological stress information. The definitions used for these scores are listed in
Table 2.1. The safety measures were designed to record the frequency and type
of body contact with the environment and the number of accidental departures
from the footpath. The Productive Walking Index (PWI) was used to measure
the mobility efficiency of the subject. PWI is the ratio between the time taken
to complete the course and actual time spent walking in the correct direction
[100]. If a subject spent time standing still or backtracking after an orientation
error, this would be reflected in the PWI. In a later follow-up study Dodds et al.
[54] used an outdoor mobility course which involved walking along a foot path
to a telephone booth. The course involved three road crossings and a number of
natural obstacles such as trees, lamp posts and a hedge. From the results of this
study, Dodds et al. reported that the PWI was a reliable mobility measure.
The average stride length was later removed from the Blind Mobility Unit
evaluation. Stride length was meant to give a measure of subject stress, however
it was found to be unreliable (as it relies on the number of steps taken in each part
of the route which would be dependent a person’s gait) [54]. During the early
1980s the list of dependent variables in the Blind Mobility Unit evaluation method
was simplified and Dodds [51] proposed the measures in Table 2.2. However, a
problem with the Blind Mobility Unit assessment was the unreliability of data
between trials. Despite the same physical environment used for each trial, the
mobility route used by subjects differed slightly, which meant some obstacles
or environmental features were not within range of the participants [51]. An
2.8. Mobility assessment 25
Table 2.1: Nottingham Blind Mobility Unit dependent variable measures [6].
1. Safety Scores Body contacts at, or rising from, ground level
Body contacts with the inner shoreline
Body contacts with obstacles at, or near, head height
Accidental departures from side curb
Accidental departures from down curb
Trips at up curb
2. Efficiency score Average walking speed (metres per second)
Continuousness of progress (Productive Walking Index)
Variation in pavement position
Proportion of landmarks detected
Average angle of veering on road crossings (degrees)
Crossing efficiency index
3. Psychological stress scores Average stride length
additional problem uncovered from the Blind Mobility Unit studies was that many
long-cane users already travelled well and the effect of a secondary device, such as
a ultrasound-based Sonic Torch device, did not lead to a significant improvement
[51]. Also, when a subject was using a long cane, bodily contacts with obstacles
was very rare, which meant that most subjects scored high on safety and efficiency.
Percentage of Preferred Walking Speed
The Productive Walking Index was reviewed by Clarke-Carter et al. [41] who
suggested that a better measure was the Percentage of Preferred Walking Speed
(PPWS). The Preferred Walking Speed (PWS) is defined as the speed of a visually
impaired person walking at their preferred speed, with a sighted guide holding
their arm. PWS requires an instructor guiding a participant over a known dis-
tance and dividing the distance by the time taken. The walking efficiency of
experimental participants can then be measured and normalised over a longer
26 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
Table 2.2: Revised Nottingham Mobility Unit measures from Dodds [51]. Shore-lining refers to following a path or wall using tactile or auditory information.
1. Productive Walking Index (time taken / time spend walking)
2. Body contacts with obstacles
3. Cane contacts with inner shoreline
4. Cane contacts with outer shoreline
5. Pavement position
6. Body contact with shoreline
7. Major safety errors (tripping, bodily contacts with obstacles)
mobility course by calculating their measured speed as a percentage of the PWS
[208]:
PPWS =SMC
PWS× 100 (2.1)
where both Preferred Walking Speed (PWS, measured over a short distance)
and Speed on the Mobility Course (SMC) are defined as:
PWS = SMC =distance
time(2.2)
The PPWS can be used as a between-participants measure to compare differ-
ent walking speeds, in addition to assessing mobility changes in a single partici-
pant. The PPWS allowed the assessment of subjects who did not use a cane and
allowed different subjects to be compared.
Clarke-Carter et al. [41] tested the new PPWS measure by recording walking
speed using a large backpack strapped to the research subject. This backpack was
then connected to a cumbersome device consisting of three wheels which would
have limited the usefulness of this measure (for example, walking up stairs).
However, this study was able to show that participants using a guide dog had
significantly higher PPWS scores than participants who used the long cane.
A further field experiment using the PPWS was conducted by Dodds et al.
2.8. Mobility assessment 27
[55]. This study investigated if mobility in low vision clients could be predicted
by current theories of perceptual functioning. The authors developed a series of
four visual tasks based on the perceptual learning theories of James Gibson (dis-
cussed in Section 2.7 above) and Ulreich Neisser. These tasks (referred to as the
OCULA assessment and training suite) involved a subject pressing a computer
key as soon as they perceived movement or recognisable objects in simulations of
textural shearing, degraded figures, embedded figures and peripheral attention.
The results from 37 subjects suggest that the visual tasks were better predictors
of visual performance than visual field and acuity measures. This research also
suggested that learning from the OCULA tasks could be applied to real-life mo-
bility situations. A similar approach could be used in training people in using an
AHV system display.
The PPWS measure has also been used to measure mobility in a study of
simulated Retinitis Pigmentosa (RP) by Haymes et al. [94]. This study involved
20 normally sighted subjects wearing swimming goggles which had been painted
on the inside and with a 5 mm hole cut through the centre of each eye piece to
simulate advanced RP. Different filters were fixed inside the goggles to simulate
different lighting conditions. The outdoor mobility route was a 220 meter resi-
dential street of ‘moderate difficulty’, with obstacles such as driveways, cracked
concrete and overhanging branches. Haymes et al. stated that the effects of
learning a route on mobility performance diminish after two attempts, so sub-
jects were measured on the route twice, walking with normal vision. The time
taken for this was considered the preferred walking speed for the PPWS calcu-
lation. Ten subjects repeated the mobility experiments indoors; however there
were no details of the test route used. A significant finding from this study was
that the clinical vision measures (such as visual acuity and the Melbourne Edge
Test) taken indoors were not useful in predicting outdoor mobility performance.
In an alternate PPWS-based study by Haymes et al. [95], three real world
28 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
mobility routes in Melbourne, Australia were used to measure the effects of vision
and psychological variables. Eighteen subjects with varying degrees of blindness
from RP were involved in the study. The first route (228 meters) involved a
quiet residential street, the second (202 meters) involved an outdoor small busi-
ness area with numerous obstacles (pedestrians, rubbish bins, seats, etc), and the
third (254 meters) took place in an indoor shopping centre with many obstacles
(pedestrians, escalators, plants, etc). Subjects walked the routes twice, in random
order and only the walking speed from the second trial was used. The preferred
speed of subjects was also calculated on a separate, obstacle free course and this
measurement was used for the PPWS calculation. Subjects also had a vision as-
sessment and took the NEO-PI (Neuroticism Extraversion Openness Personality
Inventory-Revised) to measure personality variables. This study found a highly
significant correlation between vision assessment and retinitis pigmentosa. How-
ever, there were no significant correlations found between psychological variables
and mobility in this study.
Travel time and mobility incidents
In 1998 the effects of RP on mobility were examined by Geruschat et al.[74]. In
this study measures of visual function and a self-report mobility questionnaire
were recorded, in addition to mobility measures which included: travel time and
the number of mobility incidents (bumps, stumbles, neglecting to detect stairs,
and problems remaining oriented once travelling in the correct direction). Two
courses were used: the first was a simple course built in a basement hallway (49
meters) which had paper cups hanging at varying heights from overhead vents and
floor mats as obstacles. Two different illumination levels were randomly allocated
to the first course. The second course was the main corridor in the Johns Hopkins
Hospital Outpatient Centre (444 meters), which had many obstacles including
pedestrians, elevators, and small shops. Forty-one subjects were involved in the
2.8. Mobility assessment 29
study, 16 of whom were normally sighted and the remaining 25 with RP. Very
few mobility incidents occurred in the first course, however it was found that
subjects with RP were five times more likely to have mobility incidents than
sighted subjects. The RP subjects were found to walk more slowly than normally
sighted subjects. It was noted that very few contacts occurred with pedestrians in
the second course, as pedestrians will usually move out of the way before contact
is made. However, a problem with the results from this study are the potential
variations between trials, caused by movement in the crowded hospital outpatient
centre. A large sample size would be required to rule out the possibility that one
of the groups were exposed to more obstacles than the other group.
Detection of curb ramps
Curb ramps provide a smooth surface from a footpath onto a road and have
been designed to assist with the mobility of people with physical disabilities (for
example, with a wheelchair). However, they can increase the danger to blind
pedestrians, who may not realise they have entered a road. These ramps were
studied by Benzen and Barlow [15], who assessed the detection rate of 80 subjects
in eight U.S. cities. The measure used in this study was whether the subject
stopped before the road or not. Subjects failed to stop before entering a street
in 39% of approaches by a curb ramp, and this increased to 48% when the slope
was minimal. There was no difference found in curb detection between subjects
who travelled frequently (six or more times per week) and those who travelled
fewer than three times a week. These findings support the use of tactile strips at
either end of curb ramps to provide a warning for blind pedestrians.
Heart Rate
Heart rate has also been considered as an objective measure of stress in mobility.
For example it was found that the heart rates of blind and partially sighted
30 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
subjects were significantly higher when an instructor was not present on the
same mobility course [223]. However, the use of heart rate can be affected by
the momentary work load during mobility tasks [6]. Probably for this reason
heart rate has not been considered in any other Orientation and Mobility (O&M)
assessments reviewed in this Chapter.
Mobility measurement reliability and validity
Mobility field experiments generally involve the use of an O&M instructors ob-
servations of mobility (for example in scoring the frequency of obstacle contacts).
The reliability and validity of these observations were studied by Geruschat et al.
[72], whose study involved 36 subjects (mean age 57) with varying levels of visual
impairment walking a route which included residential travel and small business
travel. Five O&M instructors assessed subjects on the mobility problem types
listed in Table 2.3. These measures were selected by Geruschat et al. as they
had a low cost and did not involve expensive laboratory equipment. Interrater
reliability (the degree to which the O&M instructor scores were similar) between
the instructors was found to be satisfactory.
In a second component of the Geruschat et al. study, 19 of the 36 subjects
were assessed using the measures in Table 2.3 before and after mobility training.
Mobility ratings were reported to have improved significantly after training. To
assess whether the measures were valid, the five O&M instructors were asked to
rank subjects in terms of mobility improvement. The combined mean instructor
ranking was found to correlate significantly with the change from pre-post in-
struction. However, it has been suggested that an improvement could have been
expected without training due to the effects of practice on the mobility course
[208]. An additional problem, discussed by the authors, was that there was a low
number of recorded pre-test mobility incidents.
2.8. Mobility assessment 31
Table 2.3: Mobility measures used in Geruschat & de l’Aune [72]
1. Unsafe street crossing (crossing at an inappropriate time or to an incorrect area)
2. Bumps (body contact (excluding hands) with any person or object
3. Stumbling (change in posture/gait as a result of contact below the knee)
4. Orientation (change in direction that does not match instruction or subject
is unable to complete section)
5. Drop-offs (unexpectedly stepping off a curb or step)
Navigation
In more recent research, Loomis et al. [136] conducted an experiment involving
GPS navigation aids for the blind. For their study, the research participant was
taken to a large open field and was requested to walk along a route specified by
a computer. Loomis et al. suggest that visually impaired subjects will soon be
navigating indoor and out using GPS-based navigation systems and local position
technology (such as Talking Signs). If a person’s mobility was combined with
the ability to navigate (for example by using the PPWS measure), this type
of experiment could be a valuable way of assessing the benefits of GPS-based
navigation devices.
In summary, the most frequently used measure in field experiments is the
PPWS. Contacts between obstacles and a person’s body are also frequently used
as dependent variables.
2.8.3 Mobility assessment: Artificial environment research
An artificial or laboratory-based study involves designing and constructing a mo-
bility course for use in mobility assessment. The two significant advantages of
artificial environments over field environments is that they provide better exper-
imental control over variables, and that the results should be easier to replicate.
Although artificial environments provide the most reliable results, there is the risk
32 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
that the results may be too artificial to generalise real world situations. However,
it has been reported by West et al. [243] that performance in their artificial as-
sessment environment and a person’s home environment were highly correlated.
Jansson [115] suggests that the artificial environment method should be the main
method during the development phase of a new ETA.
However, an artificial environment may not include many of the variables
that affect mobility performance, such as traffic sounds, pedestrian density and
variation of footpath surface [98]. Also, artificial environments do not involve the
same level of risk (such as a collision with a moving object) as a real world course
[172], and this may affect generalisability. An additional criticism of artificial
environments has been that they may be biased to favour a particular mobility
device [98].
Walking speed
In 1963, Michunas and Sheridan (cited in [200]) provided the first published
experiment on a simulated mobility environment. In their study they built a 51.8
meter long course with obstacles such as environmental sounds, steps up and
down and obstacles at head and ground level. The measures used were the total
time on course and a count of operationally defined harmful events. However,
the effects of the obstacles and masking sounds used on mobility performance in
this experiment were inconclusive due to masking conditions being provided in
the same order for all five blind participants.
Echolocation
An investigation into the usefulness of long-cane tapping noises for echolation
(using reflected sound to identify objects) for blind people was conducted by
Schenkman & Jansson [193]. Subjects were presented with a range of objects to
detect under experimental conditions. The results indicated that objects could
2.8. Mobility assessment 33
Table 2.4: Obstacle types used in Lovie-Kitchin et al. [137]
1. Suspended horizontal objects with a base at a minimum height of 140 cm
2. Suspended vertical objects with a base at a minimum height of 140 cm
3. Ground level objects with a maximum height of 37 cm
4. Ground level objects with a maximum height between 38 and 139 cm
5. Ground level large objects with a maximum height 140 cm or greater.
be detected and localised by tapping sound alone, but it was difficult and the
results varied with the size of the object.
Percentage of Preferred Walking Speed
In a large artificial environment, Lovie-Kitchin et al. [137] examined which areas
of the visual field were important for safe efficient mobility. Eighteen subjects
were involved in the study, nine of whom were classified as low vision, with the
remaining nine normally sighted control subjects. Subjects received visual field
measurement and were then assessed twice on an indoor mobility course, each
under different illumination conditions. The mobility course was 79 meters in
length, and used 87 different obstacles (which are listed in Table 2.4). Mobility
was measured as time taken to complete the course, number of contacts with
obstacles and the number of times subjects strayed from the path or required
reorientation. The study found that the loss of visual field in the mid-peripheral
or peripheral inferior and lateral areas had the most significant effect on reduced
mobility.
In 2000, a non-guided version of the PWS was developed and assessed by
Soong et al. [207] and was found to be as reliable as the guided version. In this
version, the PWS is obtained by recording the subject walking down a 20 meter
corridor which does not contain obstacles.
PPWS was also used to evaluate the effect of O&M training on mobility
34 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
performance by Soong et al. [208]. Thirty-seven visually impaired subjects were
involved in the study: 19 underwent mobility training and a control group of 18
subjects did not receive training. The unguided PPWS and mobility incidents
were used to assess subject mobility on an indoor course. Two visits to the
mobility course were conducted - each subject conducted the mobility trial twice
on each visit. Where subjects in the training group were prescribed mobility aids,
these were used in the second trial. The indoor mobility course was constructed
in two linked laboratories and was 78.9 meters long. The allocation of the 100
obstacles used were similar to [137], with five different height ranges used (0-13cm,
14-49cm, 50-99cm, 100-150cm, 151+ cm). Half of the obstacles were covered in
light grey paper to provide high luminance, and the other half were covered in dark
grey to provide low luminance. Subjects were asked to proceed through the course
while carrying out two typical mobility tasks: taking a small packet to a bench;
and picking up three empty food containers, placing them in a bag and carrying
them to another table. Errors were defined as body contacts with obstacles;
errors made while conducting the two tasks; and straying off the mobility path
(which was marked by rolled up bubble wrap). If a subject was unable to re-
orient after contacting an obstacle or straying from the path two errors were
counted. Surprisingly, this study found that O&M training did not enhance
mobility performance compared to the control group, and any improvement was
simply the result of practice.
Walking speed and obstacle contacts
A simulation of an AHV system and the effects on mobility were investigated
by Cha et al. [29]. Their ‘pixelized vision simulator’ device consisted of a video
camera connected to a monitor in front of the subject’s eyes. A perforated mask
was used in front of the monitor to reproduce the effect of individual phosphenes.
Optical lenses were then placed between the mask and the subject’s eye to reduce
2.8. Mobility assessment 35
the size of the image. This device was then used to investigate the feasibility of
achieving visually guided mobility with a visual prosthesis. An indoor maze was
constructed which allowed the test path and obstacle positions to be randomly
varied for each trial. The obstacles used were cylindrical paper columns which
were 5cm in diameter and 1.8m in length. The room was divided into 1.4 x 1.4
meter square blocks. Walls, cloth screens and the floor were white, whereas the
obstacles where black. Three 2.5 cm wide black strips about 50cm apart were
placed horizontally on the walls and screens to provide a high contrast indicator
of wall or screen location. Normally sighted, undergraduate subjects wore the
pixelized vision simulator while moving through the maze. Each subject was
tested in one two hour session per day over 8-10 trials. Subjects were asked
to move as quickly as possible, and walking speed and the number of contacts
with obstacles and walls were measured. These measurements were designed to
evaluate mobility performance as a function of pixel number, pixel spacing, object
minification and field of view.
Cha et al. [29] found that a foveally projected visual scene consisting of 625
(25 x 25) or more pixels with a field of view of about 30◦ allowed nearly normal
walking speeds and reported that this would provide good obstacle avoidance
and a sense of confidence to patients in familiar environments. Another finding
from Cha et al. was that head movement helped depth perception and improved
spatial resolution, but that this movement needed to be efficient to avoid a loss
of body balance.
There are two limitations with Cha et al.’s study. The first is that individ-
ual differences in walking speed are not taken into account, which a normalised
measure such as the PPWS would have done. This may restrict the study results
to examining how an individual subject was able to learn to move through the
course (and not differences between subjects). In addition the mobility course
was very artificial and the results may not generalized to a real life environment.
36 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
For example the walls and floor were painted white, and obstacles were all the
same shape and size with very high contrast (black on white).
Another study which used walking speed as a dependent variable was by Kuyk
et al. who examined the effects of changing light level on mobility performance
[122]. This study involved the mobility assessment of 88 visually impaired adults
under different lighting effects. The performance measures used were: time taken
to walk the course and the total number of contacts (recorded by a trained O&M
instructor walking behind the subject) with objects in the course. These measures
were taken for each subject in normal and reduced illumination. Subjects wore
modified sun shades to simulate lower illumination levels. The mobility course
had two start points and one dead-end - it is unclear if the start points or illumi-
nation levels were randomised for different subjects. The course, constructed in
a laboratory, involved 60 objects, mostly foam cylinders of different types (such
as step-over, shoulder-to-head level and walk-around) in fixed locations. Each
object was rated as low or high contrast. The pathway was 3 to 5 feet wide and
was usually marked with dark blue tape on the floor. The contrast and location
of objects significantly effected mobility through the course. Also, the ability to
avoid these objects, particularly step-over objects, was significantly reduced with
low illumination.
In a follow up study, Kuyk et al. [123] evaluated the mobility of a further
156 subjects on the same course used in their earlier paper in order to assess how
mobility performance relates to visual function. Visual field extent and scanning
ability were found to be the best predictor variables for mobility performance.
Performance based measures
The Salisbury Eye Evaluation (SEE) project, was designed to determine the rela-
tionship between visual impairment and everyday tasks [243]. The measures for
everyday tasks include self-report and performance on the tasks listed below in
2.8. Mobility assessment 37
Table 2.5: Mobility and daily activities assessment from West et al. [243]
Category Task Measure
Mobility Walk 4m m/s
Ascend 7 steps steps/s
Descend 7 steps steps/s
Chair ascent/step Time to finish
Daily Living Tasks Insert Plug s
Insert key s
Dial telephone no. s
Visually Intensive Tasks Reading speed words/min
Face recognition no. recognised
Table 2.5. These tasks are broader than typical O&M studies, and may provide
a good set of tests for artificial human vision assessment. However, recent pub-
lications from the SEE project have focused on assessment of remaining visual
functions (such as visual field), PPWS and obstacle contacts [233], [169].
Veering
Veering is a typical problem for a blind pedestrian when environmental cues are
unavailable for guidance. A tennis court which had been marked into a grid with
duct tape was used to measure the veering tendency of blind pedestrians in a
paper by Guth & LaDuke [86]. Four blind adults were assessed over three 15 trial
sessions which commenced with the subject standing against a portable wall and
then being asked to walk in a straight line for 25 meters. The overall average
veering error was found to be 11.5 degrees. A similar assessment method could be
used to determine how helpful different image processing methods are to prevent
veering in an AHV system.
38 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
Search strategies
Search strategies are important in the navigation and orientation of visually im-
paired people. These strategies were examined by Hill et al. [102] in a study
which involved 65 subjects (mean age 33 years), who were blind or only had light
perception. The same testing environment was constructed in eight different lo-
cations, and involved a 15 x 15 foot square, bordered by four strips of rubberised
matting. A baseball, glove, hat and a cup were placed on plastic baseball tees in
specific locations within the square. Subjects were monitored after being asked
to judge the directions from some of the targets to others. Subjects who used a
range of search strategies (such as perimeter, mental image or object to object)
performed better than those who relied on a single strategy.
2.8.4 Mobility assessment: Combined Field experiment
and artificial environment research
By combining both artificial environment and field experiments it may be possible
to generate results which are generalizable to the real-world, and which can also
be controlled and replicated.
Obstacle contacts and disorientation
Early work evaluating low vision mobility (rather than blind mobility) was pub-
lished in 1982 by Marron & Bailey [145]. This experiment investigated visual
factors in mobility in both outdoor and indoor test courses. The outside course
was a city block which contained a series of objects such as high contrast mail-
boxes and low contrast footpath edges. The illumination and contrast conditions
were approximately the same for all 19 subjects by testing in the early afternoon
during a 3 week period. The indoor test course was a long corridor (12.2 meters
long and 2.4 meters wide), with walls covered to make them a similar colour and
2.8. Mobility assessment 39
Table 2.6: Mobility measures used in Marron et al. [145]
Problem type (Score)
Contact with obstacle or disoriented for less than five seconds. (1 Point)
Contact with obstacle or disoriented for five-15 seconds. (2 Points)
Longer than 15 seconds to reorient and required assistance (3 Points)
luminance to the floor. Paper cylinders of varying diameters and lengths were
hung from the ceiling. Details of how many cylinders, or their location on the
course were not provided. The cylindrical shape was chosen to reduce sharp edges
and shadows which could be used as detection cues. The error scores (listed in
Table 2.6) from these courses were then compared to the results of visual field,
spatial contrast sensitivity and visual acuity assessments. There was a poor corre-
lation found between subject’s performance on the Snellen visual acuity chart and
mobility performance, however the combined effects of spatial contrast sensitivity
and visual fields were found to correlate significantly with mobility performance.
A problem with this study is that the dependent variables involve timing the
response to mobility incidents (such as scoring 3 points for taking longer than 15
seconds to reorient). These differences in responding to incidents may have been
related to individual differences (such as personality) than the effects of different
levels of existing vision. In addition there can be significant differences in lighting
levels over a three week period, which could have been measured and recorded.
In a 1990 follow up study to Marron and Bailey, Long et al. [135] examined
the mobility of subjects with moderate levels of low vision. 22 subjects were
involved in this study, with an average age of 36.1 years. An assessment of vision
was conducted before the mobility assessment, which involved subjects walking
through six unfamiliar routes. These routes consisted of two paths in three differ-
ent environments (classroom building, residential area and small business area).
To simulate low illumination, subjects wore 1 percent ultraviolet sunglasses for
40 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
Table 2.7: Mobility incidents scored in Long et al. [135]
1. High stepping (looking for a step which is nonexistent or unexpectedly shallow)
2. Missed curb (overstepping a curb because it was not seen)
3. Loss of Balance (tripping, stumbling or mild unsteadiness in balance)
4. Object contact (with any part of the body)
5. Shuffling (sliding foot forward to investigate the path)
6. Stop (stopping inappropriately)
7. Spotter intervention
8. Veer (abrupt change in direction of travel or side-stepping)
9. Off Path (veering off footpath or into adjacent hallways or open areas)
one path per setting. An effort was made in this study to prevent disorienta-
tion, and subjects were provided with navigation instructions and corrected by
an O&M instructor when moving in the wrong direction. Mobility behaviours
that occurred during periods of disorientation were ignored. During the mobility
tasks, subjects were also asked to identify whether a tone, presented every 10-20
seconds, was high or low. This response was recorded, and tested to see if per-
formance on the tone task varied as a function of the demands of the primary
mobility task - however no results were provided for this part of the study. Mo-
bility performance was videotaped and assessed by one of three pairs of scorers,
who counted the frequency of behaviours listed in Table 2.7. The percentage
agreement between observers across different routes and levels or illumination
was reported as 86.4% (agreement for the indoor course was highest, at 93%, fol-
lowed by 90% for the residential route and 68% agreement for the small business
environment). One difficulty with this study is the lack of detail on the mobility
courses used (such as the number and type of obstacles), which makes it difficult
to replicate the results. An individuals visual fields and contrast sensitivity were
found to be related to mobility performance. Visual acuity was not found to be
related to mobility performance.
2.8. Mobility assessment 41
Pass or Fail mark
A combination of controlled and field experimentation was also used in a 2002
study of night vision goggles for people with degenerative retinal diseases (which
can impair night vision). Spandau et al. [210] used a totally darkened room to
test mobility in an artificial environment. The average age of the 42 subjects
involved in this study was 35 years (range of 10 to 70 years). The dark room task
required the subjects to walk around the room avoiding obstacles, name objects
in the room and read a visual acuity chart. The field experiment component of
this study took place at night in Heidelberg, Germany, and included a residential
area, a strip mall with bars, restaurants and shops (with many pedestrians),
high traffic and noise areas. In this study mobility was assessed by an O&M
instructor who allocated each subject a pass or fail mobility grade (no further
details were given about the method of assessment). Subjects were also asked to
fill out a pre-test and post-test mobility questionnaire. Most subjects were found
to adjust quickly to the night vision device and improved their mobility at night.
2.8.5 Mobility Assessment Conclusion
The development of objective, valid and reliable assessment techniques should
enable the comparison of O&M performance from different ETA’s, visual pros-
theses and other mobility devices. These comparisons should be conducted by
independent observers to reduce bias. This section has reviewed a large number
of research efforts in assessing mobility using self-report, field and artificial ex-
periments. These papers are briefly summarised in Table 2.8. The most widely
supported mobility measures have been PPWS and mobility incidents (generally
defined as contact with obstacles, although veering is also frequently used).
The layout and type of obstacles used could be standardized to increase the
comparability of studies. In addition the layout of the mobility test route should
42 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
be changeable to reduce practice effects.
Although psychological variables (such as personality factors) have been con-
sidered in some studies, none of the reviewed papers have considered the effects
of blindness and other impairments (such as hearing) and mobility. In addition,
as pointed out Geruschat et al. [72], the sample size used in experiments is often
small, which limits the ability to generalise the research.
2.9 Chapter Summary
This chapter has reviewed major mobility issues which a blind or vision impaired
person might experience. The main hazardous situations for blind mobility are
drop-offs, obstacles and fast moving objects. Mobility aids, both traditional and
electronic have also been reviewed. The common feature of these aids is that they
provide additional preview information to the blind traveller. The information
from these reviews is important for the development of an Artificial Human Vision
(AHV) device, and it provides areas of need to be targeted by a device. The
assessment of mobility was reviewed in this chapter in some detail, because this
research provides a methodology which can be used in future simulation research
and allows the comparison of AHV mobility research with other literature.
2.9. Chapter Summary 43
Table 2.8: Summary of mobility assessment research discussed in this Chapter.‘Time’ is the amount of time on the course, ‘obst.’ is a count of obstacle contacts,and ‘veer.’ is the number of incidents of veering from a path.
Authors Main Method Comment
Self Report
Brambring [22] Transcription Walking experience reported
Passini et al. [168] Interview Major mobility problems reported
Field experiments
Armstrong [6] Video 13 measures used
Dodds [54] PWI Outdoor path
Clarke-Carter et al. [41] PPWS Large backpack used
Dodds [55] PPWS & PC based Visual task performance used
Haymes et al. [94], [95] PPWS, personality Acuity not useful
Geruschat et al. [74] Multiple measures Pedestrians avoid collision
Beizen et al. [15] Road identified Curb ramps often missed
Tanaka et al. [223] Heart rate Confounded variable
Geruschat et al. [72] Instructor ratings Inter-rater ok
Loomis et al. [136] Instructions followed Promising GPS system
Artificial environment
Michumus et al. [200] Time, obst. No sig. findings
Schneckman et al. [193] Object detected Echolocation study
Lovie-kitchen et al. [137] Time, obst., veer. Peripheral vision important
Soong et al. [207] PPWS, obst., veer. O&M training not effective
Cha et al. [29] Time, obst. 1st AHV simulation
Kuyk et al. [122], [123] time, obst. Illumination important
Guth et al. [86] veering angle Consistent veering
Hill et al. [102] Search path Search strategy important
Combined
Marron et al. [145] Obst. & vision measures Some support for vision measures
Long et al. [135] Obst. veer. and others Acuity not related to mobility
Spandeau et al. [210] Pass of fail Night vision goggle study
44 Chapter 2. Blind and Low Vision Mobility Issues and Assessment
Chapter 3
A Review of Artificial Human
Vision
3.1 Introduction
This chapter provides a comprehensive review of AHV technology and research.
An overview of the Human Visual System (HVS) and requirements for an AHV
system are given, followed by a discussion of work in the four main locations
for electrode stimulation: either on the surface or penetrating the visual cortex
(Section 3.3), either behind or just in front of the retina (Section 3.4) and, finally
surrounding the optic nerve (Section 3.4). As the number of implanted individuals
is limited, and psychophysical experiments on normally sighted subjects using
AHV simulation are often used, Section 3.6 presents a review of the literature on
AHV simulation.
3.2 Review of the Human Visual System
This section provides a description of the HVS, and defines many related terms
which are used in the remainder of this chapter.
45
46 Chapter 3. A Review of Artificial Human Vision
Optic Nerve LensSub-Retinal Implant
Epi-RetinalImplantOptic Nerve CuffElectrode CorneaRetinaSclera andChoroid
Figure 3.1: Horizontal diagram of the human eye. The locations for the epi-and sub-retinal implants and the optic nerve electrode are shown. Adapted fromGregory [83].
In the functioning human vision system, two types of photoreceptors (cells
which convert light energy to neural responses) in the retina (known as rods and
cones) are activated by light which has been focused by the lens and cornea in
the eye (see Figure 3.1). The rods provide scotopic vision, the ability to see in
dim light. Rods, which number around 91 million, are not colour sensitive but
are specialised for sensitivity to light [156]. In addition to the rods, there are
three types of cones which each contain a photopigment sensitive to different
wavelengths of light (blue, green and red). The retina contains approximately
4.5 million cones which are concentrated in a small region in the centre of the
retina called the fovea. It is this retinal location which provides the highest
visual acuity and colour vision [239]. The firing frequency of a receptor and
it’s neuronal connections is reduced under constant stimulation, which is called
adaptation [156].
3.2. Review of the Human Visual System 47
Photoreceptor cellsHorizontal, Bipolar and Amacrine cellsGanglion Cells To Optic NerveLight
Figure 3.2: A simplified diagrammatic representation of the cellular layers of theretina. Light passes through the outer layers of the retina before being absorbedby the rods and cones of the photoreceptor layer. Adapted from Sharp andPhillips [199].
Electrical signals from these photoreceptors are then passed through a layer of
bipolar cells to the ganglion cells within the retina 3.2. However, before reaching
the ganglion cells, the bipolar cells may be modified by two other cell types:
horizontal cells (which laterally inhibit the output of the bipolar cells, producing
concentric receptive fields, such as ‘off-centre, on surround) and amacrine cells
(associated with temporal responses shown by some ganglion cells) [199].
There is a strong convergence of signals from the rods and a single rod bipolar
cell can integrate the signal from 1500 rods. Therefore, the amount of information
entering the eye is reduced considerably from approximately 94.5 million from the
rods and cones to around 1 million ganglion cells [156].
The axons of the ganglion cells make up the optic nerve which carries visual
information from the eye (via the optic disk) to the optic chiasma (located at the
base of the hypothalamus) (see figure 3.4). The human brain is divided into two
separate halves (hemispheres) which are connected by a bunch of fibres called the
48 Chapter 3. A Review of Artificial Human Vision
corpus callosum. In humans the axons in the optic nerve from the left halves of
each retina run through the optic chiasma to the left lateral geniculate nucleus
(LGN) and the opposite occurs for the right halves of each retina. This provides
images of the same object formed on the right and left retinas to be processed
in the same part of the brain [24]. The axons of the LGN then travel in an
optic radiation (shown in Figure 3.4) and terminate in the primary visual cortex
(which is part of the occipital cortex). The projection from each LGN to the
primary visual cortex is ordered, and each part of the retina is represented in the
primary visual cortex [253]. The map of the retina on the cortex is an example
of a retinographical map [239].
Not all of the optic nerve is connected to each LGN. About 20-30% of the
ganglion cell axons connect to the superior colliculus (a small part of the brain
present in each hemisphere), which provides a cruder retinotopic mapping and is
responsible for eye movements [83]. This location is responsible for a phenomenon
called blindsight, in which a person who has had their visual cortex removed is
aware of the location of objects although they are unable to recognise them [24].
The primary visual cortex is also known as the striate cortex because of it’s
distinctive visible striation (layers) [253]. The primary visual cortex is also known
as V1, while associated areas in the occipital lobe are known as V2 through V6.
Six ordered layers are visible, with layer 1 being closest to the surface of the
brain. Layer 4 consists of a number of subdivisions: 4A, 4B, 4Ca and 4Cb [199].
In addition to the layers, the striate cortex is structured into ocular dominance
columns (which can combine input from both eyes for the purpose of depth per-
ception) [156]. The ocular dominance columns are in turn divided into orientation
preference columns which respond to the orientation of receptive fields (such as
bars of light or edges in a particular orientation).
Although most visual information appears to be processed first in the V1 area,
there are many other areas involved in processing visual information such as V2,
3.3. AHV technology and requirements 49
Primary VisualCortexOccipital Lobe
Parietal LobeFrontal Lobe
Temporal LobeFigure 3.3: The cortical lobes of the human brain. The primary visual cortex,which is the site for cortical electrode array implants, is also shown. Adaptedfrom Wandell [239].
V3, the mid-temporal cortex (where neurons are particularly sensitive to stimulus
movement), V4 (colour processing) and the inferotemporal cortex (where stimulus
size, shape, contrast and colour appear to be processed) [227].
3.3 AHV technology and requirements
The development of an AHV system is a multidisciplinary field, involving input
from neuroscience, engineering, computer science, and ophthalmology, in addition
to orientation and mobility specialists.
With the exception of subretinal prostheses, most AHV systems have similar
system requirements. The main components, which will need to function in real
time, are:
A Camera is required to capture and digitise image information from the
50 Chapter 3. A Review of Artificial Human Vision
Optic Nerve
OpticChiasm
Lateral GeniculateNucleusVisual CortexSuperior Colliculus
Optic Radiation
Figure 3.4: Diagram of the main pathways in the HVS. Adapted from Bruce etal. [24].
3.3. AHV technology and requirements 51
environment. Charged Coupled Device (CCD) based digital cameras are inex-
pensive, small and can be easily interfaced to other system components. An
adaptive mechanism (such as an automatic gain in current video cameras) will
also be required to allow the device to function at different levels of illumination
[45]. Complementary Metal-Oxide Semiconductor (CMOS) are an alternative to
CCD based cameras. Both CCD and CMOS camera sensors have a linear re-
sponse to light intensity: A logarithmic camera has a similar response to the
human visual system, and can reduce saturation in high contrast visual scenes.
The use of a logarithmic camera in an AHV is being investigated in at least one
current research project [170], however this method could also be applied to a
CCD or CMOS camera using a log transform of intensity.
Image processing: There will usually be more data retrieved from the cam-
era than can be used in an AHV device. The image data will usually be pre-
processed to reduce noise. After this, an information reduction (such as edge
detection or segmentation) or a scene understanding approach (attempting to ex-
tract information) can be used. Further details on the image processing methods
is provided in Chapter 4. Cortical prosthesis research by the Dobelle Institute
has found that edge detection and image reversal enhance the ability of subjects
to recognise important scene components (such as doorways) [48]. An alterna-
tive approach to traditional image processing is the use of neuromorphic vision
systems, designed to mimic the design of the human visual system [16].
Transmitter/Receiver: A link is required from the camera/image process-
ing components to the stimulator and electrode array, which are usually located
inside the body. Percutaneous connections, which involve a wire or cable fed
through the skin, have been used for most research because it is simple and reli-
able, however the risk of chronic infection is higher with this type of connection
[159]. The Dobelle Institute system uses a percutaneous connecting pedestal for
52 Chapter 3. A Review of Artificial Human Vision
connection to the image processing unit (a notebook PC). A transcutaneous con-
nection does not involve cables passing through the skin. This type of connection
is commonly used in Cochlear implants [133], and uses radio frequency telemetry
to send data and power to the embedded stimulator, reducing the risk of infec-
tion. Most AHV research projects are planning to eventually use transcutaneous
connections. Reverse telemetry can also be used to provide details of stimulation
voltage waveforms, impedance measurements and reconstruction of stimulation
voltage waveforms [217]. A good description of a high efficiency transcutaneous
data link for implanted electronic devices is provided in a 1992 paper by Troyke
and Schwan [231].
Stimulator/Electrodes: An electrode is a thin wire, which allows a small
amount of precisely controlled electrical current to pass through it. Electrodes
can be used for either stimulation or recording the electrical activity of the brain.
Two important parameters which can be varied for electrodes include: amplitude
(the highest value reached by a current) and pulse duration (generally defined
to be the time interval between the pulse amplitude reaching half of it’s final
value and the time where the pulse amplitude returns to that value again [2]).
The purpose of the stimulator is to send current through multiple electrodes.
There are two main types of electrodes discussed in the AHV literature: surface
electrodes, which lie flat against the stimulation/recording target; and penetrat-
ing electrodes, which are inserted inside the stimulation/recording target. The
biocompatability, long term-effectiveness, and safe threshold levels for implanted
electrodes need to be carefully considered [46]. Electrodes can stimulate tissue
using monophasic (either positive or negative) or diphasic (alternating between
positive and negative) stimulation. However, the monophasic method can cause
cell damage [221].
3.4. Cortical stimulation 53
3.4 Cortical stimulation
Cortical-based AHV systems use either surface or intracortical (using penetrating
electrodes) stimulation . Cortical stimulation is the only treatment available for
blindness caused by glaucoma, optic atrophy or diseases of the central visual
pathways (such as brain injuries or stroke). The main negative feature of a cortical
implant is that the lack of preliminary processing by the brain (particularly in
the retina where much of the information reduction takes place).
Most research on AHV has focused on sending a captured image to the brain
as a bitmap representation. The ’bitmap’ approach to cortical devices has been
questioned [230]. Research by Hubel and Weisel [105] on macaque monkeys has
found that, in addition to spatial location of a stimulus in the visual field, neurons
in the visual cortex are selective for spatial, temporal, chromatic and binocular
cues. A greater knowledge of cortical physiology may be required before a cortical
prosthesis provides useful vision. Evidence has also been found that there may be
specialised cortical areas for the analysis of biologically important images (such
as faces) [187].
3.4.1 Cortical surface stimulation
The early developments in cortical prostheses involved surface electrode arrays.
The first person to expose the human occipital lobe to electrical stimulation
was the German researcher Otfrid Foerster who in 1929 noticed that stimulation
caused the subject to see a spot of light in a position which depends on the site
of stimulation [90].
Early surface stimulation research
Brindley and Lewin published the results of their groundbreaking study on corti-
cal stimulation in 1968 [23]. In their study a 52-year-old legally blind subject was
54 Chapter 3. A Review of Artificial Human Vision
implanted with an array of eighty platinum electrodes, a design which had previ-
ously been tested in baboons. These electrodes were stimulated by pulsed radio
signals from an oscillator. Stimulation of these electrodes produced discernible
phosphenes. Brindley and Lewin suggested that there was probably no flicker
fusion frequency (ie. the frequency of intermittent light stimuli where it is per-
ceived as continuous lighting) for this implant. They also found that phosphenes
moved with eye movements and that phosphene perception usually (but not al-
ways) stopped when stimulation ceased. Stimulation of one electrode was found
to produce multiple phosphenes, and when multiple electrodes in close vicinity
were activated a larger, straight light phosphene was produced. Unfortunately,
the monophasic stimulus pulses used long-term in these earlier studies were also
likely to cause irreversible damage at the electrode-tissue interface [221].
William Dobelle
Brindley and Lewin’s research inspired pioneering work on 37 human subjects by
Dobelle and Mladejovsky in 1974 [49], where electrical stimulation was applied to
patients hospitalised for cranial surgery. Supporting Brindley and Lewin’s work,
they found eye movements caused phosphenes to move, and multiple phosphenes
could be produced from a single electrode. However, Dobelle and Mladejovsky
found that constant stimulation caused phosphenes to fade (suggesting that re-
fresh of the phosphenes is required). In a later paper [50], it was reported that
subjects were able to read electrode-induced Braille characters more efficiently
than using their tactile sense.
In 2000, Dobelle published a paper [48] describing a subject who had been
using a cortical visual prosthesis system for over 20 years. The system used a 64-
channel electrode array, which had been implanted on the mesial surface (towards
the middle) of the subject’s right occipital lobe in 1978. When stimulated, each
electrode produced 1-4 closely spaced phosphenes. The stimulation parameters
3.4. Cortical stimulation 55
and phosphene locations had been stable for the past 20 years; however the elec-
trode thresholds required a 15 minute recalibration every morning. This system
utilised a black and white camera connected to a notebook computer. Cables
from the notebook were connected to a percutaneous connecting pedestal, which
interfaced to the microcontroller, stimulus generator and electrode array. Dobelle
reported that ‘frame rates’ of around 4 frames per second have been found to be
optimal. Using the device, Dobelle found that the subject had a visual acuity of
approximately 20/200.
Bionic Eye Research Project
Although research in the early 1990s moved toward intracortical stimulation,
a recently commenced project by Chowdury et al. at the University of New
South Wales is investigating the use of technology adapted from cochlear implants
(which generally use surface electrodes) [39], [40]. An in vivo model has been
successfully applied in animal experiments involving cats, where the transcallosal
evoked response to cortical stimulation is recorded on the opposite hemisphere to
the site of stimulation (this is possible as there are direct corpus callosum neural
pathways between surface points on the two hemispheres of the brain). Future
experiments are planned with a human subject who, unlike the cat, will be able
to describe their subjective response to stimulation.
3.4.2 Intracortical stimulation
National Institute of Health
The Neuroprosthesis Program at the U.S. National Institute of Health (NIH) was
the first to publish research concerning the use of intracortical stimulation to pro-
duce phosphenes. In a study by Bak et al. [8], three normally sighted patients,
undergoing occipital craniotomies (opening of the skull) for other conditions, were
56 Chapter 3. A Review of Artificial Human Vision
tested for an hour each. Surface stimulation produced the same phosphenes de-
scribed by Dobelle and Brindley. After this, a dual microelectrode was inserted
to level 4B in the primary visual cortex and stimulation applied. Unlike surface
electrodes, the intracortical electrode phosphenes did not flicker. An additional
important finding from this research was the discovery that intracortical stimu-
lation required 10-100 times less electrical current to produce phosphenes than
surface electrodes. Also, intracortical electrodes located as closely as 500 µm
could evoke distinct phosphenes.
A more detailed experiment by the NIH team was described in 1996 by
Schmidt et al. [194]. Thirty-eight microelectrodes were inserted into the right
visual cortex of a 42 year old woman for four months. The patient, who had been
blind for 22 years, was consistently able to perceive phosphenes at stable posi-
tions in visual space. Phosphenes were produced with 34 of the microelectrodes,
at thresholds usually at 25 µA. It was found that these phosphenes did not flicker
and changing the stimulus amplitude, frequency and pulse duration could change
phosphene brightness. A perception of depth from the stimulation was reported.
It was also found that as the stimulation level was increased, the phosphenes
generally changed colour (varying from white, ‘yellowish’ and ‘greyish’). Sup-
porting earlier research, phosphenes moved with eye movements. Schmidt et al.
suggested that using this method electrodes could be placed five times closer than
surface stimulation. An important result of this study concerned after-discharge:
one phosphene was observed for up to 25 minutes after cessation of stimulation,
which suggests that even small electrical currents from repeated, patterned stim-
ulation may cause epilepsy. At least six of the electrode leads broke during the
study, due to accidental movement of the patient during sleep, which limited test-
ing on pattern recognition. The percutaneous leads and electrodes were removed
after four months.
The NIH Neuroprosthesis Program described above was discontinued by 2001
3.4. Cortical stimulation 57
[183]. However, there is continuing collaboration with the Intracortical visual
prosthesis team at Illinois Institute of Technology (see below).
University of Utah
The University of Utah currently has an active intracortical research group led
by Richard Normann. This team has focused mainly on electrode array design
for stimulation and recording, behavioural experiments and psychophysical ex-
periments (for example the Cha et al. AHV simulation studies described in the
previous chapter).
The University of Utah has developed an array of 100 penetrating cortical
electrodes, each 1.5 mm in length and separated by 400 microns. This length has
been selected to reach level 4Cb of the primary visual cortex (area V1). Level 4Cb
is an area responsible for receiving form information from the lateral geniculate
nucleus (LGN), in which neurons have the smallest and simplest receptive fields,
and where lower thresholds can be used for generating phosphenes [157]. Manual
insertion of the array was found to cause cortical deformation, therefore a pneu-
matic insertion device was also developed and tested [190]. The biocompatibility
of this array has been extensively evaluated, and arrays have been inserted for
up to 14 months in cats [158]. The Utah Electrode Array (UEA) has been in-
vestigated as a recording structure for potential brain-computer interfaces [148],
and recently for investigating representations of simple visual stimuli in the cat
visual cortex [160]. A modification of the UEA is available which has graded elec-
trodes, allowing stimulation and recording to be conducted in both horizontal and
vertical directions [147].
Cortical Implant for the Blind (CORTIVIS)
The CORTIVIS project, commenced in 2001, is led by Eduardo Fernandez of
the University of Miguel Hernandez in Spain, and involves additional researchers
58 Chapter 3. A Review of Artificial Human Vision
from Germany, Austria, France and Portugal.
This group has investigated the use of the UEA in animal experiments (cats,
rabbits and rats) over a period of 12 hours to six months. The electrodes were
found to be well-tolerated by the cortex, despite some inflammation in the vicinity
of the electrode tracks [65].
In order to develop a methodology to identify feasibility of a cortical pros-
thesis for a patient, and the preferred location for the prosthesis, Fernandez et
al. [67] have used transcranial magnetic stimulation (TMS) to evoke phosphenes
in 13 legally blind and 19 normally sighted patients. The advantage of TMS is
that it is painless and non-invasive. For each patient, twenty-eight positions ar-
ranged in a 2x2 cm grid over the occipital area were stimulated, and phosphenes
were perceived by 94% of the normally sighted participants. Interestingly how-
ever, only 54% of the legally blind patients perceived phosphenes using TMS
(even after adjusting the stimulation parameters). Evoked phosphenes were to-
pographically organised and the mapping results could generally be reproduced
between participants.
The CORTIVIS project is also developing a retina-like processor [171], de-
signed to simulate the functioning of the human retina to produce optimal elec-
trode stimulation at the cortical level. The output of this system is a series of
spike patterns, which could be used to stimulate neurons in the visual cortex.
In a 2003 study of brain plasticity by the CORTIVIS group [66], fMRI was used
to study the differences in reading Braille in normally sighted and congenitally
blind people. Unlike normally sighted participants, activation of the occipital
cortex (which contains the primary visual cortex) was recorded in blind partic-
ipants. The authors note that where cross-modal plasticity has been activated
in this way, the processing of tactile information is associated with significantly
improved tactile reading skill.
3.5. Retinal Stimulation 59
Intracortical visual prosthesis
This project is led by Philip R. Troyk, Director of the Laboratory of Neuro-
prosthetic Research in the United States, and involves collaboration with other
institutions and former staff from the NIH Neuroprosthesis Program. Their ap-
proach is to use small implanted arrays (consisting of eight electrodes) in groups
of intracortical electrodes which ‘tile’ the visual cortex. In a recent (2003) paper,
Troyke et al. [230], describe an interesting animal research model, using a male
macaque monkey, designed to investigate visual prosthesis functioning with this
tiled design. Before implantation, the animal was presented with a flash of light,
and then trained to continue staring at the flash location (so only the memory
of the flash remains). One hundred and ninety-two tiled electrodes were then
implanted into area V1 of the animal. Only 114 electrodes were functioning af-
ter implantation. The receptive field co-ordinates for each implanted electrode
were estimated, and a phosphene was generated in that location. The macaque
received a reward if its eye position moved within 2◦ of the known receptive field
for that electrode. The reported preliminary results indicated that this method
demonstrated a useful method for future AHV research.
3.5 Retinal Stimulation
The most common non-preventable reason for blindness in the developed world
is age-related macular degeneration. This condition affects the retina at the back
of the eye, while leaving the remaining components of the visual system intact.
Retinal prosthesis research aims to use the remaining visual pathway components
to provide partial restoration of sight. In 1956 an Australian researcher, G.E.
Tassiker, was the first to describe placing a light sensitive selenuium plate behind
a blind person’s retina and restoring some intermittent light sensation [225].
60 Chapter 3. A Review of Artificial Human Vision
There are significant advantages to the retinal approach to AHV. Implanta-
tion of a cortical prosthesis requires intercranial neurosurgery, which may expose
a patient to higher risk. At a fine scale, the mapping of a stimulus to the appro-
priate place on the cortex may be variable between subjects [216]. An alternate
approach is to stimulate the eye rather than the brain. A retinal prosthesis could
assist people who still have a functioning optic nerve. In post-mortem exami-
nations of people without light perception, 80% of the optic nerve was found to
be functioning and approximately 30% of the ganglion cell layer was found to
be functioning [109]. However, there may also be continual remodelling by the
retina which could lead to spatial corruption and cryptic synapse formation after
a retinal implant has been attached [142].
The two types of retinal prosthesis are subretinal and epiretinal, located re-
spectively inside and outside the retinal layer as explained in more detail below.
3.5.1 Subretinal stimulation
As mentioned, the information from approximately 95 million receptors in the
retina, is reduced down to 1 million fibres in the optic nerve [156]. This infor-
mation reduction takes place in the inner nuclear layer (consisting of amacrine,
bipolar and horizontal cell nuclei) of the retina. Targeting this layer, a subretinal
implant is located behind the photoreceptor layer of the retina and in front of the
pigmented layer called the retinal pigment epithelium. Therefore the subretinal
approach (unlike the epiretinal) may be able to utilise the information reduction
functions in the retina, provided the electric field produced does not interfere
with other retina components (such as the ganglion cell layer).
3.5. Retinal Stimulation 61
Optobionics Corporation (United States)
Since the 1980s Alan and Vincent Chow have been investigating subretinal mi-
crophotodiodes for subretinal stimulation [36], and their company, Optobionics,
was awarded the original patent for an artificial subretinal device in 1991 [33].
In an early animal experiment, an implanted strip electrode was inserted be-
hind the photoreceptor layer in a rabbit’s eye. The electrical evoked response of
stimulation to the operated eye was compared to the normal eye by presenting
a flash of light, and then measuring the response from the scalp over the visual
cortex. It was found that a brief electrical spike was generated during stimulation
[35]. This experiment demonstrated the feasibility of converting light into elec-
trical energy using subretinal stimulation to produce a cortical electrical evoked
response [143].
A further animal experiment focused on the long term biocompatibility of sub
retinal stimulation [37]. Cats were selected for this study as they have both retinal
and choroidal circulation (unlike rabbits). The implants, approximately 50µm in
thickness with a diameter of 2 to 2.5 mm, consisted of a doped and ion implanted
silicon substrate, surrounded with a gold electrode layer. After implantation in
the cat’s right eye, the arrays were evaluated over 10 to 27 months. During this
time, a gradually decreased response to light was found, due to the dissolution of
the gold electrode layer. In addition, the silicon substrate blocked choroidal nour-
ishment to the retina, which led to a degeneration of the photoreceptors (which
are highly dependent on blood supply for oxygenation). The loss of photorecep-
tors may not be important as they may be damaged anyway, however design work
commenced on a fenestrated design (one containing holes to improve the flow of
nutrients from the choroid to the retina) [37]. The positive findings from this
study were that the implant maintained a stable position over time and there
was no rejection, inflammation, or degeneration of the retina outside the location
62 Chapter 3. A Review of Artificial Human Vision
of the implant [166].
By June 2000, Optobionics received approval from the U.S. Food and Drug
Administration (FDA) to commence safety and feasibility trials in 6 patients [34].
The Artificial Silicon Retina (ASR), consisting of 5000 microelectrode-tipped mi-
crophotodiodes in a 2mm diameter device, was implanted into the right eyes
of 6 legally blind patients with retinitis pigmentosa. During a follow-up period
of 6 to 18 months, all ASRs were found to function electrically and there were
no signs of rejection, inflammation, erosion, retinal detachment or migration of
the device. During this study it was found that all patients experienced im-
provements in visual function (such as improved colour perception), and there
were also unexpected improvements in retinal areas distant from the implant.
These improvements may have been due to neurotrophic effects (meaning that
the improvement may have occurred due to the presence of a foreign body in the
subretinal space, and not as a result of microphotodiode functioning), and further
studies are intended to explore this improvement. Additional research is planned
to examine the implant and age related macular degeneration; and whether the
neurotrophic effect can be effective in earlier stages of retinitis pigmentosa [34].
An issue with the Optobionics research has been the lack of an experimental
control (by implanting an inactive device or conducting sham surgery), to evaluate
against the ASR. Pardue et. al. [165] have recently conducted research addressing
this issue. Their experiment involved 15 RCS rats, which have a genetic mutation
resulting in photoreceptor degeneration over approximately 77 days. The rats
received either the ASR device, an inactive device, sham surgery, or no surgery.
The outer retinal function was assessed with weekly Electroretinogram (ERG)
recordings. After 4-6 weeks there was a 30-70% higher b-wave amplitude response
with the ASR compared with the inactive device, indicating that the ASR device
appears to produce some temporary improvement in retinal function. However,
after 8 weeks, there was no significant difference in b-wave amplitude response
3.5. Retinal Stimulation 63
between the inactive and active devices. At 8 weeks, there were a significantly
greater number of photoreceptors remaining for rats who had received either the
ASR or inactive device compared to those rats that had undergone sham surgery
or no surgery. Pardue et. al. [165] suggest that enhanced protective effects from
the ASR may be possible by altering its design to increase current levels or by
increasing environmental light levels to produce higher stimulation levels.
MPD-Array project
After collaborating with the Optobionics group between 1994 and 1995 [38], a
Southern German team led by Eberhart Zrenner at the University Eye Hospital
in Tubingen, was formed in 1995 to develop a subretinal prosthesis. In 1996 the
Institute of Micro-Electronics in Stuttgart developed a prototype microphotodi-
ode array (MPDA) containing 7600 microelectrodes on a 3 mm disc, 50 µm in
diameter [258]. In vitro techniques have been predominantly reported by the
German subretinal project.
The first generation of MPDAs were tested using a ‘sandwich technique’,
which involved the retinae from newly hatched chickens being removed and ad-
hered to a recording multielectrode array (the ganglion cell side was adhered).
The photoreceptor outer segments were then damaged, and an MPDA placed
onto the retina. This technique allowed the recording of stimuli from the MPDA
[258]. A later study [259] examined degenerated rat retinae. The retinae were
removed and cut into 5x5mm segments, then attached to a 60-electrode micro-
electrode array. Beams of white light were flashed onto the MPDA and it was
found that intrinsic ganglion cell activity could be recorded even with a highly
degenerated retinal network. Further experiments have shown that it should be
possible to transform the basic features of images, such as points, bars and edges
into activity of the existing retinal network; which suggests that shape percep-
tion and object location may be possible with a subretinal device [213]. However,
64 Chapter 3. A Review of Artificial Human Vision
recent epiretinal results from Rizzo et. al. [184] have not confirmed the pattern
perception of phosphenes from patterned electrical stimulation of the retina.
Further tests have been conducted in order to test the biocompatibility stabil-
ity of the MPDA. Various materials were placed in Petri dishes with the retinae
of pigmented rats. For comparison, a control dish contained only the retinae and
solution. None of the MPDA materials showed a toxic effect. Retinal cell cultures
from rats were also used by Guenther et al. to screen for technical implant ma-
terial [85]. Although most materials (including iridium and silica) showed good
biocompatibility, a reduced biocompatibility was found for titanium materials.
Interestingly, a later paper by Hammerle et al. [91] found that titanium nitrate
had excellent biostability, both in vivo and in vitro.
In a similar method to the Optobionics research, electroretinography was per-
formed in rabbits and rats to measure the effectiveness of the MPDA. Because
the MPDA are sensitive to infrared light, it is possible to stimulate the retina and
measure the discharged current. This method should be useful for the localising
electrical responses from an MPDA.
As with the early Optobionics MPDA [35] Zrenner et al. found in their early
work that metabolic processes in the photoreceptor layer can be disrupted by the
MPDA, and they placed very thin holes in their device to allow nutrients to be
passed [258].
As natural photoreceptors in the retina are far more efficient than photodiodes,
visible light is not powerful enough to stimulate the MPDA. Therefore infrared
enhancement of the photodiode arrays (by inserting an additional layer in the
array) has been suggested to enhance the stimulation current [195].
The German team commenced in vivo experiments in 2000, when evoked
cortical potentials were measured from Yucantan micropigs and rabbits. The
micropigs have eyes which are comparable in size and function to human eyes
[257]. Fourteen months after implantation, the implant and retina surrounding
3.5. Retinal Stimulation 65
it were examined, and there were no noticeable changes to anatomical integrity
[71]. However, because the existing MPDA does not function in ambient light
conditions, an electrode foil prototype with similar properties was implanted.
The micropigs required a higher threshold level than the rabbits [196], however
the implants were successful in producing evoked cortical potentials in half of
the animals tested. The thresholds identified in this study were similar to those
required in epiretinal stimulation [196].
The latest reports from this group concern the results of in vivo experiments
on cats. Volker et al. described the use of optical coherence tomography to
examine the morphological and circulatory conditions of the cat neuroretina and
its interface with an implanted MPDA [237].
3.5.2 Other subretinal methods
A team of Japanese researchers led by Tohru Yagi of Nagoya University has
been investigating the attachment of cultured neurons onto electrodes, and then
guiding the axons towards the central nervous system. As this ‘hybrid retinal
implant’ will not require retinal ganglion cells or an optic nerve, it could be useful
for patients with diseases in these components of the visual pathway. Results of
an experiment culturing neural cells obtained from the spinal cords of a 3-4 week
old rat are described in Ito et al. [112] who found that it was difficult to guide
neurons to grow in a particular direction. Another study by this team investigated
electrical stimulation requirements by stimulating the lateral geniculate nucleus in
a cat. Recordings of the evoked potentials from the cat’s cortex found that pulse
amplitude was a more important factor than pulse duration, and that a biphasic
pulse pattern was the most effective stimulation pattern [120]. Further studies
have suggested using a computer model for the 3-D configuration of electrode
arrays [119].
66 Chapter 3. A Review of Artificial Human Vision
Peterman et al. are also investigating the use of directed cell growth and lo-
calised neurotransmitter release for a retinal interface. They have been successful
in directing the growth of neurons in a defined direction, using micropatterned
substrates [173] and have demonstrated that the localised chemical stimulation
of excitable cells is feasible. The authors suggest that chemical stimulation can
have a similar spatial resolution as an electrical stimulation, but with the ability
to mimic the major functions of synaptic transmission [174].
An interesting design for a MPDA has been recently reported by Ziegler et al.
(2003), who propose a device where each pixel acts as an independent oscillator
whose frequency is controlled by light intensity [256].
Kanda (2004) has suggested an alternative stimulation method for a reti-
nal device called Suprachoroidal-Transretinal Stimulation (STS), which does not
involve the attachment of electrodes to the retina and may result in less compli-
cated surgery for blind patients. In this method the anodic stimulating electrode
is located on the choroidal membrane, and the cathode is located in the vitreous
body. This technique has been used in animal experiments where evoked poten-
tials were recorded from the superior colliculus in rats. The authors are planning
long term, in vivo, biocompatability studies [118]. However, it has been demon-
strated that neural cells should not be separated from electrodes by more than a
few micrometers (due to overheating, cross-talk between neighboring pixels, and
electrochemical erosion) [164]. The thickness of the choroid is approximately 400
µm, therefore suprachoroidal placement precludes close proximity between elec-
trodes and cells, which will limit the potential visual acuity of the STS approach.
3.5.3 Epiretinal stimulation
An epiretinal device involves a neurostimulator chip being implanted against the
ganglion cells in the retina. This location is different from subretinal implants, in
3.5. Retinal Stimulation 67
that it bypasses the information reduction components of the retina. The advan-
tage of the epiretinal approach, however, is that the remaining retinal neurons can
be stimulated in patients who are blind from end-stage photoreceptor diseases.
Retinal Implant
Formerly from the Wilmer Ophthalmological Institute, Johns Hopkins Hospital,
Mark Humayun and Eugene De Juan Jr. are currently based at the Doheny
Retina Institute at the University of Southern California. Humayun’s 1992 PhD
thesis demonstrated that a visually impaired person could perceive phosphenes
during stimulation of the retina [106]. The engineering aspects of developing
electronic stimulators and supporting electronics have been mainly conducted by
Wentai Liu and his team at North Carolina State University [130].
In the first experiment to demonstrate successful phosphene perception from
local electrical stimulation of the retina [110], 14 patients (12 with retinitis pig-
mentosa, and two with age-related macular degeneration) had their inner retinal
surface electrically stimulated under local anaesthesia. The responses were retino-
topically correct (ie. the perceived phosphene location matched the location of
stimulation) in 13 of the patients, with the remaining patient (who was blind from
birth) unable to distinguish anything apart from flashing light. The phosphenes
were perceived exactly with the timing of the electrical stimulation [110]. Flicker
fusion was tested in two subjects and found to occur at approximately 50 Hz (the
phosphenes also appeared brighter at higher frequency) [107]. An earlier 1996
paper also reported on five of these patients [108].
In 1999, a further experiment was reported [109] on nine subjects, involving
nine or 25 electrode array electrodes. The electrodes were placed against the
retinal surface and were hand-held in place using a silicon-coated cable with the
guidance of a surgical microscope. The flicker fusion frequency was found to be
50 Hz in two subjects and 40 Hz in another two subjects (the remaining subjects
68 Chapter 3. A Review of Artificial Human Vision
were not tested). By scanning with the head-mounted camera, subjects were able
to perceive simple shapes in response to stimulation (eg. horizontal and vertical
lines and ‘U’ and ‘H’ shapes).
A report on the long term biocompatibility of an implanted, inactive epireti-
nal device was also published in 1999 [140]. Twenty-five platinum disc-shaped
electrodes in a silicon matrix were implanted into the retinal surface of four nor-
mally sighted dogs. The arrays were held in place using metal alloy tacks. Over
a six-month period the implants were biologically well tolerated, mechanically
stable, and could be securely attached to the retinal surface.
A design for a functioning retinal prosthesis system has been described in joint
papers by Liu et al. at North Carolina State University and the John Hopkins
team in 1999 [129], [128]. The proposed device, called the Multiple Unit Artificial
Retina Chipset (MARC), consists of the extraocular unit containing the video
camera and video processing board, connected by a telemetric inductive link to
the intraocular unit. The power and signal transceiver, stimulation driver and
electrode array are contained in the intraocular unit.
In 2003, after obtaining FDA approval, the Doheny Eye Institute team and
Second Sight, a company formed by former North Carolina State University team
member, Robert Greenberg and Alfred Mann, developed the first human epireti-
nal implant. A subject with advanced retinitis pigmentosa received an implanted
4x4-electrode array, connected by a subcutaneous cable to an extraocular unit
which was surgically attached to the temporal area of the skull. A wireless link
transferred data and power from a belt worn visual-processing unit to the ex-
traocular unit. All 16 electrodes produced phosphenes, and the subject was able
to detect ambient light, motion and correctly recognise the location of phosphenes
(eg. left vs right, or ‘upside down’). Future plans are to develop more complex
stimulation control and provide a higher number of electrodes [111]. The use of
microwire glass is also being investigated as a method to assist with the mapping
3.5. Retinal Stimulation 69
of flat microelectric stimulator chips and curved neuronal tissue [116].
Retinal Prosthesis Project
Following earlier collaborative work with Humayan and de Juan, Wentai Liu and
his team have continued with the development of an epiretinal prosthesis. A 60
electrode stimulating chip, which integrates power transfer and back telemetry
has been developed [131]. One of the advantages of this system would be removing
the requirement for the cable connecting the intraocular and extraocular units
described in the Doheny Eye Institute team implant [111].
Boston Retinal Implant Project
This project is a collaboration between Joseph Rizzo (Massachusetts Eye and Ear
Infirmary-Harvard Medical School) and John Wyatt (Massachusetts Institute of
Technology) to develop an epiretinal prosthesis. The main difference between
their approach and Humayun et al. is the use of a miniature laser, located in
a pair of glasses, to transfer power and data to a stimulator chip. Although
the laser is required to be accurately directed to the implant, and needs to cope
with blinking, it will not be affected by electronic noise interference (unlike radio
frequency transmission) [182]. Electrically invoked cortical potentials have been
successfully recorded from stimulation of a rabbit retina with this method [181].
Recently the Boston retinal implant project microelectrode arrays have been
tested with six patients, five of them legally blind from retinitis pigmentosa. The
sixth patient was normally sighted, however their eye required removal due to
orbital cancer. All patients were able to perceive phosphenes in response to stim-
ulation, however the results were mixed. Threshold charge densities were found
to be significantly higher, and above safe levels, in blind patients compared to
the normally sighted patient [184]. In this study, it was often found (for example,
60% of tests in one subject) that multiple phosphenes would be presented when
70 Chapter 3. A Review of Artificial Human Vision
a single electrode was stimulated. In addition, multiple-electrode stimulation did
not reliably produce matching phosphenes [185].
EPI-RET
Rolf Eckmillar, from the University of Bonn, leads the German EPI-RET project,
which involves 14 research groups. The aim of their first epiretinal device is to
allow blind people to identify the location and shape of large objects [59]. Their
approach involves replicating a healthy retina with a ‘retinal encoder’ device,
which consists of a photosensor array of 10,000-100,000 pixel inputs and simu-
lated output of 100-1,000 ‘ganglion cells’. Eventually this project aims to embed
this encoder into a contact lens. The output from the encoder is then sent to
an implanted retinal stimulator. Eckmilliar et al. suggest that a future epireti-
nal prosthesis will be tuned (to optimize phosphene perception) during a dialog
between a subject and their retinal encoder [60], [12], [13], [11]. More recently,
a ‘Learning Active Vision Encoder’ (LAVIE) has been proposed to compensate
for spontaneous eye movements (drift or nystagmus) and head movements in the
absence of vision. A smooth pursuit function is also being investigated [61].
Flat platinum microelectrodes have been developed for the EPI-RET project
and evoked cortical potentials have been recorded after stimulation in rabbits
[238]. In 2000, Hesse et al. reported problems with the fixation of the electrode
film and the retina in a cat experiment, partly due to the very thin posterior
sclera [97]. Research into alternate electrode shape and fixation techniques was
planned.
The company Intelligent Implants was formed in 1998 to commercialise re-
search by the EPI-RET group [61].
3.6. Optic Nerve devices 71
University of NSW and University of Newcastle Vision Prosthesis
Project
Australian research on an epiretinal prosthetic vision system is occurring at the
Vision Prosthesis Project at the Universities of NSW and Newcastle, led by Gregg
Suaning and Nigel Lovell. This project aims to extend concepts from the devel-
opment of cochlear prostheses.
A 100-channel neurostimulator circuit for the retina has been developed, which
uses bidirectional radio-frequency telemetry for transferring data and power [216],
[217]. A data format protocol has been introduced. The 100-channel neurostim-
ulator was found to function and successfully produce evoked potentials in sheep
[218], [219], [89]. An inexpensive technique for manufacturing platinum spher-
ical electrodes has also been proposed [220]. Recently, an hexagonal mosaic of
intraocular electrodes has been suggested by Hallum et al. [88] to optimise the
placement of electrodes and therefore improve visual acuity in prosthesis patients.
A proposed prototype for an epiretinal system, capable of 840 stimulating events
per second, using this electrode placement combined with a filtering approach to
image processing, has also been described [215].
3.6 Optic Nerve devices
The optic nerve is a collection of one million individual fibres running from the
retina to the lateral geniculate body in the centre of the brain. This nerve can be
reached surgically and could provide a suitable location for implanting a stimu-
lation electrode array.
72 Chapter 3. A Review of Artificial Human Vision
Microsystems Based Visual Prosthesis (MiVip) and OPTIVIP projects
(ESPRIT programme of the European Union)
The MiVip team, led by Claude Veraat of the Neural Rehabilitation Engineering
Laboratory, Universit Catholique de Louvain in Belgium, has developed a pros-
thesis system which includes a spiral cuff silicon electrode to stimulate the optic
nerve.
In February 1998 a 59-year-old blind patient was implanted with the op-
tic nerve visual prosthesis. Localised phosphenes were successfully produced
throughout the visual field, and changing pulse duration or amplitude could alter
their brightness. After training it was reported that the patient could perceive
different shapes, line orientations and even letters [236]. However, this system
only displays one phosphene at a time and pattern recognition was achieved by
the subject scanning with a head-mounted camera over a time period of up to 3
minutes. An interesting feature of this study has been the different phosphene
shapes that have been generated: if these could be reliably replicated they might
add a useful dimension to prosthetic vision. The cuff electrode consists of four
platinum contacts and is able to adapt continuously to the diameter of the optic
nerve. Initially a subcutaneous connector conducted stimulation of the electrode;
however in August 2000 a neurostimulator and antenna were implanted and con-
nected to the electrode. An external controller with telemetry was then used for
stimulating the cuff electrode. Recently, an adaptive neural network technique
has been proposed to classify the phosphenes generated by this device [5], [4].
3.7 AHV simulation studies
Due to the difficulty in obtaining experimental participants with an AHV device
implanted, a number of simulation studies have been conducted with normally
3.7. AHV simulation studies 73
sighted subjects. The simulation approach assumes that normally sighted people
are receiving the same experience as a blind recipient of an AHV system. How-
ever, criticism of this approach has been raised by Weiland and Humayun (2003)
who have stated that human implant studies are the only way of verifying the
effectiveness of a visual prosthesis and have questioned the validity of simula-
tion studies [242]. This criticism has been addressed by Dagnelie (2006) who has
defended the use of AHV simulation studies as they can help identify require-
ments and find solutions for vision tasks; provide examples of prosthetic vision to
clinicians and the public; and also assist in designing rehabilitation programs for
future AHV system recipients [46]. In addition, simulation studies may reduce
the number of animals sacrificed in AHV studies.
As discussed in Section 2.8.3, an often cited prosthetic vision simulation was
conducted in 1992 at the University of Utah by Cha et al. [29], in order to cal-
culate the minimum number of phosphenes required for adequate mobility. The
main findings from this research were that a 25x25 array of phosphenes with a
field of view of 30◦ would be required for a successful device. However, the simu-
lation display in Cha et al. used a simple television-like display. Hayes et al. have
described a more sophisticated approach [93]: in their study, two different image
processing applications were used to display simulated phosphenes to a seated
subject, who wore a Head Mounted Display (HMD). The first image processing
application used a simple square phosphene array, where each phosphene con-
sisted of a solid grey scale value equal to the mean luminance of the contributing
image pixels. The second image processing application used a Gaussian filter.
Array size, contrast level, drop-out percentage, simulated phosphene size, and
background noise were adjustable features of the simulation. Object recognition
(plate, cup, spoon, etc), reading, candy pouring and cutting accuracy tasks were
conducted under different simulation conditions. The main result was to conclude
that the phosphene array size would be the most important factor in a usable
74 Chapter 3. A Review of Artificial Human Vision
prosthesis.
Another image processing approach investigated the requirements for AHV
facial recognition [226]. A Low Vision Enhancement System (LVES) connected
to a PC and driven by a Visual Basic program was used to display the images.
Subjects were required to select which simulation image best matched a set of
four normal images of human faces (the images of the same person were varied by
head angle and whether the person was smiling or serious). All images displayed
occupied a visual field of 13◦ horizontally and 17◦ vertically. The simulation
display was presented in a circular ‘dot mask’, rather than the contiguous square
blocks. Electrode properties (such as drop outs; size and gaps), contrast and grey
levels could be varied experimentally. The grid sizes used in this study varied
from 10x10 to 32x32 phosphenes. The authors found high accuracy for all high
contrast tests (except those with significant drop out and two gray levels) and
suggest that reliable face recognition using a crude pixelized grid is feasible.
Research at the Queensland University of Technology (QUT), Australia by
Boyle et al. [19], has examined the use of various image processing techniques
(such as enhancing edges, using different grey scales and extracting the most im-
portant image features) to identify a recognition threshold for low quality station-
ary images. These images are used to represent the limited number of phosphenes
available to the subject (typically a 25x25 array). This research has found that at
these low information levels the use of image processing techniques is not helpful
in the identification of static scenes, although an automatic zoom feature did help
image understanding.
3.8 Evaluation of current AHV systems
With the current understanding of neuronal mechanisms in the visual system,
AHV systems do not appear likely to replace the functioning of normal human
3.8. Evaluation of current AHV systems 75
vision for some time. It is also currently difficult for microelectrodes to pro-
vide a regularly organised array of phosphenes [230]. It should be noted that
as the development of AHV systems continues, research into retinal transplan-
tation, growth factors and gene therapy has commenced which may also provide
alternative treatment options for blindness.
In the long term AHV systems are likely to offer many benefits, including mo-
bility, face recognition and reading which will have a profoundly positive effect
on the blind recipient. AHV research also offers important insight on the func-
tioning of the human visual system, and in brain-computer interface technology.
However, in the immediate future it will be important to consider whether the
benefits from the use of these systems outweigh the cost. Despite the overloading
of another sensory input channel, traditional mobility aids and ETA devices (such
as the vOICe system from Peter Meijer [150]), are currently cheaper, less invasive
and may require a similar amount of training to AHV systems. Additionally,
many people who are classified as blind are elderly, and still have some remaining
vision, and therefore may not be suited to an AHV system.
The need for standard psychophysical assessment methods have been noted
by a number of AHV researchers (for example, [215], [229] and [132]). To inform
consumers on the benefits of an AHV system compared to other technical aids
for the blind, future research comparing the effectiveness of these devices would
be useful. The lack of a method to compare mobility was also raised by Dobelle
in 2000 [48]. However, as discussed in the previous chapter, there are a number
of mobility assessment methods presented in the Orientation and Mobility Lit-
erature which could be useful for comparison of AHV systems and other devices
(recent examples are [137], [95], [74]).
A number of additional AHV review papers (which cover the same literature
discussed in this chapter) have been published. These reviews include: [46], [143],
[147], [240] and [235] (in German). A list of AHV project web sites is provided
76 Chapter 3. A Review of Artificial Human Vision
in Appendix 1 of this thesis.
3.9 Chapter Summary
The subretinal implants developed by the Optobionics Corporation show the
greatest promise in restoring some vision; however there are doubts over whether
the improvements in vision are due to neurotrophic effect or the device itself.
Further tests to determine the reason for the improvements are planned. If the
device is responsible, it is conceivable that their implants may be available in the
next few years.
The cortical implant system from the Dobelle Institute is commercially avail-
able; however it has not been approved by the U.S. Food and Drug Adminis-
tration. It is difficult to obtain outcome information from the Dobelle system.,
however one recent article in the Wall Street Journal [152] reported a 33 year old
female recipient was able to use it for only 15 minutes per day (because it was
tiring and caused head pain).
The remaining cortical and optic nerve systems are still in varying stages of
preliminary human or animal testing. Preliminary research has also commenced
on microstimulation of the lateral geniculate nucleus [175]. Although progress
is being made, it does not appear likely that a commercial system using these
methods will be available within the next five years.
Finally, psychophysical and mobility assessment standards would help in com-
paring AHV systems with other technical aids for the blind.
Chapter 4
A Framework for Blind Mobility
Improvement via Computer
Vision
4.1 Introduction
The previous two chapters have reviewed mobility problems and assessment for
the blind, and current AHV system technology. This chapter examines how infor-
mation can be effectively presented to a blind person via the perceived phosphenes
from an AHV system. The main constraint on the amount of information which
can be provided using an AHV system is the limited number of electrodes which
can be stimulated, which limits the display spatial resolution. As a result, meth-
ods are required in an AHV system to reduce the resolution of images captured
from a video source. An example of this reduction in spatial resolution is shown in
Figure 4.1. Figure 4.1a shows a typical mobility hazard in the form of a telephone
booth in Latrobe St., East Brisbane, Australia. A reduced 25x25 resolution im-
age and it’s 625 phosphene representation are shown in Figures 4.1b and 4.1c.
77
78 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
A symbolic phosphene display, which would highlight the telephone booth (at
appropriate proximity) is displayed in Figure 4.1d. There are a large number of
computer vision methods which could be discussed in this Chapter, however the
research reviewed has been limited to those methods which are computationally
efficient and stable (as mentioned in the research scope section of Chapter 1).
Computer vision, also known as image analysis or machine vision, usually in-
volves analysing an image or sequence of images and providing information about
the image contents (for example, by recognising an object within an image). A
closely related, and overlapping, field image processing involves the enhancement,
compression, reconstruction and restoration of images (for example, by highlight-
ing edges). The output from image processing is an image, whereas the output
from computer vision is usually information (which can often be used to con-
struct new images) [205]. Both of the these fields are important in enhancing the
effectiveness of an AHV system display. As computer vision methods often con-
sist of image processing methods (such as noise reduction), in this thesis image
processing is considered a subset of computer vision.
This chapter contains four related sections. The first section is an overview
of current computer vision (and image processing) methods which are useful
for blind mobility enhancement. All of the methods presented have either been
applied in experiments described in this thesis, or in the second section in this
chapter which provides a review of computer vision applications developed to
assist to blind and visually impaired. The third section briefly discusses the
links between computer vision methods and functionality provided by the Human
Vision System (HVS), which was discussed in the previous chapter. In the forth
and final section, the literature reviews from this and the previous two chapters
are integrated to develop a novel conceptual framework for AHV research. This
framework is used to guide the remaining chapters in this thesis.
4.1. Introduction 79
(b)
(c)
(a)
(d)Figure 4.1: Example of reduced visual information in an AHV system: Image (a)shows a street scene image in suburban Brisbane; in image (b) the resolution ofthis image has been reduced to 25x25 pixels. Image (c) shows a simulated 25x25phosphene display of the same image. A sample symbolic representation of themobility hazards contained in the street scene is shown in image (d).
80 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
4.1.1 An information processing approach to computer
vision
The work of J.J. Gibson was discussed in Chapter 2, and has been influential
in the fields of perception (particularly visual), O&M and computer vision (par-
ticularly the concept of optic flow). Although Gibson’s work helped identified
what information is required for a person or animal to function in an environ-
ment, it did not provide details on how the central nervous system performs
these functions. In the late 1970s the approach of David Marr provided a modern
framework which acts as a bridge between brain neurophysiology, visual percep-
tion and computer vision. Marr’s main contributions to computer vision were
on edge detection, stereopsis (using the combined information from two slightly
different images to calculate depth), and object representation in the brain. Marr
proposed that three levels of understanding are required for a system to carry out
an information processing task [144]:
1. Computational theory This level addresses the question: what is the goal of
this task and what is the logic required to carry it out? For example, how
can an object be identified from an image. Research at this level is similar
for biological and computer vision [141].
2. Representation and algorithm The main question addressed at this level is
how is the computation theory actually implemented? For example, how
are images recorded and processed (for example by neuronal operations).
3. Hardware implementation For a biological system, this level is mainly con-
cerned with anatomy and optics of the eye. A computer vision implemen-
tation would involve computer hardware.
4.2. Computer Vision 81
4.2 Computer Vision
The goal of computer vision has been defined as achieving machine behavior
which is similar to biological systems [206]. Computer vision is a complex field
with an enormous amount of literature. In this chapter an attempt will be made
to summarise only the most relevant material. This chapter also includes refer-
ences to relevant visual perception research (such as attention, motion and depth
perception).
Computer vision involves the processing of images captured from an image
source (such as a digital camera). A digital image can be considered a two-
dimensional (2D) array of numbers (or pixels). These numbers can represent light
intensities, distances or other physical quantities [232]. The spatial resolution of
an image refers to the size of the image array (for example, 160x120 pixels).
Image sequences (video) can be used to record temporal-spatial information. A
computer vision system typically has three hardware components: a camera,
which is usually a Charged Coupled Device (CCD) or CMOS, a frame grabber
to convert the camera signal to a rectangular array of NxM integer values, and a
host computer. Colour images require red, green and blue (RGB) values for each
pixel, which are often represented by a 24 bit value, where 8 bits (containing 256
values) are allocated for each colour. The pixels in a grey-scale image are usually
represented by an 8-bit number (256 values).
There are two main complementary methods for the processing of captured
images in an AHV system: information reduction and scene understanding.
4.2.1 Information reduction
Most existing AHV system efforts are aimed at the information reduction level,
which is concerned with the reduction or collapse of visual information. Infor-
mation reduction, which overlaps with some aspects of image processing, is also
82 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
referred to as low-level information processing, and usually assumes very little
knowledge about the content of images. Operations on images at this level are
designed to improve image saliency, or to emphasise features of particular impor-
tance or relevance, for example curbs or walls.
Methods at this level which are useful for blind mobility improvement com-
monly involve image compression, image filtering, edge detection and image sharp-
ening to identify objects within the image.
Image filtering
Image convolution is a fundamental operation in image analysis. It involves
setting the value of a pixel using a transformation function based on the values
of neighbouring pixels. A mask called a kernel (usually a square array) is used
and its values are often referred to as ‘weights’. An example 3 x 3 kernel is shown
in Figure 4.2. This kernel is then moved across each pixel position in the image
and the sum of pixel values at this position are multiplied by the corresponding
kernel entry to perform the convolution operation.
Image filtering is a method of transforming image intensities to reduce noise,
or emphasise certain features. Convolution can be useful for image smoothing,
for example by using a kernel with the same weighting for all values (called a
uniform filter), or by calculating the median value of grey-values which surround
each individual pixel (a 3x3 median filter). The Gaussian filter is another widely
used low pass filter and uses a kernel where central pixels have a higher weighting.
A 3x3 Gaussian filter kernel is shown in Figure 4.3. A high pass filter can be used
to sharpen images, such as the Laplacian kernel shown in Figure 4.2. The effects
of low and high pass filtering are shown in Figure 4.4.
The computer vision methods discussed to this point have dealt with images
in the spatial domain (that is based on a Cartesian grid composed of pixels). The
Fourier transform is a commonly used tool to separate the specific frequency
4.2. Computer Vision 83
0 −1 0
−1 5 −1
0 −1 0
Figure 4.2: The 3x3 Laplacian kernel for high pass filtering (for image sharpen-ing).
0.0751 0.1238 0.0751
0.1238 0.2042 0.1238
0.0751 0.1238 0.0751
Figure 4.3: An example 3x3 Gaussian low pass filter kernel for image smoothing.Note the centre element has the greatest weight (0.2042) compared to the others.
ranges in images. Filtering in the frequency domain can be conducted by re-
stricting an output image to certain frequencies (for example, a low-pass filter
would block high frequency content). The fast Fourier transform (FFT) provides
an efficient implementation of the Fourier transform for image processing. It is
generally computationally more efficient to filter images in the frequency domain
than performing convolution in the spatial domain. The regularity in the ar-
rangement of objects can be identified more easily in the frequency domain. For
example, leaves on a tree typically show a random spatial arrangement, whereas
bricks in a wall would produce highly structured patterns [198]. There are many
alternate processing approaches which can be used, including the Gabor filter
(which has been found to model the behaviour of receptive fields in the visual
cortex of monkeys for example, [178]), the Harr transform and the Laplace trans-
form.
Global image processing
Global image analysis considers the entire image. For example, Histogram equal-
isation can be used to provide an image with a uniform distribution of grey scale
84 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
(a)(b) (c)
Figure 4.4: An example of high and low pass filtering on an image. A grey scalepost box image is shown in image (a). Image (b) shows the image after it hasbeen filtered using the 3x3 Laplacian high pass filter (detailed in Figure 4.2.1).Image (c) shows the result of applying the Gaussian low pass filter from Figure4.2.1. This image has been taken from an image sequence captured using a lowquality PDA card camera (this sequence and camera is described in more detail inChapter 6). As the camera was moving at the time of capture there is a significantamount of motion blur. It is anticipated that image quality from cameras usedfor AHV systems will improve as technology advances.
4.2. Computer Vision 85
(a)(c) (d)
(b)
Figure 4.5: An example of contrast enhancement by histogram expansion. Thebase image (a) shows a Brisbane suburban bus shelter. (b) shows the distributionof the 256 grey-scale values in image (a). The contrast in image (c) has beenenhanced using histogram equalisation. The histogram of image (c) is shown in(d).
86 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
values. This can be achieved by obtaining a histogram of the image; then obtain-
ing the cumulative distribution of gray levels; and finally, replacing the original
gray level intensities with those from the cumulative distribution. Grey levels can
also either be set to black, white, or selected grey levels to emphasise an area of
a certain brightness (this is called thresholding).
Colour transformation is also a global operation. Images captured from a
digital camera generally use the RGB colour format, whereas most computer
vision operations are performed using grey-scale values. To convert from RGB
format to grey-scale, the value of each pixel (Y) cam be set using formula 4.1
(from [191]). This formula has been used to convert captured images in the
experiments described in Chapters 5 to 8 of this thesis.
Y = 0.299R + 0.587G + 0.114B (4.1)
Edge detection
Edge detection is a frequently used technique for information reduction in an
image. The output from edge detection is useful for further processing (line
detection and image segmentation). Cortical prosthesis research by the Dobelle
Institute has found that edge detection and image reversal enhance the ability
of subjects to recognise important scene components (such as doorways) [48]. In
this section, two widely used edge detection methods are briefly described. A
comparison is made between the Sobel and Canny methods for reduced spatial
resolution in Chapter 5. A comparison of the Sobel, Roberts and Canny edge
detection methods are shown in Figure 4.9.
The simplest edge detection methods involve checking for local spatial vari-
ations in pixel values. This variation can be obtained by calculating a discrete
approximation of the directional difference (ie. the gradient) between adjacent
pixels. A significant brightness change in a small spatial area will indicate an
4.2. Computer Vision 87
−1 −2 −1
0 0 0
1 2 1
Figure 4.6: The 3x3 kernel used for Sobel horizontal edge detection.
−1 0 1
−2 0 2
−1 0 1
Figure 4.7: The 3x3 kernel used for Sobel vertical edge detection.
edge. Differences between adjacent pixels which are greater than a certain thresh-
old value are identified as edges. Changes in texture, lighting or image noise can
incorrectly result in identifying pixels as edges. The Sobel filters (which are
shown in Figures 4.6 and 4.7) are commonly used for edge detection and combine
differentiation with Gaussian smoothing to reduce noise [113].
Canny [26] proposed a more advanced edge detection method with three main
objectives: low error rate, well-localised edge points and finally to have only one
response to a single edge (as opposed to using horizontal and vertical responses
to calculate the overall response of the Sobel edge detector). The Canny edge
detector also uses Gaussian smoothing to find image gradients with high spatial
derivatives, however a number of additional steps are then applied. In order to
detect weak edges, the Canny method uses two thresholds (low and high). Edges
above the lower threshold are only included if they are connected to an edge which
is above the higher threshold. An example of the reduced noise and effective edge
identification from the Canny method are demonstrated in Figure 4.9b.
Line detection
Many of the mobility tasks discussed in Chapter 2 involve the identification of
lines in the environment (for example path following, doorway identification).
88 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
(a)(b) (c)
Figure 4.8: Sobel edge detection applied to captured image of a post box (a).Image (b) shows the result of the horizontal Sobel edge kernel. The output fromthe vertical Sobel edge kernel is shown in image (c).
4.2. Computer Vision 89
(a)(c) (d)
(b)
Figure 4.9: A comparison of different edge detection methods applied to an imageof suburban footpath (a). The output from the Canny detector is shown in (b).The Sobel detector is shown in (c), and the Roberts edge detector is displayed in(d).
90 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
The Hough transform is a common method for detecting straight or curved lines
and is robust to noise and additional structures in the image [206]. This transform
is used in a number of prototype devices described in Section 4.3.
The Hough transform is applied to the binary output image from an edge de-
tection algorithm, and identifies which points are associated with particular lines.
This can be done by representing lines by their cartesian or polar coordinates.
Using the polar coordinate system, each pixel (x,y) in the input binary image is
converted using Formula 4.2 to the Hough transform parameters (r,θ).
r = x sinθ + y cosθ (4.2)
Generally an accumulator array is used to store the Hough parameter results
generated from Formula 4.2. If one or more lines exist in the image, there will
multiple pixels with the same parameter results in Hough space and therefore,
the accumulator array will be highest for these lines. The Hough transform does
not return the exact length of lines in an image, but returns the description of
these lines . Because the line orientation is available from the (r, θ) parameters,
it is possible to search for lines with a specific orientation (for example, when
searching for stairs in an image).
An example of applying the Hough transform is provided in Figure 4.10. In
this example, the transform is used to find the dominant line, which is the lower
edge of a fence.
Morphology
Morphology is a useful technique for image analysis, which is often used for noise
reduction and feature detection in binary images [198]. The morphological op-
erator is referred to as a structuring element, and it is similar to a filter kernel.
The two fundamental operations in morphology are erosion (which is used to thin
4.2. Computer Vision 91
(a)(c) (d)
(b)
Figure 4.10: Example application of the Hough transform for locating the fenceboundary shown in image (a). Image (b) shows the output from Sobel edgedetection. The corresponding Hough transform output is shown in image (c),with the origin in the top left hand side of the image. This transform image wasgenerated using software from Seul at al. [198]. The horizontal axis represents r,and the vertical axis represents θ, which increases from 0 radians in the top leftcorner to π radians at the bottom. The dominant peak, indicating the dominantline, is shown with a superimposed box. Image (d) shows the pixels which arepresent along the dominant line found by the Hough transform.
92 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
region boundaries or increase the size of gaps in an image) and dilation (which is
used to thicken region boundaries or close holes in an image) [56]. In Chapter 5,
a dilation operation was applied to enhance the appearance of edge information
in the reduced 50x50 pixel resolution images.
Image segmentation
A common requirement for computer vision systems is to extract image compo-
nents (such as people, faces, cars and other objects) from the image background.
The process of subdividing an image into parts is a process called image segmen-
tation. The main methods of segmentation include:
• The simplest segmentation method is to use luminance thresholding to in-
dividual pixels. For example, if a dark object is located against a light-grey
background (such as text on a printed page) the object could be identified
from the dark pixels and the background could be ignored. This method
could also be used to identify particular colours (for example, in identifying
fruit using an automatic harvesting applications).
• Boundary detection can be used to identify segments from an edge detection
output image. Edge linking methods (such as the Hough transform, curve
fitting, or active contours) can be used to join segments with similar object
boundaries.
• Model based segmentation can be used when the object’s geometric shape
is known a priori. Parallel lines in an image are easily identifiable in Hough
space as the peaks also occur in parallel (which is useful for identifying a
square, or lines in a stair-case). The generalised Hough transform [9] can
be used to identify circles or other shape types.
• Region based methods involve ‘growing’ segments from individual or small
4.2. Computer Vision 93
groups of pixels. Region growing can group neighbouring pixels with simi-
lar characteristics such as grey-levels. The split and merge technique uses
small blocks which are joined if they have a similar grey-level [104]. In the
watershed algorithm [186] local minima are found through the image, and
these pixels are ‘seeded’. These seeds are then ‘grown’ (or flooded) until the
region boundaries are established (areas of high edge magnitude) to prevent
the flood from spreading into neighbouring regions.
• Texture based methods can be be used to segment an image into regions
with similar textures. Autocorrelation (a measure of the amount of rep-
etition within an image) and statistical methods, such as the grey-level
co-occurrence matrix and run length matrices have been previously used to
measure texture [224]. Frequency domain methods, such the Gabor filters
and Wavelets are also useful for texture analysis [84].
• In addition, when using multiple images, motion can be used to segment
objects of interest. A difference image can be created by subtracting a
previous image from a current image which will identify parts of the image
which have changed (this method is useful for identifying moving objects
against a static background, for example in video surveillance applications).
Motion can also be used to estimate optical flow [103],[42] which is discussed
in the next section on scene understanding.
Human perception research
An additional method of segmenting an image is to use information from human
eye tracking experiments. This can be achieved by recording a person’s eye
movements while they are looking at an image or image sequence. By computing
which parts of each image have received the most eye fixations, it is possible to
determine the important regions of interest (ROI) within images (assuming that
94 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
the eyes fixate on the more important areas of the image). A number of image
components which influence eye fixations have been identified and include motion,
contrast, colour, size, location, shape, foreground/centrally located objects, edges,
texture, prior instruction and context, people, gestalt properties, clutters and
complexity and unusual stimuli [161].
Task and context have been found to be important in eye movements. If
sighted observers are asked to view a picture with a specific task or context in
mind, these become important predictors of eye movements [250]. An importance
map concept has been previously developed [162] in which the most visually
important areas of an image receive a weighting. These weightings compare well
to recorded eye movements of the same images. The importance map concept
has also been extended to the automatic detection of important areas in complex
video sequences [163]. These areas include moving objects and centrally located
objects.
For an AHV system, it may be useful to record the eye movements of normally
sighted people while they accomplish common mobility tasks (such as a road
crossing). These data could then be used to estimate the most useful image
components for an AHV system user, and possibly highlight these regions during
real-time use of an AVH system.
As discussed in Chapter 3, previous work on static image AHV simulation by
Boyle et al. [18],[20] has examined the use of various information reduction tech-
niques such as enhancing edges, using different grey-scale levels and presenting
the results of importance mapping on image recognition. Boyle et al. reported
that at low information levels (generally 25x25 pixels) the use of image process-
ing techniques is not helpful in the identification of static scenes, although an
automatic zoom feature was found to aid image understanding.
4.2. Computer Vision 95
4.2.2 Scene understanding
The scene understanding component of computer vision is concerned with identi-
fying features and extracting information [81]. This level is also referred to as high
level computer vision. This section will provide a brief overview of object recog-
nition, symbolic representation, motion analysis, obstacle avoidance and machine
learning.
An example application of scene understanding might be to identify a bus
stop, fire hydrant or traffic light in an image. It has been suggested that because
reading and navigation tasks by the blind are possible using non-implant devices
(such as text-to-sound conversion or a cane) the most useful tasks for an AHV
system user may involve scene understanding (such as face recognition) [230]. It
may also be useful to know the distance to the object (number of steps, or time
at current walking speed).
Object Recognition
One of the aims of scene understanding is to determine what the objects are in an
image. To do this, the characteristics or features of these objects must be known
a priori. Segmentation (discussed in the previous section) is usually required to
break the image into regions before each of those segments are then processed to
determine if they belong to a particular type of object.
Three main approaches to object recognition can be identified, although there
is some overlap between these approaches (for example, the output from template
matching and shape analysis can be used as features for pattern recognition):
1. Template matching also known as matched filtering, involves searching for a
known shape within an image. This involves creating a template array, then
moving this array over each pixel position in the image, and calculating the
96 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
correlation between the template and the pixel neighbourhood. The corre-
lation result is obtained by multiplying each image pixel with the matching
template pixel and summing the results: the pixel position with the highest
result should be the closest match to the template [179] . However, this
method only works for objects of the same shape and size within an image,
therefore an array of templates for different geometric transformations could
be generated to help find varied instances of objects of interest. However,
an array of templates increases the computational complexity of finding the
best template match within an image.
2. Shape analysis involves matching image segments against previously iden-
tified object geometric features. These features, which are usually obtained
from binary images, include the area, perimeter boundary and curvature
of objects. Statistical moments are also commonly used to describe ob-
jects, and provide attributes which are independent of size, position and
orientation [198]. Fourier descriptors can also be used to describe region
boundaries [81]. A good review of shape representation and description can
be found in Zhang and Lu [254].
3. Statistical pattern recognition attempts to classify object classes where they
have previously been defined (supervised classification) or attempts to de-
fine differences between object classes (called unsupervised classification or
clustering) [241]. Object recognition applications generally use supervised
classification, where the first main stage involves identifying features of an
object class (such as colour, shape, texture) [96]. Linear transforms, such
as Principal Component Analysis (PCA), have also been widely used for
feature extraction and dimensionality reduction (for example in face recog-
nition) [255]. The next object recognition stage involves the classification
4.2. Computer Vision 97
of the extracted features into a class of objects. Jain et al. [114] have sug-
gested the three main methods used for classifiers are similarity (like the
template approach, the pattern is assigned to a class which it is most closely
correlated), probability (assign a pattern to the class with the maximum
posterior probability) or decision boundaries (which focus on the minimi-
sation of criteria such as mean squared error). Artificial Neural Networks
(ANNs) are often used as an efficient implementation platform for classic
statistical pattern recognition methods [114]. A good recent review of im-
age processing using neural networks is presented byEgmont-Peterson et al.
[62]. An example ANN classifier for enhanced blind mobility is described
in Section 4.4. Additional pattern recognition methods include Support
Vector Machines (SVM) [25] and Hidden Markov Models (HMM) [125].
Motion analysis
The motion of a person provides visual information about movement relative
to the environment and information about the depths of observed scene points
[154]. Therefore the analysis of image sequences is desirable in a mobility device.
An interesting feature of image sequences is that less spatial resolution may be
required when image contents move [80]. One monocular method of judging
depth is motion parallax which is used when objects are moving at equal speed:
those which are closer to the observer seem to move faster. This information is
one method used to obtain mobility information by sighted people (for example,
when approaching a railway platform).
Obstacle avoidance
Obstacle detection and avoidance involves a combined estimate of ego-motion and
scene structure, as objects only become obstacles when they are in an observer’s
anticipated path. When a human perceives that an object is about to hit their
98 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
head there are usually stereotypical movements of the head and closing of the eyes.
Gibson has suggested that this action is based on the characteristic increases in
the size of the object as it approaches the head [77]. Human infants have been
found to display this behavior in computer simulations of collision, and diving
gannets also appear to use changes in the retinal image to decide when collision
with water is about to occur [17]. A simple approach to obstacle avoidance is to
examine optical flow in the left and right halves of the visual field and to turn
in the direction of smallest optical flow (bees use a similar method for travelling
down a corridor) [141]. An alternate method is to obtain two or more contiguous,
segmented, images from an image sequence and calculating which segmented
components of the image have changed size between images. If these segments
continue to expand past a threshold rate and size (perhaps 25% of the display)
then a looming alert warning should occur. Following this assumption, the block
based obstacle alert presented in Chapter 6 estimates the optical flow (discussed
in Section 2.7) of looming image segments in front of a head-mounted camera to
provide a real-time warning to an research participants using an AHV simulation.
Symbolic representation
When objects of interest have been recognised by an AHV system it could be
appropriate to present a symbolic representation, where an idealised or reduced
image is presented. For example a small part of the phosphene grid (perhaps 5x5
phosphenes) could be used for information on obstacle locations in the current
environment. Figure 4.1d shows a symbolic depiction of a looming obstacle.
Auditory information could also be provided either by tone or through natural
language. A scene description mode could be useful (similar to the system by
Tou et al. [228], discussed below in Section 4.4). Research on raised line pictures
for the blind could be useful for deciding on symbolic representations of objects.
4.3. Previous applications of computer vision to assist the vision impaired 99
Knowledge representation
The interpretation of objects depends on knowledge of possible objects, and might
also depend on context (for example, an outdoor scene versus a home environ-
ment). For orientation, it may be useful if an AHV system using a scene under-
standing approach could learn to recognise new objects - for example, an image
of a particular type of building (such as a tram stop) could be added to the ex-
isting object knowledge base. This interpretation combines methods from object
recognition with knowledge of the expected image content (an area of Artificial
Intelligence (AI) known as knowledge representation [138]).
4.3 Previous applications of computer vision to
assist the vision impaired
This section reviews research which has applied computer vision methods to as-
sist vision impaired people. In 2005 the first workshop on computer vision for
the visually impaired was held during the annual Computer Vision and Pattern
Recognition (CVPR 2005) conference in San Deigo. Papers presented included
wayfinding (orientation); visual audio and tactile interfaces; and sign detection. A
paper based on Chapter 7 of this thesis [57] was also presented at this conference.
The purpose of this section is to discuss the different image-based approaches to
the task of providing useful information to the blind and to evaluate these efforts.
It may be useful to integrate components of this research with AHV system soft-
ware. The eleven papers which are surveyed below all involve prototype systems
only. An additional aid, the vOICe system, was discussed in section 2.4.3 and
is the only freely available blind mobility software which uses a computer vision
approach.
100 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
Obstacle avoidance
An early mobility device was reported in 1985 by Tou & Adjouadi [228]. This
system used spoken output to describe the current scene. Two modes were pro-
vided: the first attempted to identify a safe route for the traveller, using an
analysis of grey levels within the image. The second mode used scene analysis in
an ‘object-identification’ mode. This mode attempted to use the aspect ratio (the
ratio between the width and height of the image) of identified objects to cate-
gorise an object into three classes: long thin objects (such as a pole or mail box),
square or circular objects (such as a pot hole), and large objects (such as a car or
wall). When an obstacle was detected, the system provided a warning and asked
the user to walk slowly. If object identification was required, the blind person
would need to stop walking and wait while this processing took place. An image
correspondence technique was used to identify drop-offs. Although this system
was too slow for real-time use, nevertheless it demonstrated that computer vision
techniques could be useful with future improvements in hardware speed .
More recently a proposed real time hazard detection system for low vision
developed at the Human Interface Technology Laboratory at the University of
Washington [3]. Although results were not provided this paper presented an
interesting computer vision approach. The real-time system collected image data
using a small head-mounted camera, and displayed an enhanced image on to
an optical scanning virtual retinal display. The system assumed that common
hazards (such as curbs, stairways and doors) would be based on straight lines;
therefore the Hough transform was used to detect straight lines. The set of
lines for each image were then passed to a neural network for classification. The
input vectors used were the orientation and position of each line, and the output
vector was a confidence value reflecting the likelihood of a hazard. The system
was trained using several minutes of video from doors and staircases captured
4.3. Previous applications of computer vision to assist the vision impaired 101
at Washington University. Misclassification was associated with poor lighting or
failure of the camera to focus or adjust to lighting conditions (camera details were
not provided).
A 1998 paper by Snaith et al. [204] has reported on the use of edge detection
to determine the positions of lines in an image. The grouping of these lines
was used to classify objects (such as doorways). Paths were also identified using
edges and the Hough transform was used to group these into straight lines. The
dominant vanishing point was then identified to indicate a person’s direction of
travel. A similar approach for a blind mobility device was investigated by Molton
et al. [151]. Their device used stereo vision, combined with sonar for obstacle
detection and curbs. Once an image was captured, edge points were detected and
the Hough transform used to locate parallel line clusters (which were assumed to
represent curb or path information).
Distance information
A head-worn stereo camera based device was proposed by Cyganek and Borgosz
[43]. Their software calculated depth information using a disparity map created
from two captured images using a small central window of the combined images.
In a hardware implementation the authors propose that this window could be
adjusted by the user depending on the context. The calculated depth information
was then converted to a stereo sound output for a blind pedestrian. This paper
shows impressive output from three images, however it is difficult to evaluate
the software as no details were provided on computational efficiency or results of
depth mapping accuracy.
Stair case detection
The identification of stair cases was addressed by Se & Brady [197]. This research
used a texture detection method (using Gabor filters) to locate distant stair cases.
102 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
Once a person had moved close enough to the stairs, they were then detected by
searching for groups of concurrent lines. The intensity variation was then used
to partition the convex and concave lines. Once the stairs were identified, a
further step was applied to find their vertical rotation and slope (to help a blind
person find the stair base and work out how steep the steps were). The vertical
rotation and slope of the stair case were then used to transform the captured
image into a new image with the camera facing the stairs. Although reasonable
results were achieved, the approach was found to be slow and not suitable for
real-time applications.
General object recognition
An addition object recognition system for blind mobility was developed by Ever-
ingham et al. at the University of Bristol [63]. This system used a trained neural
network implementation to classify segments from an image into previously de-
termined object classes (road, pavement, sky, building, vegetation, obstacle or ve-
hicle). Once identified, these object regions were displayed in different colours to
people with low vision. Two databases, the Bristol Image Database (200 outdoor
and suburban images) and the Bristol Blind Mobility Database (10 ‘challenging’
urban scenes) were used to train the neural network classifier. Thirty-five differ-
ent features from each object were used for each feature vector (listed in Table
4.1). This system performed adequately on a restricted number of environments,
however the Gabor calculations are computationally expensive and the system
required 9.5 seconds for each image. By discarding selected texture frequencies
the processing time dropped to 300 ms at slightly lower classification accuracy. A
pilot experiment using this approach with static images was conducted with 16
legally blind participants and was found to increase the rate of object recognition
(compared to the unmodified original images) by more than 100%.
4.3. Previous applications of computer vision to assist the vision impaired 103
Table 4.1: The feature set used in Everingham et al. [63]
Feature Description
1 Size (proportion of image)
2-3 Position ((x, y) co-ordinates)
4-5 Orientation (sin & cosine of angle)
6-8 Colour (mean color components)
9-18 Shape (invariant Fourier descriptors)
19-35 Texture (mean Gabor magnitude)
Sign Detection
Chen and Yuille [32] investigated automatic text detection using a cascade classi-
fier. The cascade approach is efficient as it searches for individual sets of features
(such as regions which do not contain text) and then eliminates these regions
from further classification. This method was used to develop text-detection soft-
ware which was trained on 423 street scene images and 4000 images without text.
When tested on a database of 530 test images, the system was able to process
40 fps with a 91% detection rate. Chen and Yuille suggest that this application
could be useful to identify image regions of interest for a blind person, who could
then choose to zoom in on these regions.
Two different papers on sign detection and classification were also presented
at CVPR 2005 from researchers at the University of Massachusetts. In the first,
Silapachote et al. [203] use local color and texture features to identify sign re-
gions, and then classifying these regions by comparison with previously identified
sign classes. A correct sign classification rate of 97% was reported. In the second
paper, Mattar et al. [146] use a similar approach to sign detection, and provide
the results of tests on 3975 sign images from two different image datasets (in-
corporating variations in lighting, orientation and viewing angle). Mattar et al.
reported a recognition accuracy of 99.5% with 35 sign classes, and 92.8% when
104 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
65 sign classes were used.
Drop off detection
As discussed in Chapter 2, changes in terrain depth (drop-offs) are a significant
problem for blind mobility. The only paper which has explicitly addressed drop-
off detection is by Yuan and Manduchi [252]. In their novel approach a prototype
hand held ‘virtual white cane’ was built which contained a laser pointer and a
camera. Matched filtering was used to detect the laser light return from captured
images. This range information is then provided to a user using a tactile or
auditory display at 15Hz.
Pedestrian crossing detection
The final paper reviewed in this section is by Uddin and Shioyama [234] investi-
gated the use of computer vision to assist blind mobility by detecting the standard
black and white lines for pedestrian crossings. Their approach is to segment the
image and search for regions which are highly bipolar (that is, in a histogram
there will be two peaks representing the darker and lighter grayscale pixels). The
location, direction and band frequency were then analysed (by checking that there
are four or more white bands in the region) to extract the crossing. A collection
of 100 static images were processed, resulting in 95% detection accuracy, however
there were no details provided on computational efficiency or the effects of other
scene factors (such as lighting, clutter or occlusions).
In summary, a number of different papers have been investigated which pro-
vide various experimental computer vision based devices for the blind. The most
widely used computer vision techniques are the use of edges, the Hough trans-
form and statistical pattern recognition. Most of the systems described process
individual images in an image sequence, rather than using information (such as
object movement) from the differences between images. A common constraint in
4.4. Relationship between computer vision methods and the Human Vision System(HVS) 105
much of this research has been the difficulty in providing output quickly enough
for the device to be useful for a mobile pedestrian. Few of the systems described
in these papers were evaluated by visually impaired people. It is also unclear
in a number of papers how the proposed systems would be used in practice (for
example how the information would be conveyed to a user). It would be useful
to assess objectively whether these computer vision based devices lead to an im-
provement in mobility performance compared to traditional mobility aids. None
of the proposed systems has been developed commercially.
4.4 Relationship between computer vision meth-
ods and the Human Vision System (HVS)
One the main functions for an AHV system is to convert information captured
from a non-biological sensor (such as a camera) and convert these signals into a
representation which can be interpreted by a blind person. There are significant
parallels between computer vision methods (many of which have been inspired
by biological systems) and the methods used by the human vision system. In
this section a brief discussion of the relationship between artificial and biological
algorithms for extracting information from images is provided. The aim of this
section is provide a link between the computer vision methods discussed in this
chapter and the HVS review provided in the previous chapter.
Light is initially captured and converted to neural signals by approximately
100 million photoreceptors in the retina of the human eye. [239]. An artificial
system relies on a camera (such as a Charged Coupled Device (CCD)) which
converts a captured image into a two dimensional array of numbers.
As discussed in Chapter 3, a large amount of processing occurs on the signals
generated from the photoreceptors which reduces approximately 100 million from
106 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
the retinal rods and cones to around 1 million ganglion cells [156]. The retinal
ganglion cells generally have concentric receptive fields which react to light falling
on a central region, but are inhibited by light falling in the surrounding area (an
ON-centre cell). As these cells do not react to uniform patches of light, they
provide information on contrast borders [227]. In addition, the photoreceptors in
the central fovea of the eye are more densely located than in other parts of the
retina which assists in data reduction. These biological methods are similar to
the image processing methods of filtering, edge detection and data compression.
The left lateral geniculate nucleus (LGN) is responsible for combining and
routing signals from the left and right sides of the retina in each eye to the visual
cortex [24]. The layers in the primary visual cortex are responsible for processing
depth perception and the orientation of receptive fields (such as bars of light or
edges in a particular orientation) [156]. Although most visual information appears
to be processed first in the primary visual cortex, there are many other areas in-
volved in processing visual information such as V2, V3, the mid-temporal cortex
(where neurons are particularly sensitive to stimulus movement), V4 (colour pro-
cessing) and the inferotemporal cortex (where stimulus size, shape, contrast and
colour appear to be processed) [227]. Attempts to mirror these biological pro-
cessing methods in computer vision research include texture processing, contour
extraction, segmentation, shape analysis, depth perception, motion detection and
face and object recognition.
In summary, increasingly sophisticated information is extracted in the human
vision system as information moves through each stage. These methods of infor-
mation extraction from images are approximated by computer vision methods,
which generally rely on a camera, frame grabber and computer. Table 4.2 pro-
vides a summary of the main types of computer vision functionality processed by
the HVS.
4.5. A conceptual framework for AHV system information display 107
Table 4.2: Overview of computer vision functionality performed by each part ofthe HVS (Based on Thorpe [227]).
HVS Location Functionality approximated by computer vision
1. Retina (Photoreceptors) Image capture
2. Retina (Processing) Image enhancement (eg. ON-centre cells
for contrasting borders)
Edge detection
Data compression
Short range motion detection
Colour constancy
3. Lateral Geniculate
Nucleus Routing of information from each eye
4. Cortical processing (V1-V4) Texture processing
Contours and spatial frequency
Segregation between figure and ground
Segmentation
Determining shape of objects
Stereoscopic depth perception
Long range motion detection
Colour of visual stimuli
Face recognition
Object recognition
4.5 A conceptual framework for AHV system
information display
Although the development of an AHV system involves research from a diverse
range of specialists, there has not been a unifying framework which combines the
requirements of blind end-users with different AHV system components. In this
section a new proposed conceptual framework (shown in Figure 4.11) is presented
which is based on the literature reviews of blind mobility (presented in Chapter
108 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
2), AHV technology (Chapter 3) and computer vision literature reviews (Chapter
4). The conceptual framework is discussed in detail below, and also provides the
context in which the remainder of this thesis is presented.
The conceptual framework is made up of different influences (for example, the
weather or location) which will affect how information from an AHV system (or
other mobility device) is perceived. The arrows in the framework indicate how
different factors interact: for example, both lighting (a dynamic, scene factor) and
camera resolution (an external, AHV technology factor) will influence effectiveness
of a number of computer vision methods (for example, histogram equalisation may
be effective in enhancing images in dull lighting). The output from the computer
vision processing stage can either be used for the AHV display, or it can be
presented by another display modality (for example with an auditory warning).
Dynamic factors can often have an affect on mobility effectiveness without any
computer vision processing: an example of this would be a person’s knowledge
that they are holding a handrail while walking down a sloping path.
In the next section each main component involved in the conceptual framework
will be briefly discussed. This is followed by a hypothesised scenario involving
a person using an AHV system to perform a number of mobility related tasks.
Finally the hypothesised scenario is linked back to the framework and a number
of benefits of a framework approach are discussed.
Dynamic factors
In the proposed framework, dynamic factors are those which relate to the current
situation and goals of a mobile person. As a person moves, these inter-related
factors can change rapidly. The identified dynamic factors are:
• Context: Situations and environments evoke different expectations of what
behaviour and actions are possible. Context is used here to describe the
4.5. A conceptual framework for AHV system information display 109Dynamic factors
Computer Visionmethods for AHVNumber of grey levelsColour filterLow pass filter (smoothing)High pass filter (edges)Histogram equalisationSpatial ResolutionFrame RateZoom modeLines (Hough transform)Colour recognitionNegative imageImportance mappingCombine information fromother sensorsObject Recognition...
ContextIndoor OfficeIndoor MallOutdoor TrainCrowded environmentBeach... Scene propertiesTextureComplexityLightingGlareContrastType of objectsConnectivityFractal Dimension...TaskFace recognitionReadingStreet crossingFinding KeysGesture recognitionLandmark identificationMapping locationWalking along path...Sensory InformationTasteHeatOlfactoryTactileAuditoryProprioception External factorsAHV TechnologyCameraResolutionFrame rateField of View ...Electrode ArrayLocationNumber of electrodesType of electrodeSpatial Layout ofelectrodesInfrastructureTransmitter/ReceiverTelemetry methodStimulator typeProcessor
EnvironmentWeatherAffordancesTactile stripsSignsLandmarks...
Human FactorsExperience/trainingPsychological factors (motivation,depression, etc)Physical factors (gait, etc)Use of secondary aidAgeGender...Non-image sensorsUltrasoundLaserGPS...Other modalitiesAuditoryTactile ...Other displaymodalitiesAuditoryTactile...AHV Display TypeAlert InformationLooming obstaclesDrop offFast moving objectsSymbolicStandard Display Mobility PerformanceMobility incidentsWalking speed...
Figure 4.11: Factors which influence the display processing for an Artificial Hu-man Vision system.
110 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
expectations (or schema) that a person will have in different situations,
which will in turn affect the type of mobility information required. For
example, a person walking in along a beach may not need to identify straight
lines within captured images. However, if the beach is very crowded a
looming obstacle alert may be useful.
• Scene properties: The extraction of information from captured scenes is
affected by a number of properties, such as lighting, low contrast, clutter
and texture. Different methods (such as automatic gain in a camera, or
filters) may be required to compensate for undesirable scene properties. A
number of scene properties have been identified in Chapter 2 as important
for low vision mobility (and likely also for AHV mobility). These properties
included lighting conditions, glare and visual clutter.
• Task: Different information is required depending on the current task.
A road crossing task may emphasise a straight path to the opposite curb
(to prevent veering), whereas a task involving identifying a set of keys on a
cluttered table may involve zooming or object recognition. Face recognition,
reading and walking up and down stairs were also identified as important
tasks for blind and low vision mobility in Chapter 2.
• Sensory Information: Auditory cues (such as the sound of an approach-
ing object) are particularly important for blind mobility and navigation.
Tactile cues (such as hand-rails or braille strips on a footpath) are also
important for effective mobility. In addition, temperature changes (for ex-
ample from an air-conditioning unit) and smells (such as a bakery or partic-
ular plants) also provide sensory information. Finally, proprioception (the
sensation of motion, position, location and orientation of a person’s body
in space) provides dynamic information about the current environment (for
example, when walking up or down a slope).
4.5. A conceptual framework for AHV system information display 111
• Environment: The dynamic properties of the physical environment are
also important for a blind traveller. These properties include the weather
(for example, rain), landmarks, people or rubbish bins on a footpath. Af-
fordances (discussed in Chapter 2) are properties of the environment which
represent relationships between a person and the environment (such as a
door handle which affords a person the means to open a door) [79]. Addi-
tional examples of affordances are signs and paths.
External factors
This group of inter-related factors are important for displaying information, how-
ever these factors do not change while a person is moving (and are therefore
external to the current mobility situation).
• AHV Technology: The different components of an AHV system (dis-
cussed in detail in Chapter 3) affect the amount of information which can
be obtained (by camera properties such as frame rate, resolution and field
of view), processed and displayed (for example by the limited number of
electrodes and presentation frame rate restrictions). Potential bottlenecks
in an AHV system include the (possibly wireless) links between camera,
processor and stimulator unit.
• Human factors: Individual psychological and physical differences between
people may also affect the information display required from an AHV sys-
tem. Example differences include the amount of mobility training received,
duration of blindness, motivation, age, memory, expectancies and gender.
• Non-image sensors: In addition to image information captured using a
camera, information about the environment can be provided from other
sensors. Ultrasound and laser technology has been used in a number of
112 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
ETAs for the blind and the information from these sensors could be inte-
grated into an AHV system display (for example for collision detection).
Global Positioning System technology could provide information on a per-
son’s current location and could be integrated with mapping software to
display navigation information.
Computer vision methods
After they are acquired from a camera, images need to be updated before they
can be used for an AHV display. As there are a limited number of the electrodes
available, captured images will usually need to be reduced to a lower spatial reso-
lution (for example from 160x120 pixels to 16x12 phosphenes). A large number of
additional computer vision methods, discussed in Section 4.5, can be applied to
enhance the effectiveness of an AHV display. The computer vision methods used
will depend on the dynamic and external factors discussed above. For example,
if a person is searching for a blue shop sign, a helpful computer vision method
may be to apply a blue colour filter and select an edge display (assuming the sign
has straight edges, for example in a square or rectangle). The computer vision
system may also combine information from non-image sensors.
AHV Display Type
• Standard Display: The method used in most current AHV prototype
systems is to resize captured images to a lower resolution and then use
each pixel in the reduced resolution image to drive a single electrode. The
resizing may be combined with a smoothing filter (to reduce noise) and
edge detection (as in the Dobelle system [48]). A simulation of the standard
display is shown in Figure 4.1b.
• Alert: As discussed in Chapter 2, looming obstacles and drop-offs are
4.5. A conceptual framework for AHV system information display 113
serious problems for a blind pedestrian. It should be beneficial if an AHV
system could continually search for hazardous features of the current scene.
These alerts, such as an approaching tree branch (obstacle detection) or
descending stairs (drop off) could run as background tasks, and interrupt
the current display when required (for example, by filling a quarter of the
current display with bright phosphenes).
• Symbolic: This type of display, shown in Figure 4.1d, would extract salient
objects from captured images and display a symbolic, or cartoon-like repre-
sentation. Therefore this display mode would rely on a scene understanding
approach. For example, a person searching for a sign could have any sign
objects in the current image shown as a group of four phosphenes.
Other display modalities
Although the primary method of displaying AHV information would be from elec-
trodes, additional information (particularly a warning) could be presented using
auditory channels (for example, an alarm sounding in the left ear could represent
a looming collision on that side of a person’s body), or tactile channels (such as an
vibrating alert on the left shoulder for a collision). However, sensory substitution
may overload an existing sensory input which could reduce the effectiveness of an
AHV device (for example if a person was required to wear headphones to hear the
audio alert). By including a non-AHV display in the framework, it is possible to
compare traditional and ETA mobility aids with an AHV system (these devices
were discussed in Chapter 2).
Mobility Performance
The final component of the mobility framework represents the dependent vari-
ables used to measure mobility. The mobility performance component allows the
114 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
experimenter to assess how the factors in other framework components effect an
individuals mobility effectiveness. As discussed in Chapter 2, a large number
of these have been included in previous O&M studies, with three of the most
common being Percentage of Preferred Walking Speed (PPWS), number of times
veering has occurred, and contact with obstacles.
4.5.1 Hypothesised operational scenario
The following simplified example illustrates the relationship between the factors
shown in Figure 4.11 and shows the type of computer vision system which is
possible with current techniques.
This hypothetical example takes place a few years in the future and involves a
40 year old female, K, who has no light perception. K lost her sight five years ago
as a result of non-arteritic ischemic optic neuropathy (a condition which prevents
the retina from receiving sufficient blood flow). As this condition damages the
optic nerve, K is unable to use a retinal or optic nerve prosthesis. Twelve months
ago K received a new generation intra-cortical implant. After surgery, training
and calibration K is able to use this system for around 10 hours each day.
In this scenario, K needs to travel from her suburban house by bus to a mu-
sic store in the city. K has travelled with a normally-sighted friend previously,
however this is her first independent trip. K exits her house, follows the drive-
way to a gate, and steps onto the pavement. K knows the tactile feeling of the
pavement. She orients herself in the direction of the local bus stop, and presses
one of the buttons on a small wireless computer located inside her pocket. A tiny
wireless camera, located inside a pair of glasses worn by X, capture and transmit
images to this computer. The 24x32 phosphene display is updated to show a
familiar symbolic menu. K selects the sign recognition display mode. As the bus
approaches the stop, K is able to confirm the bus number, and signals for the bus
4.5. A conceptual framework for AHV system information display 115
to stop.
As the bus travels toward the city, K watches for known landmarks. She
cannot rely on timing the journey or counting the number of times the bus stops,
as many of the stops are empty and the bus continues without stopping. Therefore
K switches to a symbolic map mode display and selects her destination in the
city. The computer uses GPS information to plot K’s current location. As the
target location becomes closer, K confirms the location with the bus driver, who
stops at the required stop. K exits the bus and uses the GPS map to orient herself
in the correct direction, and then switches back to the 32x24 phosphene display.
As K walks along a pavement toward the city centre, the display automatically
switches to 16x12 phosphenes with a faster frame rate. K slows down as she
approaches an intersection, and the display switches automatically back to 32x24
phosphene mode. This intersection has traffic lights, so K selects a ‘walk traffic
signal’ recognition mode, which flashes when the walk signal is displayed. K
also listens to ensure that the traffic has stopped before crossing. As she walks
toward the shops the path becomes crowded and the automatic looming obstacle
alert is frequently shown. K remembers the locations of tactile strips in the
pavement and uses these while walking into the main shopping area. K switches
her display to a doorway identification mode (using object recognition software)
which utilises text identification software to automatically alert K to written signs.
K navigates to the music shop and enters the shop. In addition to the sound of
the salesperson’s voice, face recognition software confirms that K is talking with
the same person she spoke with the week before.
116 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
4.5.2 Benefits of a conceptual framework for AHV infor-
mation display
In the hypothesised scenario above, various computer vision methods are used to
enable K’s trip from home to a music shop. Although the AHV system provides
useful information for K, much of the information she requires is provided from
other sources (such as auditory, tactile, pre-existing knowledge of the objects
in the environment and mental maps to help her navigate). The AHV system
provides a number of different processing modes such as object recognition, char-
acter recognition, looming obstacle alerts. These modes can be initiated by K, or
are dynamically displayed (such as the looming obstacle alert). To be useful the
system needs to combine a consistent interface (for example, an intuitive method
to select processing mode, and a standard layout for any symbolic displays).
The reliability of the system is critical as incorrect information could reduce K’s
confidence in the system, and could lead to serious injury.
The conceptual framework is a significant contribution of this thesis which
supports and guides the development of an adaptive AHV system, and enables
the dynamic adjustment of display properties in real-time. The benefits of the
conceptual framework include:
• Experimental control of different factors : By manipulating and controlling
different factors from the framework the effectiveness of different AHV sys-
tem displays can be measured (such as altering display temporal resolution
while using a standard mobility assessment technique). To support this
point, the application of the framework to two previous mobility experi-
ments are discussed in the next section.
• Common language: The framework allows a common language for AHV
users, medical specialists, engineers, scientists, software developers, O&M
specialists and other groups.
4.5. A conceptual framework for AHV system information display 117
• Standardised requirements : Research on the effects of different factors (such
as display types) can lead to a standardised set of requirements for AHV sys-
tem components (for example a common display interface used for menus).
This may lead to interchangeable components and a standardised testing
methodology.
• Training : The effects of different factors impacting on mobility effectiveness
(such as age of onset of blindness) can be examined. Using these results dif-
ferent training strategies and training assessment methods can be developed
and compared.
• Finally, the framework supports the development of adaptive systems which
alter their method of computer vision processing depending on a number
of external factors (for example, depending on the current task being per-
formed by an end-user).
4.5.3 Application of the conceptual framework for previ-
ous AHV research
The conceptual framework needs be sufficiently flexible to encompass different
types of mobility research. This section demonstrates how the framework can be
applied to two previous mobility experiments discussed in Chapter 2.
AHV mobility simulation
The first paper considered is the seminal AHV simulation mobility research by
Cha et al. from the University of Utah [29]. The conceptual framework for this
research is shown in Figure 4.12. The AHV simulation from this study did not
use computer vision methods: instead simulated phosphenes were displayed by
attaching different types of mask onto the display screen. The context of the
118 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
study was an indoor artificial mobility course, and participants were asked to
move through a maze without hitting obstacles (which is specified in the frame-
work task component). Environment factors available include the wall and door-
way markings. Sensory information and scene properties (such as lighting and
contrast) would have been influences on individual mobility performance. The
external factors include the head mounted camera used to capture images, and
the lenses used to reduce the field of view available to participants. Finally the
mobility assessment dependent variables used in this study were obstacle contacts
and time spent in the maze for each participant.
Low vision mobility assessment
The second paper discussed in this section is a mobility experiment published by
Long et al. [135]. In this paper, 22 participants with low vision had their vision
assessed before being asked to walk through two different paths in three unfamiliar
mobility environments. The conceptual framework for Long et al. is shown in
Figure 4.13. The context for this study are the three mobility environments (for
example, a classroom building). The tasks involved walking along a path and
avoiding obstacles. Again, computer vision methods were not required in this
research. However, participants wore sunglasses with different levels of reduced
illumination, and these have been included as external factors. Impaired vision
has been included as an ‘other display modality’ as this study did not involve
AHV simulation.
4.6 Chapter Summary
Computer vision methods provide a critical link between the camera and elec-
trode array of an effective AHV system. This chapter has examined the main
methods from computer vision for the reduction of unimportant information in
4.6. Chapter Summary 119
Mobility Performance
Dynamic factors
Computer Visionmethods for AHVNone
ContextIndoor artificial mobilitycourseScene propertiesTextureComplexityLightingGlareContrastType of objects...
TaskWalking through mazeObstacle avoidanceSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception
External factorsAHV TechnologyCameraResolutionFrame rateField of View ...Electrode ArrayLocationNumber of electrodesType of electrodeSpatial Layout ofelectrodesInfrastructureTransmitter/ReceiverTelemetry methodStimulator typeProcessor
EnvironmentSignsLandmarks...
Human FactorsExperience/trainingPsychological factors (motivation,depression, etc)Physical factors (gait, etc)Use of secondary aidAgeGender...Non-image sensorsUltrasoundLaserGPS...
Other display modalitiesAuditoryTactile...AHV Display TypeAlert InformationLooming obstaclesDrop offFast moving objects
Symbolic
Standard DisplayPhosphene masks Obstacle contactsTime on courseFigure 4.12: Conceptual framework applied to Cha et al.’s simulated AHV mo-bility experiment. Factors which are not included in this study are marked witha line pattern.
120 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
Dynamic factors
Computer Visionmethods for AHVNone
ContextClassroom buildingResidential areaSmall Business areaScene propertiesTextureComplexityLightingGlareContrastType of objects...
TaskWalking along pathObstacle avoidanceSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception
External factorsAHV TechnologyCameraResolutionFrame rateField of View ...Electrode ArrayLocationNumber of electrodesType of electrodeSpatial Layout ofelectrodesInfrastructureTransmitter/ReceiverTelemetry methodStimulator typeProcessor
EnvironmentWeatherAffordancesTactile stripsSignsLandmarks...
Human FactorsExperience/trainingPsychological factors (motivation,depression, etc)Physical factors (gait, etc)Use of secondary aidAgeGender...Non-image sensorsReduced illuminationsunglasses
Other modalitiesAuditoryTactile ...Other displaymodalitiesAuditoryTactileImpaired visionAHV Display Type
Alert InformationLooming obstaclesDrop offFast moving objectsSymbolic
Standard DisplayMobility PerformanceLoss of balanceVeeringShuffling and others ...
Figure 4.13: Conceptual framework applied to Long et al.’s low vision mobilityexperiment. Factors which are not included in this study are marked with a linepattern.
4.6. Chapter Summary 121
the identification of image components. A number of prototype systems to assist
the blind were then reviewed to illustrate how these methods can be applied.
Despite the number of systems reviewed, many struggle to provide useful mo-
bility information in real-time, and only one has received a reasonable level of
acceptance (the auditory vOICe system).
This chapter has also presented a framework for AHV system information
display. This framework, based on the reviews of blindness, blind mobility, AHV
systems, and computer vision includes the main factors which impact on a blind
traveller. The main benefits of using this framework are enhanced communication
between AHV researchers and the ability to explore and compare different factors
experimentally (such as age or gender, different types of computer vision methods,
and different environments). This framework has guided the experimental work
contained in the next four chapters of this thesis.
The framework presented in this chapter has guided the mobility assessment
methodologies contained in the next four chapters of this thesis. In Chapter 5
the ability to identify mobility-related information from degraded static images
is explored. Chapters 6 to 8 investigate the effect of various image processing
methods on the mobility of subjects wearing an AHV simulation.
122 Chapter 4. A Framework for Blind Mobility Improvement via Computer Vision
Chapter 5
AHV Mobility Assessment using
Static Images
5.1 Introduction
This chapter describes a computer-based AHV simulation experiment using static
images. As discussed in Chapter 3, static AHV simulation images have previously
been used by Boyle et al. [18] to examine the effects of various image processing
techniques on object recognition. This experiment aimed to investigate the three
main research questions presented in Chapter 1:
Can specific main factors be identified as highly significant for provid-
ing mobility information in an AHV system?
It is anticipated that some mobility information will be available from low reso-
lution images. In this experiment participants were asked to identify a number
of mobility related components from low resolution static images. These image
components were selected based on the Chapter 2 review and included: people;
tall obstacles (such as a tree or pole); low obstacles (such as a chair) and drop
123
124 Chapter 5. AHV Mobility Assessment using Static Images
offs (such as a down-stair). In addition, the ten images chosen for this experi-
ment contained different image contexts (such as indoor office, outdoor path) and
scene properties (such as image clutter). Participants were also asked to imag-
ine they were walking while using the low resolution image as their only visual
information, and to predict where their next step should be placed.
Can objective measures be developed for the comparison of effective-
ness between AHV systems in providing mobility information?
This experiment has been performed using custom computer software and static
images. Reduced resolution static images have been used by other authors in a
range of simulated AHV experiments (for example, [226], [18] and [126]), there-
fore there is reason to believe that static images may also be effective for the
identification of mobility related image components.
Can computer vision techniques be adopted and modified to provide
mobility information in an AHV system?
As discussed in the previous chapter, edge detection can reduce the amount of
data, while preserving the important structural information in an image. Previous
AHV simulation research by Boyle et al. [18] has indicated that edge detection
may not be very useful at low resolution static images (25 x 25 phosphenes or
below). However, (as mentioned in Chapter 3) this is contradicted by the Dobelle
institute [48] who found that Sobel edge detection was useful in their commercial
AHV device for recognising useful scene components, such as doorways. There-
fore, one aim of the experiment presented in this chapter is to investigate whether
edge detection would be beneficial at a higher 50x50 pixel resolution.
Additionally this investigation attempts to determine if different types of edge
detection affect the recognition of low resolution mobility-related images. As
discussed in chapter 4, two of the most widely implemented methods are the
5.2. Method 125
Canny and Sobel algorithms, and these two methods are applied to each of the
static images and compared. As the Canny method convolves the image with
a Gaussian smoothing operator before calculating the edge locations, it is less
sensitive to noise, and was expected to result in improved recognition performance
than the Sobel method.
These questions are closely connected to a number of the conceptual frame-
work components shown in Figure 5.1. Note that in this framework figure, com-
ponents which are not relevant to this study are shown in grey. The dynamic
factors addressed include image context and scene properties. The computer vi-
sion methods applied involved reduced resolution, grey-scale images, filtering and
edge detection. External factors which could influence the identification of mobil-
ity related components could include human factors (such as age, or experience
with low resolution displays) and camera properties.
5.2 Method
This study involved presenting low resolution static images which had been pro-
cessed using four different methods. Research participants were required to per-
form two main tasks for each image: (a) whether they were able to identify a
number of mobility related components and (b) to click on each image to record
where they believed the mobility related component was located.
5.2.1 Images selected
This experiment involved eight mobility-related images (shown in Figure 5.2).
Three of these images were captured by the author using a 160x120 pixel resolu-
tion PDA Pretec CompactFlash card Camera (images a, g and h). Five images
were obtained from web searches (images b , c, d, e and f). For consistency all
images were resized to 256x256 pixel resolution. These images were chosen as
126 Chapter 5. AHV Mobility Assessment using Static Images
Dynamic factors
Computer Visionmethods for AHVSpatial ResolutionNumber of grey levelsLow pass filter (smoothing)High pass filter (edges)
ContextIndoor OfficeIndoor BathroomOutdoor pathOutdoor train stationScene propertiesTextureComplexityLightingContrastType of objects
TaskWalkingSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception
External factorsAHV TechnologyCameraResolutionFrame rateField of View ...Electrode ArrayLocationNumber of electrodesType of electrodeSpatial Layout ofelectrodesInfrastructureTransmitter/ReceiverTelemetry methodStimulator typeProcessor
EnvironmentWeatherAffordancesTactile stripsSignsLandmarks...
Human FactorsExperience/trainingPsychological factors (motivation,depression, etc)Physical factors (gait, etc)Use of secondary aidAgeGender...Non-image sensorsUltrasoundLaserGPS...
Other modalitiesAuditoryTactile ...Other displaymodalitiesAuditoryTactile...AHV Display Type
Alert InformationLooming obstaclesDrop offFast moving objectsSymbolic
Standard Display
Mobility PerformanceScene componentsidentifiedSelected next stepFigure 5.1: Conceptual framework diagram showing factors which influence sim-ulated AHV display effectiveness in this chapter. Factors from Chapter 4 whichare excluded from this chapter are marked with a line pattern.
5.2. Method 127
(a) (b)(c) (d)
(e) (f)(g) (h)
Figure 5.2: Original mobility images used in this chapter. A brief descriptionsfor each image is shown in Table 5.1.
128 Chapter 5. AHV Mobility Assessment using Static Images
they included both indoor and outdoor scenes, different mobility related objects
of interest (such as cars, people and drop-offs), and different levels of clutter and
brightness.
As mentioned, this experiment assumed that a hypothesised AHV system
was capable of displaying a 50x50 phosphene resolution. To reduce the risk of
participants having a priori knowledge of the images, it was decided to use unique
images for this experiment which were unlikely to have been previously seen.
Each of the eight images was processed using four different methods (creating
a test set of 32 images):
1. Binary only
2. Binary output from Canny edge detection
3. Grey-level (8 bit)
4. Binary output from Sobel edge detection
The stages of processing are summarised in Figure 5.4 below. Edge detection
sensitivity thresholds, resulting in the most accurate representation of mobility
information, were (subjectively) selected for each image (these are listed in Table
5.2). For Canny edge detection, the standard deviation of the Gaussian filter (σ)
was equal to 1 for all images. In addition, the edge images were dilated using a
flat, disk-shaped morphological structuring element, which helped to retain the
edge information after the image was resized. To simulate the pixelization effect
in an 50x50 phosphene resolution from an AHV system, all images were then
resized from their original 256x256 pixel resolution to 50x50 pixel resolution and
then back to 256x256 pixels. Finally, a 3x3 neighbourhood median filter was
applied to each image to soften the pixelization effects after resizing the images.
All image processing was conducted using the Matlab Image Processing Toolkit.
5.2. Method 129
iptsetpref(‘ImshowBorder’,‘tight’);
x=imread(‘Z:\ChildOnStreet.bmp’);
figure, imshow(x);
x1=edge(x,‘sobel’,.09);
figure, imshow(x1);
%These resize commands are designed to
%emulate the pixelization from 50x50 phosphenes
x2=imresize(x1,[50 50]);
x3=imresize(x2,[256 256]);
% Highlight edge information (this step only
%used for Canny and Sobel image types)
se = strel(‘disk’,1);
x3 = imdilate(x3,se);
figure, imshow(x3);
Figure 5.3: Example Matlab code used for generating images. This examplecreates the output from Sobel edge detection for image A. Child On Street.
The Matlab code for generating the Sobel edge detection output for image A is
shown in Figure 5.3.
5.2.2 Assessing mobility information
For each of the 32 processed images, participants were asked a series of five
questions. These questions were selected to investigate the amount of mobility
related information which could be identified by participants and were based on
the National Research Council’s summary of blind pedestrians needs (discussed
in Section 2.3). Each participant was required to respond to the following five
questions for each image:
Q1. Can you identify a person in this image?
Q2. Can you identify a tall obstacle (e.g. pole/tree)?
Q3. Can you identify a drop off in this image?
130 Chapter 5. AHV Mobility Assessment using Static Images
Table 5.1: Mobility related image components identified for each image.
Image Person Tall Obstacle Drop Off Low Obstacle
A. Child on street X X X
B. Path near road X X
C. Person in office X X X
D. Person in bathroom X X
E. Sparse office X
F. Street scene with tree X X
G. Phone booth obstacle X X X
H. Railway platform X X
Table 5.2: Image edge detection and line enhancement thresholds for each im-age. Note that the Canny sensitivity listed is the high threshold value. The lowthreshold value was set to 0.4 times the high threshold.
Image Sobel Sensitivity Canny Sensitivity Dilation Disk size
A. Child on street .09 .30 2
B. Path near road .18 .45 1
C. Person in office .14 .45 1
D. Person in bathroom .16 .40 3
E. Sparse office .17 .60 2
F. Street scene with tree .10 .45 1
G. Phone booth obstacle .14 .35 2
H. Railway platform .17 .40 1
5.2. Method 131
1. Convert image to 256 Grey-levels
Image Type = 3?
Resize image resolution to 50x50 pixels
Resize image resolution to
256x256 pixels
Apply 3x3 median filter
Image Type = 1?
Convert image to 2 grey-levels
(binary)
Image Type = 2?
Apply Canny edge detection
Apply Sobel edge detection
No No No
Yes Yes
Yes
Yes
Input Image
Output Image
Image Type 1. Binary (no edge detection)
2. Binary (Canny edge detection) 3. Grey-Scale
4. Binary (Sobel edge detection)
Figure 5.4: Flowchart showing the image processing steps applied for each of thefour image type used in this Chapter.
132 Chapter 5. AHV Mobility Assessment using Static Images
(a) (b)(c) (d)(e)
Figure 5.5: Image processing applied to image A (Child on street). The originalimage (converted to 8 bit grey-scale and 256x256 pixel resolution) is shown withthe 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).
5.2. Method 133
(a) (b)(c) (d)(e)
Figure 5.6: Image processing applied to image B (Path near road). The originalimage (converted to 8 bit grey-scale and 256x256 pixel resolution) is shown withthe 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).
134 Chapter 5. AHV Mobility Assessment using Static Images
(a) (b)(c) (d)(e)
Figure 5.7: Image processing applied to image C (Person in office). The originalimage (converted to 8 bit grey-scale and 256x256 pixel resolution) is shown withthe 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).
5.2. Method 135
(a) (b)(c) (d)(e)
Figure 5.8: Image processing applied to image D (Person in bathroom). Theoriginal image (converted to 8 bit grey-scale and 256x256 pixel resolution) isshown with the 5x5 grid mask in figure (a). The binary image is shown in image(b) and the Canny edge detection image shown in image (c). The 50x50 8 bitgrey-scale image is shown in image (d). Finally the Sobel edge detection outputis shown in image (e).
136 Chapter 5. AHV Mobility Assessment using Static Images
(a) (b)(c) (d)(e)
Figure 5.9: Image processing applied to image E (Sparse office). The originalimage (converted to 8 bit grey-scale and 256x256 pixel resolution) is shown withthe 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).
5.2. Method 137
(a) (b)(c) (d)(e)
Figure 5.10: Image processing applied to image F (Street scene with tree). Theoriginal image (converted to 8 bit grey-scale and 256x256 pixel resolution) isshown with the 5x5 grid mask in figure (a). The binary image is shown in image(b) and the Canny edge detection image shown in image (c). The 50x50 8 bitgrey-scale image is shown in image (d). Finally the Sobel edge detection outputis shown in image (e).
138 Chapter 5. AHV Mobility Assessment using Static Images
(a) (b)(c) (d)(e)
Figure 5.11: Image processing applied to image G (Phone booth obstacle). Theoriginal image (converted to 8 bit grey-scale and 256x256 pixel resolution) isshown with the 5x5 grid mask in figure (a). The binary image is shown in image(b) and the Canny edge detection image shown in image (c). The 50x50 8 bitgrey-scale image is shown in image (d). Finally the Sobel edge detection outputis shown in image (e).
5.2. Method 139
(a) (b)(c) (d)(e)
Figure 5.12: Image processing applied to image H (Railway platform). The orig-inal image (converted to 8 bit grey-scale and 256x256 pixel resolution) is shownwith the 5x5 grid mask in figure (a). The binary image is shown in image (b) andthe Canny edge detection image shown in image (c). The 50x50 8 bit grey-scaleimage is shown in image (d). Finally the Sobel edge detection output is shownin image (e).
140 Chapter 5. AHV Mobility Assessment using Static Images
Q4. Can you identify a low obstacle (such as a chair)?
Q5. Please imagine you are moving through the scene and this image is the only
visual information available to you. Where would you aim your next step?
Please click on this button and then select the location on the image.
For the first four questions, the subject was then required to select a 5 point
Likert scale rank as follows:
1. Definitely yes
2. Probably yes
3. Don’t Know
4. Probably no
5. Definitely no
Software implementation details
The software for this research was written using Microsoft Visual Basic 6.0 and
presented on a Windows 2000 laptop. The user interface comprised two screens:
a) An initial screen with a randomly generated participant ID, and b) the main
experiment screen (shown in Figure 5.13). The image presentation sequence
was randomized for each volunteer. For each question if a ranking between 1
(‘definitely yes’) and 4 (‘probably no’) was selected, the subject was prompted
to click on the image location which best matched the object referred to in the
question. If a subject selected a ranking of ‘Definitely no’, they were not required
to click on an image location. These coordinates, along with the participant ID
and display image details, were recorded in an ascii delimited text file for further
analysis.
5.2. Method 141
Figure 5.13: A sample screen from the static image experiment. The x andy values on the right hand side of the screen show which part of the imagereceived a mouse click for each question. If the participant selected a response of‘5=Definitely No’ for a question, x and y were set to -1 by default.
142 Chapter 5. AHV Mobility Assessment using Static Images
Table 5.3: This table shows the ranges used to convert the original x and y co-ordinates (recorded for each question and image combination) into a simplified5x5 element array. For example the x,y value (227,156) would be re-coded to(5,4). The simplified values were then compared against an array of ‘correctresponses’ for each question type.
Original x or y value Grid x or y value
Between 0 and 51 1
Between 52 and 102 2
Between 103 and 153 3
Between 154 and 204 4
Between 205 and 255 5
5.2.3 Procedure
Ten postgraduate students or staff at the Queensland University of Technology
volunteered to participate in the study. All participants had normal or corrected-
to-normal vision. Each participant was asked to sit in front of a computer with
the software loaded and the initial screen displayed. A definition of a ‘drop-
off’ was verbally provided to all subjects (as this is not a common expression
outside the O&M literature). Following this, participants were asked to read the
instructions on the screen and click on a start button. Then the first of the 32
randomly allocated images were presented on the computer.
Statistical analysis
To simplify the process of evaluating participant responses, recorded image co-
ordinates for each question/image were first reduced in scale from 256x256 to a
5x5 grid. The conversion values used are shown in Table 5.3. An example of the
5x5 grid is shown in Figure 5.5a. The grid origin is in the top left corner.
Prior to conducting the experiment, a matrix of ‘correct’ grid locations for
each image and question type combination was generated (this file consisted of
5.2. Method 143
Table 5.4: Steps in identifying correct/incorrect and identified/not identified gridresponses.
Response Is the question Response
valid for this image? classified as:
1,2 Yes True Positive
No False Positive
4,5 Yes False Negative
No True Negative
169 entries). After the experiment was completed, the grid locations selected by
each participant were compared to the list of ‘correct grid locations’ to identify
correct locations.
Not all questions were valid for all images. The mobility related compo-
nents identified for each image are shown in Table 5.1, for example there are
no people in images 5, 6 and 8. However participants may have incorrectly
identified these objects or people (false positives), which may be important for
mobility (for example, a low resolution image of a phone booth may be mis-
understood to be a person). Therefore question responses were classified into
true/false/positive/negative categories according to Table 5.4. Question responses
of ‘3. Don’t Know’ were excluded from this classification.
Unless stated otherwise, statistical significance was at the p<.05 level. The
Statistical Package for the Social Sciences (SPSS) (2004, SPSS Inc, Chicago,
USA) was used for all statistical calculations. Due to the small sample size of 10
participants and ordinal scale data recorded from this experiment, nonparametric
statistics have been used for analysis.
144 Chapter 5. AHV Mobility Assessment using Static Images
5.3 Results
The response results for questions 1-4 on all presented images are displayed in
Figure 5.14. The response of ‘Definitely no’ comprised 45% of responses to ques-
tions 1 to 4, and was highest (60%) for the identification of people (question
1).
Figure 5.15 shows responses by different image types. Most of the ‘definitely
yes’ responses (65%) were related to Grey scale images, which also had the least
proportion of ‘don’t know’ responses (9%). The results for binary and edge
detected images were similar.
As discussed in Section 5.2.3, the grid locations selected by subjects were
divided into four groups: true positive, true negative, false positive and false
negative. The results for all images are summarised in Figure 5.5. Participants
clicked on the correct image locations in 77% of ‘Definitely Yes’ responses and
59.7% of ‘Probably Yes’ responses. Interestingly in 28.5% of responses, where
the participant selected ‘Probably No’, they actually selected the correct image
location for that image. Results for each image are presented in Figures 5.16-5.23.
The results for questions 1 to 4 were significantly different for the eight types
of image (χ2=281.83, n=1077, p<0.01 ). Image D (person in bathroom) and
E (sparse office) received the highest frequency of true identifications (91% and
87.4% respectively). Images C (person in office), G (phone booth obstacle) and
B (path near road) (39.1%, 40.2% and 42.2% respectively) received the lowest
frequency of correct identifications.
Grey scale images received the highest number of true responses (71%). There
was no significant difference between results for the two types of edge detection
over all images (χ2=.055, df=1, n=523, p=0.815). The results for the two types of
edge detection were significantly worse than both binary and grayscale methods
of image processing over all images (χ2=17.08, df=2, n=1077, p<0.01). There
5.3. Results 145
Table 5.5: Summary of response classification for each image type. Note that 203responses of ‘Don’t know’ have been excluded from classification.
True Positive True Negative False Positive False Negative Total
Binary only 40 106 12 104 262
Canny 41 110 10 94 255
Grey-scale 108 100 32 52 292
Sobel 43 113 10 102 268
Total 232 429 64 352 1077
Low obstacleDrop offTall obstaclePerson
Question
Definitely no
Probably no
Don't Know
Probably yes
Definitely yesUserResponse
Figure 5.14: Summary of question responses for each of the 32 images presented.
was no significant difference between true responses for binary and edge detected
images (χ2=0.63, df=1, n=785, p=0.43).
Question 5 asked subjects to select where they would place their next step.
Results for this question (and the ’correct’ grid locations) were similar for each
image. As shown in Figure 5.24, the grayscale images scored the highest percent-
age of correct responses to question 5. There was no difference in results between
Binary, Sobel or Canny images (χ2=1.39, df=2, n=240, p=0.50). The results for
Sobel and Canny edge detection methods for this question were identical (84%
correct).
146 Chapter 5. AHV Mobility Assessment using Static Images
SobelGrey-scaleCannyBinary only
ImageType
Definitely no
Probably no
Don't Know
Probably yes
Definitely yesUserResponse
Figure 5.15: Summary of responses for each image processing method used inthis experiment.
Figure 5.16: Results for questions 1-4 for each processing method on image 1(Child on street).
5.3. Results 147
Figure 5.17: Results for questions 1-4 for each processing method on image 2(Path near road).
Figure 5.18: Results for questions 1-4 for each processing method on image 3(Person in office).
148 Chapter 5. AHV Mobility Assessment using Static Images
Figure 5.19: Results for questions 1-4 for each processing method on image 4(Person in bathroom).
Figure 5.20: Results for questions 1-4 for each processing method on image 5(Sparse office).
5.3. Results 149
Figure 5.21: Results for questions 1-4 for each processing method on image 6(Street scene with tree).
Figure 5.22: Results for questions 1-4 for each processing method on image 7(Phone booth obstacle).
150 Chapter 5. AHV Mobility Assessment using Static Images
Figure 5.23: Results for questions 1-4 for each processing method on image 8(Railway platform).
Figure 5.24: Question 5 (‘next step’) result summary for each type of image.
5.4. Discussion 151
5.4 Discussion
The purpose of this study was to investigate the benefits of some simple forms of
image processing on mobility related static images at 50x50 pixel resolution.
Can specific main factors be identified as highly significant for provid-
ing mobility information in an AHV system?
There were significant differences between results for the different base images
(shown in Figure 5.2). The correct recognition of mobility related components
were best for image 1 (Child on street) and 4 (Man in bathroom). These images
are less cluttered than other images. Images 2 (Path near road), 5 (Sparse office)
and 8 (Railway platform) had significantly less correct recognition. The original
resizing of images to 160x120 pixels may also have contributed to these results.
The results for the ‘next step’ question (Table 8) were high for all image types.
This indicates that a greater range of mobility related images (such as doorways
or stairs) may be required.
Can objective measures be developed for the comparison of effective-
ness between AHV systems in providing mobility information?
This experiment has demonstrated that a static image based AHV simulation can
provide useful information regarding mobility. There are a number of advantages
associated with static image experiments over mobility course studies. These
include: portability (for example using a laptop), control of extraneous variables
(such as lighting conditions), increased number of participants (it is easier to
recruit people for a computer based study), ease of data recording (participant
responses can be automatically record during experiments). However, the use of
static images simplifies the mobility task. Additional important components of
mobility include the effects of auditory, tactile, kinesthetic (the feeling of motion)
152 Chapter 5. AHV Mobility Assessment using Static Images
and proprioceptive (information about a person’s position, location and movement
in space).
By using image sequences (video) rather than static images, it should be pos-
sible to use a lower pixel resolution when a subject is able to use ego and object
motion to assist with object identification: Cha et al [30] found that head move-
ments were important in improving mobility performance at 25x25 resolution.
Applying an ecological approach to AHV system development could emphasise
this movement in a complex and changing environment. The movement of a
head-mounted camera of a visual prosthesis patient would produce a transforma-
tion in captured images (optic flow) which can be useful for segregating an image
into component parts [32] and the rate of expansion can be used to calculate the
amount of time before a collision will occur. By processing and presenting image
sequences it should also be possible to use fewer pixels for similar object recog-
nition performance. Therefore the next three chapters will focus on processing
image sequences for AHV simulation.
One issue with the research design for this experiment was that participants
may have been primed by previous exposure to an image (for example, if they saw
a grey-scale image of the office scene and then viewed the binary version of same
image). This is a consequence of randomising the display order of the 32 images.
In future work it would be beneficial to add a constraint to the ordering of images,
so that different processed versions of the same image cannot be repeated during
the experiment.
Can computer vision techniques be adopted and modified to provide
mobility information in an AHV system?
In this experiment four different methods were used to process images. The 256
grey-level image type resulted in significantly better recognition of mobility com-
ponents than the other image types. No significant difference was found between
5.5. Chapter Summary 153
either the Canny or Sobel edge detection methods and the binary threshold image
type. Therefore the results presented in this chapter do not support the use of
edge detection at this resolution of static images. Based on these results, if edge
detection was required in an AHV system, the Sobel method appears more suited
to an AHV system due to its lower computational cost than the Canny method.
5.5 Chapter Summary
This chapter has described a computer based experiment using low resolution
static images. A number of the factors from the AHV display framework presented
in Chapter 4 have been involved in this experiment. Neither the Canny nor
Sobel methods of edge detection were found to be more useful than a standard
thresholded binary image representation for recognising mobility related scene
components. A number of benefits associated with using a static image approach
have been discussed. However mobility involves movement of a person’s body in
a dynamic environment, and these factors are not considered in a static image
AHV simulation. Therefore, the remainder of this thesis will focus on the use of
image sequences and wearable AHV simulation.
154 Chapter 5. AHV Mobility Assessment using Static Images
Chapter 6
AHV Simulation and Obstacle
Detection using a Personal
Digital Assistant
6.1 Introduction
The previous chapter on static image research identified a number of limitations
concerning AHV mobility assessment. In that chapter it was argued that mobility
assessment should be more valid and generalisable if the effects of auditory, tactile,
kinesthetic and proprioceptive on mobility are also considered. Therefore this
chapter and the remaining experimental chapters of this thesis are concerned
with portable, head-mounted simulations.
This chapter addresses the following research question from Chapter 1:
155
156Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
Can computer vision techniques be adopted and modified to provide
mobility information in an AHV system?
As discussed in Chapter 2, the aesthetic appearance of Electronic Travels Aids
(ETAs)for the blind is very important. It would be preferable if AHV system users
were able to use a small, hidden computer for camera image processing. Current
generation Personal Digital Assistant (PDA) computers meet this requirement,
and can be easily concealed in a pocket or on a belt. Therefore this chapter
describes the development of a PDA-based AHV simulation. The PDA display
itself is used to present the phosphene simulation which enables a normally sighted
participant to be assessed on various mobility tasks under different contexts, alert
scenarios and image processing conditions.
The PDA chosen for this experiment was a Hewlett Packard iPaq 2210 us-
ing an Intel PXA255 processor. This PDA has a number of constraints (which
are common to all low cost PDAs): no floating point processor, relatively slow
bus speed and processor and limited Random Access Memory (RAM). These
constraints are likely to be typical of actual AHV devices (for example, [215]).
Working within these constraints, this chapter describes the development and
evaluation of a novel method of providing a real-time looming obstacle alert. This
alert display is hypothesised to result in reduced obstacle contacts by participants
during mobility assessment.
6.2 Method
6.2.1 Hardware
There are three main components to the PDA AHV simulator described in this
chapter: the PDA itself; the camera; and the headgear used to attach the PDA
centrally in front of a participant’s eyes. Each of these components are discussed
6.2. Method 157
below.
Hewlett Packard iPaq 2210 Pocket PC
The main benefit of using a PDA is the small size, lightweight and a reduction
in the number of connecting cables. Current generation PDA’s are however con-
strained by relatively slow CPU, and lack a floating-point unit for real number
computation.
The main operating systems for PDA’s are currently Palm OS and Microsoft
Window Mobile Pocket PC 2003. The Microsoft operating system was chosen
due to the availability of free development software. The HP iPaq was selected
because (at the time of purchase) it contained the fastest processor (a 400 MHz
Intel XScale PXA255) and fastest internal bus speed (200MHz).
Lifeview Flycam CompactFlash Camera Card
For image capture, a Lifeview Flycam CompactFlash (CF) Camera Card is used,
capable of capturing static images at a resolution up to 640x480 pixels, although
only 160x120 is available for video capture. This camera uses a 1/4 inch CMOS
sensor, has a viewing angle of 52◦, and has automatic gain and exposure. A ring
outside the lens provides a manual focus for the camera.
Ideally a head mounted camera could be connected (preferably via a wireless
link) to a concealed PDA. However, this was not possible at the time of develop-
ment, therefore the camera was required to be inserted into the CF card slot at
the top of the PDA. This meant that the PDA itself needed to be head-mounted,
which was feasible as the combined weight of the camera and PDA was 164 grams.
The advantage of using placing the PDA in front of a person’s eyes was that the
PDA display itself could function as the simulation display.
158Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
Figure 6.1: Front and side views of the AHV simulator used in the present study.
Headgear
A standard head brace device was adapted to include a bracket for holding the
PDA in front of a person’s eyes (see Figure 6.1). The viewing distance from the
eyes was approximately 65 cm. The PDA screen display was 8.89 cm diagonal
with a resolution of 240 x 320 pixels.
6.2.2 Obstacle avoidance
The inclusion of an automatic looming obstacle alert would be a beneficial func-
tion in an AHV system. There are two main methods to achieve this goal on a
portable AHV system:
1. Process distance information provided by ultrasound or laser sensors. Apart
from the disadvantages of these technologies discussed in the section on
Electronic Travel Aids in Chapter 2, processing the received distance infor-
mation would place an additional burden on the the AHV system.
2. Use the images captured from the AHV system camera. In this thesis a
single camera is used, as this is simpler to implement, is easier to conceal,
requires less processing resources, and is biologically feasible (humans with
monocular vision are still able to judge distances and impending collisions).
A different, and more computationally intensive approach would be to use
6.2. Method 159
multiple cameras (which, when calibrated, would provide stereo information
in the same way as binocular human vision).
The camera used in this experiment was located on a person’s head. In
contrast to a camera located on a different part of the body (such as the feet), a
single, head-mounted, camera would identify objects which are expanding toward
the head. As discussed in Chapter 2, chest or head high looming obstacles are
often difficult to detect (for example, in Figure 4.1 the main body of the phone
booth is at chest height, while the base is quite narrow). In addition, a collision to
the head, particularly while wearing an implant system, is probably more serious
than other parts of the body.
The traditional approach to image based obstacle avoidance, using a single
camera, is to estimate the optical flow within the image sequence, compensate for
camera motion (ego motion), and suggest turning toward the direction where the
optical flow is smaller [141]. The PDA based looming obstacle alert developed and
presented in this chapter is broadly based on optic flow and motion estimation,
both of which will be briefly described below.
Motion Estimation
Optic flow is discussed in the blind mobility, visual perception and computer vi-
sion literature, and was first proposed by the experimental psychologist James
J. Gibson [76]. When a person (or camera) is moving, optic flow provides in-
formation on the spatial structure of the outside environment. If an object is
located directly in front of the observer while they move toward it, the central
part of the object will have no optic flow. However, the object edges will move
out as the object expands. The most popular methods for computing optical
flow include the differential, region based matching, frequency and phase based
approaches [10]. However, using these methods to extract optical flow on a PDA
160Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
in near real-time (for example, 5 fps) is computationally challenging (at the time
of writing). Therefore in this chapter, a block based approach is used to estimate
optic flow.
A block-based approach to calculating displacement vectors is implemented in
the widely used MPEG 1 and 2 video compression standards, where a single mo-
tion vector is estimated for each 16x16 block. As the differences between images
in an image sequences are often small, during video encoding the movement of
blocks within an image can be calculated and stored, instead of the actual block.
As the block motion vectors require much less space than storing the entire block,
a considerable amount of inter-frame compression can be achieved.
Block matching searches for the location of the best-matching block in the
next frame(s) based on a distance criterion. Generally the Mean Square Error
(MSE) or Mean Absolute Error (MAE) is used as the matching criteria. A number
of fast search methods have been developed to assist with motion estimation [75].
Due to the lack of an FPU, a PDA implementation needs to use as few integers
as possible. To evaluate motion estimation information on a PDA, a fast integer-
based search algorithm from Srinivasan and Rao [212] was implemented. The
present implementation involved dividing a captured image (120x160 pixels in
size) into 7x10 blocks, each of which is 16x16 pixels. These blocks are shown in
the sample image in Figure 6.2). The block-based approach involves searching
for the location of each block in the previous image. A 5x5 pixel search area
from the centre of each block was used. The Sum of Absolute Differences (SAD)
between pixel brightness in the current and previous image blocks was computed.
The block with the lowest SAD value is assumed to be the same block, and the
motion vector is stored. The motion vectors are estimated at around 6 frames
per second on the PDA.
An example of the motion vectors extracted using the iPaq 2210 PDA are
shown in Figure 6.3. These images were captured as the camera was moved toward
6.2. Method 161
Figure 6.2: Grid showing the 7x10 pixel blocks used from each 120x160 pixelimage for the PDA motion estimation described below.
an obstacle (a knee high concrete bench). The image on the left was captured
immediately before the image on the right, and the motion vectors have been
calculated between these images. Because the camera was carried by a person
who was walking there is general camera movement (ego motion) to the right,
however the bench has increased slightly in size in the second image, and this is
indicated by the direction of the vectors around the bench. One possible use of
the information from these motion vectors is to identify (or segment) objects from
within the image sequence, and to work out if they represent hazards (obstacles
or drop offs) for a blind traveller.
6.2.3 Block Based Obstacle Alert
The main functions of the looming obstacle alert are to:
1. Segment an image sequence in real time to identify visible objects.
162Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
Figure 6.3: Motion vectors extracted calculated from the PDA. The origin of eachmotion vector is the centre of each of of the grid blocks in Figure 6.2. In certaindirections the arrow heads look like white blobs.
2. Detect when a segmented object is growing larger (that is, approaching the
camera) at a sufficient rate to suggest that a collision is imminent.
6.2.4 AHV Simulation Implementation
The main purpose of the AHV simulation software was to convert input from the
camera into an on-screen simulated phosphene display. In addition, background
processing needed to determine if an alert warning should be displayed. The
alert processing was based on a block-matching approach. To be representative
of current AHV prototypes and maintain the same aspect ratio of the display
device, the PDA based AHV simulator reduced the resolution of captured images
from 160x120 RGB to 32x24 grey-level ‘phosphenes’, displayed as squares.
The Flycam-CF Software Development Kit was used for accessing images from
the camera. The simulator software was developed in Microsoft embedded Visual
6.2. Method 163
C++ version 4.0 on a Windows XP PC. After compilation, files were transferred
to the PDA using a USB connection and Microsoft ActiveSync. A Windows XP
test application was also developed using Microsoft Visual C++ version 6.0 to
test methods on image sequences previously captured from the PocketPC and
camera.
The approach used to provide the obstacle alert was to:
1. Segment each image based on grey-level values.
2. Check the size and rate of expansion of each segment between contiguous
images.
3. If a segment has expanded quickly and comprises at least 20 5x5 blocks
then display the alert warning.
The main steps used in the PDA simulation are shown in Figure 6.4. A set
of arrays for both the current and previous image is maintained, including the
block grey-level value, warning segments, and segment size. An array of allocated
segments is also maintained across images. To improve computation time, each
5x5 pixel area from the original 160x120 pixel image was used to generate one
32x24 pixel block. More detailed implementation details will now be provided.
Steps 1-4
Initially each 160x120 pixel RGB bitmap supplied by the camera is captured as
step one. In step two, the 256 level image was converted to an 8 grey-level
array. This reduction of grey-level information assists with the execution speed
of image segmentation and filtering.
A constant brightness level is required for motion estimation algorithms, oth-
erwise changes in brightness may be incorrectly identified as an object. Therefore,
if the difference between the sum of grey-level values in the current image and the
164Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
1. Get current 160x120 pixel image
2. Convert image colour from 24 bit RGB
colour to 8 bit (256) grey-level image
4. Apply 3x3 median filter
3. Reset the previous image
arrays and image segments
5. Reduce spatial resolution of image to 32 x 24 blocks (using median pixel values)
6. Segment image based on
neighbourhood greyscale values
7. Search for the current block segment in the previous image
8. Smooth the updated segment values in current
image
9. Calculate rate of expansion for each
segment from previous image and set alerts if required
Is current image greyscale sum >
threshold? Yes
10. Display simulator output (32x24 blocks)
11. Copy current image to previous
image
No
Figure 6.4: Processing steps for the PDA block-based AHV simulation
6.2. Method 165
sum of grey levels in the previous image was greater than a threshold, the current
scene was assumed to have changed and the previous and segment arrays were
reset (step three). The threshold used was 245760, chosen as a 10% change in
total image grey-level for the image: (160x120x128)/10).
Following these steps, a 3x3 median filter was applied in step four to reduce
image noise.
These processing steps are illustrated in Figure 6.5a-c.
Step 5
In this step (shown in Figure 6.5a) the 32x24 block array was generated from
each image. The value of each block was calculated from the median value of
the 25 contributing pixels in the original 160x120 image. Image segments that
were expanding at a certain rate and were larger than a certain size were used to
determine the presence of a looming obstacle. The loss of spatial resolution (from
160x120 to 32x24 pixels) should be partly compensated by improved search time
in the following segmentation steps.
Step 6
Steps 6 through 10 use the 32x24 block array. In step 6 the eight neighboring
blocks of each block were scanned in a clockwise manner for a matching grey-
level value. If any of the grey-level values matched, and the matching block had
been allocated to a segment, the current block segment was set to the matching
block segment. If there were no matching grey-level or segment available, a new
segment was allocated.
166Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
(a) (b)
(c) (d)
Figure 6.5: The first five steps of the block based approach are illustrated in theseimages of a suburban footpath. The number of grey-levels in the base image (a)was first reduced to 8 grey-levels (b), before a median filter was applied (c).Finally the image was spatially reduced from 160x120 pixels to 32x24 blocks (d).
Step 7
It is common for a camera to move between captured frames due to ego mo-
tion (movement of the camera due to small head movements or walking gait).
Therefore to compensate for this in step 7 the algorithm searched for the position
of each current block array element in the previous image. As in the previous
step, a matching grey-level value signifies a match. Ego motion was considered
by searching over a 5x5 block area in the previous block array in the following
6.2. Method 167
Figure 6.6: The maximum search area used in Step 7. Each block from the currentimage block array (shown on the left) is compared against the previous imageblock array (shown on the right). Initially only the matching block position iscompared. The search then checks the 8 blocks surrounding this position. Finallyif a matching block has not been identified the surrounding 16 blocks are searched(giving a total search area of 25 blocks).
manner:
• The current block value was first compared against the previous block array
value.
• If there was no match, a search was conducted over the neighboring 8 blocks
in the previous block array.
• If a match was still not made, a search was conducted over the 16 blocks
neighboring the 8 blocks (and therefore a total of 25 blocks are compared).
This search area is shown in Figure 6.6.
• If there was no match from any of the 25 blocks, a new segment was allocated
to the current block.
Step 8
The final part of the segmentation stage was designed to integrate the segmenta-
tion results from steps 6 and 7. For each block, a search was performed on the
168Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
immediate 8-block neighbourhood and, if there was a matching grey-level value,
the current segment was updated to the matching block’s segment.
Step 9
In general, objects which are closer to the camera will expand at a faster rate
than objects which are further away. To check the rate of expansion, in this step a
comparison was made between the area (number of blocks) of each segment in the
current image block array compared to the previous image block array. Segments
that were smaller than a preset threshold (currently 20 blocks in area) were
ignored. If the rate of expansion (defined as the current image allocated segment
size/Previous image allocated segment size) was greater than 1.15 (determined
heuristically from test image sequences), an alert was set for that segment.
Steps 10-11
Finally, the phosphenes were displayed on the PDA display. Each phosphene was
displayed as a grey-level square. In this chapter, a 32x24 phosphene array was
displayed, therefore there was a simple one-to-one mapping between the block
array and the phosphene array. As the Pocket PC operating system does not
support the Microsoft DirectX set of APIs for high performance graphic display,
the older Game Application Programming Interface (GAPI) was used to directly
access the display memory. The block array was expanded to fill the 240x320
pixel PDA display. To improve efficiency, blocks were only displayed if they were
different from the previous display.
If an alert has been triggered from step 9 above, the phosphenes in the area of
that segment were identified with an ‘alert colour’. The colour red was selected
to stand out from the grey-level which was otherwise used for phosphenes. For
clarity in a printed thesis, the example figures in this chapter have had the red
alert areas filled with an ‘x’ shaped pattern (for example, 6.7(a) below.
6.2. Method 169
(a) (b)
(c) (d)
Figure 6.7: An example block based alert, shown in (d), which has been triggeredin response to looming branches in front of a head-mounted camera.
Figure 6.7 demonstrates the algorithm steps on a single image taken from a
suburban Brisbane footpath. In this image sequence the experimenter, wearing
the head-mounted camera, veered into bushes next to a path. Figure 6.7a is the
original 160x120 pixel grey scale image. Figure 6.7b is the same image after me-
dian filtering and conversion to 8 grey level values. Figure 6.7c is the 32x24 block
representation of Figure 6.7a. Figure 6.7d shows the location of alert segments,
which match the location of looming branches, that have been set for this image.
170Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
Table 6.1: The number of frames in each image sequence, along with the durationof each captured sequence.
Postal Box Bus Stop
Time of capture Frames Seconds Frames Seconds
Mid morning 234 29 204 25
Early afternoon 244 30 200 25
Mid afternoon 268 33 171 21
Late afternoon 284 35 179 22
Mean 257.5 31.75 188.5 23.25
6.2.5 Procedure
To evaluate the performance of the obstacle alert component of the AHV simula-
tion, image sequences were captured at different times of the day using the AHV
simulation hardware. Two locations were used to capture the image sequences:
1. Postal Box sequence. The first sequence involved walking slowly around a
bend and toward a postal box (approximately 15 metres in total).
2. Bus Shelter sequence. In the second sequence, the experimenter walked
toward a bus shelter obstacle along a path with overhanging trees (a distance
of approximately 10 metres).
The number of frames and duration of each sequence are shown in Table 6.1.
An obstacle alert should be triggered as a person moves towards a large loom-
ing obstacle. Therefore, both sets of image sequences ended with the camera
approximately a centimetre from the final obstacle (either the postal box or the
bus shelter).
6.2.6 Statistical methods
These image sequences were then analysed on the PC based version of the alert
software. The alerts identified by the block based obstacle alert were subjectively
6.2. Method 171
(a) Postal Box Mid morning Frame 10
(c) Postal Box Mid morning Frame 70
(e) Postal Box Mid morning Frame 130
(g) Postal Box Mid morning Frame 190
(b) Bus stop early afternoon Frame 10
(d) Bus stop early afternoon Frame 70
(f) Bus stop early afternoon Frame 130
(h) Bus stop early afternoon Frame 190
Figure 6.8: Frames 10,70,130 and 190 from the postal box mid morning sequence(on left) and the bus stop early afternoon sequence (on right).
172Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
rated as either correct (fences, overhanging trees, etc) or incorrect (incorrectly
segmented objects, shadows, etc).
There are two issues which should be noted regarding the analysis of results
from this experiment:
1. The first issue is that the number of correct alerts can be divided by the
number of incorrect alerts (recall). This information indicates how often
false alerts are presented which may distract an AHV system user, and an
indication of the usefulness of the obstacle feature. However, as the total
number of possible obstacles is unknown, the precision of the results in this
chapter [the number of correct alerts]/[the total number of possible alerts]
cannot be presented. A similar problem exists for Web searches, where the
total number of correct documents which can be retrieved is often unknown.
2. The second issue is that image sequences captured at different times of the
day have different numbers of image frames and obstacles. This has oc-
curred due to slight differences between head movements, walking speed,
and daylight levels (causing varying contrast between obstacles and back-
ground).
6.3 Experimental results
For all image sequences the post box and bus shelter were correctly identified as
looming obstacles at least once.
The results for the postal box 6.2 sequence were influenced by a white fence on
one side of the path. During the sequence captured at early afternoon, this fence
was captured less frequently which led to a reduction in valid alerts. This suggests
that following known structures, such as walls or fences, may be a useful method of
using an AHV system (a similar method, called shorelining, is frequently used by
6.3. Experimental results 173
blind people while walking next to walls or paths). Aside from the early afternoon
sequence, the ratio of correct/total number of alerts (Figure 6.9) decreased as the
experimenter moved away from the fence and increased again toward the postal
box. An example of correct obstacle identification for the midmorning postal box
sequence is shown in Figure 6.12.
Table 6.3 shows the results for the Bus shelter sequence. More alerts were
triggered for this sequence, as there was more visual clutter in front of the camera
as the images were recorded (from overhanging bushes along the section of path
leading to the bus shelter). In contrast to the early afternoon results for the postal
box sequence, only 37% of alerts were correct at this time for this sequence. This
can be partly explained by increased cloud cover while capturing this sequence
(note that the mean grey level is also reduced). Excluding the early afternoon
result, the other captured sequences for this route followed a similar pattern of
steadily decreasing during the day as natural illumination decreased. Figure 6.10
shows the mean recall for the bus shelter sequence over time.
Figures 6.9 and 6.10 show that in 7 out of 8 of the image sequences the recall
increased during the final 10% of the image sequence. This is desirable as each
sequence ended with the camera around two centimetres from the obstacle.
False alerts were usually shadows on the path, or the area surrounding an
obstacle. An example of a false alert is shown in Figure 6.11 where a path
shadow is incorrectly identified as an obstacle. The median filtered and 8 grey
level image is shown in Figure 6.11a. The 32 x 24 block image figure has been
segmented in Figure 6.11c. Figure 6.11d shows the alert segment, which has been
incorrectly identified.
174Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
100.0090.0080.0070.0060.0050.0040.0030.0020.0010.00
Percent of image sequence completed
1.00
0.80
0.60
0.40
0.20
0.00
Mea
n R
ecal
l
Late afternoon
Mid afternoon
Early afternoon
Mid morningTime of Day
Figure 6.9: Postal Box recall graph: This graph shows the recall (the numberof correct alerts/the number of alerts) at different stages during each capturedimage sequence.
6.3. Experimental results 175
100.0090.0080.0070.0060.0050.0040.0030.0020.0010.00
Percent of image sequence completed
1.00
0.80
0.60
0.40
0.20
0.00
Mea
n R
ecal
l
Late afternoon
Mid afternoon
Early afternoon
Mid morningTime of Day
Figure 6.10: Bus shelter recall graph: This graph shows the recall (the numberof correct alerts/the number of alerts) at different stages during each capturedimage sequence.
176Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
(a) (b)
(c) (d)
Figure 6.11: An example incorrect alert warning. The shadow shown in theoriginal median filtered and 8 grey-level image (a) is incorrectly segmented fromthe lower resolution image (b) and is assumed to be a looming obstacle in frontof the camera (d). The objects segments which have been identified are shown inimage (c).
6.3. Experimental results 177
Figure 6.12: Images 153 (top) to 156 (bottom) of the mid morning post boxsequence. The images on the left have been reduced to 8 grey levels and medianfiltered. On the right is the segmentation result for each image. An obstacle alert(shown with an ‘x’ pattern) was identified for frame 156.
178Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
Table 6.2: Postal box image sequence results.
Time of Capture Mean Grey level Correct Alerts Total Alerts Recall
Mid morning 72.73 7 8 0.87
Early afternoon 110.48 18 18 1.00
Mid afternoon 76.44 3 7 0.30
Late afternoon 81.82 1 4 0.25
Total 85.37 29 37 0.78
Table 6.3: Bus shelter image sequence results.
Time of Capture Mean Grey level Correct Alerts Total Alerts Recall
Mid morning 110.60 13 18 0.72
Early afternoon 102.05 7 19 0.36
Mid afternoon 91.91 14 21 0.67
Late afternoon 87.70 11 23 0.48
Total 98.07 45 81 0.56
6.4 Discussion
There are currently no other AHV simulations based on PDA technology. Current
technology processor, FPU, and reduced memory have constrained the PDA AHV
simulation presented in this chapter to performing obstacle detection on reduced
resolution images. An additional constraint on the simulation device used in this
chapter was the low quality of images captured from the CF card camera.
The PDA based alert system was development to partly answer the following
thesis question: Can computer vision techniques be adopted and modified
to provide mobility information in an AHV system? The alert method
has demonstrated that the computer vision method of motion estimation can be
adapted to provide a warning to an AHV system user. The alert algorithm used
the rate of expansion of segmented objects to provide a warning of an impending
collision. The two main obstacles (bus shelter and postal box) were correctly
6.5. Chapter Summary 179
identified as hazards in every image sequence captured. However, there was a
wide variation in the number of alerts presented (between 1 and 18 for the postal
box sequence), due to differences in lighting and camera direction while recording
the images. Therefore, it would be useful to measure how frequently alerts should
be presented to assist with the mobility of an AHV system user.
An obvious method to improve the segmentation performance of the looming
obstacle alert would be to conduct all motion processing on the higher resolution
image before reducing the spatial resolution. It should be possible to use this
approach as PDA technology improves.
The algorithm described measures the rate of expansion of objects which are
approaching the camera. In an AHV system this camera will usually be mounted
on a person’s head. Therefore a person could make contact with an obstacle such
as a chair or car which is not within the camera’s field of view. However, as
discussed in Chapter 2, head-high obstacles (such as telephone booth jutting out
from a wall) can be particularly dangerous obstacles for the blind.
In future studies on the effect of light levels on obstacle alerts, the use of a
single sequence could be helpful. One way to do this might be to adjust the
grey-levels within an image sequence to reproduce decreasing illumination. How-
ever, caution needs to be used, as there may be compounding factors which effect
mobility at different times of the day (for example, shadow effects and temper-
ature may also affect walking speed). A set of standard mobility-related image
sequences would be very useful for the development and testing of alert algo-
rithms.
6.5 Chapter Summary
In this chapter a novel method of processing images using a PDA and attached
camera was presented. This method detects obstacles which are looming in front
180Chapter 6. AHV Simulation and Obstacle Detection using a Personal Digital
Assistant
of the camera and provides an alert to the wearer. The results of two experi-
ments at four illumination levels have indicated that the initial segmentation and
adequate illumination are a significant factor in system performance. The overall
recall value of 63% indicates that the block based method shows reasonable per-
formance for development in future AHV systems, although it will be important
to consider what ratio of correct alerts versus false alerts will be acceptable for
system usability (this question is addressed in the next chapter).
An important question is whether the display of alert information in an AHV
simulation results in a reduction in the number of collisions during a mobility
assessment. In the next chapter an indoor mobility experiment will be described
to evaluate the use of the alerts on mobility effectiveness.
Chapter 7
Mobility Assessment using a
PDA-based AHV Simulation in a
course environment
7.1 Introduction
This chapter presents a pilot experiment in which an artificial mobility course was
constructed (based on the artificial courses discussed in Chapter 2). This course
allowed the mobility performance of volunteers to be assessed while they wore a
PDA-based AHV simulator. This simulator consisted of three modes including
the alert processing mode discussed in the previous chapter.
An image processing based AHV simulation has not been previously been used
in mobility assessment. As discussed in Chapter 2, in the only paper focussing on
AHV mobility Cha et al. [29] investigated walking speed and obstacle contacts
in an high contrast maze-like environment. However, their work used different
masks attached to a monitor in front of participants’ eyes (there was no processing
of images captured from the camera). Also, Cha et al.’s paper reported walking
181
182Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
speed through their maze, which (unlike PPWS) does not consider the effects of
individual differences in normal walking speed. Therefore a pilot experiment with
a small number of participants was conducted to investigate the use of PPWS
and computer vision methods in an AHV simulation.
The aims of this experiment were to investigate the following two main thesis
research questions:
Can objective measures be developed for the comparison of effective-
ness between AHV systems in providing mobility information?
The use of an artificial mobility course in this experiment should allow a larger
range of mobility related factors to be investigated than static image simulations.
The increased number of influences on mobility are shown in figure 7.1 and include
dynamic factors (such as sensory and environmental information) and human
factors (such as walking gait). A specific hypothesis investigated in this chapter
was that PPWS and mobility performance should increase during trials and with
repeated use of the simulator.
Can computer vision techniques be adopted and modified to provide
mobility information in an AVH system?
In this chapter the alert display developed in the previous chapter was compared
to two other display types. Using the artificial mobility course it was expected
that the frequency of mobility errors and time required to perform mobility tasks
should be less when the alert display is activated compared to the other display
types.
7.1. Introduction 183
Dynamic factors
Computer Visionmethods for AHVSpatial ResolutionFrame RateNumber of grey levelsLow pass filter (smoothing)Motion estimationObstacle detection
ContextIndoor artificial mobilitycourseScene propertiesTextureComplexityLightingGlareContrastType of objects
TaskWalking along pathObstacle avoidanceFinding KeysSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception
External factorsAHV TechnologyCameraResolutionFrame rateField of View ...Electrode ArrayLocationNumber of electrodesType of electrodeSpatial Layout ofelectrodesInfrastructureTransmitter/ReceiverTelemetry methodStimulator typeProcessor
EnvironmentAffordancesPath boundariesLandmarks
Human FactorsExperience/trainingPsychological factors (motivation,depression, etc)Physical factors (gait, etc)Use of secondary aidAgeGender...Non-image sensorsUltrasoundLaserGPS...
Other modalitiesAuditoryTactile ...Other display
modalitiesAuditoryTactile...AHV Display TypeAlert InformationLooming obstacles
Symbolic
Standard Display Mobility PerformancePPWSObstaclesVeeringFigure 7.1: Factors which influenced simulated AHV display effectiveness in thischapter. Excluded factors are marked with a line pattern.
184Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
7.2 Method
This experiment involved human participants wearing a custom PDA-based AHV
simulator. Participants were required to walk through an artificial mobility course
while their mobility performance was assessed.
7.2.1 AHV Simulation Device
Custom hardware and software were developed for the novel AHV simulation
used in this chapter.
Hardware
A standard head-brace device (weighing 250 grams) was adapted to include a
bracket (400 grams in weight) for holding a PDA in front of the participant’s
eyes. This head-brace and PDA setup enables the simulation of AHV within
different experimental environments.
External light (not from the PDA display), was restricted by each participant
wearing a pair of modified ski goggles, lenses removed, and a sheet of block out
curtain sewn to the bottom of the frame. This curtain was then lifted over the
headgear and tied behind each the head of each participant. A layer of black felt
was also attached to the nose area of the goggles to restrict light.
The combined weight of the camera and PDA was only 164 grams. To conserve
battery life, the PDA display brightness was adjusted to 50% and bluetooth
communication settings were disabled. To prevent the PDA shutting down during
an experiment, all power saving options were also disabled.
Simulation software
The PDA software was an enhanced version of the program developed in the
previous chapter. The simulator software was developed in Microsoft embedded
7.2. Method 185
Display type Image processing
1 8 grey-scale median filtered display with Alerts
2 8 grey-scale median filtered display
3 256 grey-scale average display
Table 7.1: AHV simulation display types used for the pilot study
Visual C++ version 4.0 on a Windows XP PC. Functions from the Flycam-CF
Software Development Kit library was used to obtain images from the camera.
After compilation, the program files were transferred to the PDA using a USB
connection and Microsoft ActiveSync.
Three types of display were compared in this experiment. These display types
are described in Table 7.1 and shown in Figure 7.3. To exclude frame rate as
a confounding variable, each display type was standardised to 7.5 frames per
second (fps). These displays presented 32x24 simulated phosphenes, which filled
the 320x240 pixel PDA display.
The main steps in the image processing algorithm are shown in Figure 7.2.
The original 160x120 pixel camera image was captured, converted to either 8
(display types 1 and 2) or 256 grey-levels (display type 3). A 3x3 median filter
was applied to display types 1 and 2.
Detailed information on display type 1 (the alert display) was presented in
chapter 6. The main purpose of this display is to segment each image and com-
pare the growth of segments between images. If a segment expands above a
predetermined threshold, and takes up more than 40% of the screen, then an
alert is displayed (the expanding segment is shown as red on the PDA screen).
For display type 2, the reduced grey-level and median filtered output was
reduced to a 32x24 block array based on the average pixel values. Display type
3 was simply the original grayscale image reduced to a 32x24 ‘block’ array based
on average pixel values.
186Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
Get current160x120 pixelimageConvert from RGBto 256 grey-levelimageDisplay = 1 or 2? Reduce image to 8greylevelsApply 3x3 medianfilterYesNoDisplay = 1?Reduce spatialresolution of image to32x24 blocks (usingaverage pixel values) No Yes Reset the previousarray and imagesegmentsReduce spatialresolution of image to32 x 24 blocks (usingmedian pixel values)Segment imagebased onneighbourhoodgreyscale valuesSearch for thecurrent blocksegment in theprevious imageSmooth the updatedsegment values incurrent imageCalculate rate ofexpansion for eachsegment fromprevious image andset alerts if required
Current imagegreyscale sum >threshold? YesNo
Display simulatoroutput (32x24blocks)Copy current imageto previous imageFigure 7.2: Processing steps for the AHV simulator used in the pilot study. Notethe display type is initialised before the images are processed. The three displaytypes are listed in table 7.1.
7.2. Method 187
(a) (b)
(c) (d)
Figure 7.3: Examples of the image types used in this study. Figure (a) is the base160x120 pixel 256 grey-level image. The simulator image using display type 3 isshown in image (b). Image (c) shows the base image from (a) with 8 grey-levelsand a 3x3 median filter applied. In image (d), image (c) has been reduced to a32x24 phosphene display (this is used for simulator display types 1 and 2).
188Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
(a) (b)
Figure 7.4: Images taken of the Gait Lab before the mobility course was set up.Image (a) shows the black curtains surrounding the lab. The change area ‘tent’,and raised wooden platform are visible in image (b).
7.2.2 Assessment of mobility performance
To assess mobility performance using the AHV simulation, an indoor mobility
course (shown in Figure 7.6) was constructed within an 11m x 10m laboratory
(used for gait analysis at the School of Human Movement, Queensland University
of Technology). The walls of this laboratory were covered with black curtains.
The course consisted of a winding path, approximately 1.2m in width. Path
boundaries were marked with 48mm black duct tape. A wooden platform (raised
approximately 8cm from the floor) was incorporated into the mobility path and
is visible in Figure 7.4b. The floor of the course consisted of wood and concrete
(painted light grey). The total length of the course was approximately 45m.
Eleven obstacles of differing heights were placed through the course (Figure
7.6). Two of the obstacles were suspended from the ceiling to a height of 1.2
m. All obstacles along the path were made of soft materials. Obstacle number 4
7.2. Method 189
Figure 7.5: Example soft obstacle set up for the mobility course.
from Figure 7.6, a 100cm tall light grey obstacle, is shown in Figure 7.5. These
obstacles were chosen as a safe way of replicating various obstacles found in real
life such as chairs or people. The obstacle types were based on those used in
previous low vision mobility studies (for example, Lovie-Kitchin et al. [137]).
One of the main dependent variables used in this pilot experiment was PPWS.
As discussed in Chapter 2, PPWS allows the objective comparison of different
people by normalising their walking speed. Therefore a straight, unobstructed,
10m section of the course was used to measure the Preferred Walking Speed
(PWS) of each participant. This area is shown in Figure 7.6.
In this experiment each participant was required to perform two different tasks
(A and B). The order of these tasks alternated for each participant. These tasks
were:
1. Task A: Navigate through the course from the start to a clearly marked end
190Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
point.
2. Task B: Find a set of keys, located on a table next to the path, and carry
these keys to the end of the course.
7.2.3 Questionnaire
A number of human factors, which have been identified in the display framework
(Figure 7.1) were also recorded for each participant in this experiment. Therefore
before commencing the mobility tasks, each participant was asked to fill in a
questionnaire comprising the following questions:
1. What is your gender? Male Female
2. Please indicate your age (years): <20 yrs 20-30 yrs 30-40 yrs 40-50 yrs 50-60
yrs over 60 yrs
3. Are you wearing any vision correction device (glasses or contact lenses)?
Yes No
4. Have you ever used an immersive Virtual Reality (VR) environment (using
a head mounted VR display) before? Yes No
5. If you have used an immersive VR environment before, approximately how
many times have you done this?
The questions on vision correction or prior experience with virtual reality were
included as these factors may enhance a person’s ability to compensate for a lower
resolution AHV simulation display.
7.2. Method 191Mobility Path Mobility Path ChangeRoom
14 4 44
5 5 22
33A
B TableTableTableTableGait analysis equipment
31 245Obstacle Types:90 cm high from floor, 48 cm diameter120 cm high from floor, 70 cm x 30 cm30 cm high from floor, 55 cm x 40 cm100 cm high from floor, 50 cm diameter120 cm high suspended from ceiling, 35 cm x 35 cm x 10 cm depthBA Start position A and B10 metre Preferred Walking Speed (PWS) measurement area
Figure 7.6: The indoor course used for mobility assessment in this Chapter.
192Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
7.2.4 Participants
This experiment was designed as a pilot to investigate the use of a PDA based
AHV simulator in an artificial mobility course. Therefore a small sample of
five people participated in this pilot study. Three volunteers were selected from
the postgraduate student population and another from academic staff, at the
Queensland University of Technology. The final volunteer was an undergraduate
student at the University of Queensland. All participants had normal or corrected
to normal vision. Three participants were aged 20-30 years, one was less than 20
years, and one was aged 40-50 years.
Level 1 (Low risk) ethical clearance (number (3887H)) was obtained from the
QUT University Human Research Ethics Committee for this experiment.
7.2.5 Procedure
Each participant was randomly allocated one of the three display types, and was
allocated to commence the first trial with one of the two task types (Task A
involved moving through the course or Task B which involved searching for a set
of keys in addition to moving through the course). An hour was allocated for
testing each individual. Study participants were met in a waiting room, blind-
folded, and led to a screened ‘change room’, where they were asked to read a
consent sheet and fill out the questionnaire. The simulation headgear was then
explained and fitted. Each participant was allowed two minutes to familiarise
themselves with the display. If the alert based display was used, the red flashing
display sections (obstacle warning) were explained. The guided PWS was then
recorded over 10m (this area is shown by a dotted line in Figure 7.6). After this
the participant was led to the task starting location and the first mobility task
was conducted. Each participant was offered a short break before the second
task was conducted. Finally, the PWS was again measured. During the mobility
7.3. Results 193
tasks, a single experimenter recorded walking speed, obstacle contacts, the num-
ber of times participants were told they were walking backwards (that is they
were walking normally but had become disoriented and started walking in the
opposite direction to that required to complete the course) and the number of
times participants veered outside the path boundary.
7.3 Results
The questionnaire responses are shown in Table 7.2. None of the participants had
personal experience with Virtual Reality environments. Three participants played
computer games monthly, one played weekly and one played daily. There was no
significant relationship found between game playing and mobility performance.
The number of recorded mobility errors for each two minute interval during
the mobility course are presented in Figure 7.7. The number of errors decreased
steadily as participants adapted to the simulation device. The number of errors
also peaked during the first two minutes for both the first and second trials.
The calculations used for PWS, SMC and PPWS are shown in Table 7.3 for
trials 1 and 2. These calculations are based on equations 2.1 and 2.2 in Chapter
2. Overall PPWS was significantly reduced between the first and second trial
(F (1,9) = 9.70, p<0.05). A reduction in mobility errors was also found between
the first and second trials. The mean number of obstacle contacts was reduced
(5.8 in the first trial to 5.2 in the second trial), veering errors (10.4 to 6.4) and
walking backwards errors (1.8 to 0.8). The overall mean PPWS improved from
the first trial (mean value of 14.76) to the second trial (mean value of 21.52).
No participants were successful in finding the keys during the searching mo-
bility task. However, in this pilot study the type of mobility task did not appear
to make a difference in mobility performance (Tables 7.4 and 7.6 ). A summary
of mobility errors for each task is shown in Table 7.5.
194Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
Table 7.2: Questionnaire responses for each participant.
No. Gender Age Video game use Used VR Times VR used
1 Male 20-30 Weekly No 0
2 Male 40-50 Monthly No 0
3 Male 20-30 Monthly No 0
4 Male 20-30 Daily No 0
5 Male <20 Monthly No 0
Table 7.3: PPWS results for each trial for each participant. The Benchmarkcolumn is the time taken during the 10m guided walk. PWS is 10/Benchmarktime. Course (s) is the amount of seconds taken while walking through the 45mmobile course. SMC is 45/Course speed. PPWS is SMC/PWS multiplied by 100.
Benchmark (s) PWS (m/s) Time(s) SMC (m/s) PPWS
No. D T 1 2 1 2 Ave 1 2 1 2 1 2
1 1 B 16.82 13.98 0.59 0.72 0.66 775 297 0.06 0.15 8.86 23.13
2 3 A 12.50 13.34 0.80 0.74 0.77 357 320 0.13 0.14 16.37 18.26
3 2 B 19.53 16.30 0.51 0.61 0.56 475 288 0.09 0.16 16.92 27.90
4 2 A 13.50 15.22 0.74 0.66 0.70 459 325 0.10 0.14 14.01 19.78
5 1 A 14.98 17.59 0.67 0.57 0.62 420 357 0.11 0.13 17.28 20.33
Table 7.4: PPWS results for each task type and trial.
PPWS Trial 1 Trial 2
Task A 15.89 25.52
Task B 12.89 19.46
Table 7.5: Mobility error results for each trial for each participant.
Obstacle Walk
contacts Veering backwards
Task/Trial 1 2 1 2 1 2
A 5.00 6.00 10.33 8.33 2.00 1.33
B 7.00 4.00 10.50 3.50 1.50 0.00
7.3. Results 195
Table 7.6: Mobility error results for each task type and trial.
Mobility Errors Trial 1 Trial 2
Task A 17.33 15.67
Task B 19.00 7.50
12-1410-128-106-84-62-40-2
Minutes
40.00
30.00
20.00
10.00
0.00
Nu
mb
er o
f m
ob
ility
err
ors
Trial 2
Trial 1Trial
Figure 7.7: Total number of mobility errors for both trials during the mobilitycourse experiments
196Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
Table 7.7: Mobility error summary for each display type.
Display Type 1 2 3 Total
Obstacle contacts trial 1 15 9 5 29
Obstacle contacts trial 2 13 10 3 26
Veering errors trial 1 21 23 8 52
Veering errors trial 2 19 7 6 32
Walking backwards trial 1 3 6 0 9
Walking backwards trial 2 2 1 1 4
Total: 73 56 23 152
Table 7.8: PPWS summary for each display type.
Display Type 1 2 3 Mean
Trial 1 mean PPWS 13.07 15.46 15.76 14.76
Trail 2 mean PPWS 21.73 23.84 19.00 21.52
Combined mean PPWS 17.40 19.65 17.38 18.14
Table 7.9: Effect sizes (η2) for the main mobility factors in this pilot study. ‘DV’represents the dependent variable, ‘F’ is the F-test result and ‘Sig’ representssignificance.
Factor DV F Sig η2
Trial PPWS 9.700 0.014 0.548
Trial Veering errors 3.008 0.121 0.273
Trial Obstacle contacts 0.151 0.707 0.019
Display PPWS 0.196 0.827 0.053
Display Veering errors 0.472 0.642 0.119
Display Obstacle contacts 1.683 0.253 0.323
7.4. Discussion 197
The results for each display type are summarised in tables 7.8 and 7.7. Display
type 2 (the 8 grey-level median filtered display) resulted in the highest PPWS
results. Display type 3 (the 256 grey-level average display) resulted in only 23
mobility errors, compared to 56 for the Display type 2 and 73 errors for Display
type 1 (8 grey-scale median filtered display with Alerts) (Table 7.6).
Table 7.9 shows the effect sizes from this pilot study. η2 was used as a measure
of effect and represents the proportion of variance in the dependent variable that
is attributable to each factor. The greatest degrees of association were found
to be between the trial number (first or second) and display type and obstacle
avoidance.
7.4 Discussion
This pilot study has demonstrated the feasibility of using a low cost PDA-based
AHV simulator to assess mobility performance. PPWS and mobility errors have
provided a useful method of measuring the three display types used in this study.
This chapter aimed to address the following questions:
Can objective measures be developed for the comparison of effective-
ness between AHV systems in providing mobility information?
The use of PPWS within an artificial mobility environment has been demon-
strated. This environment has enabled the comparison of three different AHV
simulation display types on mobility performance.
PPWS and mobility performance should increase during trials and
with repeated use of the simulator.
Learning effects were demonstrated with the two trials used in this study. Cha et
al [27] have previously noted the effects of learning on mobility skill, in particular
198Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
the use of head movements to help depth perception and familiarity with the
environment. During this study participants learnt to recognise and follow the
path boundaries, usually by slight head movements. However, this meant that
participants tended to bend over, and walk in a shuffling gait during the mobility
course, similar to the gait of the elderly or the congenitally blind [188].
Can computer vision techniques be adopted and modified to provide
mobility information in an AVH system?
In this chapter the alert display developed in the previous chapter was compared
to two other display types. All of these display types have used image process-
ing techniques to process captured camera input and provide a low resolution
simulation image to participants.
The frequency of mobility errors and time required to perform mobility
tasks should be less when the alert display is activated compared to
the other display types.
Display type 1 (the alert display) did not assist with mobility performance, and
led to the highest number of mobility errors (see table 7.7). One problem with
this display type was that although each of the three display types were standard-
ised at 7.5 fps, the alert display could temporarily pause with large changes in
luminance (generally due to large head movements). The reason for the delay was
the alert software re-initialising and performing initial segmentation. The large
number of false positives with the alert display also reportedly confused partici-
pants. Although the idea of checking the rate of expansion of segmented objects
is probably sound, these results indicate that an efficient method of performing
this processing needs to occur on the full size image and not the reduced 32x24
block spatial image (which was chosen to reduce the computational burden on
the PDA). At the lower spatial images, the human brain appears better prepared
7.5. Chapter Summary 199
to extract looming obstacle than the alert system presented in this chapter. This
supports Boyle et al. [19] who found that further processing on low resolution
images does not result in greater image understanding, and that the most impor-
tant current constraint on AHV systems is the limited number of electrodes (and
therefore reduced spatial resolution).
The usefulness of the alert modes could depend on the location of the AHV
electrode implant and the degree of learning, neuroplasticity and general health
available to the recipient. For example, a retinal implant recipient may be less
likely to require alert assistance due to further processing in the human visual
system.
Although the PDA simulator tended to pull down on the participants fore-
head, none of the participants asked to stop the experiment. Two participants
needed a break between trials due to nausea and dry eyes. Nausea is a well-known
side effect of display lag within VR environments (making vestibulo-ocular adap-
tation difficult for the participant) [1]. A lighter, and less conspicuous, simulation
device would be useful, particularly for outdoor mobility assessment. It should
be feasible to connect a wireless head mounted camera with a PDA and send the
display to either VR goggles, or a Low Vision Enhancement System (as used in
[226]). In addition the material shroud used to block external light had the effect
of trapping heat generated from the PDA, which added to participant discomfit.
7.5 Chapter Summary
In summary, the simulator has demonstrated that it is possible to capture and
perform image processing on camera input using a PDA device. Although only
a small number of subjects were involved, this study has not supported the use
of the ‘intelligent’ alert display developed in Chapter 6. All participants, with all
display types, were able to improve mobility performance, measured by PPWS
200Chapter 7. Mobility Assessment using a PDA-based AHV Simulation in a course
environment
and mobility errors, over only two trials. Although the PDA provided a small and
low cost simulation platform, a problem with the PDA based simulator was the
weight of the bracket on the front of the head brace, which may have altered the
movement of participants, and which could have affected mobility performance.
In addition, the square grey boxes displayed to represent phosphenes are not
representative of those described by human recipients of a AHV system. The next
chapter describes the development and use of a more advanced and comfortable
AHV simulator which overcomes these issues.
Chapter 8
Effects of Spatial and Temporal
Resolution on Mobility
Assessment
8.1 Introduction
This chapter presents the results of an AHV mobility experiment involving a large
number of participants. The design of this experiment builds on the pilot mobility
course described in Chapter 7, however the heavy PDA based simulator has been
upgraded to a Virtual Reality (VR) type Head Mounted Display (HMD), which is
connected to a standard Windows XP laptop running custom software. A guiding
principal used throughout this thesis was to develop a low cost simulation, and
this has guided design decisions on hardware and software used in this Chapter.
The following thesis research questions are addressed in this chapter:
201
202 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
Can specific main factors be identified as highly significant for provid-
ing mobility information in an AHV system?
One of the aims of the research presented in this chapter is to investigate the
effects of frame rate and spatial resolution on mobility performance. There have
been reported limits to the temporal resolution at which phosphenes can be per-
ceived (for example, Dobelle has reported that 4 frames per second (FPS) was the
most effective temporal resolution for his cortical device. [48]). Also, although
faster stimulation has been reported in this literature, much research on temporal
resolution (for example, Eckhorn et al. [58]) is currently based on animal experi-
ments and the effects of chronic electrical stimulation on the human visual system
may cause a reduction in temporal resolution. In addition sensory substitution
devices for the blind also provide information at low frame rates (for example,
the vOICe auditory based device provides soundscapes at 1 fps [149]).
Although a focus of much current AHV research is to increase the number of
implantable electrodes and therefore increasing perceived spatial resolution, the
effect of frame rate on mobility for an AHV display has not yet been examined. It
is hypothesised that mobility performance should increase with increased spatial
resolution and also with frame rate. This chapter investigates the interaction
between display frame rate (1, 2 and 4 FPS) and spatial resolution (32x24 and
16x12 phosphenes).
A number of additional factors from the proposed AHV mobility display
framework were also evaluated (see Figure 8.1). Participants with corrected-
to-normal vision were hypothesised to demonstrate better mobility performance
than normally sighted participants due to their previous experience compensating
for minor visual loss. Similarly, this experiment was also interested in whether
8.1. Introduction 203
participants who had previous experience with immersive virtual reality environ-
ments could perform more effectively with the AHV simulation than those with-
out previous experience. The effect of gender on mobility performance was also
examined, as research has reported differences in navigation speed performance
on virtual reality display field of view [44] and differences in mental rotation (for
example [167]).
Can objective measures be developed for the comparison of effective-
ness between AHV systems in providing mobility information?
The experiment described in this chapter uses a custom developed indoor artificial
course to compare mobility between different people using an AHV simulation
device. This course is similar to the course used in the previous chapter, however
the obstacle types are standardized, background clutter has been reduced by the
use of office partitions along the course, and the level of noise has been reduced.
The dependent variables used in this experiment are walking speed through the
course, PPWS, and mobility errors (obstacle contacts and veering). One aim of
this study was to explore the suitability of these mobility related variables as
objective measures for comparing different AHV display types.
Can computer vision techniques be adopted and modified to provide
mobility information in an AVH system?
Both the resolution and frame rate of the simulation display were programatically
controlled for each participant. Captured camera images were processed to reduce
both the resolution and the number of grey-levels reduced before display. The
alert display, found to be confusing and ineffective in the previous chapter, was
not used in this experiment.
204 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
Dynamic factors
Computer Visionmethods for AHVSpatial ResolutionFrame RateNumber of grey levelsGaussian filter
ContextIndoor artificial mobilitycourseScene propertiesTextureComplexityLightingGlareContrastType of objects
TaskWalking along pathObstacle avoidanceSensory InformationTasteHeatOlfactoryTactileAuditoryProprioception
External factorsAHV TechnologyCameraResolutionFrame rateField of View ...Electrode ArrayLocationNumber of electrodesType of electrodeSpatial Layout ofelectrodesInfrastructureTransmitter/ReceiverTelemetry methodStimulator typeProcessor
EnvironmentAffordancesPath boundariesLandmarks
Human FactorsExperience/trainingPsychological factorsPhysical factors (gait, etc)Use of secondary aidAgeCorrected visionExperience with virtual reality...Non-image sensorsUltrasoundLaserGPS...
Other modalitiesAuditoryTactile ...Other displaymodalitiesAuditoryTactile...AHV Display Type
Alert InformationLooming obstaclesSymbolic
Standard Display Mobility PerformancePPWSObstaclesVeeringFigure 8.1: Factors which influence simulated AHV display effectiveness in thischapter. Excluded factors are marked with a line pattern.
8.2. Method 205
8.2 Method
The experimental set up used in this chapter is similar to the previous chapter. A
significant different was the use of a VR display device to display the AHV sim-
ulation. Custom software was also developed to display more realistic simulated
phosphenes. An indoor mobility course was set up in a large civil engineering
concrete laboratory.
8.2.1 Simulation Hardware
The hardware used in this study consisted of three components:
Head Mounted Display
The Head Mounted Display (HMD) used in this study was the i-O Display Sys-
tems (Sacramento, CA) i-glasses PC/SVGA, which provided a selected resolution
of 640-by-480, total field of view 26.5◦ at 60Hz refresh rate. This display was cho-
sen due to its low cost (AU$1230) and simple interface to a laptop PC. An external
lithium polymer battery (cost AU$215) powered the HMD.
To block out external light, a custom shroud was constructed from block out
curtain and sewn around to the HMD (with slots to allow ventilation).
Camera
A Swann Netmate Universal Serial Bus (USB) camera was attached, at eye level,
to the front of the HMD. This camera was selected due to its low cost (AU$53),
small size, light weight and simple integration with the Windows operating sys-
tem. The camera used a 1/7 inch CMOS sensor, and has automatic gain com-
pensation, exposure and white balance. The field of view (FOV) for this camera
was manually calculated at 34◦ horizontal and 27◦ vertical.
206 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
Laptop PC
A Toshiba Tecra laptop (1.6GHz Centrino processor) was either worn by partic-
ipants in a backpack, or carried by the experimenter. The camera was powered
from the USB port of this computer.
8.2.2 Simulation Software
The main requirement for the AHV simulation software was to convert input
from the USB camera into an on-screen phosphene display. To be representative
of current AHV prototypes and maintain the same aspect ratio of the display
device the simulation reduced the resolution of captured images from 160x120
RGB colour to 32x24 or 16x12 simulated phosphenes. As discussed in Chapter
3, it is possible to modulate the size of phosphenes to represent a limited number
of grey levels. Therefore in this simulation it is assumed that eight grey levels for
each phosphene can be displayed.
The simulation software was written in Visual C++ 6.0 (Microsoft, Redmond,
WA), using the Microsoft Video for Windows library to capture incoming video
images. These images were sub-sampled (using the mean grey level of contributing
pixels) to a lower resolution image, which was then converted to 8 grey levels. To
simulate a perceived electrode response, the low resolution image was displayed as
a phosphene array using the DirectDraw component of Microsoft DirectX. Figure
8.2 shows the mapping between image grey levels and the different phosphene
representations. Each phosphene was generated from an original circle, 40 pixels
in diameter, filled with the matching grey level and blurred with a Gaussian filter
(Radius=10). Examples of the simulation display are shown in Figures 8.3 to
8.5. These simulated phosphenes are similar to those generated by Thompson et
al. [226] and Dagnelie et al. [47].
8.2. Method 207
Figure 8.2: Phosphenes (top row) displayed as grey level pixels in reduced reso-lution images
Figure 8.3: Original 160x120 pixel captured image
8.2.3 Mobility course
To assess mobility performance, an indoor mobility course (Figure 8.6) was con-
structed within an emptied 30x40m civil engineering laboratory at the Queens-
land University of Technology. The mobility course consisted of a winding path,
approximately 1m wide and 30m long. Path boundaries were marked with 48mm
black duct tape. The floor of the course was concrete, which was painted light
grey, however a 3m2 section was painted white from a previous experiment. Grey
office partitions, approximately 200 cm tall, were placed on either side of the path
to reduce visual clutter and to prevent participants from confusing the neighbor-
ing path with the current path.
208 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
Figure 8.4: Original image reduced to 32x24 phosphenes
Figure 8.5: Original image reduced to 16x12 phosphenes
8.2. Method 209
Figure 8.6: Map of the 30m mobility course built for this study. The grey shadedarea is the path identified by black tape on the floor. The numbers refer to theplacement of obstacles and the black lines denote office partitions.
Figure 8.7: Different types of grey shading on each obstacle shown in Figure 8.6
Eight obstacles, painted in different shades of matt grey, were placed through
the course (see Figure 8.7). Two of the obstacles were suspended from the ceiling
to a height of 1.2 m above floor level. All obstacles along the path were made
from empty packing boxes (450x410x300mm). As in the mobility course described
in Chapter 7, these obstacles were designed to replicate obstacles which a blind
person could encounter in the real world.
A straight, unobstructed, 10m section of the course (shown in Figure 8.6)
was used to measure the Preferred Walking Speed (PWS) of each participant.
8.2.4 Participants
Ten female and 50 male volunteers were recruited from staff and students at dif-
ferent faculties at the Queensland University of Technology (QUT). The method
210 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
of recruitment involved emails and posters placed around the three QUT cam-
puses. The age and gender distribution of participants is shown in Table 8.1.
All participants had normal or corrected-to-normal vision.
Level 1 (Low risk) ethical clearance (number (3887H)) was obtained from the
QUT University Human Research Ethics Committee for this experiment.
8.2.5 Questionnaire
As in the previous chapter, before commencing the experiment each participant
was asked to provide details of gender, age and whether the participant was
wearing glasses or contact lenses were collected from a questionnaire. In addition,
participants were asked how many times (if any) they had used an immersive
Virtual Reality environment. The Questionnaire is included in Appendix B of
this thesis.
8.2.6 Statistical methods
Unless stated otherwise, statistical significance was at the p<.05 level. The Sta-
tistical Package for the Social Sciences (SPSS) (2004, SPSS Inc, Chicago, USA)
was used for all statistical calculations. Multifactorial ANOVA was used to test
for significant effects among the experimental factors. Normality and homogene-
ity of variance were assessed visually and found to be acceptable for the use of
parametric statistics.
The formulae for calculating PPWS are provided in Chapter 2 ( equations 2.1
and 2.2 ). In the experiment described in this chapter the preferred walking speed
(PWS) for each participant was measured before and after their two mobility
trials. These two PWS scores were defined as distance (metres) divided by speed
(seconds). As previous research has found the PWS to be a stable mobility
measure for individuals (for example, Soong et al. [209] and Lovie-Kitchin et al.
8.2. Method 211
Table 8.1: Gender and age groups of experiment participants.
Age Male Female Total
0-19 3 1 4
20-29 27 5 32
30-39 11 1 12
40-49 6 3 9
50+ 3 0 3
Total 50 10 60
[137])), the two PWS results were averaged. This average PWS score was used
for all PPWS calculations.
8.2.7 Procedure
Each participant was randomly allocated to one frame rate (1, 2 or 4 fps) and one
display type level (16x12 or 32x24 phosphenes) and commenced their first trial
with one of the two course start locations (marked ‘A’ or ‘B’ in Figure 8.6). One
hour was allocated for testing each individual. Study participants were met in
a corridor outside the lab, read a consent sheet and filled out the questionnaire.
The simulation headgear was then explained and fitted before the participant was
led into the lab. Each participant was then allowed two minutes to familiarise
themselves with the display. The guided PWS was then recorded over 10m. After
this the participant was led to the trial starting location and the first mobility
trial was conducted. Participants were offered a short break of approximately one
minute before the second trial was conducted. Finally, the PWS was measured
for the second time. During the mobility trials, a single experimenter recorded
walking speed, obstacle contacts, the number of times participants were told they
were walking backwards and the number of times participants veered outside the
path boundary. Mobility performance was recorded for each participant on the
212 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
experimenter sheet shown in Appendix B.
8.3 Results
The average number of obstacle contacts by frame rate and resolution are sum-
marised in Table 8.2. A boxplot showing the median obstacle contacts is shown
in in Figure 8.8. The frequency of contact with different obstacle types is shown
in Figure 8.9.
Table 8.3 summarises the average number of veering errors by frame rate and
resolution. Figure 8.10 demonstrates a trend in reduced veering errors as frame
rate and spatial resolution increase.
A summary of the two benchmark speeds, used to calculate PWS, are shown
in Table 8.4. A similar summary for the time spent walking through the mobility
course and PPWS are presented in Table 8.4. The decrease in walking time and
PPWS is more noticeable in the second trial.
A boxplot showing the median PPWS on trials 1 and 2 by resolution and
frame rate is shown in Figure 8.11, which shows a general increase in PPWS as
frame rate increased (although there was an unexpected lower score for 2 frames
per second at the 16x12 resolution level). A similar plot for the time spent walking
through the course (referred to as Time 1 and Time 2 from now on) is shown in
Figure 8.12.
The interaction between resolution, frame rate and PPWS during each trial is
shown in Figures 8.13 and 8.14. Similar graphs showing the interaction between
resolution, frame rate and Time 1 and Time 2 is shown in Figure 8.15 and 8.16.
8.3.1 Phosphene spatial resolution
As shown in Figure 8.10, overall veering was significantly less with a higher level of
spatial resolution (F (1,54) = 21.25, p<0.01). There was no significant difference
8.3. Results 213
Table 8.2: Mean number of obstacle contacts (with standard deviations) for dif-ferent resolution and frame rate.
Resolution Frame Obstacle Obstacle Total
Rate Trial 1 Trial 2
16x12 1 4.30 (1.70) 3.70 (1.57) 8.00 (2.67)
2 4.10 (1.73) 3.20 (1.55) 7.30 (2.26)
4 3.40 (1.35) 4.30 (1.34) 7.70 (1.95)
32x24 1 3.90 (1.37) 2.70 (1.16) 6.80 (2.04)
2 3.70 (1.77) 4.10 (1.29) 7.80 (2.44)
4 3.10 (1.10) 2.80 (1.62) 5.90 (2.42)
Figure 8.8: Summary of obstacle errors during trials 1 and 2 by resolution andframe rate (FPS). The boxes show the middle 50 per cent of observations, withthe median shown by the solid line in the box. The whiskers coming from eachbox show the largest value excluding outliers (which are shown as small circles).
214 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
4.002.00
1.00
FP
S
32x2416x12
Resolution
14.00
12.00
10.00
8.00
6.00
4.00
2.00
0.00
No
. of
Ob
stac
le C
on
tact
s
14.00
12.00
10.00
8.00
6.00
4.00
2.00
0.00
No
. of
Ob
stac
le C
on
tact
s
14.00
12.00
10.00
8.00
6.00
4.00
2.00
0.00
No
. of
Ob
stac
le C
on
tact
s
ObstacleH2
ObstacleH1
Obstacle6
Obstacle5
Obstacle4
Obstacle3
Obstacle2
Obstacle1
Figure 8.9: Frequency of obstacle contacts by obstacle number for different res-olution types and frame rates (FPS). The obstacle types are displayed in Figure8.7.
Table 8.3: Mean number of veering errors (with standard deviations) for differentresolution and frame rate.
Resolution Frame Veering Veering Total
Rate Trial 1 Trial 2
16x12 1 11.50 (2.76) 11.50 (4.88) 23.00 (5.91)
2 12.50 (4.01) 10.10 (4.93) 22.60 (7.11)
4 10.30 (2.79) 9.30 (1.77) 19.60 (3.63)
32x24 1 10.00 (4.50) 7.80 (4.02) 17.80 (7.96)
2 7.10 (3.63) 5.40 (3.13) 12.50 (6.11)
4 6.50 (3.47) 5.30 (4.08) 11.80 (7.21)
8.3. Results 215
Figure 8.10: Summary of veering errors during trials 1 and 2 by resolution andframe rate (FPS)
Table 8.4: Mean benchmark speeds over 10m (with standard deviations) for res-olution and frame rate. Benchmark no. 1 was recorded before the first mobilitytrial. Benchmark no. 2 was recorded after the second mobility trial. PWS is 10divided by each Benchmark score. The combined PWS score in the table is theaverage PWS for the two benchmarks for each participant.
Resolution Frame Benchmark Benchmark Combined
Rate no. 1 (s) no. 2 (s) PWS (m/s)
16x12 1 16.14 (3.54) 15.01 (2.81) 0.67 (0.12)
2 16.08 (2.34) 16.70 (4.55) 0.63 (0.11)
4 16.45 (1.94) 17.06 (2.68) 0.61 (0.07)
32x24 1 16.75 (4.30) 16.11 (5.20) 0.65 (0.14)
2 17.56 (5.30) 15.93 (3.33) 0.63 (0.13)
4 14.06 (1.38) 14.53 (2.40) 0.71 (0.08)
216 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
Table 8.5: Mean scores (with standard deviations) for the amount for time spentwalking through the mobility course during each trial, and for PPWS (calculatedusing combined PWS) during each trial.
Resolution Frame Time (s) Time (s) PPWS PPWS
Rate Trial 1 Trial 2 Trial 1 Trial 2
16x12 1 326.40 (190.66) 317.80 (207.14) 27.87 (15.49) 29.76 (18.05)
2 353.20 (142.64) 376.80 (208.75) 24.50 (11.83) 26.07 (15.27)
4 245.30 (61.91) 237.50 (55.93) 31.79 (6.34) 32.88 (7.20)
32x24 1 306.20 (86.82) 251.80 (112.80) 25.55 (8.90) 32.39 (11.27)
2 266.10 (78.50) 264.10 (122.14) 29.85 (9.68) 31.74 (10.43)
4 204.60 (79.80) 178.70 (68.93) 35.84 (10.34) 40.00 (12.62)
4.002.001.00
FPS
32 x 2416 x 12
Reso
lutio
n
PPWS2PPWS1 PPWS2PPWS1 PPWS2PPWS1
70
60
50
40
30
20
10
0
70
60
50
40
30
20
10
0
44
60
Figure 8.11: Percentage of Preferred Walking Speed (PPWS) results for trials 1(PPWS1) and 2 (PPWS2) displayed by resolution type and frame rate (FPS).
8.3. Results 217
4.002.001.00
FPS
32 x 2416 x 12
Reso
lutio
n
Time2Time1 Time2Time1 Time2Time1
800
600
400
200
800
600
400
200
33
32
54
12
54
Figure 8.12: Time spent walking through the mobility course for trials 1 (Time1)and 2 (Time2) displayed by resolution type and frame rate (FPS).
218 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
4.002.001.00
FPS
36.00
34.00
32.00
30.00
28.00
26.00
24.00
Est
imat
ed M
arg
inal
Mea
ns
PP
WS
Tri
al 1 32 x 24
16 x 12Resolution
Figure 8.13: Variation of trial 1 PPWS scores by frame rate (FPS) and resolution.These results suggest a confounding variable, perhaps anxiety, during the initialtrial.
4.002.001.00
FPS
40.00
35.00
30.00
Est
imat
ed M
arg
inal
Mea
ns
PP
WS
Tri
al 2 32 x 24
16 x 12Resolution
Figure 8.14: Variation of trial 2 PPWS scores by frame rate (FPS) and resolution.These results show an increase in walking confidence as frame rate and resolutionincrease.
8.3. Results 219
4.002.001.00
FPS
350.00
300.00
250.00
200.00
Est
imat
ed M
arg
inal
Mea
ns
of
Tim
e S
pen
tW
alki
ng
Du
rin
g T
rial
132 x 24
16 x 12Resolution
Figure 8.15: Variation of time spent walking during trial 1 scores by frame rate(FPS) and resolution.
4.002.001.00
FPS
400.00
350.00
300.00
250.00
200.00
150.00
Est
imat
ed M
arg
inal
Mea
ns
of
Tim
e S
pen
tW
alki
ng
Du
rin
g T
rial
2
32 x 24
16 x 12Resolution
Figure 8.16: Variation of time spent walking during trial 1 scores by frame rate(FPS) and resolution.
220 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
found between the two levels of display spatial resolution and overall obstacle
contacts (F (1,54) = 0.08, p=0.78). However, there was a significant interaction
between frame rate, resolution and obstacle avoidance on the second trial (F (2,54)
= 9.16 ,p<0.05). Contact with obstacle 5 on both trials were significantly less
with increased resolution (F (1,54) = 9.16 ,p<0.01).
There were no significant relationships found between resolution and PPWS
on the first (F (1,54) = 0.51 ,p=0.48) or second trials (F (1,54) = 2.37, p=0.30).
The results for Time 1 were also not significantly different (F (1,54) = 2.52,
p=0.12). However, participants did spend significantly less time walking through
the course during the second trial (F (1,54) = 4.40, p<0.05).
8.3.2 Frame Rate
Using PPWS as the dependent variable, frame rate was not related to improved
performance on the first (F (2,54) = 1.80, p=0.18) or second trials (F (2,54) =
2.33, p=0.11). However, time spent walking through the mobility course was
significantly affected by frame rate on both the first trial (F (2,54) = 3.86, p<0.05)
and the second trial (F (2,54) = 3.24, p<0.05). Post-hoc Tukey’s HSD analysis
revealed significant differences between frame rate values of 1 and 4 FPS (p<0.05)
and time spent walking on the first trial, and significant differences between frame
rate values of 2 and 4 FPS (p<0.05) on the second trial.
There was also a marginally significant relationship between frame rate and
overall veering on both trials (F (2,54) = 2.68, p=0.08). Overall contact with
obstacle 5 was related to frame rate (F (2,54) = 3.21, p<0.05), however frame
rate was not related to overall obstacle contacts (F (2,54) = 0.59, p=0.56).
8.3. Results 221
8.3.3 Prior experience with immersive VR
Eleven (18.3%) participants had previous experience with VR. However, no signif-
icant relationships were found between VR and overall obstacle contacts (F (1,54)
= 0.24, p=0.62), overall veering (F (1,54) = 0.08, p=0.78), or PPWS on trial 1
(F (1,54) = 1.39, p=0.24) or 2 (F (1,54)=0.053, p=0.82). Similarly VR was not
related to time spent walking on the course during trial 1 (F (1,54) = 0.932,
p=0.338) or trial 2 (F (1,54) = 0.217, p=0.643).
Significant interactions were found between resolution, frame rate and VR
experience and contacts with obstacle 1 (F (2,54) = 3.53, p<0.05), and the same
factors and contact with obstacle 3 (F (2,54) = 3.42, p<0.05).
8.3.4 Gender
Although only 10 participants (16.7%) in this study were female, they had sig-
nificantly fewer obstacle contacts than males overall (F (1,54) = 9.27, p<0.01)).
Further analysis showed this difference on obstacle contacts was significant on the
first trial (F (1,54) = 7.84, p<0.01)), but not on the second trial (F (1,54) = 2.75,
p=0.10)). Overall, significant gender differences were found between Obstacles
one (F (1,54) = 5.89, p<0.05)) and three (F (1,54) = 5.55, p<0.05)). There was
no difference found between gender on veering during either trial.
On average females scored higher on PPWS in both trials (Trial 1: male
mean=28.31 (SD=10.46); female mean=32.82 (SD=13.06)), (Trial 2: Male mean=30.85,
SD=12.18; female mean=38.62, (SD=16.09)), however these differences were not
significant (Trial 1: F (1,54) = 1.40, p=0.24)) , (Trial 2: F (1,54) = 3.04, p=0.08)).
Similarly there was no significant difference between gender and time spent walk-
ing in trial 1 (F (1,54) = 0.384, p=0.54)) and trial 2 (F (1,54) = 0.729, p=0.40)).
222 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
8.3.5 Age
Boxplots summarising age group results for PPWS and time spent walking through
the course are provided in Figures 8.17 and 8.18. Age was not significantly related
to overall obstacle contacts (F (5,54) = 1.93, p=0.11), overall veering (F (5,54) =
0.36, p=0.87), PPWS on trial 1 (F (5,54) = 0.49, p=0.74) or Trial 2 (F (5,54) =
0.70, p=0.59), or time spent walking during Trial 1 (F (5,54) = 0.48, p=0.75) or
trial 2 (F (5,54) = 1.52, p=0.21).
There was a significant difference with obstacle 1 contact (F (5,54) = 3.78,
p<0.01), probably due to the 50-60 and 60-70 year age groups making contact
with this obstacle on every trial (at least double any other age group). However,
there were only three participants within those age groups. A significant inter-
action between age and resolution type was also found for obstacle 6 ((F (3,54)
= 4.00, p<0.05)). Grouping ages 0-30 (n=36) and participants with an age >30
(n=24) did not reveal any significant differences.
8.3.6 Corrected Vision
Twenty-two (36.7%) participants had corrected to normal vision. Corrected
vision was not significantly related to overall obstacle contacts (F (1,54)=0.66,
p=0.42), overall veering (F (1,54) = 0.25, p=0.61), PPWS on trial 1 (F (1,54) =
0.18, p=0.67) or 2 (F (1,54) = 0.25, p=0.62), or time spent walking through the
course during trial 1 (F (1,54) = 0.74, p=.39) or 2 (F (1,54) = 1.45, p=0.23).
Significant interactions were found between resolution and corrected vision
and contacts with obstacle 3 (F (1,54) = 4.82, p<0.05), and between frame rate
and corrected vision with contact with obstacle 6 (F (2,54) = 3.27, p<0.05).
8.3. Results 223
50+40-4930-3920-290-19
Age
800
600
400
200
33
12
8
1
8
4
12
Time2
Time1
Figure 8.17: Median time spent walking through the mobility for during Trial 1(Time1) and 2 (Trial 2) for different age groups.
224 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
50+40-4930-3920-290-19
Age
70
60
50
40
30
20
10
0
25
54
PPWS2
PPWS1
Figure 8.18: Median PPWS scores from Trial 1 (PPWS1) and 2 (PPWS2 2) fordifferent age groups.
8.4. Discussion 225
8.3.7 Learning effects
The initial and final measurements of preferred walking speed (PWS) were sig-
nificantly correlated (r = 0.74, p<0.01). However, the correlation between time
spent on the mobility course during the two mobility trials was higher (r =
0.87,p<0.01). The relationship between PPWS1 and PPWS2 was also significant
(r = 0.88, p<0.01), although this relationship is artificially enhanced due to the
same (average) PWS score being used for the calculation of PPWS on the first
and second trials for each participant. These results do not support the reliability
of the PPWS measure over simply recording the time taken by participants dur-
ing the mobility course. Scatterplots showing the correlation between dependent
variables on the first and second trial are shown in Figures 8.19 TO 8.24.
Repeated measures ANOVA between Trials 1 and 2 showed a significant de-
crease in veering errors (F (1,54) = 7.97, p<0.01), but only a marginally significant
change in PPWS (F (1,54) = 3.58, p=0.06). There was not a significant reduction
in obstacle contacts between trials (F (1,54) = 1.35, p=0.25).
8.4 Discussion
The experiment described in this chapter has provided further information ad-
dressing the following main thesis questions:
226 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
0.900.800.700.600.500.400.30
Preferred Walking Speed 1
1.00
0.90
0.80
0.70
0.60
0.50
0.40
0.30
Pre
ferr
ed W
alki
ng
Sp
eed
2
R Sq Linear = 0.53
Figure 8.19: Participant Preferred Walking Speed (PWS) results during trial 1and 2 (r = 0.74).
60.0050.0040.0030.0020.0010.00
PPWS1
70.00
60.00
50.00
40.00
30.00
20.00
10.00
0.00
PP
WS
2
R Sq Linear = 0.775
Figure 8.20: Participant Percentage Preferred Walking Speed (PPWS) resultsduring trial 1 and 2 (r = 0.88).
8.4. Discussion 227
800.00700.00600.00500.00400.00300.00200.00100.00
Time1
800.00
600.00
400.00
200.00
Tim
e2
R Sq Linear = 0.751
Figure 8.21: Participant time spent walking during trial 1 and 2 (r = 0.87).
0.400.300.200.10
SMC1
0.40
0.30
0.20
0.10
SM
C2
R Sq Linear = 0.761
Figure 8.22: Participant Speed on Mobility Course (SMC) results for trial 1 and2 (r = 0.87).
228 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
20.0015.0010.005.000.00
Veering1
20.00
15.00
10.00
5.00
0.00
Vee
rin
g2
R Sq Linear = 0.365
Figure 8.23: Veering incidents for each participant during trial 1 and 2 (r = 0.60).
7.006.005.004.003.002.001.00
Obstacles1
7.00
6.00
5.00
4.00
3.00
2.00
1.00
0.00
Ob
stac
les2
R Sq Linear = 0.027
Figure 8.24: Participant obstacle contacts during trial 1 and 2 (r = 0.16).
8.4. Discussion 229
8.4.1 Can specific main factors be identified as highly
significant for providing mobility information in an
AHV system?
Effects of phosphene spatial resolution and frame rate
An increase in spatial resolution from 16x12 phosphenes to 32x24 phosphenes was
associated with a significant reduction in veering errors between participants.
However, frame rate, during the second of two trials for each participant, was
significantly related to increased walking speed. The variability of results for the
first PPWS trial could be due to learning effects, and mixed levels of comfort
and confidence by participants. The results from this study indicate that spatial
resolution is more useful than increased frame rate for following a path without
veering. However, the display frame rate has a significant effect on a person’s
preferred walking speed. These findings suggest the development of an adaptive
AHV system which could provide a lower resolution/faster display mode which
a person is moving, and a higher resolution/slower display when a person has
ceased movement.
Interestingly, three participants reported useful echolation from nearby par-
titions as they were walking. One participant reported trying to use sound to
assist with navigation.
Effects of gender, age, corrected vision and VR experience
Although there were some significant interactions for particular obstacles, prior
experience with VR was not found to be an important factor in improved mo-
bility performance using the simulator. Similarly, age and corrected vision were
not significantly associated with improved performance (despite some interaction
effects).
230 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
There was a significant reduction in obstacle contacts during the first trial by
female participants. This may reflect a more cautious approach during this trial
rather than innate gender differences. The finding was not repeated in the second
trial. The obstacle avoidance finding suggests that a confounding variable may
have been the gender of the experimenter. Therefore a balance between both
male and female experimenters should be used for further studies investigated
gender differences in mobility.
Learning effects
It would be interesting to assess the effect resolution and frame rate have on
mobility over a number of repeated trials. However, it would be difficult and
time-consuming to maintain a sufficient number of participants for reasonable
statistical results over a period of time. Learning effects have been found in many
AHV simulation studies (for example, [29], [31] and [70]). Mean scores generally
improved between the first and second trials in the current experiment, however
an extraneous variable could be the level of confidence each participant felt while
being effectively blindfolded in a strange environment. Some participants also
required time to adjust to the location of camera and the associated difference
in display viewing angle from their usual vision. The following suggestions were
received from participant feedback and observation during the sessions which may
enhance future AHV mobility performance:
• During training, to assist in obstacle avoidance, allow participants to ob-
serve the increased rate of expansion from a high-contrast looming object
as they walk toward it.
• Advise participants to adjust their walking speed to the speed of display
(for example, 1 step per display update).
• Demonstrate the width of the camera field of view (FOV) by showing an
8.4. Discussion 231
object of a known width (for example. a doorway) and allow the participant
to touch the object.
• To reduce veering, show participants the black tape marking the path
boundaries and ask participants to touch it.
• Suggest using slow head movements to compensate for the narrow displayed
FOV (see insect vision peering behaviour comments below). However, ex-
plain that faster head movements may result in image corruption due to
motion blur.
In addition, some participants tended to point the head mounted camera too
high to locate the path boundaries. Therefore, an artificial horizon indicator may
be useful to assist with camera orientation.
8.4.2 Can objective measures be developed for the com-
parison of effectiveness between AHV systems in
providing mobility information?
The highly significant relationships between pre- and post-trial Preferred Walk-
ing Speed offer some support for the use of the PPWS method as a mobility
assessment measure for AHV research. However, this relationship is artificially
enhanced as PWS scores from the beginning and end of sessions were averaged.
In fact the correlation between PWS scores was lower (r = 0.74) than the times
spent on the mobility course during trial 1 and trial 2 (r = 0.87). The difference
in PWS scores is probably due to learning effects (as the first measurement takes
place soon after participants wear the AHV simulator for the first time). There-
fore the stability of PPWS should improve as people spent more time training
with the simulator.
232 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
The mean PPWS results for this experiment range from 24.5 for 16x12 phosphene
resolution at 2 FPS to 40.0 for 32x24 resolution at 4 FPS. These results are higher
than those reported in Chapter 7 (where the overall mean PPWS was 18.14), in-
dicating that the weight (and discomfit) of the PDA head-gear may have effected
walking speed. Participants generally moved at a slow pace, and spent time
scanning for both obstacles and the edges of the path. However, these results are
similar to those reported in Jones et al. [117], who recorded PPWS while inves-
tigating eight visually impaired adults and the effectiveness of an image based
ETA.
In conclusion, time spent walking through the course, combined with veering
and obstacle contacts form the basis for an objective method to assess the effects
of different image processing methods in both simulated and real AHV systems.
This method of assessment could also be extended to comparing different blind
mobility aids with an implanted AHV system, for example comparing the freeware
vOICe auditory electronic aid for the blind (which is limited to presenting one
frame per second [150]) with simulated AHV.
8.4.3 Can computer vision techniques be adopted and
modified to provide mobility information in an AHV
system?
The experimental hardware and software performed reliably. No participants
reported nausea during the experiment, although two required a rest between
trials. The front of the HMD sometimes became warm during the experiment, due
to the shroud attached to block external light. One hardware constraint in this
study was the narrow 34◦ field of view (FOV) of the Swann USB camera, which is
a similar constraint to current generation night-vision goggles (eg. [92]). However,
an image captured with a wider FOV may not necessarily enhance mobility, as
8.4. Discussion 233
the spatial resolution will still need to be greatly reduced for an electrode array.
It would be useful in future work to compare the effect of different camera fields
of view on mobility.
8.4.4 Connections with Vision Research
Insects have fixed visual systems, with fixed focus, that lack stereoscopy, but are
still able to determine the distance to features within their environment with
enough precision to exhibit safe mobility [211]. It has been demonstrated that
the insect principles for mobility can be effectively translated to solve mobility
problems in the context of autonomous systems [211].
The limitations of the insect visual system appear similar to the features of
the mobility problem faced by participants in the experiment described in this
chapter. These features include safe mobility using a fixed focus, low resolution
and a monoscopic vision system. Therefore the mobility strategies used by insects
may provide some insight into the principles for mobility for AHV system users.
Within the biology community, it is believed that optical flow is a fundamental
quantity in enabling safe mobility of insects [211]. As discussed in Chapter 6,
optical flow is the apparent motion of apparent brightness patterns in an image
sequence. An insect can use the apparent motion of objects in its environment to
make a good estimate of their distance. For the optical information to be reliable,
the motion of a brightness pattern must be observed. If self motion is too fast,
the apparent brightness pattern is too fast to see; if self motion is too slow, no
apparent motion occurs. It is believed that some insects (such as grasshoppers)
artificially generate optical flow when stationary by exhibiting peering behaviour
(translation of the head) [211]. Insects have also developed large fields of view to
improve the robustness of navigation based on their low resolution, monoscopic
information.
234 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
These observations provide one possible explanation as to why walking speed
was found to be strongly dependent on frame rate. At a fixed resolution and low
frame rate, there may be a threshold below which there is insufficient optical flow
information. As the frame rate increases, motion at a faster speed is possible.
Another observation from insect mobility is that both peering type behaviour
and head rotation behaviours (to increase the effective field-of-view) improve the
information available and therefore improve the safety of mobility. This behaviour
was demonstrated by participants in the current simulated AHV experiment, and
has also been reported in previous AHV simulation research ([28]) and research
on auditory vision substitution devices [7].
Another aspect of frame rate can be understood in terms of existing results
from the computer vision community. It has previously been shown that the effec-
tive resolution of one image frame within an image sequence can be improved by
considering the information contained in other image frames [127]. This process
of improving the effective resolution is known as super-resolution. In the context
of AHV systems, the principle of super-resolution suggests that higher frame rates
can effectively increase the resolution of information available to participants.
8.5 Chapter Summary
The research described in this chapter fills a gap in the AHV literature, by pro-
viding evidence that a method of mobility assessment adapted from the low vision
community (time spent walking through mobility course, obstacles and veering)
can be used as a practical method to objectively assess AHV system technology.
For example, this method could be easily used to compare the effects of different
image processing algorithms. In this chapter, this assessment method was used
with a custom AHV simulator to investigate the effects of frame rate and res-
olution. Higher spatial resolution was important for accurate walking (reduced
8.5. Chapter Summary 235
veering), and higher frame rate resulted in faster walking speeds. Female partic-
ipants made contact with a significantly lower number of obstacles than males.
Prior experience with immersive virtual reality, age and corrected vision did not
significantly affect mobility performance.
236 Chapter 8. Effects of Spatial and Temporal Resolution on Mobility Assessment
Chapter 9
Conclusion and Future Work
This chapter contains a summary of the work presented in each chapter of this
thesis. Additionally, conclusions are drawn and possible avenues for future work
are identified. The main scientific contributions of this work are summarised in
Table 9.1.
9.1 Conclusions
This thesis has provided thorough reviews of blind and low vision mobility, Arti-
ficial Human Vision (AHV) and computer vision. The original work in this thesis
is primarily aimed at two particular areas of AHV mobility:
1. The first aim of this thesis was to investigate, evaluate and develop
techniques for mobility assessment which will allow the objec-
tive comparison of different AHV system phosphene presentation
methods. The lack of an objective method for comparing different AHV
system displays, in addition to comparing AHV systems and other blind
mobility aids (such as the long cane), has been identified by other authors
as a significant problem. In this thesis a number of different methods have
237
238 Chapter 9. Conclusion and Future Work
Table 9.1: Summary of the main scientific contributions of this thesis.
1 A conceptual framework based on literature reviews of blind and low vision
mobility, AHV technology, and computer vision. This framework incorporates
a comprehensive number of factors which affect the effectiveness of information
presentation in an AHV system.
2 The adaptation of a mobility assessment method from the blind and low vision
literature to measure simulated AHV mobility performance using real-time
computer based analysis. This method of mobility assessment (based on
parameters for walking speed, obstacle contacts and veering) is demonstrated
experimentally in two different indoor mobility courses.
3 The development and evaluation of an original real-time looming obstacle
detector, based on coarse optical flow, and implemented on a Windows PocketPC
based Personal Digital Assistant (PDA) using a CF card camera.
4 The development of a novel head-mounted Windows PocketPC PDA based AHV
simulator.
5 The development of a novel Windows XP based AHV simulation with an immersive
Head Mounted Display.
9.1. Conclusions 239
been developed to evaluate differences in the perception of information from
phosphene displays. These methods have included a computer-based static
image simulation, a PDA-based simulation display, and a Virtual Reality
Head Mounted Display.
2. The second aim was to develop a display framework for the presenta-
tion of AHV system information, and use this framework to guide
the development of an AHV simulation device. A novel framework
for AHV system information display was developed and presented in Chap-
ter 4. This framework has been based on the literature reviews of blind-
ness, blind mobility, AHV systems, and computer vision. The framework
includes the main factors which impact on a blind traveller. The main ben-
efits of using this framework are enhanced communication between AHV
researchers and the ability to explore experimentally and compare different
factors (such as age or gender, different types of computer vision methods,
and different environments). Experimental work contained in this thesis
has been guided by this original framework.
The research questions which have been addressed in this thesis are:
9.1.1 Can specific main factors be identified as highly
significant for providing mobility information in an
AHV system?
Chapter 2 provided a review of major mobility issues which a blind or vision
impaired person might experience. The main hazardous situations for blind mo-
bility were identified as drop-offs, obstacles and fast moving objects. Mobility
aids, both traditional and electronic were also reviewed. The main benefit of
240 Chapter 9. Conclusion and Future Work
these devices is that they provide additional preview information to blind pedes-
trians. As mentioned above, one of the novel contributions in this thesis is the
conceptual framework based on literature reviews of blind and low vision mobility,
AHV, and computer vision presented in Chapter 4. This framework incorporates
a comprehensive number of factors which affect the effectiveness of information
presentation in an AHV system. Experiments reported in this thesis have inves-
tigated a number of these factors. In Chapter 5, it was found that less cluttered
images resulted in the highest number of correct recognition of mobility related
scene components.
In the experiment reported in Chapter 8 it was found that higher spatial
resolution is associated with accurate walking (reduced veering), whereas higher
frame rate is associated with faster walking speeds. This finding supports the
development of an adaptive AHV system, with dynamic adjustment of display
properties in real-time. This experiment also found that prior experience with
immersive VR was not an important factor in improved mobility performance
using the simulator. Similarly, age and corrected vision were not significantly
associated with improved performance.
9.1.2 Can objective measures be developed for the com-
parison of effectiveness between AHV systems in
providing mobility information?
In this thesis a number of different methods have been used to evaluate differences
in the perception of information.
In Chapter 5 a novel computer based static image software was developed
which demonstrated that a static-image based AHV simulation can provide useful
information regarding mobility. The advantages of using a static image approach
9.1. Conclusions 241
for such studies include portability, control of extraneous variables, ease of par-
ticipant recruitment and ease of data recording. However, a number of important
mobility related sensations (such as auditory, tactile, kinesthetic and proprio-
ceptive input) are not considered in a static image study. These are severely
limiting factors for making deductions about mobility as they are strong influ-
ences. Therefore, the static image method was not used for the remainder of the
thesis.
A PDA-based AHV simulator was developed in Chapter 6. This simulator
was used in a pilot artificial mobility course study, described in Chapter 7. The
simulator demonstrated the feasibility of capturing, processing and displaying
images using a PDA device. In this pilot study it was found that all participants
were able to improve mobility performance, measured by PPWS and mobility
errors, over two trials. However, the PDA simulator and head brace bracket
was found to be heavy after a few minutes of mobility assessment, and this may
have altered the movement of participants (which could have affected mobility
performance).
Chapter 8 presented the results of an experiment using a custom VR HMD-
based simulator and an artificial mobility course. This experiment provided ev-
idence that a method of mobility assessment adapted from the low vision com-
munity (time spent walking through mobility course, obstacles and veering) can
be used as a practical and useful method to objectively assess AHV system tech-
nology.
242 Chapter 9. Conclusion and Future Work
9.1.3 Can computer vision techniques be adopted and
modified to provide mobility information in an AVH
system?
Computer vision methods provide a critical link between the camera and electrode
array of an effective AHV system. All of these systems currently need to reduce
the resolution and the number of colours of captured images to match the number
of stimulating electrodes. This thesis has demonstrated the use of a number of
computer vision techniques to provide mobility information.
Chapter 4 examined the main computer vision methods for the reduction of
unimportant image information, and the extraction of important features from
images. A number of prototype systems to assist the blind were then reviewed
to illustrate how these methods can be applied. Despite the number of systems
reviewed, many struggle to provide useful mobility information in real-time, and
only one has received wide acceptance (viz. the auditory vOICe system).
In Chapter 5, four different methods were used to process static images. The
256 grey-level image type resulted in significantly better recognition of mobility
components than binary or edge detected image types. There was no significant
difference found between two different types of edge detection (the Canny or
Sobel methods). These results did not support the use of edge detection for low
resolution static images.
Chapter 6 described the development and evaluation of an original real-time
looming obstacle detector, based on coarse optical flow, and implemented on
a Windows PocketPC based PDA using a Compact Flash (CF) card camera.
This method detected obstacles which were looming in front of the camera and
provided an alert to the wearer. The results of two experiments at four different
lighting levels indicated that the initial segmentation and adequate lighting was
a significant factor in system performance. The accuracy of alerts ranged from
9.2. Future Work 243
100% for a sequence captured in the early afternoon, down to 25% for an image
captured in the late afternoon.
The VR HMD-based simulator presented in Chapter 8 reduced the number of
grey-levels and resolution of images captured from a head mounted camera, and
provided realistic simulated arrays of phosphenes to experiment participants.
9.2 Future Work
A number of avenues for future work have been identified during the undertaking
of this thesis. These are summarised below:
9.2.1 Mobility experiments with AHV system recipients
There is a limited amount of published data regarding AHV system recipients.
This appears to be due to commercial reasons (for example, the lack of reported
outcomes from the Dobelle cortical AHV system), or because the technology
is still being tested on animals (for example, much of the current retinal im-
plant research). Therefore a number of assumptions must be made regarding
the phosphene display which a person may perceive (for example, the shape and
layout of phosphenes or temporal resolution). As AHV technology (particularly
electrode array technology) develops over the next five to ten years it should be
possible to provide more realistic and generalizable simulations. Ultimately it
would be beneficial to measure the mobility performance of a number of AHV
system recipients objectively, and use these findings to drive system improve-
ments.
244 Chapter 9. Conclusion and Future Work
9.2.2 Symbolic display
There is a large amount of human factors research on Human Computer Inter-
face (HCI) and the perception of information by people (such as in aircraft dis-
plays). This research could be applied and extended to provide a more effective
AHV system display. For example, object recognition, discussed in Chapter 4,
could provide a simplified display for blind people (such as displaying a standard
symbol for a doorway or tactile strip). Additionally, a standard for ‘phosphene
menus’ could also be developed and assessed. Additional fields of study which
also overlap with an AHV symbolic display include wearable computing, mobile
communication, augmented reality and virtual reality.
9.2.3 Real world mobility assessment environments
This thesis has described experiments involved static images and indoor artifi-
cial mobility course. However, as discussed in Chapter 2, real world mobility
assessment would also be a useful method of comparing and assessing the effec-
tiveness of simulated AHV displays. In addition, self reported observations from
AHV system recipients would provide valuable information on different display
methods.
9.2.4 Integration of information from other sensors
Current generation AHV systems are based on image data captured from a single
head-mounted camera. However, there has been a large amount of research on the
development of different sensor based ETA’s for the blind (using ultrasound or
laser reflection). It could be useful to integrate information from these sensors into
an AHV system, such as using ultrasound to provide a phosphene ‘distance map’.
Additionally the use of multiple cameras could be a useful source of providing
depth information. Finally, although technically a navigation aid (rather than
9.3. Final Remarks 245
mobility) the integration of GPS data with an AHV system would allow the use
of current location and directional maps to be displayed (possibly in a symbolic
format).
9.2.5 Standard set of mobility related images
One approach which has been successfully applied in the field of Information
Retrieval [155] and could be useful for AHV research is the development of a
standard set of images/image sequences for evaluation and comparability of com-
puter vision methods. The main benefit of a standard is that different algorithms
could be objectively tested against each other and measured for efficiency and
effectiveness. Sample image sequences might be captured from a subject walking
at normal pace toward different obstacles, over different drop-offs or toward a
door. Different environments (such as indoors/outdoors) and lighting conditions
could also be included.
9.3 Final Remarks
The field of AHV and mobility research is entering a period of challenges as the
enabling technology is developed in advance of our knowledge and understanding
of the human factor aspects. This research has attempted to examine and stim-
ulate some investigative paths in this space. It should be seen as a beginning,
rather than reaching a conclusive or advanced stage in what will necessarily be
a lengthy process, requiring many different perspectives and approaches to be
considered, given the human, subjective nature of this topic.
246 Chapter 9. Conclusion and Future Work
Bibliography
[1] R. Allison, L. Harris, M. Jenkin, U. Jasiobedzka, and J. Zacher, “Tolerance
of temporal delay in virtual environments,” in Proceedings of Virtual Reality
2001, pp. 247–254, 2001.
[2] American National Standards Institute, Inc., “Atis telecom glossary 2000,”
(accessed July 2006).
[3] J. Andersen and E. Seibel, “Real-time hazard detection via machine vision
for wearable low vision aids,” in Fifth International Symposium on Wearable
Computers (ISWC’01), pp. 182–183, 2001.
[4] C. Archambeau, J. Delbeke, C. Veraart, and M. Verleysen, “Prediction of
visual perceptions with artificial neural networks in a visual prosthesis for
the blind,” Artificial Intelligence in Medicine, vol. 32, no. 3, pp. 183–194,
2004.
[5] C. Archambeau, J. Delbeke, and M. Verleysen, “Classification of visual sen-
sations generated electrically in the visual field of the blind,” in Proceedings
of the 5th IFAC symposium on Modelling and Control in Biomedical Systems,
(Melbourne, Australia), pp. 223–228, 2003.
[6] J. D. Armstrong, “Evaluation of man-machine systems in the mobility of
the visually handicapped,” in Human factors in health care (R. Pickett and
T. Triggs, eds.), pp. 331–343, Lexington: Lexington Books, 1975.
247
248 Bibliography
[7] P. Arno, A. Vanlierde, E. Streel, M. Wanet-Defalque, S. Sanabria-Bohorquez,
and C. Veraart, “Auditory substitution of vision: pattern recognition by the
blind,” Applied Cognitive Psychology, vol. 15, no. 5, pp. 509–519, 2001.
[8] M. Bak, J. Girvin, F. Hambrecht, C. Kufta, G. Loeb, and E. Schmidt,
“Visual sensations produced by intracortical microstimulation of the human
occipital cortex,” Medical & Biological Engineering & Computing, vol. 28,
pp. 257–259, 1990.
[9] D. Ballard, “Generalizing the Hough transform to detect arbitray shapes,”
Pattern Recognition, vol. 13, no. 2, pp. 111–122, 1981.
[10] J. Barron, D. Fleet, and S. Beauchemin, “Performance of optical flow tech-
niques,” International Journal of Computer Vision, vol. 12, no. 1, pp. 43–77,
1994.
[11] O. Baruth, R. Eckmiller, and D. Neumann, “Retina encoder tuning and data
encryption for learning retina implants,” in Proceedings of the International
Joint Conference on Neural Networks, vol. 2, pp. 1249–1252, 2003.
[12] M. Becker, M. Braun, and R. Eckmiller, “Retina implant adjustment with
reinforcement learning,” in Proceedings of the 1998 IEEE International Con-
ference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1181–1184,
1998.
[13] M. Becker, R. Eckmiller, and R. Hunermann, “Psychophysical test of a tun-
able retina encoder for retina implants,” in Proceedings of the International
Joint Conference on Neural Networks, vol. 1, pp. 192–195 vol.1, 1999.
[14] B. Bentzen, “Environmental accessibility,” in Foundations of Orientation
and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York: American
Foundation for the Blind, 2nd ed., 1997.
Bibliography 249
[15] B. Bentzen and J. Barlow, “Impact of curb ramps on the safety of persons
who are blind,” Journal of Visual Impairment & Blindness, vol. 89, pp. 319–
328, 1995.
[16] K. Boahen, “A retinomorphic vision system,” IEEE Micro, vol. 16, no. 5,
pp. 30–39, 1996.
[17] R. G. Boothe, Perception of the visual environment. New York: Springer-
Verlag, 2002.
[18] J. Boyle, Improving Perception From Electronic Visual Prostheses. PhD
thesis, Queensland University of Technology, 2005.
[19] J. R. Boyle, A. J. Maeder, and W. W. Boles, “Can environmental knowledge
improve perception with electronic visual prostheses?,” in Proceedings of the
World Congress on Medical Physics and Biomedical Engineering (WC2003),
(Sydney, Australia), 2003.
[20] J. Boyle, A. Maeder, and W. Boles, “Inherent visual information for low
quality image presentation,” in Proceedings of the 2003 APRS workshop on
digital image computing, (Brisbane, Australia), pp. 51–56, 2003.
[21] J. Brabyn, “A review of mobility aids and means of assessment,” in Elec-
tronic Spatial Sensing for the Blind (D. H. Warren and E. R. Strelow, eds.),
pp. 13–27, Dordrecht: Martinus Nijhoff Publishers, 1985.
[22] M. Brambring, “Mobility and orientation processes of the blind,” in Elec-
tronic Spatial Sensing for the Blind (D. H. Warren and E. R. Strelow, eds.),
pp. 493–508, Dordrecht: Martinus Nijhoff Publishers, 1985.
[23] G. S. Brindley, “Sensations produced by electrical stimulation of the occip-
ital poles of the cerebral hemispheres, and their use in constructing visual
250 Bibliography
prostheses,” Annals Of The Royal College Of Surgeons Of England, vol. 47,
no. 2, pp. 106–108, 1970.
[24] V. Bruce, P. Green, and M. Georgeson, Visual perception. Psychology Press:
New York, 4th ed., 2003.
[25] C. J. C. Burges, “A tutorial on support vector machines for pattern recog-
nition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121–167,
1998.
[26] J. Canny, “A computational approach to edge detection,” IEEE Transac-
tions on Pattern Analysis and Machine Intelligence, vol. 8, pp. 679–698,
1986.
[27] K. Cha, K. Horch, and R. Norman, “Reading speed with a pixelized vi-
sion system,” Journal of the Optical Society of America A-Optics & Image
Science, vol. 9, no. 5, pp. 673–677, 1992.
[28] K. Cha, K. Horch, and R. Normann, “Simulation of a phosphene field based
visual prosthesis,” in Proceedings of the IEEE International Conference on
Systems, Man and Cybernetics, pp. 921–923, 1990.
[29] K. Cha, K. Horch, and R. Normann, “Mobility performance with a pixelised
vision system,” Vision Research, vol. 32, no. 7, pp. 1367–1372, 1992.
[30] H. Chawla, Essential Opthamology. Edinburgh: Churchill Livingstone, 1981.
[31] S. C. Chen, L. Hallum, N. Lovell, and G. J. Suaning, “Visual acuity mea-
surement of prosthetic vision: a virtual-reality simulation study,” Journal of
Neural Engineering, vol. 2, pp. 135–145, 2005.
[32] X. Chen and A. Yuille, “A time-efficient cascade for real-time object de-
tection: With applications for the visually impaired,” in IEEE Computer
Bibliography 251
Society Conference on Computer Vision and Pattern Recognition, vol. 3,
pp. 28–28, 2005.
[33] A. Chow, “Artificial retina device.” Optobionics Corporation, 1991.
[34] A. Chow, “First trials and future technologies for artificial retinas,” in Pro-
ceedings of the 14th Annual Meeting of the IEEE Lasers and Electro-Optics
Society, vol. 2, pp. 734–735, 2001.
[35] A. Y. Chow and V. Y. Chow, “Subretinal electrical stimulation of the rabbit
retina,” Neuroscience Letters, vol. 225, no. 1, pp. 13–16, 1997.
[36] A. Chow, V. Chow, M. Pardue, G. Peyman, C. Liang, J. Pearlman, and
N. Peachey, “The semiconductor-based microphotodiode array artificial sil-
icon retina,” in Proceedings of the IEEE International Conference on Sys-
tems, Man, and Cybernetics, vol. 4, pp. 404–408, 1999.
[37] A. Chow, M. Pardue, V. Chow, G. Peyman, C. Liang, J. Perlman, and
N. Peachey, “Implantation of silicon chip microphotodiode arrays into the cat
subretinal space,” IEEE Transactions on Neural Systems and Rehabilitation
Engineering, vol. 9, no. 1, pp. 86–95, 2001.
[38] A. Y. Chow and N. S. Peachey, “The subretinal microphotodiode array reti-
nal prosthesis,” Ophthalmic Research, vol. 30, pp. 195–198, 1998.
[39] V. Chowdhury, J. W. Morley, and M. T. Coroneo, “An in-vivo paradigm
for the evaluation of stimulating electrodes for use with a visual prosthesis,”
ANZ Journal of Surgery, vol. 74, no. 5, pp. 372–378, 2004.
[40] V. Chowdhury, J. W. Morley, and M. T. Coroneo, “Surface stimulation of
the brain with a prototype array for a visual cortex prosthesis,” Journal of
Clinical Neuroscience, vol. 11, no. 7, pp. 331–341, 2004.
252 Bibliography
[41] D. D. Clarke-Carter, A. D. Heyes, and C. Howarth, “The efficiency and walk-
ing speed of visually impaired people,” Ergonomics, vol. 29, no. 6, pp. 779–
789, 1986.
[42] M. Coimbra and M. E. Davies, “Approximating optical flow within the
MPEG-2 compressed domain,” IEEE Transactions on Circuits and Systems
for Video Technology, vol. 15, no. 1, pp. 96–100, 2005.
[43] B. Cyanek and J. Borgosz, Computer platform for transformation of vi-
sual information into sound sensations for vision impaired people, vol. 2626.
Springer: Berlin, 2003.
[44] M. Czerwinski, D. S. Tan, and G. G. Robertson, “Women take a wider view,”
in Proceedings of the SIGCHI conference on Human factors in computing
systems CHI ’02, (New York, USA), pp. 195–202, 2002.
[45] G. Dagnelie, “Toward an artificial eye,” IEEE Spectrum, vol. 33, no. 5,
pp. 20–29, 1996.
[46] G. Dagnelie, “Visual prosthetics 2006: Assessment and expectations,” Expert
Review of Medical Devices, vol. 3, no. 3, pp. 315–325, 2006.
[47] G. Dagnelie, D. Barnett, M. Humayun, and R. Thompson Jr., “Paragraph
text reading using a pixelized prosthetic vision simulator: Parameter depen-
dence and task learning in free-viewing conditions,” Investigative Ophthal-
mology and Visual Science, vol. 47, pp. 1241–1250, 2006.
[48] W. Dobelle, “Artificial vision for the blind by connecting a television camera
to the brain,” ASAIO Journal, vol. 46, no. 1, pp. 3–9, 2000.
[49] W. H. Dobelle and M. G. Mladejovsky, “Phosphenes produced by electrical
Bibliography 253
stimulation of human occipital cortex, and their application to the devel-
opment of a prosthesis for the blind,” The Journal Of Physiology, vol. 243,
no. 2, pp. 553–576, 1974.
[50] W. Dobelle, M. Mladejovsky, J. Evans, T. Roberts, and J. Girvin, “”braille”
reading by a blind volunteer by visual cortex stimulation.,” Nature, vol. 259,
no. 5539, pp. 111–112, 1976.
[51] A. Dodds, “Evaluating mobility aids: an evolving methodology,” in Elec-
tronic Spatial Sensing for the Blind (D. H. Warren and E. R. Strelow, eds.),
pp. 191–200, Dordrecht: Martinus Nijhoff Publishers, 1985.
[52] A. Dodds, Mobility Training for Visually Handicapped People: A Person-
Centred Approach. London: Croom Helm, 1988.
[53] A. Dodds, Rehabilitating Blind and Visually Impaired People. London:
Chapman and Hall, 1993.
[54] A. G. Dodds, D. D. Carter, and C. I. Howarth, “Improving objective mea-
sures of mobility.,” Journal of Visual Impairment & Blindness, vol. 77, no. 9,
p. 438, 1983.
[55] A. G. Dodds and D. P. Davis, “Assessment and training of low vision clients
for mobility,” Journal of Visual Impairment & Blindness, pp. 439–446, 1989.
[56] E. R. Dougherty, An introduction to morphological image processing. SPIE
Optical Engineering Press, 1992.
[57] J. Dowling, A. J. Maeder, and W. W. Boles, “Mobility assessment using
simulated artificial human vision,” in IEEE Computer Society Conference
on Computer Vision and Pattern Recognition (CVPR 2005), vol. 3, pp. 32–
32, 2005.
254 Bibliography
[58] R. Eckhorn, M. Wilms, T. Schanze, M. Eger, L. Hesse, U. T. Eysel, Z. F.
Kisvrday, E. Zrenner, F. Gekeler, and H. Schwahn, “Visual resolution with
retinal implants estimated from recordings in cat visual cortex,” Vision Re-
search, vol. In Press, 2006.
[59] R. Eckmiller, “Learning retina implants with epiretinal contacts,” Oph-
thalmic Research, vol. 29, pp. 281–289, 1997.
[60] R. Eckmiller, M. Becker, and R. Hunermann, “Dialog concepts for learning
retina encoders,” in Proceedings of the International Conference on Neural
Networks., vol. 4, pp. 2315–2320, 1997.
[61] R. Eckmiller, M. Becker, and R. Hunermann, “Towards a learning retina
implant with epiretinal contacts,” in Proceedings of the IEEE International
Conference on Systems, Man, and Cybernetics, vol. 4, pp. 396–399, 1999.
[62] M. Egmont-Petersen, D. de Ridder, and H. Handels, “Image processing with
neural networks-A review,” Pattern Recognition, vol. 35, no. 10, pp. 2279–
2301, 2002.
[63] M. R. Everingham, B. T. Thomas, T. Troscianko, and D. Easty, “A neural-
network virtual reality mobility aid for the severly visually impaired,” in 2nd
Annual Conference on Disability, Virtual Reality and Associated Technolo-
gies, pp. 183–192, 1998.
[64] L. Farmer and D. Smith, “Adaptive technology,” in Foundations of Ori-
entation and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York:
American Foundation for the Blind, 2nd ed., 1997.
[65] E. Fernandez, P. Ahnelt, P. Rabischong, C. Botella, F. Garcia-de Quiros,
P. Bonomini, C. Marin, R. Climent, J. Tormos, and R. Normann, “Towards
Bibliography 255
a cortical visual neuroprosthesis for the blind,” Proceedings of the 2nd In-
ternational Federation for Medical & Biological Engineering (IFMBE) Con-
ference, vol. 3, no. 2, pp. 1690–1691, 2002.
[66] J. Fernandez, A. Alfaro, P. Bonomini, J. Tormos, L. Concepcion, F. Pelayo,
and E. Fernandez, “Brain plasticity: feasibility of a cortical visual prosthesis
for the blind,” in Proceedings of the 25th Annual International Conference
of the IEEE Engineering in Medicine and Biology Society., vol. 3, pp. 2027–
2030, 2003.
[67] E. Fernandez, A. Alfaro, J. M. Tormos, R. Climent, M. Martinez, H. Vi-
lanova, V. Walsh, and A. Pascual-Leone, “Mapping of the human visual cor-
tex using image-guided transcranial magnetic stimulation,” Brain Research
Protocols, vol. 10, no. 2, pp. 115–124, 2002.
[68] S. Foran, J. J. Wang, E. Rochtchina, and P. Mitchell, “Projected number of
Australians with visual impairment in 2000 and 2030,” Clinical and Experi-
mental Opthalmology, vol. 28, pp. 143–145, 2000.
[69] J. Fowler, “The next generation of mobility aid,” in 2nd Australasian orien-
tation and mobility conference, (Gold Coast, Australia), 2003.
[70] L. Fu, S. Cai, H. Zhang, G. Hu, and X. Zhang, “Psychophysics of reading
with a limited number of pixels: Towards the rehabilitation of reading ability
with visual prosthesis,” Vision Research, vol. 46, no. 8-9, pp. 1292–1301,
2006.
[71] F. Gekeler, H. Schwahn, A. Stett, K. Kohler, and E. Zrenner, “Subretinal
microphotodiodes to replace photoreceptor-function. A review of the current
state,” in Vision, sensations et environnement (M. Doly, M.-T. Droy, and
Y. Christen, eds.), pp. 77 – 95, Paris: Irvinn, 2001.
256 Bibliography
[72] D. R. Geruschat and W. de l’Aune, “Reliability and validity of O&M in-
structor observations,” Journal of Visual Impairment & Blindness, vol. 83,
pp. 457–60, 1989.
[73] D. Geruschat and A. J. Smith, “Low vision and mobility,” in Foundations of
Orientation and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York:
American Foundation for the Blind, 2nd ed., 1997.
[74] D. Geruschat, K. A. Turano, and J. Stahl, “Traditional measures of mobility
performance and retinis pigmentosa,” Optometry and Vision Science, vol. 75,
no. 7, pp. 525–537, 1998.
[75] M. Ghanbari, Video coding: an introduction to standard codecs. London:
The Institute of Electrical Engineers, 1999.
[76] J. J. Gibson, The Perception of the Visual World. Boston: Houghton-Mifflin,
1950.
[77] J. J. Gibson, The senses considered as perceptual systems. Massachusetts:
Houghton-Mifflin, 1966.
[78] J. J. Gibson, “The theory of affordances,” in Perceiving, acting, and know-
ing: toward an ecological psychology (R. Shaw and J. Bransford, eds.), New
Jersey: Lawrence Erlbaum Associates, 1977.
[79] J. J. Gibson, The ecological approach to visual perception. Hillsdale, NJ:
Lawrence Erlbaum Associates, 1979.
[80] B. Girod, “What’s wrong with mean-squared error?,” in Digital Images and
Human Vision (A. Watson, ed.), Cambridge: MIT Press, 1993.
[81] R. C. Gonzalez and R. E. Woods, Digital Image Processing. Reading, Mas-
sachusetts: Addison-Wesley, 1992.
Bibliography 257
[82] J. Gothe, S. A. Brandt, K. Irlbacher, S. Roricht, B. A. Sabel, and B.-U.
Meyer, “Changes in visual cortex excitability in blind subjects as demon-
strated by transcranial magnetic stimulation,” Brain, vol. 125, no. 3,
pp. 479–490, 2002.
[83] R. L. Gregory, Eye and Brain: The Psychology of Seeing. Tokyo: Oxford
University Press, 5th ed., 1998.
[84] S. Grigorescu, N. Petkov, and P. Kruizinga, “Comparison of texture features
based on gabor filters,” IEEE Transasctions on Image Processing, vol. 11,
no. 10, pp. 1160–1167, 2002.
[85] E. Guenther, B. Troger, B. Schlosshauer, and E. Zrenner, “Long-term sur-
vival of retinal cell cultures on retinal implant materials,” Vision Research,
vol. 39, no. 24, pp. 3988–3994, 1999.
[86] D. Guth and R. LaDuke, “Veering by blind pedestrians: Individual differ-
ences and their implications for instruction,” Journal of Visual Impairment
& Blindness, vol. 89, pp. 28–37, 1995.
[87] D. A. Guth and J. J. Rieser, “Perception and the control of locomotion
by blind and visually impaired pedestrians,” in Foundations of Orientation
and Mobility (B. B. Blasch and W. R. Weiner, eds.), pp. 9–39, New York:
American Foundation for the Blind, 2nd ed., 1997.
[88] L. E. Hallum, D. S. Taubman, G. J. Suaning, J. W. Morley, and N. H. Lovell,
“A filtering approach to artificial vision: A phosphene visual tracking task,”
in Proceedings of the World Congress on Medical Physics and Biomedical
Engineering, (Sydney, Australia), 2003.
[89] L. Hallum, G. Tsafnet, N. Lovell, and G. Suaning, “Artificial vision for the
blind,” Australasian Science, vol. 30, no. 1, pp. 21–23, 2003.
258 Bibliography
[90] F. T. Hambrecht, “The history of neural stimulation and its relevance to
future neural prostheses,” in Neural Prostheses: Fundamental Studies (W. F.
Agnew and D. B. McCreery, eds.), New Jersey: Prentice Hall, 1990.
[91] H. Hammerle, K. Kobuch, K. Kohler, W. Nisch, H. Sachs, and M. Stel-
zle, “Biostability of micro-photodiode arrays for subretinal implantation,”
Biomaterials, vol. 23, no. 3, pp. 797–804, 2002.
[92] D. T. Hartong, F. F. Jorritsma, J. J. Neve, B. J. M. Melis-Dankers, and A. C.
Kooijman, “Improved mobility and independence of night-blind people using
night-vision goggles,” Investigative Ophthalmology & Visual Science, vol. 45,
no. 6, pp. 1725–1731, 2004.
[93] J. S. Hayes, V. T. Yin, D. Piyathaisere, J. D. Weiland, M. S. Humayun, and
G. Dagnelie, “Visually guided performance of simple tasks using simulated
prosthetic vision,” Artificial Organs, vol. 27, no. 11, pp. 1016–1028, 2003.
[94] S. Haymes, D. Guest, A. Heyes, and A. Johnston, “Comparison of functional
mobility performance with clinical vision measures in simulated retinitis pig-
mentosa,” Optometry and Vision Science, vol. 71, no. 7, pp. 442–453, 1994.
[95] S. Haymes, D. Guest, A. Heyes, and A. Johnston, “Mobility of people with
retinitis pigmentosa as a function of vision and psychological variables,”
Optometry and Vision Science, vol. 73, no. 10, pp. 621–637, 1996.
[96] B. Heisele, “Visual object recognition with supervised learning,” IEEE In-
telligent Systems, vol. 18, no. 3, pp. 38–42, 2003.
[97] L. Hesse, T. Schanze, M. Wilms, and M. Eger, “Implantation of retina stim-
ulation electrodes and recording of electrical stimulation responses in the
visual cortex of the cat,” Graefe’s Archive for Clinical and Experimental
Ophthalmology, vol. 238, no. 10, pp. 840–845, 2000.
Bibliography 259
[98] A. D. Heyes, “The sonic pathfinder - a new travel aid for the blind,” in High
technology aids for the disabled (W. J. Perkins, ed.), pp. 165–171, London:
Butterworth, 1983.
[99] T. Heyes, “The sonic pathfinder: An electronic travel aid for the vision
impaired.” http://www.sonicpathfinder.org/, (accessed July 2006).
[100] A. D. Heyes, A. G. Dodds, D. D. C. Carter, and C. I. Howarth, “Evaluation
of the mobility of blind pedestrians,” in High technology aids for the disabled
(W. J. Perkins, ed.), pp. 14–19, London: Butterworth, 1983.
[101] J. Hill and J. Black, “The miniguide: A new electronic travel device.,”
Journal of Visual Impairment & Blindness, vol. 97, no. 10, pp. 655–656,
2003.
[102] E. Hill, J. Rieser, M. Hill, M. Hill, J. Halpin, and R. Halpin, “How persons
with visual impairments explore novel spaces: Strategies of good and poor
performers,” Journal of Visual Impairment & Blindness, vol. 87, 1993.
[103] B. K. P. Horn and B. G. Schunck, “Determining optical flow,” Artificial
Intelligence, vol. 17, no. 1, pp. 185–203, 1981.
[104] S. Horowitz and T. Pavlidis, “Picture segmentation by a tree traversal
algorithm,” Journal of the ACM, vol. 23, no. 2, pp. 368–388, 1976.
[105] D. H. Hubel, “Exploration of the primary visual cortex, 1955-78,” in Cogni-
tive Neuroscience: A reader (M. S. Gazzaniga, ed.), Massachusetts: Black-
well, 2000.
[106] M. S. Humayun, “Is surface electrical stimulation of the retina a feasible
approach towards the development of a visual prosthesis?,” PhD thesis, Uni-
versity of North Carolina at Chapel Hill, 1992.
260 Bibliography
[107] M. S. Humayun and E. de Juan, “Artificial vision,” Eye, vol. 12, pp. 605–
607, Jun 1998.
[108] M. Humayun, E. De Juan Jr., G. Dagnelie, R. Greenberg, R. Propst, and
D. Phillips, “Visual perception elicited by electrical stimulation of retina in
blind humans,” Archives of Ophthalmology, vol. 114, no. 1, pp. 40–46, 1996.
[109] M. S. Humayun, J. de Juan, Eugene, J. D. Weiland, G. Dagnelie, S. Katona,
R. Greenberg, and S. Suzuki, “Pattern electrical stimulation of the human
retina,” Vision Research, vol. 39, pp. 2569–2576, 1999.
[110] M. S. Humayun, Y. Sato, R. Propst, and E. de Juan Jr, “Can potentials
from the visual cortex be elicited electronically despite severe retinal de-
generation and a markedly reduced electroretinogram?,” German Journal of
Ophthalmology, vol. 4, no. 1, pp. 57–64, 1995.
[111] M. S. Humayun, J. D. Weiland, G. Y. Fujii, R. Greenberg, R. Williamson,
J. Little, B. Mech, V. Cimmarusti, G. Van Boemel, and G. Dagnelie, “Visual
perception in a blind subject with a chronic microelectronic retinal prosthe-
sis,” Vision Research, vol. 43, no. 24, pp. 2573–2581, 2003.
[112] Y. Ito, T. Yagi, H. Kanda, S. Tanaka, M. Watanabe, and Y. Uchikawa,
“Cultures of neurons on micro-electrode array in hybrid retinal implant,”
in Proceedings of the IEEE International Conference on Systems, Man, and
Cybernetics, vol. 4, pp. 414–417 vol.4, 1999.
[113] B. Jahne and H. Haußecker, eds., Computer Vision and Applications. Aca-
demic Press: San Diego, 2000.
[114] A. Jain, R. Duin, and J. Mao, “Statistical pattern recognition: a review,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22,
no. 1, pp. 4–37, 2000.
Bibliography 261
[115] G. Jansson, “Development and evaluation of mobility aids for the visually
handicapped,” in Development of electronic aids for the visually impaired
(P. Emiliani, ed.), Dordrecht: Martinus Nijhoff, 1986.
[116] L. Johnson, F. K. Perkins, T. O’Hearn, P. Skeath, C. Merritt, J. Frieble,
S. Sadda, M. Humayun, and D. Scribner, “Electrical stimulation of isolated
retina with microwire glass electrodes,” Journal of Neuroscience Methods,
vol. 137, no. 2, pp. 265–273, 2004.
[117] T. Jones and T. Troscianko, “Mobility performance of low-vision adults
using an electronic mobility aid,” Clinical and Experimental Optometry,
vol. 89, no. 1, pp. 10–17, 2006.
[118] H. Kanda, T. Morimoto, T. Fujikado, Y. Tano, Y. Fukuda, and H. Sawai,
“Electrophysiological studies of the feasibility of suprachoroidal-transretinal
stimulation for artificial vision in normal and rcs rats,” Investigative Oph-
thalmology & Visual Science, vol. 45, no. 2, pp. 560–566, 2004.
[119] H. Kanda, T. Yagi, Y. Ito, S. Tanaka, M. Watanabe, and Y. Uchikawa, “Ef-
ficient stimulation inducing neural activity in retinal implant,” Proceedings
of IEEE Systems, Man, and Cybernetics Conference, vol. 4, pp. 409 – 413,
1999.
[120] H. Kanda, T. Yagi, T. Nakatsu, M. Watanabe, and Y. Uchikawa, “A study
on electrical stimulation to visual nervous system in visual prosthesis,” in
Proceedings of the 26th Annual Conference of the IEEE, vol. 1, pp. 108–113
vol.1, 2000.
[121] L. Konig, “The laser long cane,” in 2nd Australasian orientation and mo-
bility conference, (Gold Coast, Australia), 2003.
262 Bibliography
[122] T. Kuyk, J. L. Elliott, J. Biehl, and P. S. Fuhr, “Environmental variables
and mobility performance in adults with low vision,” Journal of the Ameri-
can Optometric Association, vol. 67, no. 7, pp. 403–409, 1996.
[123] T. Kuyk, J. L. Elliott, and P. S. Fuhr, “Visual correlates of mobility in
real world settings in older adults with low vision,” Optometry and Vision
Science, vol. 75, pp. 538–47, Jul 1998.
[124] P. Larcombe, “Tactile ground surface indicators and the law,” in 2nd Aus-
tralasian orientation and mobility conference, (Gold Coast, Australia), 2003.
[125] J. Li, A. Najmi, and R. M. Gray, “Image classification by a two dimensional
hidden markov model,” IEEE Transactions on Signal Processing, vol. 48,
no. 2, pp. 517–533, 2000.
[126] R. Li, X. Zhang, and G. Hu, “A computational pixelization model based on
selective attention for artificial visual prosthesis,” Lecture Notes in Computer
Science, pp. 654–662, 2005.
[127] D. S. H. Ling, H. Hsu, G. C. Lin, and S. Lee, “Enhanced image-based
coordinate measurement using a super-resolution method,” Robotics and
Computer-Integrated Manufacturing, vol. 21, no. 6, pp. 579–588, 2005.
[128] W. Liu, E. McGucken, R. Cavin, M. Clements, K. Vichienchom, C. De-
marco, M. Humayun, E. d. Juan, J. Weiland, and R. Greenberg, “A retinal
prosthesis to benefit the visually impaired,” in Intelligent Systems and Tech-
nologies in Rehabilitation Engineering (H.-N. L. Teodorescu and L. C. Jain,
eds.), Boca Raton, Florida: CRC Press, 2001.
[129] W. Liu, E. McGucken, K. Vichienchom, S. Clements, S. Demarco, M. Hu-
mayun, E. de Juan, J. Weiland, and R. Greenberg, “Retinal prosthesis to aid
Bibliography 263
the visually impaired,” in Proceedings of the IEEE International Conference
on Systems, Man, and Cybernetics, vol. 4, pp. 364–369, 1999.
[130] W. Liu, E. McGucken, K. Vitchiechom, M. Clements, E. de Juan, and
M. Humayun, “Dual unit visual intraocular prosthesis,” in Proceedings of the
19th Annual International Conference of the IEEE Engineering in Medicine
and Biology society, vol. 5, pp. 2303–2306, 1997.
[131] W. Liu, M. Sivaprakasam, P. R. Singh, R. Bashirullah, and G. Wang, “Elec-
tronic visual prosthesis,” Artificial Organs, vol. 27, no. 11, pp. 986–995, 2003.
[132] J. I. Loewenstein, S. R. Montezuma, and J. F. Rizzo III, “Outer reti-
nal degeneration: An electronic retinal prosthesis as a treatment strategy,”
Archives of Ophthalmology, vol. 122, no. 4, pp. 587–596, 2004.
[133] P. C. Loizou, “Introduction to cochlear implants,” IEEE Signal Processing
Magazine, vol. 15, no. 5, pp. 101–130, 1998.
[134] R. Long and E. Hill, “Establishing and maintaining orientation for mobil-
ity,” in Foundations of Orientation and Mobility (B. B. Blasch and W. R.
Weiner, eds.), New York: American Foundation for the Blind, 2nd ed., 1997.
[135] R. Long, J. Rieser, and E. Hill, “Mobility in individuals with moderate
visual impairments,” Journal of Visual Impairment & Blindness, vol. 84,
1990.
[136] J. M. Loomis, R. L. Klatzky, and R. G. Goledge, “Navigating without
vision: Basic and applied research,” Optometry and Vision Science, vol. 78,
no. 5, pp. 282–289, 2001.
[137] J. Lovie-Kitchin, J. Mainstone, J. Robinson, and B. Brown, “What areas of
the visual field are important for mobility in low vision patients?,” Clinical
Vision Sciences, vol. 5, no. 3, 1990.
264 Bibliography
[138] G. Luger and W. StubbleField, Artificial Intelligence: Stuctures and strate-
gies for complex problem solving. Benjamin/Cumming: California, 2nd ed.,
1993.
[139] W. M. Mace, “James J. Gibson’s strategy for perceiving: Ask not what’s
inside your head, but what your head is inside of,” in Perceiving, acting, and
knowing: toward an ecological psychology (R. Shaw and J. Bransford, eds.),
pp. 43–66, New Jersey: Lawrence Erlbaum Associates, 1977.
[140] A. B. Majji, M. S. Humayun, J. D. Weiland, S. Suzuki, S. A. DAnna, and
J. de Juan, Eugene, “Long-term histological and electrophysiological results
of an inactive epiretinal electrode array implantation in dogs,” Investigative
Ophthalmology & Visual Science, vol. 40, no. 9, pp. 2073–2081, 1999.
[141] H. A. Mallot, Computational vision : Information processing in perception
and visual behavior. Cambridge, Mass.: MIT Press, 2000.
[142] R. E. Marc, B. W. Jones, C. B. Watt, and E. Strettoi, “Neural remodeling
in retinal degeneration,” Progress in Retinal and Eye Research, vol. 22, no. 5,
pp. 607–655, 2003.
[143] E. Margalit, M. Maia, J. D. Weiland, R. J. Greenberg, G. Y. Fujii, G. Tor-
res, D. V. Piyathaisere, T. M. O’Hearn, W. Liu, and G. Lazzi, “Retinal
prosthesis for the blind,” Survey of Ophthalmology, vol. 47, no. 4, pp. 335–
356, 2002.
[144] D. Marr, Vision. San Francisco, USA: W. H. Freeman, 1982.
[145] J. Marron and I. Bailey, “Visual factors and orientation-mobility perfor-
mance,” American Journal of Optometry and Physiological Optics, vol. 59,
no. 5, 1982.
Bibliography 265
[146] M. Mattar, A. Hanson, and E. Learned-Miller, “Sign classification using lo-
cal and meta-features,” in IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, vol. 3, pp. 26–26, 2005.
[147] E. M. Maynard, “Visual prostheses,” Annual Review of Biomedical Engi-
neering, vol. 3, no. 1, pp. 145–168, 2001.
[148] E. M. Maynard, C. T. Nordhausen, and R. A. Normann, “The Utah intra-
cortical electrode array: A recording structure for potential brain-computer
interfaces,” Electroencephalography and Clinical Neurophysiology, vol. 102,
no. 3, pp. 228–239, 1997.
[149] P. B. Meijer, “An experimental system for auditory image representations,”
IEEE Transactions on Biomedical Engineering, vol. 39, no. 2, pp. 112–121,
1992.
[150] P. B. Meijer, “Vision technology for the totally blind.”
http://www.seeingwithsound.com/, (accessed July 2006).
[151] N. Molton, S. Se, M. Brady, D. Lee, and P. Probert, “Robotic sensing for
the partially sighted,” Robotics and Autonomous Systems Journal, vol. 26,
no. 3, pp. 185–201, 1999.
[152] G. Naik and A. Regalado, “An inventor struggles to restore sight,” Wall
Street Journal, p. B.1, August 27 2003.
[153] K. Nakayama, “James J. Gibson - an appreciation,” in Cognitive Neuro-
science (M. S. Gazzaniga, ed.), Oxford: Blackwell, 2000.
[154] V. S. Nalwa, A guided tour of computer vision. Reading, Massachusetts:
Addison-Wesley, 1993.
266 Bibliography
[155] National Institute of Standards and Technology, “Text retrieval conference
(trec).” http://trec.nist.gov/, (accessed July 2006).
[156] C. Noback, N. Strominger, R. Demarest, and D. Ruggiero, The Human
Nervous System. Humana Press: New Jersey, 6th ed., 2005.
[157] R. Normann, “A penetrating, cortical electrode array: design considera-
tions,” in Proceedings of IEEE International Conference on Systems, Man
and Cybernetics, pp. 918–920, 1990.
[158] R. Normann, “Visual neuroprosthetics-functional vision for the blind,”
IEEE Engineering in Medicine and Biology Magazine, vol. 14, no. 1, pp. 77–
83, 1995.
[159] R. Normann, E. Maynard, K. Guillory, and D. Warren, “Cortical implants
for the blind,” IEEE Spectrum, vol. 33, no. 5, pp. 54–59, 1996.
[160] R. Normann, D. Warren, and A. Koulakov, “Representations and dynamics
of representations of simple visual stimuli by ensembles of neurons in cat
visual cortex studied with a microelectrode array,” in Proceedeings of the
First International IEEE EMBS Conference on Neural Engineering, pp. 91–
94, 2003.
[161] W. Osberger, Perceptual vision models for picture quality assessment and
compression applications. PhD thesis, Queensland University of Technology,
1999.
[162] W. Osberger and A. Maeder, “Automatic identification of perceptually im-
portant regions in an image using a model of the human vision system,”
Proceedings of the 14th International Conference on Pattern Recognition,
pp. 701–704, 1998.
Bibliography 267
[163] W. Osberger and A. Rohaly, “Automatic detection of regions of interest
in complex video sequences,” in Human Vision and Electronic Imaging VI,
vol. 4299, pp. 361–372, Bellingham, USA: SPIE - The International Society
for Optical Engineering, 2001.
[164] D. Palanker, P. Huie, A. Vankov, Y. Freyvert, H. Fishman, M. Marmor, and
M. Blumenkranz., “Attracting retinal cells to electrodes for high-resolution
stimulation,” in Ophthalmic Technologies, (SPIE vol.5314), pp. 306–313,
2004.
[165] M. T. Pardue, M. J. Phillips, H. Yin, B. Sippy, S. Webb-Wood, A. Y.
Chow, and S. L. Ball, “Neuroprotective effect of subretinal implants in the
RCS rat,” Investigative Ophthalmology & Visual Science, vol. 46, pp. 674–
682, 2004.
[166] M. T. Pardue, E. B. Stubbs, Jr., J. I. Perlman, K. Narfstrom, A. Y. Chow,
and N. S. Peachey, “Immunohistochemical studies of the retina following
long-term implantation with subretinal microphotodiode arrays,” Experi-
mental Eye Research, vol. 73, no. 3, pp. 333–343, 2001.
[167] T. D. Parsons, P. Larson, K. Kratz, M. Thiebaux, B. Bluestein, J. G.
Buckwalter, and A. A. Rizzo, “Sex differences in mental rotation and spatial
rotation in a virtual environment,” Neuropsychologia, vol. 42, no. 4, pp. 555–
562, 2004.
[168] R. Passini, A. Dupre, and C. Langlois, “Spatial mobility of the visually
handicapped active person: A description study,” Journal of Visual Impair-
ment & Blindness, pp. 904–907, 1986.
[169] I. Patel, K. A. Turano, A. T. Broman, K. Bandeen-Roche, B. Munoz, and
S. K. West, “Measures of visual function and percentage of preferred walking
268 Bibliography
speed in older adults: The salisbury eye evaluation project,” Investigative
Ophthalmology & Visual Science, vol. 47, pp. 65–71, Jan 2006.
[170] F. Pelayo, A. Martinez, S. Romero, C. Morillas, E. Ros, and E. Fer-
nandez, “Cortical visual neuro-prosthesis for the blind: Retina-like soft-
ware/hardware preprocessor,” in Proceedings of the First International IEEE
EMBS Conference on Neural Engineering, pp. 150–153, 2003.
[171] F. Pelayo, S. Romero, C. Morillas, A. Martinez, E. Ros, and E. Fernan-
dez, “Translating image sequences into spike patterns for cortical neuro-
stimulation,” Neurocomputing, vol. 58-60, pp. 885–892, 2003.
[172] D. G. Pelli, “The visual requirements of mobility,” in Low Vision: Princi-
ples and Applications (G. C. Woo, ed.), pp. 134–146, New York: Springer-
Verlag, 1986.
[173] M. C. Peterman, D. M. Bloom, C. Lee, S. F. Bent, M. F. Marmor, M. S.
Blumenkranz, and H. A. Fishman, “Localized neurotransmitter release for
use in a prototype retinal interface,” Investigative Ophthalmology & Visual
Science, vol. 44, no. 7, pp. 3144–3149, 2003.
[174] M. C. Peterman, N. Z. Mehenti, K. V. Bilbao, C. J. Lee, T. Leng,
J. Noolandi, S. F. Bent, M. S. Blumenkranz, and H. A. Fishman, “The
artificial synapse chip: A flexible retinal interface based on directed retinal
cell growth and neurotransmitter stimulation,” Artificial Organs, vol. 27,
no. 11, pp. 975–985, 2003.
[175] J. Pezaris and R. Reid, “Microstimulation in LGN produces focal visual
percepts,” Journal of Vision, vol. 5, no. 8, p. 367, 2005.
[176] G. Phillips, “Gpd research.” http://www.gpd-research.com.au/, (accessed
July 2006).
Bibliography 269
[177] C. Poirier, M.-A. Richard, D. T. Duy, and C. Veraart, “Assessment of
sensory substitution prosthesis - Potentialities in minimalist conditions of
learning,” Applied Cognitive Psychology, vol. 20, no. 4, pp. 447–460, 2006.
[178] D. A. Pollen and S. Ronner, “Visual cortical neurons as localized spatial
frequency filters,” IEEE Transactions on Systems, Man, and Cybernetics,
vol. 13, pp. 907–916, 1983.
[179] W. K. Pratt, Digital Image Processing: PIKS Inside. John Wiley & Sons,
Inc., 3rd ed., 2001.
[180] A. S. Reber, Dictionary of Psychology. London: Penguin, 2nd ed., 1995.
[181] J. F. Rizzo, S. Miller, T. Denison, and J. Wyatt, “Electrically-evoked cor-
tical potentials from stimulation of rabbit retina with a microfabricated
electrode array (abstract),” Investigative Ophthalmology & Visual Science,
vol. 37:S707, 1996.
[182] J. F. Rizzo and J. Wyatt, “Prospects for a visual prosthesis,” The Neuro-
scientist, vol. 3, no. 4, pp. 251–262, 1997.
[183] J. F. Rizzo, J. Wyatt, M. Humayun, E. d. Juan, W. Liu, A. Chow, R. Eck-
miller, E. Zrenner, T. Yagi, and G. Abrams, “Retinal prosthesis: An encour-
aging first decade with major challenges ahead,,” Ophthalmology, vol. 108,
no. 1, pp. 13–14, 2001.
[184] J. Rizzo, J. Wyatt, J. Loewenstein, S. Kelly, and D. Shire, “Methods and
perceptual thresholds for short-term electrical stimulation of human retina
with microelectrode arrays,” Investigative Ophthalmology & Visual Science,
vol. 44, no. 12, pp. 5355–5361, 2003.
270 Bibliography
[185] J. Rizzo, J. Wyatt, J. Loewenstein, S. Kelly, and D. Shire, “Perceptual
efficacy of electrical stimulation of human retina with a microelectrode ar-
ray during short-term surgical trials,” Investigative Ophthalmology & Visual
Science, vol. 44, no. 12, pp. 5362–5369, 2003.
[186] J. Roerdink and A. Meijster, “The watershed transform: Definitions, al-
gorithms and parallelization strategies,” Fundamenta Informatica, vol. 41,
pp. 187–228, 2000.
[187] S. F. Ronner, “Electrical excitation of CNS neurons,” in Neural Prostheses:
Fundamental Studies (W. F. Agnew and D. B. McCreery, eds.), New Jersey:
Prentice Hall, 1990.
[188] S. Rosen, “Kinesiology and sensorimotor function,” in Foundations of Ori-
entation and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York:
American Foundation for the Blind, 2nd ed., 1997.
[189] D. Ross, “Wearable computers as a virtual environment interface for people
with visual impairment,” Virtual Reality, vol. 3, pp. 212–221, 1998.
[190] P. Rousche and R. Normann, “A system for impact insertion of a 100 elec-
trode array into cortical tissue,” in Proceedings of the Twelfth Annual In-
ternational Conference of the IEEE Engineering in Medicine and Biology
Society, pp. 494–495, 1990.
[191] J. Russ, The Image Processing Toolkit. CRC Press, 2002.
[192] W. G. Sannita, L. Narici, and P. Picozza, “Positive visual phenomena in
space: A scientific case and a safety issue in space travel,” Vision Research,
vol. 46, no. 14, pp. 2159–2165, 2006.
Bibliography 271
[193] B. N. Schenkman and G. Jansson, “The detection and localization of objects
by the blind with the aid of long-cane tapping sounds,” Human Factors,
vol. 28, no. 5, pp. 607–618, 1986.
[194] E. M. Schmidt, M. Bak, F. Hambrecht, C. Kufta, D. K. O’Rourke, and
P. Vallabhanath, “Feasibility of a visual prosthesis for the blind based on
intracortical microstimulation of the visual cortex,” Brain, vol. 119, no. 2,
pp. 507–522, 1996.
[195] M. B. Schubert, A. Hierzenberger, H. J. Lehner, and J. H. Werner, “Op-
timizing photodiode arrays for the use as retinal implants,” Sensors and
Actuators A: Physical, vol. 74, no. 1-3, pp. 193–197, 1999.
[196] H. N. Schwahn, F. Gekeler, K. Kohler, K. Kobuch, H. G. Sachs, F. Schul-
meyer, W. Jakob, V. P. Gabel, and E. Zrenner, “Studies on the feasibility of a
subretinal visual prosthesis: Data from yucatan micropig and rabbit,” Grae-
fes Archive for Clinical and Experimental Ophthalmology, vol. 239, no. 12,
pp. 961–967, 2001.
[197] S. Se and M. Brady, “Vision-based detection of stair-cases,” in Proceedings
of Fourth Asian Conference on Computer Vision, vol. 1, (Taipei), pp. 535–
540, 2000.
[198] M. Seul, L. O’Gorman, and M. Sammon, Practical algorithms for image
analysis. Cambridge University Press, 2000.
[199] P. Sharp and R. Phillips, “Physiological optics,” in The Perception of Visual
information (W. Hendee and P. Wells, eds.), Springer: New York, 2nd ed.,
1997.
[200] C. A. Shingledecker and E. Foulke, “A human factors approach to the
272 Bibliography
assessment of mobility of blind pedestrians,” Human Factors, vol. 20, no. 3,
pp. 273–286, 1978.
[201] S. Shoval, I. Ulrich, and J. Borenstein, “Computerized obstacle avoidance
systems for the blind and visually impaired,” in Intelligent Systems and
Technologies in Rehabilitation Engineering (H. N. L. Teodorescu and L. C.
Jain, eds.), pp. 414 – 448., CRC Press, 2000.
[202] R. Siegel, “Hallucinations,” Scientific American, vol. 237, no. 4, pp. 132–
140, 1977.
[203] P. Silapachote, J. Weinman, A. Hanson, and M. Mattar, “Automatic sign
detection and recognition in natural scenes,” in IEEE Computer Society
Conference on Computer Vision and Pattern Recognition (CVPR 2005),
vol. 3, pp. 27–27, 2005.
[204] M. Snaith, D. Lee, and P. Probert, “A low-cost system using sparse vision
for navigation in the urban environment,” Image and Vision Computing,
vol. 16, no. 4, pp. 223–292, 1998.
[205] W. Snyder and Q. Hairong, Machine Vision. Cambridge University Press,
2004.
[206] M. Sonka, V. Hlavac, and R. Boyle, Image processing, analysis, and machine
vision. California: Brookes/Cole, 2nd ed., 1999.
[207] G. P. Soong, J. E. Lovie-Kitchin, and B. Brown, “Preferred walking speed
for assessment of mobility performance: Sighted guide versus non-sighted
guide techniques.,” Clinical and Experimental Optometry, vol. 83, no. 5,
pp. 279–282, 2000.
[208] G. P. Soong, J. E. Lovie-Kitchin, and B. Brown, “Does mobility perfor-
mance of visually impaired adults improve immediately after orientation and
Bibliography 273
mobility training?,” Optometry and Vision Science, vol. 78, no. 9, pp. 657–
66, 2001.
[209] G. P. Soong, J. E. Lovie-Kitchin, and B. Brown, “Measurement of pre-
ferred walking speed in subjects with central and peripheral vision loss,”
Ophthalmic and Physiological Optics, vol. 24, no. 4, pp. 291–295, 2004.
[210] U. H. M. Spandau, S. Wechsler, and A. Blankenagel, “Testing night vi-
sion goggles in a dark outside environment,” Optometry and Vision Science,
vol. 79, no. 1, pp. 39–45, 2002.
[211] M. V. Srinivasan, J. S. Chahl, K. Weber, S. Venkatesh, M. G. Nagle, and
S. W. Zhang, “Robot navigation inspired by principles of insect vision,”
Robotics and Autonomous Systems, vol. 26, no. 2-3, pp. 203–216, 1999.
[212] R. Srinivasan and K. Rao, “Predictive coding based on efficient motion es-
timation,” Communications, IEEE Transactions on, vol. 33, no. 8, pp. 888–
896, 1985.
[213] A. Stett, W. Barth, S. Weiss, H. Haemmerle, and E. Zrenner, “Electrical
multisite stimulation of the isolated chicken retina,” Vision Research, vol. 40,
no. 13, pp. 1785–1795, 2000.
[214] E. R. Strelow, “What is needed for a theory of mobility: Direct perception
and cognitive maps - lessons from the blind,” Psychological Review, vol. 92,
no. 2, pp. 226–248, 1985.
[215] G. J. Suaning, L. E. Hallum, S. C. Chen, P. J. Preston, and N. H. Lovell,
“Phosphene vision: Development of a portable visual prosthesis system for
the blind,” in Proceedings of the 25th Annual International Conference of
the IEEE/EMBS, (Cancun, Mexico), 2003.
274 Bibliography
[216] G. J. Suaning and N. H. Lovell, “A 100 channel neural stimulator for ex-
citation of retinal ganglion cells,” Proceedings of the 20th Annual Interna-
tional Conference of the IEEE Engineering in Medicine and Biology Society,
vol. 20, no. 4, pp. 2232–2235, 1998.
[217] G. Suaning and N. Lovell, “CMOS neurostimulation system with 100 chan-
nels, scaleable output and bi-directional radio frequency telemetry,” IEEE
Transactions on Biomedical Engineering, vol. 48, no. 2, pp. 248 –260, 2001.
[218] G. Suaning, N. Lovell, and Y. Kerdraon, “Physiological response in ovis
aries resulting from electrical stimuli delivered by an implantable vision pros-
thesis,” in Proceedings of the 23rd Annual International Conference of the
IEEE Engineering in Medicine and Biology Society., vol. 2, pp. 1419–1422,
2001.
[219] G. Suaning, N. Lovell, and Y. Kerdraon, “Trans-retinal electrical stimu-
lation using a neuroprosthesis: The effects of damage to the r-membrane,”
in Proceedings of the Second Joint Annual Conference and the Annual Fall
Meeting of the Biomedical Engineering Society., vol. 3, pp. 2091–2092, 2002.
[220] G. J. Suaning, N. H. Lovell, and C. Y. Kwok, “Fabrication of platinum
spherical electrodes in an intra-ocular prosthesis using high-energy electrical
discharge,” Sensors and Actuators A: Physical, vol. 108, no. 1-3, pp. 155–
161, 2003.
[221] G. Suaning, N. Lovell, K. Schindhelm, and A. Coroneo, “The bionic eye
(electronic visual prosthesis): A review,” Australian and New Zealand Jour-
nal of Ophthamology, vol. 26, no. 3, pp. 195–202, 1998.
[222] Talking Signs Inc, “Talking signs infrared communications system.”
http://www.talkingsigns.com/, 2003.
Bibliography 275
[223] I. Tanaka, T. Murakami, and O. Shimzu, “Heart rate as an objective mea-
sure of stress in mobility,” Visual Impairment and Blindness, vol. 75, no. 2,
pp. 55–60, 1981.
[224] X. Tang, “Texture information in run-length matrices,” IEEE Transactions
on Image Processing, vol. 7, no. 11, pp. 1602–1609, 1998.
[225] G. E. Tassiker. U.S. patent 2,760,483, 1956.
[226] R. W. Thompson, G. D. Barnett, M. Humayun, and G. Dagnelie, “Fa-
cial recognition using simulated prosthetic pixelized vision.,” Investigative
Ophthalmology & Vision Science, vol. 44, no. 11, pp. 5035–5042, 2003.
[227] S. Thorpe, “Image processing by the human visual system,” tech. rep.,
Eurographics ’90 : Image Processing by the Human Visual System, 1990.
[228] J. T. Tou and M. Adjouadi, “Computer vision for the blind,” in Electronic
Spatial Sensing for the Blind (D. H. Warren and E. R. Strelow, eds.), pp. 83–
124, Dordrecht: Martinus Nijhoff Publishers, 1985.
[229] G. Trick, “Artificial vision: What are we hoping to restore?,” in The Eye
and The Chip 2004: : World Congress on Artificial Vision., (Detroit, Michi-
gan, USA), 2004.
[230] P. Troyk, M. Bak, J. Berg, D. Bradley, S. Cogan, R. Erickson, C. Kufta,
D. McCreery, E. Schmidt, and V. Towle, “A model for intracortical visual
prosthesis research,” Artificial Organs, vol. 27, no. 11, pp. 1005–1015, 2003.
[231] P. Troyk and M. Schwan, “Closed-loop class E transcutaneous power and
data link for microimplants,” IEEE Transactions on Biomedical Engineering,
vol. 39, no. 6, pp. 589–599, 1992.
276 Bibliography
[232] E. Trucco and A. Verri, Introductory techniques for 3-D computer vision.
New Jersey: Prentice-Hall, 1998.
[233] K. Turano, A. Broman, K. Bandeen-Roche, B. Munoz, G. Rubin, S. West,
and SEE Project Team, “Association of visual field loss and mobility per-
formance in older adults: Salisbury eye evaluation study.,” Optometry and
Vision Science, vol. 81, no. 5, pp. 298–307, 2004.
[234] M. Uddin and T. Shioyama, “Detection of pedestrian crossing using bipo-
larity feature - an image based approach,” IEEE Transactions on Intelligent
Transportation Systems, vol. 6, no. 4, pp. 439–445, 2005.
[235] C. E. Uhlig, S. Taneri, F. P. Benner, and H. Gerding, “Elektrostimulation
des visuellen systems,” Ophthalmologe, vol. 98, no. 11, pp. 1089–1096, 2001.
[236] C. Veraart, M.-C. Wanet-Defalque, B. Grard, A. Vanlierde, and J. Del-
beke, “Pattern recognition with the optic nerve visual prosthesis,” Artificial
Organs, vol. 27, no. 11, pp. 996–1004, 2003.
[237] M. Volker, K. Shinoda, H. Sachs, H. Gmeiner, T. Schwarz, K. Kohler,
W. Inhoffen, K. Bartz-Schmidt, E. Zrenner, and F. Gekeler, “In vivo assess-
ment of subretinally implanted microphotodiode arrays in cats by optical
coherence tomography and fluorescein angiography,” Graefe’s Archive For
Clinical And Experimental Ophthalmology, vol. 242, no. 9, pp. 792–799, 2004.
[238] P. Walter and K. Heimann, “Evoked cortical potentials after electrical stim-
ulation of the inner retina in rabbits,” Graefe’s Archive for Clinical and
Experimental Ophthalmology, vol. 238, no. 4, pp. 315–318, 2000.
[239] B. Wandell, Foundations of vision. Sinauer Associates: Massachusetts,
1995.
Bibliography 277
[240] D. J. Warren and R. A. Normann, “Visual neuroprostheses,” in Handbook
of Neuroprosthetic Methods (W. E. Finn and P. G. LoPresti, eds.), Boco
Raton: CRC Press, 2003.
[241] A. Webb, Statistical Pattern Recognition. Wiley, 2nd ed., 2002.
[242] J. D. Weiland and M. S. Humayun, “Past, present, and future of artificial
vision,” Artificial Organs, vol. 27, no. 11, pp. 961–962, 2003.
[243] S. K. West, G. S. Rubin, A. T. Broman, B. Munoz, K. Bandeen-Roche,
and K. Turano, “How does visual impairment affect performance on tasks
of everyday life? The SEE Project. Salisbury eye evaluation,” Archives of
Ophthalmology, vol. 120, no. 6, pp. 774–80, 2002.
[244] R. Whitestock, L. Frank, and R. Haneline, “Dog guides,” in Foundations of
Orientation and Mobility (B. B. Blasch and W. R. Weiner, eds.), New York:
American Foundation for the Blind, 2nd ed., 1997.
[245] World Health Organization, “WHO fact sheet no. 145: Blindness and visual
disability: Socioeconomic aspects,” 1997.
[246] World Health Organization, “WHO fact sheet no. 146. blindness and visual
disability: Seeing ahead - projections into the next century,” 1997.
[247] World Health Organization, “WHO fact sheet no. 143. blindness and visual
disability: Major causes worldwide.,” 1999.
[248] World Health Organization, “WHO fact sheet no. 144. blindness and visual
disability: Other leading causes worldwide,” 1999.
[249] World Health Organization, “WHO fact sheet no. 233: Blindness as a public
health problem in china,” 1999.
[250] A. L. Yarbus, Eye Movements and Vision. New York: Plenum, 1967.
278 Bibliography
[251] C. S. Yoon, “Audible maps - a simple and effective tool,” in 2nd Aus-
tralasian orientation and mobility conference, (Gold Coast, Australia), 2003.
[252] D. Yuan and R. Manduchi, “Dynamic environment exploration using a
virtual white cane,” in Proceedings of IEEE Computer Society Conference
on Computer Vision and Pattern Recognition, pp. 243– 249, 2005.
[253] S. Zeki, A vision of the brain. Blackwell Scientific Publications: London,
1993.
[254] D. Zhang and G. Lu, “Review of shape representation and description tech-
niques,” Pattern Recognition, vol. 37, no. 1, pp. 1–19, 2004.
[255] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, “Face recognition:
A literature survey,” ACM Computing Surveys, vol. 35, no. 4, pp. 399–458,
2003.
[256] D. Ziegler, P. Linderholm, M. Mazza, S. Ferazzutti, D. Bertrand, A. M.
Ionescu, and P. Renaud, “An active microphotodiode array of oscillating
pixels for retinal stimulation,” Sensors and Actuators A: Physical, vol. 110,
no. 1-3, pp. 11–17, 2003.
[257] E. Zrenner, “The subretinal implant: Can microphotodiode arrays re-
place degenerated retinal photoreceptors to restore vision?,” Ophthalmolog-
ica, vol. 216, pp. Suppl 1:8–20, 2002.
[258] E. Zrenner, K.-D. Miliczek, V. P. Gabel, H. G. Graf, E. Guenther, H. Haem-
merle, B. Hoefflinger, K. Kohler, W. Nisch, M. Schubert, A. Stett, and
S. Weiss, “The development of subretinal microphotodiodes for replacement
of degenerated photoreceptors,” Ophthalmic Research, vol. 29, pp. 269–280,
1997.
Bibliography 279
[259] E. Zrenner, A. Stett, S. Weiss, R. B. Aramant, E. Guenther, K. Kohler,
K.-D. Miliczek, M. J. Seiler, and H. Haemmerle, “Can subretinal micropho-
todiodes successfully replace degenerated photoreceptors?,” Vision Research,
vol. 39, pp. 2555–2567, 1999.
280 Bibliography
Appendix A
AHV project web sites
A list of AHV project web sites (current at July 2006) and main contacts is
provided below:
Bionic Eye Research Project (Cortical Neuroprosthesis - UNSW, Australia)
Vivek Chowdhury and John Morley
http://ophthalmology.med.unsw.edu.au/bioniceye.htm
Cortical Implant for the Blind (CORTIVIS, Europe)
Edwardo Fernandez
http://cortivis.umh.es/
EPI RET (Retina implant research in Cologne, Germany)
Rolf Eckmiller
http://www.medizin.uni-koeln.de/kliniken/augenklinik/epi-ret3e.htm
Intracortical Visual Prosthesis (Illinois Institute of Technology, United States)
Phillip Troyk
http://neural.iit.edu/intro.html
281
282 Appendix A. AHV project web sites
Microsystems Based Visual Prosthesis (MiVip, now OPTIVIP, Europe)
Claude Veraart
http://www.md.ucl.ac.be/gren/Projets/mivip.html
OPTIVIP projects (ESPRIT programme of the European Union)
Claude Veraart
http://www.dice.ucl.ac.be/optivip/
Optobionics Corporation (United States)
Alan Chow and Vincent Chow
http://www.optobionics.com
Retinal Implant (Doheny Retina Institute, United States)
Mark Humayun and Eujene De Juan Jr
http://www.usc.edu/hsc/doheny/
Retinal Implant & Bio-hybrid Implant (Japan)
Tohru Yagi
http://www.bmc.riken.jp/yagi/retina/
Retinal Implant-AG (was SUB RET project, Germany)
Eberhart Zrenner
http://www.retina-implant.de/tour/
Retinal Prosthesis Project (North Carolina State University, United States)
Wentai Liu
http://www.icat.ncsu.edu/projects/retina/
283
Retinomorphic chip (University of Pennsylvania, United States)
http://www.neuroengineering.upenn.edu/boahen/pub/fs pub.htm
Second Sight (California, United States)
Alfred E. Mann and Robert Greenberg
http://www.2-sight.com/
The Boston Retinal Implant Project (United States)
John Wyatt and Joseph Rizzo
http://www.bostonretinalimplant.org/
The Dobelle Institute (Lisbon, Portugal)
William Dobelle
http://www.dobelle.com/
University of Utah (Intracortical prosthesis, United States)
Richard A. Normann
http://www.bioen.utah.edu/cni/projects/blindness.htm
Vision Prosthesis Project (UNSW and Newcastle University, Australia)
Gregg Suaning
http://bionic.gsbme.unsw.edu.au/
284 Appendix A. AHV project web sites
Appendix B
Chapter 7 and 8 experiment
materials
285
286 Appendix B. Chapter 7 and 8 experiment materials
Participant Information Sheet
“Mobility enhancement using simulated Artificial Human Vision” Jason Dowling (PhD Candidate, EESE, S1102, Gardens Point, 3864 1608 [email protected]) Description Artificial Human Vision (AHV) systems are designed to help restore some sense of vision to the blind by electrically stimulating a component of the visual pathway. However there are limits to the amount of visual information that can be provided to a person using an AHV system. We are interested in how we can process images from a camera to enhance mobility for blind recipients of AHV systems. The research team requests your assistance in testing one method of information display and its effect on your mobility performance. Your participation will involve wearing a head mounted simulation device. The device display will be your only visual information during the experiment. After the device has been placed on your head: • You will be allowed two minutes for familiarization with the device display; • Your walking speed while wearing the device will be measured; • You will be asked to complete two tasks within a mobility course; • Your walking speed while wearing the device will be measured again. The experiment is expected to take approximately 40 minutes. Expected benefits It is expected that this project will not benefit you. However, it may benefit the mobility performance of blind people who use an artificial human vision system. This research may also be useful for the development of nno-surgical image processing based electronic travel aids for the blind. Risks 1. Some people may experience disorientation or nausea while using Virtual Reality (VR) headgear. If you feel sick during the experiment please tell the experimenter who will immediately stop the experiment. 2. As the simulation presents a reduced amount of visual information, there is a risk of tripping or hitting obstacles during the experiment. However we have designed the mobility course to reduce these risks. In addition, the experimenter will walk directly behind you during the experiment to monitor and prevent any personal danger. Confidentiality All comments and responses are anonymous and will be treated confidentially. The names of individual persons are not required in any of the responses. During the experiment we may record your movements on video: this video data will be coded and destroyed within two months of the experiment. Voluntary participation Your participation in this project is voluntary. If you do agree to participate, you can withdraw from participation at any time during the project without comment or penalty. Your decision to participate will in no way impact upon your current or future relationship with QUT. Questions / further information Please contact the researchers if you require further information about the project, or to have any questions answered. Concerns / complaints Please contact the Research Ethics Officer on 3864 2340 or [email protected] if you have any concerns or complaints about the ethical conduct of the project. Consent The return of the completed questionnaire is accepted as an indication of your consent to participate in this project. Thank you for your time in completing this questionnaire. Figure B.1: Coversheet provided to participants before the AHV simulation ex-periments described in Chapter 7 and 8.
287
C:\Documents and Settings\Jason Dowling\My Documents\2005 PhD thesis\Other\Questionnaire_CVPR.doc
1
Mobility enhancement using simulated Artificial Human Vision
1. What is your gender: □ Female □ Male
2. Please indicate your age (years):
□ 0-20 yrs □ 20-30 yrs □30-40 yrs □40-50 yrs □50-60 yrs □over 60 yrs
3. How frequently do you play computer/video games?
□ Never □ Once a year □ Monthly □ Weekly □ Daily 4. Have you ever used an immersive Virtual Reality (VR) environment (using a head mounted VR display) before?
□ Yes □ No If you have used an immersive VR environment before, approximately how many times have you done this? ____ times
Figure B.2: Questionnaire provided to participants before the AHV simulationexperiments described in Chapter 7 and 8.
288 Appendix B. Chapter 7 and 8 experiment materials
C:\Documents and Settings\Jason Dowling\My Documents\2005 PhD thesis\Other\Experimenter record sheet.doc
1
Mobility enhancement using simulated Artificial Human Vision Experimenter Sheet: Jason Dowling ([email protected], x1608)
Participant id#: ____________
1. PWS(a) Duration: ___________________________ 2. Task 1 (mobility measure/locate object)
Start Time: ______________________________ End Time: ______________________________ Duration: ______________________________ Time Obstacle # Veering# Other# 0-2 mins 2-4 mins 4-6 mins 6-8 mins 8-10 mins 10-12 mins 12-14 mins 14-16 mins 16-18 mins 18-20 mins
3. Task 2 (mobility measure/locate object)
Start Time: ______________________________ End Time: ______________________________ Duration: ______________________________ Time Obstacle # Veering# Other# 0-2 mins 2-4 mins 4-6 mins 6-8 mins 8-10 mins 10-12 mins 12-14 mins 14-16 mins 16-18 mins 18-20 mins
4. PWS(b) Duration: ___________________________ Comments:
Figure B.3: Record sheet used by the experimenter during the AHV simulationexperiments described in Chapter 7 and 8. The locate object task was not usedfor the Chapter 8 experiment.