An Investigation of the Use of Real-time Image Mosaicing ... · 2.3 Evaluation of cognitive maps...
Transcript of An Investigation of the Use of Real-time Image Mosaicing ... · 2.3 Evaluation of cognitive maps...
An Investigation of the Use of Real-time Image Mosaicing for Facilitating
Global Spatial Awareness in Visual Search
by
Anthony Soung Yee
A thesis submitted in conformity with the requirements
For the degree of Doctor of Philosophy
Graduate Department of Mechanical and Industrial Engineering
University of Toronto
© Copyright by Anthony Soung Yee 2013
ii
Abstract
An Investigation of the Use of Real-time Image Mosaicing for Facilitating Global
Spatial Awareness in Visual Search
Anthony Soung Yee
Doctor of Philosophy
Graduate Department of Mechanical and Industrial Engineering
University of Toronto
2013
Three experiments have been completed to investigate whether and how a software technique
called real-time image mosaicing applied to a restricted field of view (FOV) might influence
target detection and path integration performance in simulated aerial search scenarios,
representing local and global spatial awareness tasks respectively. The mosaiced FOV
(mFOV) was compared to single FOV (sFOV) and one with double the single size (dFOV).
In addition to advancing our understanding of visual information in mosaicing, the present
study examines the advantages and limitations of a number of metrics used to evaluate
performance in path integration tasks, with particular attention paid to measuring
performance in identifying complex routes.
The highlights of the results are summarized as follows, according to Experiments 1 through
3 respectively.
iii
1. A novel response method for evaluating route identification performance was developed.
The surmised benefits of the mFOV relative to sFOV and dFOV revealed no significant
differences in performance for the relatively simple route shapes tested. Compared to the
mFOV and dFOV conditions, target detection performance in the local task was found to
be superior in the sFOV condition.
2. In order to appropriately quantify the observed differences in complex route selections
made by the participants, a novel analysis method was developed using the Thurstonian
Paired Comparisons Method.
3. To investigate the effect of display size and elevation angle (EA) in a complex route
environment, a 2x3 experiment was conducted for the two spatial tasks, at a height
selected from Experiment 2. Although no significant differences were found in the target
detection task, contrasts in the Paired Comparisons Method results revealed that route
identification performance were as hypothesised: mFOV > dFOV > sFOV for EA = 90°.
Results were similar for EA = 45°, but with mFOV being no different than dFOV. As
hypothesised, EA was found to have an effect on route selection performance, with a top
down view performing better than an angled view for the mFOV and sFOV conditions.
iv
Acknowledgments
I would like to thank my supervisor Paul Milgram for his countless efforts in helping to make
this dissertation a reality. I can still recall the first day that I met him in Toronto; I got lost
due to the construction around what was then Taddle Creek Rd., ending up somewhere in the
Mining building. Paul had to pick me up and guide me to his office (which I’ll also never
forget), and he has been there every step of the way. I am indebted to him for his support in
research, teaching et pour nos discussions en français. Merci.
I would like to acknowledge my thesis committee members Birsen Donmez and Justin
Hollands, as well as my external members Mark Chignell and Colin Ware. Their comments
have undoubtedly served to strengthen my work, as well as my belief in it.
I would like to thank my extended family of academic brothers and sisters in the ETC Lab,
whose influences can be found everywhere in this work. In particular, the support of my
friends Winnie Chen and Bardia Bina cannot be understated; we have stuck together through
thick and thin, and I am honoured to be graduating alongside them.
I would like to thank the lovely Audrey Kuo for her friendship, dedication and patience over
the years. Her adventurous spirit has made me realise what important things are waiting for
me after a hard day’s work, both at my doorstep and out in the world.
Finally, I would like to thank my brother and parents. Perhaps Lawrence can claim to have
first nurtured the curiosities of a budding researcher, by patiently answering his little
brother’s parade of “Why? Why? Why?”. In any case, I am grateful for his support to this
day. I am indebted to Mom & Dad for their unwavering support in whatever I choose to do.
To that point, I look forward someday to actually explaining to them what this thesis is
about!
v
Table of Contents
Abstract ii
Acknowledgments iv
Table of Contents v
List of Tables x
List of Figures xi
List of Abbreviations xv
Chapter 1. Introduction 1
1.1 Background and motivation 1
1.2 Image mosaicing/image stitching 3
1.3 Spatial awareness tasks in aerial search 4
1.4 Objectives 5
Chapter 2. Literature review and concepts 7
2.1 Introduction 7
2.2 Cognitive maps 7
2.3 Evaluation of cognitive maps and global awareness 9
2.4 Image mosaicing 12
2.4.1 Basic principle of mosaic construction 12
2.4.2 Off-line applications 12
2.4.3 Potential applications of real-time image mosaicing 13
2.5 Studies evaluating human spatial performance in real-time mosaicing 15
2.5.1 Aerial search – Morse et al. (2008) 15
2.5.2 Desktop augmented reality – Jeon and Kim (2008) 16
vi
2.6 Global and local spatial awareness in (teleoperated) aerial search 18
2.6.1 Global awareness in aerial search 18
2.6.2 Local spatial awareness in aerial search 19
2.7 Relevant parameters for the present study 20
2.7.1 Viewing perspective/Elevation angle 20
2.7.2 Display size (and resolution) 22
2.7.3 Speed of traversal and height above terrain 23
2.8 Summary 24
Chapter 3. Experiment 1 26
3.1 Introduction 26
3.2 Experimental tasks 26
3.2.1 Target detection 26
3.2.2 Route identification 28
3.3 Response method 30
3.4 Platform 31
3.5 Procedure 33
3.6 Experimental parameters 35
3.7 Experimental Hypotheses 37
3.8 Results 37
3.8.1 Target detection 37
3.8.2 Route identification 40
3.9 Discussion 42
3.9.1 Route identification results 43
3.9.2 Target detection results 44
3.9.3 Comparison of results to literature 44
vii
3.9.4 Synthesis 45
Chapter 4. Experiment 2 47
4.1 Introduction 47
4.2 Experimental task 49
4.3 Results of Analysis 53
4.3.1 Challenge of Defining Objective Scoring Method 54
4.3.2 Paired Comparisons Method 59
4.3.3 Application of Paired Comparisons 62
4.3.4 Outlier analysis 65
4.3.5 Statistical tests and checking assumptions 68
4.4 Discussion 72
Chapter 5. Experiment 3 74
5.1 Introduction 74
5.2 Experimental procedure 77
5.3 Results 82
5.3.1 Target detection task 82
5.3.2 Route identification task 83
5.3.3 Participant subjective ratings of six viewing conditions 87
5.4 Discussion 89
5.4.1 Target Detection Results 89
5.4.2 Route Identification Results 91
5.4.3 Participants’ Subjective Rating Results 94
5.4.4 Summary 95
Chapter 6. Conclusions 97
6.1 Summary of experimental results 98
6.1.1 Experiment 1 98
viii
6.1.2 Experiment 2 99
6.1.3 Experiment 3 101
6.1.4 Synthesis 103
6.2 Limitations 103
6.3 Contributions 105
6.4 Suggestions for future work 106
References 108
Appendix 1. Statistical Outputs (Descriptive measures and ANOVA results) 115
A1.1 Experiment 1 results 115
A1.2 Experiment 3 results 117
Appendix 2. Parameters for the long river for Experiment 2 120
Appendix 3. Aggregated Route Selections by Participants 121
A3.1 Experiment 2 Routes 121
Appendix 4. Instructions for the set of paired comparisons 123
A4.1 Copy of Experiment 2 Paired comparisons instructions and form 123
A4.2 Copy of Experiment 3 Paired comparisons instructions and interface 125
Appendix 5. Calculations for the linear scales using the Paired comparisons method 129
A5.1 Experiment 2: Paired comparisons without the Route 5 comparisons 129
A5.2 Experiment 3: Judge Paired comparisons 130
A5.3 Experiment 3: Participant Paired comparisons 132
Appendix 6. Statistical tests for assumptions of Thurstone’s Case V method 135
ix
A6.1 Experiment 2: Paired comparisons without the Route 5 comparisons 135
A6.2 Experiment 3: Participant Paired comparisons 139
A6.3 Experiment 3: Judge Paired comparisons 141
Appendix 7. Estimates of the Discriminal Dispersions for Paired Comparisons Method 144
Appendix 8. Contrasts for Paired comparisons 147
A8.1 Experiment 2: Height Paired comparison contrasts 147
A8.2 Experiment 3: Judge Paired comparison contrasts 148
A8.3 Experiment 3: Participant Paired comparison contrasts 149
Appendix 9. Calculation of Number of Mosaiced Frames for Equivalent Size to dFOV
condition 151
Appendix 10. Additional approaches and pilot tests 152
A10.1 Experiment 1 152
A10.2 Experiment 2 158
A10.3 Experiment 3 159
x
List of Tables
Table 4.1 - Aggregated confusion matrix of paired comparison judgements for performance
at four Heights: H1, H2, H3, H4. Table should be interpreted as preferences of the column
element over row element. 63
Table 4.2 - Aggregated scores converted to proportions of the total number of judgments
over all judges (in this case 126). 64
Table 4.3 - Proportion scores in the confusion matrix converted to Z scale units. The values
are then summed along the columns to compute the mean Z values. Finally, the values are
shifted by the minimum value to anchor the values to 0. 64
Table 4.4 - Aggregated confusion matrix, with column totals, ai. 71
Table 4.5 - Results of pairwise contrasts between levels of Height in Experiment 2, following
the Scheffé method outlined in Starks and David (1961). The value in each cell represents a
Q2 test statistic for the column element being preferred over the row element. Critical values
at α = 0.05 and 0.01 are indicated by * and ** respectively. 72
Table 5.1 – The six combinations of display condition and camera elevation angle used in
Experiment 3. 76
Table 5.2 – Sets of pairwise contrasts for the judges in Experiment 3, following the Scheffé
method outlined in Starks and David (1961). Each pair of contrasts is indicated by an X in a
particular row. The value in the last column represents a Q2 test statistic for the column
element being preferred over the row element. Critical values at α = 0.05 and 0.01 are
indicated by * and ** respectively. 85
Table 5.3 - Set of pairwise contrasts for the participant ratings in Experiment 3, following the
Scheffé method outlined in Starks and David (1961). The value in the last column represents
a Q2 test statistic for the column element being preferred over the row element. Critical
values at α = 0.05 and 0.01 are indicated by * and ** respectively. 88
xi
List of Figures
Figure 1.1 - Example of wide (left) and narrow (right) fields of view, taken from Google
Earth 1
Figure 1.2 - Example of an image mosaic, generated by aligning and blending a set of images
with overlapping content. 3
Figure 1.3 - Example of an image mosaic, generated in real-time from a set of video images
as the camera pans from left to right. The white border represents the most recent image
frame in the video. 4
Figure 2.1 - Three displays providing a perspective viewpoint, (a) nominal size FOV, (b)
mosaic FOV, (c) enlarged FOV 21
Figure 3.1 - Example of target used in Experiment 1: (a) target magnified to show textures,
(b) target within flyover terrain. The red dot in (b) represents shadow of the aircraft directly
beneath. In this screenshot, the target is found to the right of the red dot. Forward flyover
motion was along the blue river, from bottom to top, resulting in overall motion of the image
from top to bottom, as indicated by the arrow. 28
Figure 3.2 - 10x10 response grid for novel response method in which participants selected
the route they flew over. 30
Figure 3.3 – Illustrations of (a) route elements, including the curved and straight portions, (b)
Grid layout from which the participants selected the route they flew over. The values for the
length ratio and curvature radii are included here for illustrative purposes; participants only
saw the grid of routes. 32
Figure 3.4 - Illustration of neutral zones and target zones in each flyover. Target zones did or
not contain a target, while neutral zones did not contain targets. Note that the red box
representing the FOV of the sFOV (travelling from left to right) covers half the length of an
event. 33
Figure 3.5 - Screenshots of the three display conditions for Experiment 1, (a) single size:
sFOV (b) mosaic: mFOV, (c) double size: dFOV 36
Figure 3.6 - Proportion of targets detected in Experiment 1 for each display condition, for all
participants 39
Figure 3.7 - Proportion of targets detected in Experiment 1 for each display condition, for
each participant 39
xii
Figure 3.8 - Example of Route Selected by a participant, and the Correct Route. The
measures of Euclidean and City block distance are also shown. 40
Figure 3.9 - Graph showing the Euclidean distance error over all participants for Route
identification in Exp. 1 41
Figure 3.10 - Graph showing the Euclidean distance error, for each participant for Route
identification in Exp. 1 42
Figure 4.1 – (a) Winding river landscape; (b) analogous computer generated ‘river’,
consisting of sum of four sinusoids. 48
Figure 4.2 - Screenshots of one terrain segment, with constant 60° FOV, displayed at four
heights, (a) H1 = 20m, (b) H2 = 56m, (c) H3 = 92m, (d) H4 = 164m 49
Figure 4.3 - Display of the six Routes selected for Experiment 2, chosen from the long
continuous river (Left). Routes on right show start of each Route with a green marker and
end of each Route with a red marker. 50
Figure 4.4 - Screenshot of Route identification Window. Left: top-down view of entire river.
Centre: response buttons, for controlling response; Right: instantaneous indication of selected
route. Green and red markers indicate respective start and end points of currently selected
route. 52
Figure 4.5 - Examples of ensembles of selected Routes collapsed over all participants at (a)
Height H2, (b) Height H3. Each plot contains 14 Routes in black ink (two for each of the
seven participants), as well as one Route in dashed red ink representing the correct Route.
The routes are translated so that their starting points coincide, while maintaining the original
North up representation (as seen in the Route identification window). 53
Figure 4.6 – An illustration of a Correct Route (in red) and a selected route (in black). The
resulting RSME score between these two routes would be large, despite the fact that the
shapes are quite similar. 55
Figure 4.7 - An illustration of a Correct Route (in red) and a selected route (in black), where
small deviations in the route occur between the two routes. The resulting RSME score
between these two routes would be large, despite the fact that the overall shapes are quite
similar. 56
Figure 4.8 - An illustration of a Correct Route (in red) and a selected route (in black), that
exhibit similar shapes but that are mirrored with respect to each other. 57
xiii
Figure 4.9 – Examples of four types of errors observed in route selections (black dotted line)
compared to Correct route (red line), (a) Translation error, (b) Phase shift error, (c) Partial
matching error, (d) Mirroring error. 58
Figure 4.10 - Illustration of four types of errors observed in route selections (black dotted
line) compared to Correct route (red line), shown with starting points matching for (a)
Translation error, (b) Phase shift error, (c) Partial matching error, (d) Mirroring error. 59
Figure 4.11 - Final PCM scale values for route identification task for four Heights, from
Table 4.3. 65
Figure 4.12 - Aggregated route selections for Route 2 and Route 5, for each of the four
Heights H1 to H4. Each plot contains 14 Routes in black ink (two for each of the seven
participants), as well as one Route in dashed red ink representing the correct Route. The
routes are translated so that their starting points coincide, while maintaining the original
North up representation (as seen in the Route identification window). 66
Figure 4.13 – Graphs for PCM results for each of the six Routes, aggregated over all
participants, for Routes 1 to 6. 67
Figure 4.14 – Final PCM scale values: (a) using all comparisons; (b) using all comparisons
except those from Route 5. 68
Figure 4.15 - Three computed scale values for Experiment 2 results, (a) all data, Case V
method (b) all data excluding Route 5, Case V method, (c) all data excluding Route 5, Case
III method. 69
Figure 4.16 - Experiment 2 PCM values for all data excluding Route 5, Case III method. The
actual Heights in metres are shown. 70
Figure 4.17 - Plot of the PCM scale values, including contrasts results. Each line indicates
that a significant contrast was found between the conditions at the endpoints of that line. 72
Figure 5.1 - Illustration of the angled (45°) and top down (90°) viewpoints used in
Experiment 3. 74
Figure 5.2 - Display of the six Routes selected for Experiment 3, chosen from the long
continuous river (Left). Routes on right show start of each Route with a green marker and
end of each Route with a red marker. 78
xiv
Figure 5.3 - Example of target used in Experiment 3: (a) target magnified to show textures,
(b) target within flyover terrain. In this screenshot, the target is located on the bottom right of
the FOV. 78
Figure 5.4 - Screenshot of the ‘Route flyover’ window for the dFOV condition. Participants
were asked to press the ‘Target detected’ button beneath the image when a target appeared
within the area designated by the red markers. Note: a target is currently showing in the
screenshot, half-covered at the top of the FOV. 79
Figure 5.5 – Screenshot of the ‘Participant Paired Comparison’ Window, presented to the
participants after completing all experimental trials. 81
Figure 5.6 - Graph of target detection performance for the six experimental conditions:
{45°,90°}x{sFOV, mFOV, dFOV}. 82
Figure 5.7 - Plots of PCM results for closeness of aggregated route selections to Correct route
performance: (a) across the three FOV conditions for each Elevation angle, (b) across the two
Elevation angles for each Display size. 83
Figure 5.8 - Two-dimensional plots for data from Experiment 3 PCM route identification
performance, with pairwise contrast results, (a) across the three FOV conditions for each
Elevation angle, (b) across the two Elevation angles for each Display size. 86
Figure 5.9 - Plots of PCM results generated by participants, for the question “which of the
two viewing conditions allowed you to more accurately identify the shape of the Route?”, (a)
across the three FOV conditions for each Elevation angle, (b) across the two Elevation angles
for each Display size. 87
Figure 5.10 - Two-dimensional plot for participant ratings of Display conditions, with
pairwise contrast results, (a) across the three FOV conditions for each Elevation angle, (b)
across the two Elevation angles for each Display size. 89
Figure 5.11 – A route used in Experiment 3 shown at a height of 457.2 m above the terrain.
The entire route is shown in the single FOV (sFOV). 90
Figure 5.12 - Graph highlighting (with a dotted circle) the three scale values that were not
found to be significantly different from each other in pairwise contrasts. 94
xv
List of Abbreviations
ANOVA Analysis of variance
AR Augmented reality
dFOV Double size field of view
DRDC Defence Research and Development Canada
EA Elevation angle
FLIR Forward Looking Infrared Red
FOV Field of view
mFOV Mosaic field of view
PCM Paired comparisons method
RMSE Root mean square error
SAR Search and rescue
SAR tech Search and rescue technician
sFOV Single size field of view
SDT Signal detection theory
UAV Unmanned aerial vehicle
1
Chapter 1. Introduction
1.1 Background and motivation
The tradeoff between local detail and global context in visual search tasks is commonplace in
daily tasks. Reading the words on this page requires that the reader ‘zoom in’ on this
particular line on the page, at the expense of the global context of the overall layout of the
page’s paragraphs. In the analogous context of camera viewing for tasks such as
surveillance, reconnaissance, command and control, quality control, etc., these two concepts
are often associated with the size of the field of view (FOV) provided. Extracting local detail
is often performed with a ‘zoomed in’ or high magnification narrow FOV, whereas the
global context of the surroundings is communicated by a relatively wide FOV. This is shown
in Figure 1.1, where the wide FOV (left side) allows the observer to extract the shape of the
canyon, while the narrow FOV (right side) affords local details of a particular area of the
canyon. Unfortunately, a narrow FOV can restrict the assimilation of more global features,
contributing to an operator becoming ‘lost’ with respect to global surroundings (Wickens &
Hollands, 1999).
Figure 1.1 - Example of wide (left) and narrow (right) fields of view, taken from Google Earth
More formally, these acts typically involve some combination of visual search and
wayfinding, both of which require attentional resources to be carried out. Visual search
involves scanning and extracting visual information from scenes to locate a particular object
or feature of interest (Wickens & Hollands, 1999), such as when looking for a particular
article of clothing from a rack, or searching for a set of keys from a cluttered desk. This may
2
be categorised as a local spatial awareness task, since necessary information can be gleaned
from within the current field of view (FOV). Wayfinding, the purposeful act of orienting
oneself in physical space while navigating between points of interest, is completed, for
example, when locomoting between landmarks in a new city or returning home from the
grocery store. That is, upon the basis of aggregated views of an environment, one can
develop a global understanding of the surrounding context.
Consider now the interplay between tasks that involve the assimilation of local detail and
those that rely on global understanding of the surrounding environment, such as is found in
visual search and wayfinding tasks. Woods (1984) coined the term ‘keyhole effect’, as an
analogue to the limited visual field experienced while peering through the keyhole in a door.
In a keyhole effect, only local information can be viewed at any one time, leaving the
observer to spatially integrate successive views through the limited FOV. The real-world
manifestation of this effect is often encountered with mediated viewing through remote
cameras, or new imaging technologies that offer high resolution or highly magnified views of
the world, often at the cost of being able to easily understand the surrounding context of the
environment being observed or scanned. Examples can be found in aerial search tasks such
as those in search and rescue (SAR) or unmanned aerial vehicle (UAV) surveillance
operations.
Whenever a task necessitates both local spatial task performance and knowledge of the
global environment simultaneously, the operator may be forced to trade off performance in
one task to maintain performance in the other task. This has been known to result in poor
performance in both local and global tasks and disorientation. An important underlying
question therefore relates to understanding the factors that might affect the available visual
information that allows operators to perform these tasks. Consequently, with the motivation
of compensating for cognitive deficiencies in mission critical tasks that require both local and
global spatial awareness, the present study investigates whether a real-time image processing
technique called ‘image mosaicing’ (explained below) might be able to enhance spatial task
performance in (simulated) aerial search tasks.
Paramount to making meaningful claims regarding the enhancing of task performance is a
critical examination of how performance is in fact evaluated in such spatial awareness tasks.
3
In particular, the present study also investigates methods for evaluating wayfinding
performance over complex winding routes for which objective computational methods are
not appropriate.
1.2 Image mosaicing/image stitching
An image mosaic is a spatially continuous image created by combining a set of smaller size
images, each containing some overlapping spatial content. An example of an image mosaic is
shown in Figure 1.2, generated from a series of images using a computer software algorithm.
In their most basic form, image mosaicing techniques are based on the ability to align
different images (or tiles) from a scene into a spatially continuous set and to blend them
together (Szeliski, 1996). Research in mosaicing algorithms generally aims to develop
mosaics with as few alignment or blending errors as possible, to create seamless mosaic
images (Irani et al., 1996).
Figure 1.2 - Example of an image mosaic, generated by aligning and blending a set of images with overlapping
content.
Advances in computational power and the decreasing cost of graphics hardware have
afforded many new possibilities in software implementation. For example, it is possible to
create image mosaics of impressive size and detail, on the order of Gigapixels per mosaic
using off the shelf hardware. Many modern cell phones in fact contain software applications
for generating image mosaics using the phone’s CPU and graphics chip. Of particular interest
in the present context is the existence of real-time image processing, which allows an image
mosaic to be generated from a video source as the images are captured. Whereas the image
in Figure 1.2 was created offline some time after the image tiles were captured, the
composite image mosaic in Figure 1.3 was generated in real-time from a video source. As the
4
camera is panned from left to right in the scene, the composited image is updated by aligning
and blending new image frames to the mosaic. The white border represents the most recent
frame in the image mosaic.
Figure 1.3 - Example of an image mosaic, generated in real-time from a set of video images as the camera pans from
left to right. The white border represents the most recent image frame in the video.
The immediately observable effect of mosaicing from Figure 1.2 and Figure 1.3 is that quite
simply, more visual information is present in the image mosaic compared to any single image
frame. If the white border in Figure 1.3 represents the field of view (FOV) size of a standard
camera setup, one can artificially extend the FOV, by aligning and blending just recently
viewed image frames to the most current image frame. The ability to provide an extended
FOV in software is often cited as a benefit of the technique (Lo, 2008; Shum & Szeliski,
2000; Irani & Peleg, 1991) and why refining the algorithms continues to be the topic of
continued research in computer science.
1.3 Spatial awareness tasks in aerial search
Tasks requiring simultaneous local and global spatial awareness often pose great perceptual
and cognitive challenges to the operators performing them. For example, operators charged
with controlling a UAV through remote video feeds require an awareness of the surrounding
context to be able to understand for example, the vehicle’s current position relative to the
5
environment, the locations of objects relative to each other and the areas they have already
flown over. The search and rescue technician (SAR tech) must also maintain awareness of
the surrounding environment, as environmental features such as shapes and landmarks are
useful in communicating to the pilot where a target was spotted when the aircraft is to be
brought back around. This can be challenging if the terrain is bereft of landmarks or the
aircraft follows a path whose trajectory is complex.
In addition to maintaining global spatial awareness, performing local detection tasks can also
be very challenging. Visual search tasks in aerial SAR can be time-critical, where the lives of
crash site victims are often in danger, and finding targets is a constant challenge. SAR techs
and UAV operators may use image enhancements such as infrared imaging, with the goal of
improving detection performance. This may be necessary whenever the resolution required to
identify targets is very high. However, high resolution or highly magnified camera systems
are often limited in their field of view, as the cost inevitably increases to produce larger or
wider FOV sensors at high resolution.
Thus, it is imperative that these operators be able to maintain both local and global spatial
awareness in order to successfully complete mission goals.
1.4 Objectives
Referring to the perceptual issues experienced in mediated viewing tasks, providing a display
whose effectively large field of view comprises a real-time image mosaic would seem like a
viable solution to the keyhole effect. That is, for perceptual tasks that are normally to be
performed with a relatively narrow FOV but at an adequate resolution to accomplish the
search task, one could artificially broaden the FOV in software to supplement the restricted
view of the environment. This would in theory enhance operator performance in spatial tasks,
without any additional hardware. Despite the continued interest in improving the
computational speed and efficiency of mosaicing algorithms, relatively little empirical work
has been done to determine if there is in fact any improvement in operator performance using
real-time mosaiced displays. As such, the present study aims to determine whether the
technique of image mosaicing can enhance human operator performance in tasks requiring
6
both local and global spatial awareness. Furthermore, the present study uses the notable
example of aerial search as a basis from which to design three controlled studies.
Of critical importance in evaluating the effect of image mosaicing is the appropriateness of
the measures used to evaluate performance, particularly for tasks that require path
integration. Thus in addition to understanding visual information in mosaicing, the present
study examines the advantages and limitations of a number of metrics used to evaluate
performance in spatial awareness tasks, with particular attention paid to measuring
performance in identifying complex routes.
7
Chapter 2. Literature review and concepts
2.1 Introduction
This chapter begins with a discussion of concepts related to cognitive mapping, dedicated to
understanding the processes, strategies and difficulties involved in forming accurate
cognitive representations of the environment. It is an important component of wayfinding -
the act of purposeful navigation from one place to another (Bowman, 2002), whereby
features from successive glances of the scene are assimilated and combined to continually
update current estimates of position and orientation (Kitchin & Blades, 2002). The methods
of evaluating cognitive maps are also discussed, as careful consideration must be made to
ensure that the collected data are able to provide insights into performance of the complex
spatial tasks. Next, the spatial awareness tasks in aerial search are considered, as an example
from which to base the experimental investigations in this thesis. A brief discussion of the
important concepts and applications of image mosaicing are provided to highlight the unique
properties of the image mosaic and the potential benefits of such a display system. Finally,
the factors of Viewing Perspective and Display size, identified as being relevant to both real-
life visual search conditions as well as controllable in an experimental setting.
2.2 Cognitive maps
Concomitant with the study of global spatial awareness is a discussion on the formation of
“cognitive maps”, also known as abstract maps, mental maps and conceptual representations.
The diversity of terms is evidence of the continued and varied interest in understanding
spatial behaviour in a number of disciplines, including geography, psychology, computer
science and anthropology. Cognitive maps are useful in that accessing them can provide
answers to questions such as “Where am I?”, “Where did I come from?” and “Along which
route did I travel to get here?”. Used to describe the spatial, geographical and environmental
knowledge acquired by people in wayfinding tasks, the term cognitive map refers to the
information encoded to embody a person’s cognitive representation of the environment
(Kitchin and Blades, 2002) using both short-term and long term memory (Gärling et al.,
1985). The information contained in cognitive maps generally falls into three categories:
8
objects or places (Lynch, 1960; Canter, 1997; Russell, 1982), spatial relations between places
(Kuipers, 1978), and travel plans or routes (Siegel and White, 1975). However, one must be
careful not to regard the term cognitive map necessarily as a form of cartographic map ‘in the
head’ (Kuipers, 1982; Cadwallader, 1979; Lowrey, 1970), but rather as a human’s
representation of space and spatial relationships (Kitchin and Blades, 2002; Gärling et al.,
1985).
Related to these concepts are generally accepted forms of spatial knowledge acquired while
traversing an environment: landmark knowledge, route knowledge and configurational
knowledge. Siegel and White (1975) proposed that the levels are arranged in a set pattern of
development. Landmarks are any salient features in the environment that help to anchor
zones or regions of interest to the observer. They are seen as fundamental building blocks to
developing the second level, route knowledge, which involves understanding the connections
between salient landmarks in the environments. With the development of route knowledge,
one acquires knowledge of routes between landmarks or nodes (Golledge, 1978), as well as
estimates of the distances between salient features. The highest level of spatial knowledge
comes in the form of configurational, or survey, knowledge. Configurational knowledge
allows the observer to understand the relationships between different routes in the
environment, which can help to develop higher order spatial knowledge, such as new routes
and shortcuts between points in the environment. While a body of research has sought to
clarify the hierarchical relationships between the three levels (e.g. Golledge et al., 1993;
Ferguson and Hegarty, 1994; Gärling et al., 1981), the general consensus is that the elements
observed from the environment help to develop knowledge about higher order spatial
relationships, which can be used to accomplish spatial awareness tasks successfully.
The interest in cognitive maps and the acquisition of spatial knowledge in the present study is
two-fold. First, there is an intrinsic interest in investigating the viewpoint parameters that
potentially affect spatial behaviour in forming cognitive maps across different proposed
information displays. For the evaluation of human spatial performance using a mosaiced
display, the factors of FOV size and viewpoint height were investigated in Experiments 1 and
2 respectively, while FOV size and viewpoint perspective were varied in Experiment 2.
Second, the results of the present study have many implications for real tasks that require the
9
accurate formation of cognitive maps during live operations. Among the number of domains
discussed in the following, the spatial processes involved in aerial search are explored in
experimentally controlled (and admittedly contrived) spatial tasks.
2.3 Evaluation of cognitive maps and global awareness
In recognition of the fact that cognitive maps are not a physical manifestation of the spatial
representations encoded in memory, there are two important considerations for the present
study. First, cognitive maps are prone to interference, which can result in distortions of one’s
cognitive map and ultimately the feeling of disorientation or feeling lost. It is this deficiency
that new information displays may be able to help resolve. Second, for evaluating
performance in tasks involving the formation of cognitive maps, careful attention must be
paid to how cognitive maps are externalized by observers, as “spatial products” (Liben,
1982). These can be elicited using a variety of techniques, including sketching, estimation,
reproduction and modelling.
Kitchin & Blades (2002) provide a classification of tasks used for assessing spatial
performance using cognitive maps. Unidimensional tasks are used to determine a
participant’s knowledge of the relationship between two locations, and can be divided into
two categories: distance tasks and direction tasks. In distance tasks, the participant is asked to
report the distance between two points, either as a magnitude (e.g. Cadwallader, 1979) or as a
ratio relative to some standard distance (e.g. Lloyd and Heivly, 1987). In some cases,
participants may be asked to draw places on a map on a scale smaller than the estimated
environment, in order that the distances between points of interest can be compared against
the true distances (e.g. Montello, 1991).
In direction tasks, participants are asked to estimate the direction between two places in the
environment, usually requiring the participant to point from a given place to a target place,
either on paper, a computer screen using a mouse (Kearns et al., 2002; Kitchin & Blades,
2002) or pointing in the physical world (e.g. Fujita et al., 2010; Loomis et al., 1999). One
such task is the so-called triangle completion task (Tan et al. 2004), commonly used for
evaluating spatial task performance in relatively simple traversed routes. In a triangle
completion task, the participant performs three movements in sequence: translation, rotation
10
and another translation. The participant must then point back to where he started the
sequence of movements, and the angular error from the participant’s response to the actual
direction represents the sum of all perceptual errors in spatial task performance.
While the triangle completion task is often regarded as one of the most direct ways of
measuring spatial ability (Golledge, 1999), this method, or any method that provides a single
computed value for error, may not be appropriate for evaluating global task performance
when traversing more complex routes, since the rather coarse angular error measure cannot
provide any insight into where along a complex route any misjudgements may have occurred.
Some researchers have questioned the validity of such one dimensional techniques in
evaluation cognitive maps (e.g. Montello et al., 1999; Kitchin & Blades, 2002).
In response to the limitation of one dimensional techniques, several two-dimensional data
collection techniques have been proposed to elicit the participant’s knowledge of the spatial
relationships between elements in the environment. Kitchin & Blades (2002) propose three
types of two-dimensional tasks: completion tasks, graphic tasks and recognition tasks.
Completion tasks involve the participant receiving some portion of a map or diagram
containing information pertinent to a spatial task, and being asked to fill in the rest of the
information. That is, the partial information is meant to prime the participant, relieving him
from having to reproduce an entire map from scratch. For example, Thorndyke & Hayes-
Roth (1982) asked participants to indicate a single location relative to two given points on a
map, while also being provided with scale and orientation information relevant to the task.
Kitchin (1996) varied the amount of information provided to the participant in similar spatial
tasks, and found that the amount of cueing information had an effect on the responses
provided.
Graphic tasks involve the participant producing a sketch or map of the environment (Kitchin
& Blades, 2002). A basic sketch map is one that is minimally defined by the experimenter.
For example, a participant might be given a blank piece of paper and asked to sketch a map
of a city, with no further instructions. By contrast, the participant may be provided some
constraints, such being asked to sketch a city with only major roads and street names. In this
case, the participant is said to produce a normal sketch map. There are several advantages to
11
the sketch map technique, as it requires the participant to express environmental features in
relation to one another. The technique is also simple to employ, and most adult participants
are familiar with the idea of drawing sketches.
However, there are also a number of limitations to the technique of sketch maps. Most
notably, the quality of sketch map depends on graphical skill of the participant, as well as the
ability to express their cognitive map as a sketch map. Furthermore, Beck and Wood (1976)
reported that participants were also unwilling to adjust details of features they had already
positioned on paper, which may result in distorted sketch maps. Finally, once the sketch
maps are collected, the challenge involves evaluating or comparing the maps’ features. In
other words, because the participants are given relatively few restrictions, there may be an
issue of quantifying them.
In recognition tasks, participants are asked to identify a configuration of objects or places to
which they have been exposed. Participants may be asked to identify a feature on a map or
aerial photograph of a familiar area, or be shown several configurations and be asked to
identify the correct spatial configuration. Wang (2005) used the latter technique, showing
participants physical scale models of possible tunnel configurations in order to evaluate
global spatial awareness. Evans et al. (1980) had participants walk routes through buildings
and then asked them to identify the individual routes from a number of floor plans.
Recognition tasks may be advantageous in that they provide a closed set of possible
responses, allowing the participant to recognise the correct configuration rather than recall
its exact properties. Compared to having a participant reproduce a sketch map, a participant
may been keener at identifying the map among a set of alternatives, thus obviating the
graphical skills required to produce a sketch of the environment. Finally, as with the graphic
tasks, recognition tasks may have ecological validity in that participants may also be familiar
with tasks that involve identifying areas or routes along a given map.
Given the wide variety of spatial tasks used in exploring cognitive maps, particular attention
must be paid to the methodological issues involved in selecting an appropriate method of
evaluation. Indeed, Kitchin and Blades (2002) caution that experimenters frequently do not
provide a justification for their choice of technique, which can introduce significant
problems, since some techniques may be more appropriate or less appropriate for certain
12
environments and populations. As is discussed in the experiment chapters of the present
study, careful attention was paid to the selection of an appropriate evaluation technique,
given the demands of the two tasks and the complexity of the environment being traversed.
2.4 Image mosaicing
2.4.1 Basic principle of mosaic construction The implementation of real-time image mosaicing used in this study was developed in a
MASc thesis project by Hok Man Herman Lo (2008) at the University of Toronto, as part of
a suite of software tools for improving visualization in laparoscopic surgery. While the reader
is invited to consult that MASc thesis for a full description of the mosaicing algorithm, a high
level description of the process is offered here to familiarise the reader with the important
concepts.
Constructing an image mosaic consists of three major steps: image registration, projection,
and blending (Mann, 2002). In the alignment (or registration) phase, we attempt to develop a
model of the geometric relationships between pairs of images (Schmidt et al., 2000). The
mapping between two images is called the projective coordinate transformation, which
ranges from relatively simple translations and rotations, to more complex full projective
transformations (Brown, 1992). The software implementation used in the present study
matches specific groups of pixels called ‘feature points’ that appear in the images. The
images in subsequent frames are then projected to be in alignment with one of the images
(called the reference image) and then blended together using one of several techniques
(Szeliski, 2006).
2.4.2 Off-line applications Some early examples of image mosaicing can be traced to underwater photographic
surveying, where photographs of the ocean floor were physically laid out and taped together
to form a single coherent map (Pollio, 1968). More recently, the fields of photogrammetry,
computer vision, image processing and computer graphics have sought to develop algorithms
that automate the process of mosaicing (Irani et al., 1996; Shum & Szeliski, 2000), with the
goal of minimising the ‘residuals’, or significant errors in alignment, and integrating
individual images into the mosaic. While the traditional applications for automated
13
mosaicing algorithms are found in satellite and aerial imagery (Shum & Szeliski, 2000), its
adoption in different fields is growing rapidly. Applications have expanded for example to
video processing, including file compression, search and indexing, and change detection
(Irani et al., 1996). Mosaicing has also been used to increase photo resolution (Irani & Peleg,
1991) and to emulate the effects of true panoramic cameras (Irani et al., 1996; Szeliski,
1994).
2.4.3 Potential applications of real-time image mosaicing With the advent of powerful low cost computing power, mosaicing algorithms can now be
applied to live video images, allowing mosaiced images to be ‘painted’ from the streaming
output of a video source (Shum & Szeliski, 2000). Consider a display where, instead of
continuously updated live images from a video source, an observer is presented with an
augmented video stream comprising frames captured in the recent past that are automatically
aligned and integrated into a single larger image.
This technique opens up a number of interesting possibilities for assisting an operator in
performing a live search task (i.e. in real time), where a narrow camera field of view often
restricts assimilation of global context as local details are taken in. A display augmented
with mosaiced images effectively constructs a broadened field of view using previously
captured camera images, at the same resolution needed to identify features and objects. This
mosaiced view provides a more global context of areas viewed in the recent past. The
following are examples of some domains in which such a real-time image mosaicing
capability might be used.
2.4.3.1 Histopathology
A histopathologist examines pathology slides (glass slides containing thin slices of tissue
stained with chemical dyes) to identify features in cells that are consistent with known
diseases. The histopathologist examines the slide using a microscope, with a set of eyepiece
lenses and (fixed magnification) objective lenses. The combined magnifications commonly
reach up to 400x. To examine different areas of the slide, the histopathologist either moves
the slide directly with her hand (freehand) or, depending on the circumstances, manipulates
control knobs to move the mechanical platform holding the slide. She must switch back and
forth between discrete levels of magnification to maintain context while examining local
14
details. This procedure is complicated by the movement of the slide, as even small
displacements cause an amplified change in scene under high magnification.
In response to these challenges, the work of Lo (2008) has been spun off into a company,
ViewsIQ (http://viewsiq.ca/), which uses his real-time image mosaicing algorithm to
automatically integrate the image frames coming from the imaging scope into a mosaic.
2.4.3.2 Remote camera surveillance
Pan, zoom and tilt cameras are commonly used for surveying large areas, allowing the
operator to control the orientation and FOV of the image produced. When objects such as
faces or license plate numbers must be identified, the human operator (HO) may need to
zoom out to maintain an awareness of the object's location relative to the area being
surveyed. An extra challenge occurs when the object of interest is moving within the area, in
which case the operator must track the object to keep the object within the camera's FOV.
2.4.3.3 Aerial search and rescue
A search and rescue (SAR) operator scans the environment outside the cockpit of a fixed
wing aircraft traversing its flight path. Perhaps the aircraft is passing over a canopy of trees
while the operator or ‘spotter’ must identify objects of interest such as wreckage or survivors.
To complicate the task of searching out the window of a moving aircraft, consider the spotter
using a view mediated by camera sensors (using infrared imaging, for example) to extract
local details of a small portion of the terrain below. In this situation, the spotter has no
contextual information outside the narrow FOV of the binoculars. The operator must either
pan slowly to scan the forest without feeling disoriented (Carver, 1990), or lower her
binoculars to regain global context. In either case, the operator may miss important local
details as the forest canopy rushes past.
A research project at Defence Research and Development Canada (DRDC) was undertaken
to develop a real-time mosaic system using a Forward Looking Infrared Red (FLIR) sensor.
As part of the ‘Infrared Eye’ project, an opto-mechanical pointing system rapidly steers the
narrow FOV IR camera to capture high resolution images that are stitched together,
producing high-resolution wide FOV images at very high speed. The goal was to provide
“the operator with fast access to points of interest without losing situation awareness.”
15
(Lavigne and Ricard, 2005). A system testbed was constructed but unfortunately never flew,
due to funding constraints.
2.5 Studies evaluating human spatial performance in real-
time mosaicing
The present study focuses on investigating performance in local and global awareness tasks
using a mosaiced FOV. More specifically, I investigated global awareness, in the form of
route identification, and local spatial awareness, in the form of target detection. With regards
to previous research on this topic, the interest of the research community has focussed
primarily on improving the computational effectiveness and speed of mosaicing algorithms,
with few studies dedicated to evaluating the performance of users of mosaicing displays.
From the research on this topic that has been found, we first consider in the following work
in evaluating human operator (HO) performance in search tasks1 using real time image
mosaic displays, followed by desktop augmented reality applications.
2.5.1 Aerial search – Morse et al. (2008) Morse et al. (2008) conducted a HO performance evaluation of a real-time mosaicing system,
termed a ‘temporally local mosaic’, which appended a limited number of mosaiced frames to
the single size FOV. The participants were given the primary task of identifying stationary
targets (red umbrellas) embedded in short video clips of flight paths over a terrain, while
completing a secondary task of identifying red coloured spots among multi-coloured spots on
a second monitor. Two display conditions were investigated; the participants used either a
relatively small fixed size FOV or the mosaiced FOV. There were no explicit non-target
distractors in the experiment.
Morse et al. found that more targets were detected in the mosaic FOV condition compared to
only the single FOV, with no difference in secondary task performance between the two
1 With regards to the potential of automatic detection algorithms in aerial SAR, Baker and Youngson (2007)
considered it unlikely that a sufficiently effective system could be developed, citing reasons of low signal-to-
noise ratio of many targets in the environment, as well as the high false alarm rates that might be expected. As
such, human operators continue to be deployed for these demanding visual search tasks.
16
conditions. By examining where in the display participants identified the targets, Morse et
al. also confirmed that participants did in fact detect targets in the mosaic that they had
missed in the single FOV.
While the results of this investigation suggest that there may be benefits to a mosaiced image
display system for local detection tasks, two issues with the experimental design are noted.
First, as there were no explicit trials without targets present, the participants could have
simply guessed whether they saw a target. In other words, there were no recorded Correct
Rejections and thus no possibility to compute the signal detection parameters d’ and Beta.
However, Morse et al. found that participants responded to artefacts in the display caused by
noise in the video transmission or misalignment of the frames. It was found that more of
these ‘false positives’ occurred for trials using the mosaic display.
The second issue with Morse et al.’s design is that they provided no indication that the
effective FOV of the mosaic remained constant (within reason) throughout the experiment.
In fact, it appears from their screenshots that the size of the mosaic varied during the trial,
caused by the fact that the trajectory of their camera followed a non-linear path. Therefore, it
is difficult to establish any relationship between the increase in size of the effective FOV and
target detection performance.
2.5.2 Desktop augmented reality – Jeon and Kim (2008) Jeon and Kim (2008) investigated the effect of FOV and display size (including real-time
image mosaicing) on task performance in a desktop tangible augmented reality (DTAR)
application. Augmented reality (AR) was used to enhance camera scenes viewing a desktop
environment, by overlaying virtual computer graphic generated elements onto the camera
view. A “nominal” FOV afforded by a conventional webcam mounted to the participant’s
head provided a limited FOV (40 degrees) of the desktop (72 x 120 cm). A “mosaiced” FOV
generated from the webcam provided an extended FOV (although no details on the number
of mosaiced frames was provided). The experimental setup was somewhat unconventional,
as the camera views augmented with AR elements were observed on an upright computer
monitor behind the desktop. In other words, for the “nominal” and “mosaiced” conditions,
the participant had to point his head toward the desktop while gazing at the monitor at a
different location.
17
Participants were shown a virtual object of a certain shape and colour in the middle of the
desktop, and were tasked with moving the virtual object to a different location on the
desktop. To investigate the effect of the two display conditions – with and without mosaicing
– on performance, Jeon and Kim collected time to completion data from 20 participants who
performed 100 trials for each condition. The results indicated that the time to complete the
trials in the mosaiced FOV condition were significantly lower than for the nominal FOV, and
there were significantly fewer movements of the head. Participants also showed a preference
for the mosaiced FOV based on Likert scale ratings to the question “How easy/convenient
was it to carry out the task in this environment?” The authors concluded that the extended
FOV afforded by the mosaiced condition provided performance enhancements, despite the
artefacts seen in the mosaiced images.
In the same paper, Jeon and Kim (2008) conducted a follow up experiment to compare
performance using a mosaiced FOV with that of a “fixed” FOV, afforded by a wide FOV
camera mounted directly above the desktop that provided a view of the entire desktop. Thus
the fixed FOV represented the widest FOV of the three DTAR conditions. The results from
12 participants indicated that the times to complete the trials in the fixed FOV condition were
significantly lower than that of the mosaiced FOV, as well as showing a preference for the
fixed FOV based on Likert scale ratings of the preferred viewing condition. The authors
posited that a mosaiced FOV could be useful in situations where providing a view of the
entire work area is impractical or impossible. They also reiterated the importance of
providing a FOV that is contextually relevant for the task at hand, positing that the results
may have been different for tasks involving close up manipulation.
This last point is particularly relevant for the experiments prepared in the present study. Jeon
and Kim (2008) investigated a task where there was no tradeoff in providing a larger view of
the environment. In other words, providing a view of the entire desktop area would
invariably be expected to provide the best performance compared to narrower fields of view
and thus, perhaps the results are not that surprising. In the present research, which focuses on
tradeoffs between local and spatial awareness, it will be seen (in Experiment 2) that it is
necessary to calibrate the height parameter in order to create a fair comparison – i.e., an
18
experiment whose results are not easily predictable in advance – between the different
display conditions.
2.6 Global and local spatial awareness in (teleoperated)
aerial search
As outlined in the introduction, it is common for the complementary features of wide and
narrow FOVs to be traded off in many tasks. As stated by Vos (1990): ‘... the gain in
visibility is obtained at the cost of “searchability”’. Thus, in situations where both local
detail and global context are needed simultaneously, the ability to trade off these forms of
visual information has a critical impact on the operator's ability to conduct visual search.
2.6.1 Global awareness in aerial search Understanding the spatial relationships between the UAV and the terrain, other aircraft,
points of interest (such as refuelling stations), and targets in the environment are crucial to
successfully completing UAV operations (Drury et al., 2006). Because the UAV operator is
remotely located from the aircraft, the multisensory information of their surrounding
environment one normally receives when directly flying an aircraft, is no longer available.
For example, UAV operators do not have access to kinaesthetic cues pilots of manned
aircraft use to gain an understanding of turbulence, aircraft movement and gravitational
forces (Hopcroft et al., 2006). Thus the operator performs control manoeuvres using visual
information provided by cameras mounted onto the UAV, transmitted via a data link to the
operator. However, the visual information can be limited in terms of image quality and FOV
(Draper and Ruff, 2000; van Erp, 1999), making accurate and up to date cognitive maps of
the environment challenging.
The benefits of providing greater context in global spatial awareness tasks are well known in
the literature. For example, Hodgson (1998) found that enlarging window size, and thus
global context, increases accuracy of human operators in identifying land use types from
aerial photographs. A coordinated series of zooming and panning movements was found to
be a prevalent technique for preserving global context in long distance pointing tasks using
multi-scale interfaces (Bourgeois et al., 2001; Pietriga et al., 2007).
19
Much research has been carried out on the impairment of spatial cognition based on display
parameters and positional relationships between the observer and the environment. In
particular, the current study focuses on global tasks such as map reading tasks or route
identification, whose response or output relies on an exocentric perspective. That is because,
for the aerial search tasks simulated in the experiments presented here, global spatial
awareness was evaluated using an exocentric recognition task, in spite of the fact that the
(simulated) world was experienced from an egocentric viewpoint, i.e. looking out the
window of an aircraft in a SAR-like task. Clearly, providing an exocentric response requires
a mental transformation between the two perspectives.
2.6.2 Local spatial awareness in aerial search Efforts in understanding the process of visual search in aerial SAR tasks generally fall under
two categories. On the one hand, one group of studies have investigated eye movement data
collected during SAR search, to make inferences about the gaze behaviours of spotters (Croft
et al., 2007; Stager and Angus, 1978; Stager and Angus, 1975). Those studies have been
particularly useful for estimating visual field coverage during simulated SAR, for identifying
differences in scanning behaviours between novice and expert spotters, and for determining
the effectiveness of new training programs for spotters.
On the other hand, efforts have also been placed in empirically determining the factors
related to the environment, the targets, and the operational parameters that influence visual
search in aerial SAR. The work of Stager (1974, 1978) was important in this regard, as it
showed that the rate of motion perceived by an operator varies as a function of the angular
velocity away from the perpendicular beneath the aircraft to the ground (Stager, 1974). This
angular velocity is defined as the rate of change of the angle subtending points or objects
moving across the terrain. As the operator gazes away from the terrain beneath the aircraft
(i.e. as the camera elevation angle decreases) towards the horizon, the angular velocity
decreases, which allows an object to remain within a given fixation radius for a
proportionally longer time. Furthermore, the aircraft’s altitude has an effect on angular
velocity; as the altitude is increased, the angular velocity decreases. Stager (1974) cited
practical concerns of searching for objects when altitude was low, as searching through dense
forests would become “nearly impossible”.
20
While attempts have been made to fully automate micro-UAV operations using artificial
intelligence and computer vision techniques, thereby obviating global spatial awareness, the
fact remains that the certain operations may require operators to remain “in the loop”. For a
survey of these efforts, see Michael et al. (2012). Two examples of such operations are covert
reconnaissance missions where enemy positions are to be visually confirmed and missions
for designating targets by training lasers on objects of interest (Austin, 2010). Furthermore,
in those cases where multiple objects must be tracked outside the FOV of the UAV’s camera,
errors in global and local spatial awareness can occur. Thus is it clear that both forms of
spatial awareness should be considered if image mosaicing is to be investigated as potentially
beneficial to performance in spatial tasks.
2.7 Relevant parameters for the present study
Taking into account the visual properties afforded by a real-time image mosaicing algorithm,
as well as the wealth of factors that may affect spatial task performance in aerial search type
tasks, it was crucial to identify a subset of those factors that are not only relevant to real-life
visual search conditions but also whose effects could be manipulated in a controlled
experimental setting. To that end, it was decided to consider the effects of Viewing
Perspective and Display Size.
2.7.1 Viewing perspective/Elevation angle Wayfinding errors can occur during the encoding of spatial information due to a variety of
factors, including incorrect sensing of velocity and time and distorted frames of reference as
the world is traversed (Golledge, 1999). The viewing perspective, manifested through the
camera’s elevation angle relative to the terrain’s surface, has implications for the present
study for two reasons.
First, it stands to reason that distortions such as those caused by perspective foreshortening
due to relatively small elevation angles (Andre et al., 1991), in combination with a restricted
FOV, might contribute to encoding errors in the formation of cognitive maps, and thus
degrade global spatial awareness (Wickens et al., 1989). The relevance of this issue has
increased with the advent of computer graphics, prompting a number of investigations into
performance with the use of so-called 3D displays, which usually provide information about
21
a scene from a perspective viewpoint. Performance using 3D displays is normally contrasted
with that of single or multiple coplanar 2D views of the same environment, to observe any
differences in, for example, route planning, judgements of relative position and mental
workload during operation. Experimental results are decidedly mixed, as many studies have
found 3D displays to be superior to 2D displays (e.g. Ellis et al., 1987; van Breda & Veltman,
1998); some studies have found performance to be roughly equal (e.g. Wickens et al., 1996);
while others have found 3D displays to be inferior (e.g. O’Brien & Wickens, 1997; Boyer et
al., 1995). After considering the wealth of studies in planar vs. perspective viewing, Haskell
& Wickens (1993) drew the logical conclusion that the results may be task dependent.
Second, the effect of viewing perspective has practical implications for an operator who
might use a real-time image mosaicing display in practice. Because stitching algorithms are
based on geometric relationships between individual frames, the mosaic can manifest itself
on the display in a pronounced way. Figure 2.1 shows three display conditions: on the left,
Figure 2.1(a) shows a nominal size FOV. On the right, Figure 2.1(c) shows an enlarged FOV,
whose display size is twice that of the FOV in Figure 2.1(a). This view is what could be
expected if one were to use a different lens, and thus simply extend the view of the
environment at the same resolution as the nominal size FOV.
(a) (b) (c)
Figure 2.1 - Three displays providing a perspective viewpoint, (a) nominal size FOV, (b) mosaic FOV, (c) enlarged
FOV
The mosaic FOV condition shown in Figure 2.1(b) provides not only an extended size FOV,
but also, as a consequence of changes to the image content in the video frames, displays the
22
shape of the path of motion traversed in the most recently displayed frames. For example, the
screenshots in Figure 2.1 show the path of the camera following a curved trajectory, which in
the case of the mosaic FOV (Figure 2.1(b)) is shown directly on the screen. In other words,
the mosaic view imparts not only that there is a curved river in the environment at that
moment in time, but also that camera just recently travelled through the scene in a curved
motion path. The image frames in the other FOV conditions in Figure 2.1 only convey the
former information, that there is a river present in the scene.
As one adds more frames to the mosaic in a perspective view, the mosaic view creates a
“tunnel effect” (a term coined by the author), where past images are appended to the outsides
of the most recent frame, as the camera traverses the scene. This additional visual
information, computed purely from the relative motion between successive frames and
superimposed along with the most recent image frame of the camera, is unique to the mosaic
display.2
Of course, the motion of the camera’s path can also be computed by spatial integration, when
viewing a sequence of images (i.e. videos) of the camera travelling through the environment
using any of these displays. However, the mosaic provides additional information of the
terrain features as seen in the doubled (in the present example) FOV directly as well as the
shape feature, without the need to mentally integrate that information over the two successive
image frames. It is these surmised advantages that form some of the bases of the empirical
investigations of the mosaic FOV in the present study.
2.7.2 Display size (and resolution) Because human observers use knowledge of landmarks as a means to form higher order
spatial relationships, increasing the size of the FOV through mosaicing has the potential
benefit of providing more time for those landmarks to be encoded into spatial memory.
However, there appears to be little consensus in the literature as to whether a wider FOV and
larger display sizes will necessarily result in a gain in spatial performance in aerial SAR type
2 It is interesting to note that this extra cue provided by the successive frames has not been programmed to appear, but is a
direct consequence of the technology itself. It is also relevant to point out that, in contrast to the curved trajectory illustrated
in Figure 2.1, if the trajectory were to have been straight, then this extra cue would not be visible, and Figure 2.1(b)
would, in principle, be identical to Figure 2.1(c).
23
tasks. For example, Brickner & Foyle (1990) found that navigation performance in flying
through a computer simulated slalom course was performed more accurately using a wider
FOV of 55 degrees compared to a narrower FOV of 25 degrees. In surveying target detection
performance, Crebolder et al. (2003), for example, posited that an “intermediate” FOV size
provides optimal performance using multisensor surveillance imaging systems for supporting
visual search tasks under poor visibility.
A parameter related to Display size is that of resolution of the sensor capturing the visual
information, as well as the display resolution of the monitor used for target detection tasks.
Warner & Hubbard (1992) found that using a higher resolution narrow FOV camera sensor
improved detection performance compared to a low resolution wide FOV in a flight
simulation. From these results it was critical in the present study to only consider Display
conditions whose camera and display resolutions scale equally, so that only the size of the
Display is manipulated relative to the baseline FOV condition. To that end, the present study
includes two Display sizes: the mosaic FOV and the double size FOV, both of which are
share the same resolution but at twice the size of the nominal (or single) FOV. The reader
should refer to Appendix A10.1 for a treatment of other Display conditions that were
considered for the present study.
One confounding issue is the proportion of the display area that can be effectively scanned
by an observer as the environment moves past the camera’s field of view. For example,
although a larger size display may allow the observer more time to detect an object passing
through the FOV, the amount of displayed information that must be searched also increases.
In the presence of distractors (non-target objects in the environment), a wide FOV or large
size display also means that more distractors are present at any one time. Thus, the observer
may not be able to effectively scan the entire area, and may miss targets due to poor
coverage.
2.7.3 Speed of traversal and height above terrain Other factors that were considered for investigation were the height of the aircraft above the
terrain and the speed of traversal across the terrain. Concerning the height, it was decided
that a fixed height would be used, since the height above the terrain would influence the
visibility of the targets in the environment. In order to avoid the confound of targets being
24
easier to detect at lower heights, the height was fixed for all display conditions in
Experiments 1 and 3. However, it will be shown that a calibration of height was necessary for
the route identification task, which was the goal of Experiment 2, in which case no target
detection task was deployed.
Varying the speed of traversal was considered as well, as a way to control for the time that an
object (i.e. target) remains on the screen during a flyover, between one FOV and one of twice
the size. In other words, if a target remains in the FOV of Figure 2.1(a) for 2 seconds, for
example, one should double the speed of traversal to ensure that the target appears for the
same amount of time in the FOVs in Figure 2.1(b) and (c). However, one caveat in that case
is that the total flyover time would be cut in half, presenting a disadvantage to display
conditions with a larger Display size when trying to understand global features of the
environment. For this reason, the speed of traversal was fixed for all display conditions for
all three Experiments.
2.8 Summary
The author will henceforth refer to the local spatial awareness task as the target detection
task, and to the global spatial awareness task as the route identification task.
A review of the literature on the formation of cognitive maps as a means of maintaining
spatial awareness provides a basis from which to begin investigating the potential benefits of
real-time image mosaicing during traversal of an environment. The software technique of
image processing provides an artificially broadened effective field of view (FOV), by
aligning and blending images from a relatively narrow FOV to form a larger spatially
continuous image.
Furthermore, during a flyover manoeuvre, the real-time image mosaic also augments the
display with additional shape properties that directly indicate the path of the camera’s
motion. Thus, another potential benefit unique to mosaicing is that spatial relationships of
most recently viewed frames promotes more efficient formation of route knowledge,
compared to a smaller FOV and an equivalently sized but fixed size FOV. Thus, the
empirical investigations presented in this study seek to investigate these potential benefits
25
explicitly, as well as to supplement the existing research on human performance using real-
time image mosaicing.
26
Chapter 3. Experiment 1
3.1 Introduction
Experiment 1 was designed as an exploratory investigation of simultaneous local and global
task performance, to investigate the theoretical benefits of using a mosaiced FOV (mFOV)
display over “conventional displays”. In the experiment, the mosaiced FOV was contrasted
with a baseline single size FOV (sFOV) as well as a FOV of twice the single size (dFOV).
Each participant was asked to watch a series of recorded videos of a flyover of a simulated
terrain from a top down perspective, and was asked to identify targets as he flew over the
terrain. The image presented looked like some variation of Figure 3.1(b), for which motion of
the image was from top to bottom (representing forward motion from bottom to top relative
to the terrain). The vertically oriented blue line in the figure represented a river, along which
the simulated aircraft flew during the flyover. After the video had ended, participants were
asked to select the shape of the route he flew over.
In the following sections, the rationale behind the design of the experiment is discussed, as
well as a description of the platform, experiment parameters and hypotheses. (For further
considerations made in Experiment 1 regarding display conditions, moving targets, response
methods and data analysis methods, please refer to Appendix A10.1.) Results and analysis of
performance in the target detection and route identification tasks are discussed, followed by a
general discussion.
3.2 Experimental tasks
3.2.1 Target detection As described in Chapter 1, the present study was inspired by the difficulty in aerial search of
discriminating between a signal (a difficult to spot target on the ground) and noise (any non-
target object), while in motion above the terrain. A number of options were considered for an
appropriate paradigm that would reflect the visual information processing demands of two
simultaneous spatial awareness tasks. While it cannot be claimed that the target detection
task devised was completely representative of an actual aerial search operation, the task was
nevertheless considered to adhere to most generic attributes of aerial signal detection.
27
In the study by Morse et al. (2008), participants were asked to identify all targets (appearing
as red umbrellas), without regard for misidentified or missed targets, resulting in no means of
scoring any Correct Rejections or False Alarms. In the present study, the environment was
segmented into zones, to act as discrete events and thus enable analysis using signal detection
theory (SDT). That is, some of these segments contained no targets, so that data on False
Alarms and Misses could be collected. Between 3 and 6 targets were implanted in each
flyover route, with each segment containing no target or one target.
The flyover videos were recorded from a top down or “bird’s eye” perspective, with the
camera viewpoint fixed and pointing down towards the terrain, perpendicular to the direction
of travel. This was done to avoid the potential confound of perspective foreshortening, where
objects or parts of objects would appear compressed depending on the camera’s perspective
relative to the objects. This would have potentially complicated performance in the target
detection task, as targets would have appeared as expanded (i.e. larger) for the extended FOV
display conditions.
In order to avoid potential floor and ceiling effects in the detection task, targets were
designed to be ‘moderately difficult’ to detect relative to the background texture. This
involved several iterations of pilot tests, in which several sizes, shapes and textures were
considered for the targets and matched against the background terrain. All candidate targets
were evaluated informally by both the author and his colleagues until a single target was
decided upon, as illustrated in Figure 3.1(a).
The reader should note that this approach is markedly different from that of Morse et al.
(2008). In contrast to the target red umbrellas used in their study, which presumably were the
only red objects in the environment3, the targets in the present study were designed to blend
into the surrounding environment more closely. This was done to simulate the kinds of
challenges faced in real aerial search tasks, where discrimination between targets and
distractors (any non-target object) can be difficult (Croft et al., 2007). For this reason, highly
conspicuous (red) targets such as those used by Morse et al. (2008) were not implemented in
3 This was based on the screenshots included in the Morse et al. (2008) paper, as well as the fact that no explicit
distractors were included in their study.
28
the present study4. This distinction will become important when discussing the results of
Experiment 1.
(a) (b)
Figure 3.1 - Example of target used in Experiment 1: (a) target magnified to show textures, (b) target within flyover
terrain. The red dot in (b) represents shadow of the aircraft directly beneath. In this screenshot, the target is found to
the right of the red dot. Forward flyover motion was along the blue river, from bottom to top, resulting in overall
motion of the image from top to bottom, as indicated by the arrow.
Each participant was asked to identify targets as the video flyover was played. In order to
ensure that the participants were in fact responding to the target they had detected (and not
some distractor on the screen), measures were taken to record where the actual target was
located. Participants were instructed to indicate the location of the target as quickly as
possible, primarily in order to minimise any interference with the route identification task,
described below.
3.2.2 Route identification A route identification task was devised to reflect the perceptual processes needed for some
generic aspect of global spatial awareness – that is, tasks for which spatial updating was
required, by continuous spatial integration (Tan et al., 2004). Furthermore, the desired task
would allow the component parts of the externalised cognitive map to be analysed offline.
That is, as discussed in Section 2.2, rather than relying on a single number to represent the
collectivity of all perceptual processes, one of the goals was to be able to examine the
constituent parts of the responses and the relationships among them. In other words, it
4 Furthermore, the targets used in Morse et al. (2008) were 2D images of umbrellas, seen only from a top down
view. As will be discussed in Experiments 2 and 3, 3D models were needed to simulate targets viewed from an
angled viewpoint.
29
seemed useful to request a response that reflects the entire shape of the route that the
participant believed he flew over. Second, it was beneficial that participants not have to rely
solely on memory to be able to identify the shape of the route. That is, providing aids to the
participants would make the act of responding less prone to errors involved in recalling from
memory the shape of the route.
It was decided that the participants would perform a route identification task, for which they
would retrospectively select the route they had just flown over from among a rectangular grid
of possible routes. The rationale was that in order to assess the participants’ cognitive
mapping performance across different display conditions, the task should require continual
attention to the shape of the flyover, as opposed to only intermittent attention. Furthermore,
whereas in a recall task such as sketch mapping participants would need to memorise the
whole route to be able to reproduce it, selecting from among a number of routes in a
recognition task theoretically would be a bit easier for the participant. Furthermore, if a
sufficiently large set of routes is presented, a route identification task would also provide
some granularity to the recorded data, which could provide the opportunity for analysis that
would be more extensive than scoring by means of a single metric.
Each route that was overflown consisted of three contiguous segments, ordered as straight,
curved, straight. For carrying out the route identification after the flyover, the routes were
arranged along two dimensions, where the rows represent the curvature change in the second
segment, and the columns represent the ratio of the lengths of the two straight segments, as
shown in Figure 3.3(a). There are a number of reasons why this response method was
selected. First of all, it made the response straightforward for the participant, allowing him to
quickly communicate the cognitive map formed during the preceding flyover. It was
important to allow participants to do this quickly, since it was recognised that significant
forgetting would likely occur if the delay were too long.
Another reason for devising this method, instead of asking participants for example to sketch
their routes, as was done by Kitchin and Blades (2002), was because it was desired that the
responses provided be standardized and easily analysed. This therefore both solved the
challenge of developing a method to digitise and quantify participants’ sketches, and avoided
potential confounds in analysing drawings.
30
Finally, the actual arrangement of the grid provided a means for participant to provide a
discrete response to a continuous task, by allowing them to examine an essentially
continuous spectrum of responses whose individual units can be easily compared to their
neighbour in the grid.
Yet another important aspect of this response method was that the routes in the grid were
presented in a canonical (nominally north-up) representation, meaning that all routes were
aligned according to world-centred coordinates. This had implications for the participants
during the flyover, which were presented track-up, and ultimately was identified as a
limitation of Experiment 1.
3.3 Response method
A novel response method was introduced, whereby the participant was ask to select the route
he had flown over from a number of alternatives placed on a 10x10 grid, as shown in Figure
3.2. The grid elements were arranged in a meaningful order, varying by curvature in
horizontal direction, and ratio of the lengths of the straight segments in the vertical direction.
For further details on the values selected for these dimensions, please see Section 3.4.
Figure 3.2 - 10x10 response grid for novel response method in which participants selected the route they flew over.
The 10x10 response grid exhibits the properties discussed earlier, namely that the entire route
is preserved for later analysis, as well as the fact that participants can presumably accomplish
the task through recognition of the route, rather than recall. Furthermore, another benefit of
this response method is that the researcher is able to control the granularity of the response
31
by changing the number of elements as required. In this case, a 10x10 grid was selected after
pilot testing grids with both fewer and more elements.
As far as it has been determined from a review of the literature on externalised cognitive
maps, the response method described here is a novel one, with a number of potential benefits
for researchers trying to evaluate participants’ performance in identifying route traversals.
3.4 Platform
The experimental platform for Experiment 1 consisted of a computer program developed
using Matlab and Psychtoolbox (Kleiner et al., 2011). All routes were generated in a virtual
environment in Google Sketchup and Google Earth, made up of a computer generated grass
terrain and a river running the length of the route5. Targets were also placed along the length
of the route.
First, the participant watched a flyover video in the Route Flyover window. Whenever a
target was believed to be present, the participant used the space button to pause the video,
which brought up the video mask. The participant used the mouse to click on the screen
where he believed the target to be located, after which the mask was removed and the video
resumed playing. For a description of another response method that was considered, namely
of a N-alternative forced choice method, please see Appendix A10.1. At the end of each
video, the Route Identification window appeared, and the participant used the mouse to click
on a Route on the 10 x 10 response grid representing his best estimate of the route that had
just been traversed. After the participant confirmed his selection, the next experimental trial
began after a 10 sec break.
Route layout: The route consisted of three contiguous segments traversed in this order:
straight, curve, straight. Both the length of the curved sections and the combined length of
the straight sections were constant for all routes, with the combined straight segments being
1.5 times as long as the length of the curved section. The curved section’s length was Lc =
2.75 km, then the combined length of the straight sections was Ls1+Ls2 = 1.5*2.75 = 4.05
5 Note that the scales and distances used in all three experiments were selected arbitrarily, within the default
settings of Google Sketchup. As such, they are not representative of distances used in aerial search tasks.
32
km and the total length of the route was L = Ls1+Ls2+Lc = 6.75 km. For each route the
radius of each curved section remained constant along its length.
Grid layout: A 10 x 10 matrix of routes was developed as the response grid, shown in Figure
3.3(b). The radius of curvature of the elements of the grid ranged from 687.9 to 1.91x105
metres, while the Ls1/Ls2 ratios of the straight segments had a range of 0.1 to 10. Note that
corresponding right curving grids were shown after right curving routes had been flown
(assuming that participants would never make an overall right versus left curving error).
(a) (b)
Figure 3.3 – Illustrations of (a) route elements, including the curved and straight portions, (b) Grid layout from
which the participants selected the route they flew over. The values for the length ratio and curvature radii are
included here for illustrative purposes; participants only saw the grid of routes.
Targets: For the signal detection task, the terrain was populated with stationary targets, all of
the same size, shape and image texture, as illustrated in Figure 3.1(a). The targets were
inserted at predefined locations to create target zones (or ‘events’). ‘Neutral zones’
containing no targets were added between events, with the size of each neutral zone being
twice that of the sFOV, to ensure that the terrain containing two target events could not
appear at any one time. ‘Neutral zones’ containing no targets were added between events,
with the size of each neutral zone being twice that of the sFOV, to ensure that the terrain
from two events could not appear at any one time. These elements are shown in Figure 3.4.
33
The length of the events was also set at twice the size of the sFOV, and by extension, the
dFOV and mFOV displays cover the entire length of an event. Note that the events and the
neutral zones were invisible to the participant, as he observed only a continuous forested
terrain.
Figure 3.4 - Illustration of neutral zones and target zones in each flyover. Target zones did or not contain a target,
while neutral zones did not contain targets. Note that the red box representing the FOV of the sFOV (travelling from
left to right) covers half the length of an event.
Flyover: Each flyover took approximately 1.5 minutes to complete. As the camera passed
over the terrain, each target took approximately 3 seconds to traverse the display in the sFOV
condition (and thus approximately 6 seconds for the mFOV and dFOV conditions). The
height remained fixed at 325 m over the surface, which was selected so that the river and
trees were visible at all times.
Camera elevation angle: In order to avoid the potential confounds of perspective
foreshortening described in Section 2.7.1, the camera was oriented pointing down towards
the terrain. Because the curves of the river were very gentle, the simulated aircraft (perhaps
somewhat unrealistically) perfectly followed the path of the river. The result was a top-down
view with a continuously changing track-up orientation. This meant that the world appeared
to rotate continuously at a rate that was always tangential to the present curvature of the
river, with the general flow of the terrain being from the top to the bottom of the display.
3.5 Procedure
A fully within subjects experiment was performed with 9 graduate students from the
University of Toronto. All were 18-40 years of age, with normal or corrected-to-normal
34
vision. None reported having experience as a search and rescue operator. Only male
participants were recruited in order to avoid any confounding factors of inter-gender
differences in spatial awareness performance. This decision was based on frequent reports in
the literature that males in general are known to perform better than females in mental
rotation and other spatial tasks (Linn & Petersen, 1985; Halpern, 2000).
Participants were briefed on the scenario and the two experimental tasks. They were asked to
perform both tasks equally well. The participants conducted 3 training trials for each display
condition, with feedback provided on both tasks. In the target detect task, the experimenter
watched the training video alongside the participant, and made note of any missed targets or
False Alarms. During training participants were told that there could be any number of
targets in each route, but that only one target would be present at any given time. At the
conclusion of each training video, the experimenter replayed the video and pointed out the
missed targets. For the global task, the participant was shown the correct route on the
response grid after having made each of his selections.
After the training period, each participant completed a total of 24 experimental trials divided
into 2 blocks, with each block containing 3 sets (for each of the 3 Display conditions) of 4
randomised trials per set. Three pseudorandom sequences of the 6 sets (2 blocks x 3 sets per
block) were generated and distributed among the 9 participants, three participants per
sequence. Trials within each set were randomised, even though participants may have
received the same pseudorandom sequence of sets.
Each trial consisted of identifying targets during a video flyover, and then selecting the route
he had just flown over once the video was completed. The participants completed the entire
experiment within two hours, including 10 second breaks between trials, and a 3 min break
after every block of 4 trials. Participants were compensated $30 for their participation.
35
3.6 Experimental parameters
Display condition was manipulated as a within subject factor with 3 levels:
Single field of view (sFOV)
Double the size of the single field of view (dFOV)
Mosaic field of view (mFOV)
The single FOV (sFOV), shown in Figure 3.5(a), was selected as a baseline display
condition. The sFOV condition has a field of view of 60°, equivalent to a display size of 8 cm
by 10 cm, or 344 pixels by 424 pixels, corresponding to a simulated real-world area of
approximately 300 x 375 m viewed from a flyover height of 325 m. The mFOV, as shown in
Figure 3.5(b), was selected to be approximately double the size of the sFOV, as a reasonable
size increase to compare performance. The resulting size is equivalent to the dFOV condition
when the movement of the camera viewpoint is purely translation. In the implementation of
the image mosaicing algorithm used, the size of the mFOV change is determined by the
number of overlapping frames that are composited into the image mosaic. For a camera
viewpoint travelling in the forward direction, the image mosaic grows in size as more images
are superimposed in the mosaic. Through trial and error testing, it was decided to use 7
frames in the mosaicing algorithm.
36
(a) (b) (c)
Figure 3.5 - Screenshots of the three display conditions for Experiment 1, (a) single size: sFOV (b) mosaic: mFOV, (c)
double size: dFOV
The double size FOV (dFOV), shown in Figure 3.5(c), provides a display of size equal to the
mFOV, but without the unique shape properties of the mFOV, described in Section 2.7.1.
The reason for including the dFOV condition is so that any potential differences in
performance between sFOV and mFOV can be also evaluated in comparison with simply
using a display with double the size. In other words, it is conceivable that any mFOV
performance benefits may be a consequence simply of the larger FOV rather than from the
unique shape properties of the mosaic.
The dFOV was twice the size of the single field of view - equivalent to a display size of 8 cm
by 20 cm, or 344 pixels by 848 pixels. This corresponds with a simulated real-world area of
approximately 300 x 750 m viewed from a flyover height of 325 m.
It should be noted that, to eliminate one potential confound, the spatial resolution of the
dFOV was set to be equal to that of the sFOV, in spite of the fact that it was twice the
physical size of the sFOV. Due to the nature of the mosaicing algorithm, the mFOV display
had the same resolution as the other two displays, thus permitting a fair comparison for
which FOV size was the main factor.
37
For Experiment 1, eight routes, comprising different combinations of two straight segments
and one curved segment were chosen for the experiment. The routes were treated as a
random factor.
3.7 Experimental Hypotheses
For the target detection task, it was hypothesised that the larger the amount of time that the
target remains within an extended FOV, as in the mFOV and dFOV conditions, the better the
target detection performance compared to the sFOV condition.
Concerning the route identification task, it was hypothesised that the larger size FOV in the
mFOV and dFOV conditions would result in better route identification performance
compared to that of the sFOV.
Furthermore, it was hypothesised that the unique shape properties in the image mosaic
(mFOV) would improve route identification performance, compared to the fixed shape in the
dFOV condition.
3.8 Results
3.8.1 Target detection As mentioned earlier, the data for the target detection task comprises a recording of where
participants indicated with the cursor the targets were located. In order for the experimenter
to determine whether participants responded to the target or some other environmental
feature, a Matlab script was written to display the screenshot that was recorded at the
moment of pausing together with the location of the recorded mouse click. Because it was
reasonable to expect that participants would report the location with some error, the location
of the mouse click was represented by a box of 100 x 100 pixels, to provide some tolerance
for defining a correct detection. The experimenter visually assessed all reported target
detections to classify them as Hits, Misses, False Alarms, or Correct Rejections.
The first observation was that there were no False Alarms among any of the participants; that
is, there were no incidents of reporting a target where there was no target. Given the effort
spent developing targets that blended in with the surrounding grass and tree features, and
thus were intended to be difficult to detect, it was expected that there would be some False
38
Alarms. One possible explanation is that the strength (d’) of the signals (targets) was larger
than anticipated. Another is that participants may have adopted a conservative response
criterion, resulting in more Misses at the cost of fewer (ie. no) False Alarms. Overall, this
may have been a failure to create a classic signal detection task, and thus it stands to reason
that a signal detection theory analysis (calculating d’ and Beta) was not appropriate for these
data. A simpler measure of the proportion of correctly identified targets for each route was
thus used.
Figure 3.6 displays the target detection performance for the three display conditions,
collapsed over all nine participants. It suggests that overall, detection performance was
highest for the single size FOV (sFOV) compared to the other two display conditions.
Similarly, for the plots of each participant, shown in Figure 3.7, participants appeared to
perform better for sFOV in comparison to either mFOV or dFOV.
Statistical analyses were conducted to verify these observations. Because Experiment 1
included one independent variable (display condition) and two dependent variables (accuracy
in the route identification task and distance error in the target detection task), a repeated
measures MANOVA was performed. All SPSS output tables are presented in Appendix
A1.1.
Mauchly’s test indicated that the assumption of sphericity was not violated for the local
measure (χ2(2) = .67, p > 0.05) across the three display conditions. The one-way within-
subject ANOVA indicated a significant main effect of display condition on the target
detection measure (F(1.58,14.66) = 13.55, p = 0.001). The within-subjects contrasts verified
that there were significant differences between the sFOV and mFOV conditions (F(1,8) =
26.04, p = 0.001) and between the sFOV and dFOV conditions (F(1,8) = 20.90, p = 0.002).
All other contrasts were found to be not significant. These results are consistent with the
observations made in Figure 3.6, that the target detection performance was highest using the
single FOV condition.
39
Figure 3.6 - Proportion of targets detected in Experiment 1 for each display condition, for all participants
Figure 3.7 - Proportion of targets detected in Experiment 1 for each display condition, for each participant
sFOV mFOV dFOV0
0.25
0.5
0.75
1Exp1: Local task performance for three display conditions, over all participants
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
Exp1: Local task performance for three display conditions, for each participant
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
sFOV mFOV dFOV0
0.25
0.5
0.75
1
Display condition
Pro
port
ion o
f ta
rgets
dete
cte
d
40
3.8.2 Route identification The data recorded from the route identification task came in the form of the routes selected
by the participant after watching each flyover video. Figure 3.8 shows an example of a
response from one participant, in the form of a route selected, together with the Correct
Route. The challenge for analysing such data was to develop a measure of performance,
based on participants’ selections relative to the correct response for each flyover.
Figure 3.8 - Example of Route Selected by a participant, and the Correct Route. The measures of Euclidean and City
block distance are also shown.
Note that different routes on the grid are represented by changes in curvature in the
horizontal direction and changes in straight line (Ls1/Ls2) ratios in the vertical direction.
Consequently, the distance between the participant’s response and the correct response was
defined as a measure of the participant’s error for this task, with lower Distance error
corresponding to better route identification performance. Because errors along the horizontal
and vertical axes were weighted equally, this Distance error measure should be unbiased
along either dimension.
41
The Distance error was computed first as the Euclidean distance – the hypotenuse of the right
triangle formed by the vertical and horizontal errors.6 For each participant, the Euclidean
distance scores were averaged across 8 trials, for each of the three display conditions, as
shown in Figure 3.10. Although the error in the global task appeared to be lower for the
mFOV and dFOV conditions for some participants, collectively the performance plots for the
global task did not reveal any clear differences between the three displays. In addition, the
aggregate Euclidean distance error plot of route identification performance data collapsed
over all participants in Figure 3.9 suggests that the mosaiced field of view provided no
advantage for correctly identifying the flyover route.
Mauchly’s test indicated that the assumption of sphericity was not violated for the global
measure (χ2(2) = 2.19, p > 0.05) across the three display conditions. The one-way within-
subject ANOVA (display condition) indicated a non-significant display main effect on the
global measure (F(1.83,14.66) = 0.622, p > 0.05), which was consistent with the graphical
results.
Figure 3.9 - Graph showing the Euclidean distance error over all participants for Route identification in Exp. 1
6 Note that units of Distance error computed here are unrelated to physical distances in the simulated world. The difference
between adjacent objects on the grid was assumed to represent an equal change of perceived curvature and length ratio for
horizontal and vertical dimensions, respectively.
sFOV mFOV dFOV
1
2
3
4
5Exp1: Global task performance ("Euclidean distance"), over all participants
Display condition
"Eucl. d
ist.
" err
or
42
Figure 3.10 - Graph showing the Euclidean distance error, for each participant for Route identification in Exp. 1
After examining the participants’ responses plotted on the grid, it was observed in some trials
that errors along one dimension (curvature or length ratio) were much larger than in the
other. Because the “Euclidean distance” error involves squaring both terms, the error value
skews toward the larger of the two terms. Therefore, a “city block” analysis was also
conducted, where, as illustrated in Figure 3.8, the distance error was represented by the sum
of the errors along both grid dimensions. The goal here was to dampen the influence of any
individual dimension for trials in which large differences exist between the two dimensions.
However, the computed City-block distance error yielded results that were similar to the
Euclidean distance error; that is, no difference in global task performance was found between
the three display conditions.
3.9 Discussion
Based on the results from Experiment 1, none of the hypotheses for the two spatial awareness
tasks were supported, as performance was not found to be higher in the mosaiced display
condition for either the global or target detection task. Performance was also not higher in
either task for the double FOV relative to the single FOV case.
sFOV mFOV dFOV
1
2
3
4
5
Display condition
"Eucl. d
ist.
" err
or
sFOV mFOV dFOV
1
2
3
4
5
Display condition
"Eucl. d
ist.
" err
or
Exp1: Global task performance ("Euclidean distance"), for each participant
sFOV mFOV dFOV
1
2
3
4
5
Display condition
"Eucl. d
ist.
" err
or
sFOV mFOV dFOV
1
2
3
4
5
Display condition
"Eucl. d
ist.
" err
or
sFOV mFOV dFOV
1
2
3
4
5
Display condition"E
ucl. d
ist.
" err
or
sFOV mFOV dFOV
1
2
3
4
5
Display condition
"Eucl. d
ist.
" err
or
sFOV mFOV dFOV
1
2
3
4
5
Display condition
"Eucl. d
ist.
" err
or
sFOV mFOV dFOV
1
2
3
4
5
Display condition
"Eucl. d
ist.
" err
or
sFOV mFOV dFOV
1
2
3
4
5
Display condition
"Eucl. d
ist.
" err
or
43
3.9.1 Route identification results The route identification task was developed to provide insight into the potential use of
mosaicing as a means of enhancing global spatial awareness. It was surmised that appending
recently viewed information to present information would aid participants in updating their
cognitive map of the environment to come up with an integrated image by the end of the trial.
Thus it was hypothesised that, compared to a single FOV size, using the image mosaic
display would on average result in smaller errors when identifying the route within a
rectangular grid of routes. The results in Figure 3.9 and Figure 3.10 suggest that overall, the
participants were able to identify the correct Route with an average error of between 2 and 3
“Euclidean” units of distance. Taking into consideration that there were 100 possible
responses presented in the 10x10 grid, as shown in Figure 3.8, performance was deemed to
be quite good overall. In fact, the average error in the sFOV condition was smaller than
expected; it was surmised that a flyover time of 1 minute and 30 seconds and the demands of
the two tasks would render the task rather difficult, leading to a larger average error in the
sFOV condition than what was observed. However, no differences were found in the route
identification task between the three display conditions. An examination of the experimental
procedure may offer some explanations for this.
Although the routes were designed so that participants would form a cognitive map of the
flyover, the simplicity of the routes may not have offered enough of a challenge for
participants to take advantage of the potential benefits of the mosaiced display. For example,
because visual cues for understanding the shape of a straight segment are not necessarily
diminished by a smaller field of view then, admittedly in hindsight, it was perhaps not
reasonable to expect any differences for those portions of the route.
However, for the curved portion of the route, the larger displays should provide a wider
portion of the curve from which the participant can more easily extract the degree of
curvature. (Note that this same effect could also be achieved for a relatively wide FOV but at
a low altitude, which will become important in a later discussion.) Thus, with more
information presented from an extended FOV, it was expected that differences would be
observed relative to the single FOV. Perhaps, in hindsight, the length of this portion of the
route was sufficiently long such that, even in the sFOV condition, the degree of curvature
44
could be easily determined over the duration of the flight over the curved portion. In
particular, the curved portion maintained a constant curvature throughout, perhaps lending
itself to no benefit for the mFOV and dFOV conditions when identifying the correct route.
3.9.2 Target detection results Concerning the target detection task, the participants performed unexpectedly with higher
accuracy when using the single (smaller) FOV. Referring to Figure 3.5, it is surmised that
two factors could have played a role here: scanning coverage and the inclusion of more
clutter in the larger displays. First, note that each target would take approximately 3 seconds
to cross the display in the sFOV condition, and by extension approximately 6 seconds in the
mFOV and dFOV conditions, which were twice as long. Given that no instructions were
given that might have prescribed any particular scanning behaviour, it is conceivable that
with the extended FOV conditions, participants may have returned to areas they had
previously scanned without detecting a target, while neglecting other areas in their visual
field. In other words, within the context of the display conditions, it is possible that the
additional time available for revisiting previously scanned areas was not appropriately
exploited.
Furthermore, the presence of distractors in the visual field, in the form of trees and
background texture, could have contributed to the comparatively higher performance in the
sFOV condition. For a display with twice the size as the sFOV at the same spatial resolution,
the number of distractors present in the visual field at any one time is also doubled.
Combined with a suboptimal scanning pattern, it was surmised that these confounds for
viewing the terrain with an enlarged FOV outweighed the potential benefit of additional time
in detecting targets.
3.9.3 Comparison of results to literature Although the target detection results were surprising, they are not unprecedented. As
discussed in Section 2.7.2, Crebolder et al. (2003) conducted a survey of Defence Research
and Development Canada (DRDC) research on the effect of Display size on target detection
performance, concluding that results are task dependent and that an “intermediate” size FOV
offers the best performance. In other words, adopting a larger FOV size did not necessarily
result in better performance. Taking this work into consideration, as well as the number of
45
target sizes, shapes and textures that were pilot tested to avoid “floor” and “ceiling” effects in
the target detection task, perhaps it was not unreasonable to find that performance was better
in the sFOV.
Experiment 1 was motivated by the results of Morse et al. (2008), where a significant
improvement in the number of detected targets was found using a mosaicing display.
Although there appears to be contradicting results between Experiment 1 and the work of
Morse et al., there is one important difference in the target’s features that may explain the
discrepancy. Morse et al. (2008) reported that red umbrellas were used as targets. It is
presumed from the screenshots provided in their paper, as well as from the fact that no
explicit distractors were used, that the targets were the only red coloured features in the
environment. Thus their targets could be identified along a single salient dimension of
colour, making the task amenable to parallel search (Wickens and Hollands, 1999). In the
case of parallel search, search times in identifying targets has been shown to be relatively
unaffected by the number of distractors (or equivalently as the FOV is extended), as
corresponds to the mosaic FOV used by Morse et al. (2008).
In the case of the present study, the targets were selected to have characteristics similar to
those of the surrounding environment. In order to make the target search moderately difficult,
the targets shared similar shape, size and colour with the surrounding trees and grassy terrain,
as shown in Figure 3.1(b). Because the target could not be identified along a single salient
dimension, serial search along multiple dimensions was assumed to be required, where each
object must be scanned before moving on to the next, to identify the target among the
distractors. Thus, with more distractors appearing in the extended FOV conditions at any one
time, participants perhaps adopted a serial search strategy in a larger search area, leading to
worse performance compared to the smaller FOV.
3.9.4 Synthesis Taken together, it was perhaps possible for participants to accomplish the global task of
identifying the route without continuously updating their cognitive maps. For example, rather
than spatially integrate the information from successive views of the display, participants
could have simply memorised the time taken to traverse the straight parts. The single curve in
the route could have been determined relatively early in the curved portion, since it followed
46
a constant curvature throughout. Thus, instead of forming a cognitive map, the route could
have been reconstructed by combining three relatively simple judgments and then matching
the route to the corresponding shapes on the response grid.
Furthermore, although participants were asked to perform the detection and route
identification tasks “equally well”, the results suggested that, perhaps due to the actual
relative difficulties of the two tasks, the participants focused more on the target detection task
than the route identification task. Because the route identification task could be accomplished
relatively easily without continuous updating of one’s cognitive map, perhaps more attention
was placed on the detection task. Thus both tasks were too easy, allowing the tasks to be
accomplished without the benefits of an extended FOV, and hence the highest overall target
detection accuracy was found in the sFOV condition, where the smallest amount of area
needed to be searched.
In summary, this investigation into spatial task performance using three display conditions
yielded results that did not support the hypotheses predicting better performance for a
mosaiced FOV display. It was observed that for relatively simple routes, comprising portions
of two straight lines plus a constant curvature, performance using the extended FOV afforded
by either the mosaic or enlarged FOV was worse in the case of target detection and resulted
in no difference in the case of route identification. A thorough examination of the
experimental hypotheses, platform and results revealed the potential for a number of changes
in the procedure required to test the hypotheses.
47
Chapter 4. Experiment 2
4.1 Introduction
With the lessons learned from Experiment 1, it was decided to focus next on the benefits of
mosaicing for enhancing global spatial awareness, by excluding the target detection task and
thereby any confounds of dual task performance. To that end, a number of changes were
made to the simulated environment, as well as the viewpoint parameters of the virtual camera
providing a view of the terrain. First and foremost, it was decided that the most effective way
to test whether mosaicing has the potential to enhance route identification performance was
to devise a much more challenging global mental mapping task, one that encouraged
participants to sample the displayed visual information continuously. The relatively simple
routes in Experiment 1 were thus replaced by more complex winding routes that were not
predictable, thus requiring constant attention to accomplish the route identification task.
A single continuous shape consisting of a sum of four sinusoids was generated, representing
a long winding river of path length 18750m, contained in a forested area. This was inspired
by the naturalistic shapes of rivers often found in aerial landscape photography, such as that
shown in Figure 4.1(a). A number of segments within the long river were designated for the
purpose of the experiment as Correct Routes. Each of the routes had a different shape and
each had the same path length, 1500m. The amplitudes, frequencies and phase shifts of the
four sinusoids were adjusted in order to generate, through trial and error, a set of routes with
sufficient variety that the details of the entire set of routes could not reasonably be
memorised. Details of the sinusoid parameters are given in Appendix 2.
In order to further elicit differences between the various viewing conditions, a presumably
more difficult camera elevation angle was used in Experiment 2. As described in Section
2.7.1, distortions caused by perspective foreshortening may lead to encoding errors in
cognitive mapping. Thus, the camera elevation angle was fixed at a forward facing 45 degree
angle, instead of the top down view used in Experiment 1.
It was surmised that a crucial factor in performing the spatial awareness task would lie in the
height above the terrain. Figure 4.2 shows a particular segment of the long river, displayed
48
with the same 60° FOV, but presented at four different heights. Flying at a higher altitude
allows the observer to view a larger portion of the terrain below, which should in principle
afford better understanding of the global shape of the flown over route. In fact, one can
imagine the extreme case where the camera viewpoint is placed at such a high altitude that
the entire route is shown in the camera’s FOV, making the task of identifying the route
essentially trivial. What was not clear for the present environmental features and camera
parameters, however, was at what altitude global performance improvements might begin to
saturate. In other words, logic suggested that a point may be reached at which increasing the
altitude has no further benefit for identifying the route.
(a) (b)
Figure 4.1 – (a) Winding river landscape; (b) analogous computer generated ‘river’, consisting of sum of four
sinusoids.
49
(a) H1 = 20m (b) H2 = 56m (c) H3 = 92m (d) H4 = 164m
Figure 4.2 - Screenshots of one terrain segment, with constant 60° FOV, displayed at four heights, (a) H1 = 20m, (b)
H2 = 56m, (c) H3 = 92m, (d) H4 = 164m
In relation to the topic of mosaicing, recall that one of the anticipated benefits of the enlarged
FOV is that more of the terrain can be seen at any one time, an effect surmised to be
analogous to an increase in height. Therefore, with regards to our goal of determining
whether such benefits actually exist, it was important to avoid a situation for which the route
was so easily identifiable at a selected height in the sFOV condition that extending the
effective FOV by introducing the mFOV and dFOV conditions would have no measurable
benefit. The primary goal of the present experiment, therefore, was to determine empirically
the effect of the camera’s height above the terrain on participants’ performance in the route
identification task, using only the single FOV display condition. This experiment was thus
considered as a ‘calibration experiment’, in that the results would be used to select an
appropriate height for which to test the mosaiced FOV in the next experiment.
The hypothesis for the present investigation, therefore, was that as height is increased,
performance in identifying the traversed route should improve.
4.2 Experimental task
The experimental task for Experiment 2 was similar to the route identification task performed
in Experiment 1, but without the target detection task. The participant was first shown a 20
sec flyover video of an out-the-window view of an aircraft flying above the terrain along a
predetermined flight path. After completion of the video, the participant was asked to
identify which route he flew over. One experimental trial consisted of watching one video
and providing the Route identification. An important distinction from Experiment 1 was the
method of responding, explained below.
50
The experimental platform consisted of a computer program developed using Matlab. All
routes were generated in a virtual environment created in Google Sketchup and Google
Earth, made up of a computer generated grass terrain and a river running the length of the
route. Figure 4.3 illustrates six separate Routes selected from the long river, representing the
Correct Routes for Experiment 2. Each trial consisted of an overflight of one of the six
Routes, while the participant watched the flyover video in the Route Flyover window, similar
to those shown in Figure 4.2.
Figure 4.3 - Display of the six Routes selected for Experiment 2, chosen from the long continuous river (Left). Routes
on right show start of each Route with a green marker and end of each Route with a red marker.
At the end of each video, the Route Selection window appeared, as shown in Figure 4.4, and
the participant used the mouse to interactively indicate the route he believed he had just
flown over. The Route Selection window showed on the left side a top-down view of the
entire river, with green and red markers designating the respective start and end points of the
currently selected route. The participant clicked on the buttons in the centre to shift the start
Route 1
Route 2
Route 3 Route 4
Route 5
Route 6
51
and end markers together7 in either direction along the length of the river. The window on the
right showed a magnified version of the currently selected (highlighted) route resulting from
the button clicks.
Although there was theoretically an infinite number of start/stop positions that could have
been chosen along the continuous long river, in fact it was divided up into a total of 460
discrete equal length routes. The single arrow control buttons were used to displace the
markers along one segment at a time, while the double arrow control buttons moved the
markers more coarsely, 20 routes at a time. (Because of the large number of discrete
segments, and because the boundaries of those segments were not visible to the participants,
it is believed that they were not in fact aware that their inputs were not continuous.) When
the participant was content with his selection, he clicked on the Submit button at the bottom
right to enter that selection8. For a description of an alternative response method considered
in Experiment 2, combining the response grid from Experiment 1 with the complex routes,
please refer to Appendix A10.2.
7 Because all Routes had a common fixed length, only one degree of freedom was necessary for manipulating one’s
response; consequently the green and red dots moved together as participants clicked on the response buttons. 8 Although the form of the response has changed from the 10x10 grid used in Experiment 1, the route selection method
exhibits many of the same characteristics, despite the absence of an actual grid. Instead of 10x10=100 choices, there are now
460 choices, placed on a 1-dimensional response layout. This should make the task arguably easier for participants selecting
a route. Furthermore, the shape of the route is still recorded as in the grid method used in Experiment 1.
52
Figure 4.4 - Screenshot of Route identification Window. Left: top-down view of entire river. Centre: response
buttons, for controlling response; Right: instantaneous indication of selected route. Green and red markers indicate
respective start and end points of currently selected route.
A fully within subjects experiment was conducted by recruiting seven male participants from
the University of Toronto. All participants had normal to corrected vision, and ranged in age
from 18 to 40. None reported having had prior experience with aerial search and rescue type
tasks.
Following an explanation and two practice trials to become familiar with the platform,
participants performed 6 training trials, each at a different height and for a different Route.
The participants viewed each of the six randomised routes at each of the four heights in
Figure 4.2: H = {20, 56, 92, 164} metres. There were also two heights, 128 and 200 m, used
in the practice session that were not used during the experiment. After each selection, the
participants were shown the correct Route.
During the data gathering phase, participants completed a total of 48 experimental trials
divided into 2 blocks, with each block containing 4 sets (for each of the 4 Heights) of 6
randomised trials (for each of the 6 Correct routes) per set. Two pseudorandom sequences of
53
the 8 sets (2 blocks X 4 sets per block) were generated, and distributed among the seven
participants. Three pseudorandom sequences of the 8 sets were generated and distributed
among the 7 participants. One pseudorandom sequence was given to 4 participants, while the
other was given to 3 participants. Trials within each set were randomised, even though
participants may have received the same pseudorandom sequence of sets. A break of at least
3 seconds was given between trials (participants could extend this if desired), and a two
minute break was enforced in between blocks.
4.3 Results of Analysis
The selected routes were collected for all participants and were plotted for each Height along
with the corresponding Correct Routes. The set of six graphs for Height 2 (56m) and Height
3 (92m) are given in Figure 4.5. The complete set of graphs for all four heights can be found
Appendix A3.1. Each plot contained 14 selected routes in black ink (two for each of the
seven participants), as well as one Correct Route in red ink representing one of the six
Correct Routes. Note that the red Correct Routes (1 to 6) are identical in the two sets of
graphs, due to the fact that it was only the Heights that were varied.
(a) Routes selected at H2 (b) Routes selected at H3
Figure 4.5 - Examples of ensembles of selected Routes collapsed over all participants at (a) Height H2, (b) Height H3.
Each plot contains 14 Routes in black ink (two for each of the seven participants), as well as one Route in dashed red
ink representing the correct Route. The routes are translated so that their starting points coincide, while maintaining
the original North up representation (as seen in the Route identification window).
54
4.3.1 Challenge of Defining Objective Scoring Method Given the variety of route selections shown in Figure 4.5, one challenge is that of defining an
objective scoring method for evaluating performance. One class of performance measures
involves a purely computational approach, where routes are broken down into constituent
parts and then objectively compared to come up with an overall measure of error. To
demonstrate how one might apply an objective quantifiable metric to assess differences
between routes, Figure 4.6 presents two routes from the long river with slightly differing start
and end points, calling one the Correct Route and other the selected route.
Using the conventional root mean square error (RMSE) metric, one could simply sample N
points along each of the curves (CRcorrect and CRselected), define an appropriate distance score
between corresponding samples, and then compute an aggregated error over all points
between the curves by the equation:
√∑ ( )
.
The RMSE method is a standard practice in a number of domains, and served as a starting
point for comparing the route identifications with the Correct route. In the present context,
RMSE = 0 if the selected Route matches the Correct route, growing as differences between
samples along the two routes increases. Although this formula seems at first glance quite
straightforward, and thus appropriate for evaluating the accuracy of selected routes, it quickly
became clear that neither this RMS error measure nor most other ‘obvious’ computational
metrics would necessarily capture the real extent of errors for this particular route selection
task. The challenge posed in analysing the results of the present experiment involved how not
to inflate the error scores for route selections for which the errors were arguably in fact not
very large. The following are three examples to illustrate the inappropriateness of using RMS
error measures to score route selections used in Experiment 2.
The left side of Figure 4.6 depicts an error where the two segments lie very close to each
other on the long river, but with a slight offset. Because this discrepancy seems rather small,
one would intuitively expect a relatively low RMS error score to be produced by the formula
above. Regarding the right side of Figure 4.6, however, where two routes are presented with
a common starting point, as would be the case if the RMSE equation were to be applied, the
55
source of error between the two Routes lies in how far along the Route one travelled before
approaching the first curve to the left. In this case, the left turn occurs earlier in the trajectory,
as if the turn were shifted along the curve. Even though some elements of the flyover were
judged correctly, the RMSE score would still accumulate all of the differences in positions
along the Routes, potentially leading to inflated RMSE scores for curves that are shifted
relative to each other.
Figure 4.6 – An illustration of a Correct Route (in red) and a selected route (in black). The resulting RSME score
between these two routes would be large, despite the fact that the shapes are quite similar.
Another type of error occurs where the overall shape of the selected route tracks the Correct
route except for a small number of deviations somewhere along the trajectory. This is shown
in Figure 4.7, where the routes exhibit similar shapes despite being at different areas along
the long river. The only discrepancy lies in the relative lengths of the beginning and ending
segments. For the present example, the error in judging the first curve would carry over
throughout the rest of the error computation, causing the RMSE error to become inflated in
spite of only small differences in the shapes of the two Routes.
56
Figure 4.7 - An illustration of a Correct Route (in red) and a selected route (in black), where small deviations in the
route occur between the two routes. The resulting RSME score between these two routes would be large, despite the
fact that the overall shapes are quite similar.
Furthermore, given the complexity of the routes found in the long river, it was possible for
the participant to select a route with similar overall shape to the Correct Route, but simply
mirrored along the vertical axis, as shown in Figure 4.8. In other words, the shape was
correct, except for reversing the direction of the turns. Clearly the RMSE value describing
the magnitude difference between the two routes would be high, leading one to conclude that
the routes are very dissimilar. However, taking into consideration the demands of the flyover
task, I believe that the two routes are actually similar, with the important difference being the
turn directions. That is, the reader is reminded that the flyover videos are presented from a
track-up perspective without a canonical North, meaning that in sections where the path is
straight, it is impossible to know in which world-referenced direction one is travelling. In
other words, upon the basis of only watching the video, flying along an Eastbound direction
along a straight path is indistinguishable from flying along a Westbound direction along a
straight path. As such, a simple reversal of the turn directions can lead to the participant
selecting a mirrored route. In the Correct Route the first turn occurs to the left, whereas the
first turn occurs to the right in the selected route. Each subsequent turn direction is also
reversed, and even though the sequence of straight and curved portions is more or less
57
correct, as well as their relative proportions, the reversed turn directions lead to the mirrored
shape being selected.
Figure 4.8 - An illustration of a Correct Route (in red) and a selected route (in black), that exhibit similar shapes but
that are mirrored with respect to each other.
After considering these three examples of errors in the participants’ route selections, I
endeavoured to classify different types of errors based on the route identifications made by
the participants. Four types are identified, with examples of each type shown in Figure 4.9:
Translation errors: An error in which the overall shape was correctly identified, but
where the start point differed from the Correct route by a large amount. The
occurrence of this type of error was a consequence of the fact that the underlying long
river was a quasi-random signal, generated by the sum of a series of sine waves,
which resulted in similar (but not identical) patterns occurring along its length.
Phase shift errors: An error in which the start point of the selected route was close to
the Correct route except for some (small) offset. This type of error reflects a
participant’s having recalled the route very well, but with only a relatively small error
in recalling its start (or end) point. The result is that there is a slight offset between
the correct and designated routes, analogous to (in Engineering terms) the two routes
being “out of phase” with each other. Figure 4.6 is an example of this type of error.
58
Partial matching errors: An error in which most, but not all, of the overflown route
was recalled correctly, resulting in the choice of a route segment containing a small
number of deviations along the trajectory. An example of a partial matching error is
shown in Figure 4.7.
Mirroring errors: An error in which the relative distances were tracked well, but
with reversals of the left and right turns. As mentioned earlier, Figure 4.8 shows an
example of mirroring error, where even the steepness of the turns was tracked
adequately, but not the direction of the turns.
Figure 4.9 – Examples of four types of errors observed in route selections (black dotted line) compared to Correct
route (red line), (a) Translation error, (b) Phase shift error, (c) Partial matching error, (d) Mirroring error.
It became clear that any scoring method would have to adequately characterise these
different types of errors, given the cognitive challenges in accomplishing the task. Applying
an RMSE score to these four types of errors produces mixed results. Consider Figure 4.10,
which shows the routes in Figure 4.9 with the starting points are aligned. In the case of a
translation error (Figure 4.10(a)), an RMSE score would appropriately score the error as low
(i.e. RMSE = 0), and thus, an RSME score might be sufficient. However, it still remains that
for the other 3 error types (phase shift, partial matching and mirroring), RMSE would be
artificially inflated, as errors are carried over throughout the lengths of the trajectories, even
if the shapes are considered similar.
59
Figure 4.10 - Illustration of four types of errors observed in route selections (black dotted line) compared to Correct
route (red line), shown with starting points matching for (a) Translation error, (b) Phase shift error, (c) Partial
matching error, (d) Mirroring error.
Taking into account the kinds of errors described above, as well as the inability of most
RMSE measures9 to adequately capture the errors observed, the use of purely computational
methods was abandoned. With no viable alternatives in using objective measures, I turned to
a subjective method of evaluation, namely Thurstone’s Method of Paired Comparisons to
evaluate the Route selections in Experiment 2.
4.3.2 Paired Comparisons Method The paired comparisons method (PCM) is one of several well-known scaling methods for
comparing object attributes (Dunn-Rankin et al., 2004). Thurstone (1927) proposed the PCM
as a law of comparative judgment for placing objects along a psychological continuum of
quality. The ratings can refer to essentially any properties, ranging from perceived weights
(Thurstone, 1927), to opinions on political issues, to perceived video quality (Woods et al.,
2010).
The essence of PCM is to aggregate a set of judgments about attributes of objects, carried out
two at a time, and to transform those aggregations onto a single rating scale. Judgments of
quality between any two objects may vary across judges and may also vary in time within
judges from one comparison to the next, even for the same objects. Thus the underlying
neural or psychological processes whereby the quality of objects is judged dictate the spacing
of this scale, typically based on the premise that the underlying processes follow a normal
9 In addition to the computation of RMS error in terms of geometric distance between the two curves, other RMS measures
were investigated and ultimately abandoned. For example, both first and second order derivatives of the selected routes were
matched against those of the Correct routes in order to develop separate RMS velocity and acceleration measures. The
rationale there was that participants might be especially sensitive to changes in direction, and/or rate of change of direction,
and thus to recognise such performance accordingly. Similarly, a metric of RMS curvature error was also computed, under
the rationale that participants might excel in recalling the continuity of the overflown route. In all cases, inflated values of
error were observed, for essentially the same reasons as described above. It was thus concluded that these alternative
objective RMS error measures could also not adequately evaluate the participant’s performance in identifying routes.
60
distribution. In the present work, we consider Case V of Thurstone’s law of comparative
judgment, which provides the most simplifying assumptions for the standard deviations and
correlations of the distributions of these “discriminal processes”.
With PCM, pairs may be evaluated by a single judge performing multiple repeated
comparisons or, as in the present work, by multiple judges, who are informed about the
criteria along which they must compare the objects’ quality. It is the multiplicity of
judgments that provide the basis for satisfying the model’s requirements as being based on
underlying discriminal processes.
For a comparison set of t objects, the total number of comparisons required is: t*(t-1)/2.
Judgments for all pairs of objects are tabulated in a confusion matrix. For example, if object
A is judged to be of higher quality than object B, then the cell entry for column A / row B is
increased by one. If the converse judgment is made, then the cell entry for column B / row A
is incremented. After all judgments have been tabulated, the cells in the matrix are converted
to proportions of the total number of comparisons, and then converted into standard normal
Z-scores. The latter values are averaged down the columns of the matrix to generate a mean
Z-score for each of the t objects.
Due to the underlying assumption of normally distributed discriminal processes, the resulting
mean Z-scores can be shown to lie along an equal interval scale, meaning that unit
differences between two points at one place on the scale are equal to unit differences at other
places on the scale10
. The spacing between objects on the scale represents the ‘psychological
distance’ along the particular perceived continuum of quality. Implicit to Thurstone’s method
is the fact that the relative spacing between objects is what is important, not the values
themselves. This has two implications for Thurstone’s scales. First, the linear scale can be
adjusted by means of any linear transformation. For example, a common approach is to scale
the PCM values by √2 to transform the scale to units of Standard Normal deviates
(Thurstone, 1927). Second, because only relative spacing is important, the assignment of the
10
A common example is that of the Celsius temperature scale. The difference between 20ºC and 35ºC is the same as the
difference between 100ºC and 115ºC.
61
0 value is arbitrary. Typically one assigns one of the psychological objects, usually the one
with the lowest mean Z-score, as an anchor and assigns it a value of zero.
A number of variations and modifications to Thurstone’s original method have been
proposed, including corrections for extreme proportions of less than 0.02 or greater than
0.9811
(Dunn-Rankin et al., 2004), eliminating bias in presentation of objects by generating
ordered pairs (Ross, 1934), and relaxing some of Thurstone’s original assumptions
(Mosteller, 1951). Furthermore, a number of statistical tests for significance of paired
comparison scores have been proposed (David, 1988; Gridgeman, 1963; Jackson and
Fleckstein, 1957; Starks and David, 1961; Woods et al., 2010). For example, Edwards (1957)
provides for a comprehensive treatment of the data from PCM, including checking for
consistency among data, testing assumptions of Thurstone’s Case V method, and estimating
the properties of the discriminal dispersions of psychological objects.
An interesting consequence of Thurstone’s model is that, even though a (preferably) large
number of judges carry out a set of comparisons between different instances of a desired
quality, it is the aggregation of those judgments that are used to form proportions, and are
then transformed, using the Normal distribution function, to generate equal interval scale
ratings. As a result, the aggregated data12
may not be normally distributed and may have
inter-subject dependencies, which violates the assumptions of a conventional analysis of
variance (ANOVA). In other words, the transformations performed on the aggregate data do
not scale appropriately to conventional statistics, and for this reason an ANOVA was not
applied. However, statistical tests for one-way effects and contrasts between conditions are
available (Starks and David, 1961). This is discussed in further detail in Section 4.3.5.
Of particular interest in the present context was the need to evaluate performance of
participants in estimating the shape of complex routes, in the absence of an alternative
directly computable scoring method. The premise was that PCM would provide insight into
11
The problem with extreme proportions is that the Z approaches -∞ as probability approaches zero, and Z approaches +∞
as probability approaches one. 12 For example, one can imagine a table whose rows represent the judges and whose columns represent preferences of one
object over another. In other words, each cell represents the number of times each option was preferred (out of the choices
provided) within each person.
62
the collective performance of the participants in the assigned route identification task, based
on their aggregated data.
In concluding this section, three important points merit being pointed out.
The paired comparisons method as used here relied on sets of pairwise judgments that
were carried out on the response data recorded during the experimental trials. As such,
applying the paired comparisons method to the participants’ route identification data
represents, as far as is known to the author, a previously unexplored method for
evaluating spatial task performance.
Further to the point above, it is important to distinguish between two very different
populations of “judges” in this experiment:
o The participants who observed the video flyovers and then used their judgement
to select for each flight the route that had just been overflown.
o The volunteer judges who were recruited (see below) to judge the responses of the
route selecting participants in the experiment.
In spite of the fact that in carrying out the paired comparisons each of the (second) group
of judges provided their individual subjective judgments regarding the different pairs of
route identification data, it is important to realise that the scale values produced by their
collective judgments represents a set of objective scores of spatial task performance.
4.3.3 Application of Paired Comparisons In setting up the paired comparisons, it was reasoned that, because the interest was in scaling
performance as a function of height, there was no point in comparing performance across
different routes, especially given that the routes were quite different from each other. (For
example, with reference to Figure 4.5, there was no point in comparing performance at
Height 2 for Route 1 with performance at Height 3 for Route 2.) Consequently a separate set
of paired comparisons of route identifications across the different heights was carried out for
each of the six routes. That is, for Route 1 performance for H1 was compared with that for
H2, H3 and H4, for a total of 6 (=4*3/2) judgements. This was repeated for each of the 6
routes, for a total of 36 comparisons.
63
Twenty-one volunteer (i.e. unpaid) judges were recruited by email to carry out the paired
comparisons. Please see Appendix A4.1 for a copy of the instructions and response format.
They were presented with pairs of ensembles of selected routes (the black curves), similar to
the sample ensembles shown in Figure 4.5, and were instructed for each pair of ensembles to
indicate which of the two more closely matched the corresponding Correct Route, shown in
the figures as the red curve. Note that for all comparisons the two red curves were always
identical; it was only the relationships of the ensembles of black curves to the red curves that
were being compared.
All judges completed the full set of 36 comparisons by replying on a Google form created for
the purposes of collecting the PCM data. Using Thurstone’s method, the judgments were
compiled into a single 4x4 confusion matrix for the four Heights, by combining judgments
for all of the six routes for each height pair. For example, all the comparisons between H1
and H2 are grouped for Routes 1 to 6 in the same cell. In other words, completing the task for
the six routes was considered to represent six instances of the same perceptual task, at a
given Height. The aggregated confusion matrix and other calculations are shown in Table
4.1.
H1 H2 H3 H4
H1 - 87 79 103
H2 39 - 46 75
H3 47 80 - 80
H4 23 51 46 -
Table 4.1 - Aggregated confusion matrix of paired comparison judgements for performance at four Heights: H1, H2,
H3, H4. Table should be interpreted as preferences of the column element over row element.
The raw matrix values were converted into proportions of the total number of judgements, as
shown in Table 4.2. Note that the diagonal entries of the matrix are filled with a proportion of
0.5, as it is assumed that being presented two sets of the same Routes, the judges overall
would select one of those sets 50% of the time.
64
H1 H2 H3 H4
H1 0.500 0.690 0.627 0.817
H2 0.310 0.500 0.365 0.595
H3 0.373 0.635 0.500 0.635
H4 0.183 0.405 0.365 0.500
Table 4.2 - Aggregated scores converted to proportions of the total number of judgments over all judges (in this case
126).
The proportions in Table 4.2 were then converted to Z-score values using standard normal
tables. The columns in the confusion matrix were then summed and averaged over the
number of objects (4 in this case) to obtain the mean Z-scale for performance at each of the
four Heights. Because this is an equal interval scale, a shift of all the values does not affect
the distances between the scale values. Thus, as a last step, the scale values are shifted so that
the lowest scale value acts an anchor at a value of 0 for the scale. Calculations are shown in
Table 4.3, and the final scale values are plotted in Figure 4.11.
H1 H2 H3 H4
H1 0.000 0.497 0.324 0.906
H2 -0.497 0.000 -0.345 0.241
H3 -0.324 0.345 0.000 0.345
H4 -0.906 -0.241 -0.345 0.000
Sums -1.727 0.601 -0.366 1.492
Means -0.432 0.150 -0.091 0.373
Means + 0.432 0 0.582 0.340 0.805
Table 4.3 - Proportion scores in the confusion matrix converted to Z scale units. The values are then summed along
the columns to compute the mean Z values. Finally, the values are shifted by the minimum value to anchor the values
to 0.
Although it was hypothesised that increased height would result in better performance in
identifying the flown over Route, the results obtained in Table 4.3 were unexpected. In terms
of the order of the psychological objects, the performance at H1 and H4, corresponding to
the least and greatest heights respectively, fell in line with the expected results, with the
worst performance at the lowest altitude and the best performance at the greatest altitude.
The results also suggested, however, that performance at H2 was better than at H3, which
65
was inconsistent with the hypothesis for Experiment 2. These unexpected results let to further
analysis of the data, as follows.
Figure 4.11 - Final PCM scale values for route identification task for four Heights, from Table 4.3.
4.3.4 Outlier analysis An examination of the raw data revealed an interesting observation about the participants’
responses. In particular, as illustrated in Figure 4.12, there appeared to be some confusion
between the Routes 2 and 5, which had some similarities in overall shape but whose left and
right turns were reversed.13
In retrospect, it was surmised that this important commonality
may have confounded performance for the Route 5 trials, by leading the participants to
frequently select routes resembling Route 2 instead. This is especially evident in the bottom
row of Figure 4.12.
13
Although Route 2 and Route 5 might appear different in the (exocentric) North up representations shown in Figure 4.12,
the reader is reminded that the participants carried out these tasks from a track-up (rotating azimuth) perspective, meaning
that there was no easy way to maintain a sense of a canonical North as the participant flew over the route.
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(b)
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
(c)Height
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(a)
66
Figure 4.12 - Aggregated route selections for Route 2 and Route 5, for each of the four Heights H1 to H4. Each plot
contains 14 Routes in black ink (two for each of the seven participants), as well as one Route in dashed red ink
representing the correct Route. The routes are translated so that their starting points coincide, while maintaining the
original North up representation (as seen in the Route identification window).
This supposition was further supported by examining the individual PCM graphs derived
from the non-aggregated confusion matrices for each of the six Routes, shown in Figure 4.13.
For the most part, the graphs for each of the routes follow a general trend where a greater
height resulted in better Route identification performance. However, the relative scores for
the four Heights {H1, H2, H3, H4} for Route 5 lied in stark contrast to those for the other
five routes, with the greatest height H4 exhibiting the worst performance, followed by H3.
67
Figure 4.13 – Graphs for PCM results for each of the six Routes, aggregated over all participants, for Routes 1 to 6.
Taking into consideration what appeared to be a confounding outlier Route among the set of
six, PCM scale values were recomputed without the comparisons for Route 5. (The
calculations are provided in Appendix A5.1.) The resulting graph is shown in Figure 4.14(b),
along with the graph from Figure 4.11 for all six Routes reproduced for comparison in Figure
4.14(a).
Although the order of the psychological objects remained the same, comparing the graphs
reveals an interesting change in the spacing between performances at the different heights.
Whereas the objects in the original scale were spread out fairly evenly, the adjusted graph in
Figure 4.14(b) indicates non-uniform differences between objects. In particular,
performances at H2 and H3 appear to be much closer to each other, whereas the distances
between those two performances and at H1 were enlarged. In other words, it appears that
performance at both H2 and H3 was better compared to H1, but that the difference between
H1 H2 H3 H40
1
2
3Route 1
Scale
valu
e
Height
H1 H2 H3 H40
1
2
3Route 2
Scale
valu
e
Height
H1 H2 H3 H40
1
2
3Route 3
Scale
valu
e
Height
H1 H2 H3 H40
1
2
3Route 4
Scale
valu
e
Height
H1 H2 H3 H40
1
2
3Route 5
Scale
valu
e
Height
H1 H2 H3 H40
1
2
3Route 6
Scale
valu
e
Height
68
H2 and H3 was relatively small. Similarly, the distances between H4 and all other
performances were even more pronounced in the adjusted scale.
Figure 4.14 – Final PCM scale values: (a) using all comparisons; (b) using all comparisons except those from Route 5.
4.3.5 Statistical tests and checking assumptions As mentioned earlier, the Case V model of PCM uses simplifying assumptions concerning
the standard deviation of a discriminal processes, which is referred to as its discriminal
dispersion. In order to check the assumptions of the Case V model, Edwards (1957) provides
a significance test which is sensitive to the property of additivity among other assumptions.
Furthermore, after the original 1927 paper on the PCM method, Thurstone (1932) and Burros
(1951) proposed methods for estimating discriminal dispersions of the psychological objects.
This would allow the objects to be scaled based on the dispersion values, in the event that
assumptions of the Case V model were not found to be tenable (Edwards, 1957).
Applying this test led to the conclusion that the assumptions of Thurstone’s Case V model
were not tenable for the Experiment 2 PCM data. The explanation and calculations are
provided in Appendix A6.1. Thus, adjustments were made to the scale values by estimating
the discriminal dispersions of the psychological objects, in order to interpret the data under
Thurstone’s Case III model. The Case III model imposes fewer restrictions on the discriminal
processes, namely that the standard deviations (or discriminal dispersions) are not assumed to
be equal. As this was the case for the Experiment 2 data, the discriminal dispersions were
calculated, as shown in Appendix 7. Essentially, the Case III model could be followed
instead of the more general Case V model. Figure 4.15(c) shows the linear scale values with
the inclusion of the discriminal dispersions, under the Case III model.
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(b)
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
(c)Height
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(a)
69
Figure 4.15 - Three computed scale values for Experiment 2 results, (a) all data, Case V method (b) all data excluding
Route 5, Case V method, (c) all data excluding Route 5, Case III method.
In order to further illustrate the differences at the various levels of Height, a two-dimensional
plot is presented in Figure 4.16, with the X-axis showing the Height in metres above the
simulated terrain and the Y-axis showing the scale values.
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(b)
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
(c)Height
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(a)
70
Figure 4.16 - Experiment 2 PCM values for all data excluding Route 5, Case III method. The actual Heights in
metres are shown.
Follow-up analyses on the adjusted confusion matrix were performed to test for statistically
significant differences among scores. Starks and David (1961) developed a test statistic D for
this purpose:
[∑
] ⁄
⁄
14
14
Note that special consideration must be taken for the value of N with regards to Experiment 2. There were 21
judges performing the paired comparisons for the 4 objects (different heights), meaning t(t-1)/2 = 6 pairs.
Furthermore, each of the 6 pairs of comparisons was repeated for each of the six Routes, for a total of 36 paired
comparisons per judge. It is believed that the original interpretation of N representing the number of judges is
somewhat misleading in this case, as each judge in essence provides a multitude of sets of judgements for the
PCM, albeit for pairs representing different instances of the same perceptual task. In other words, five of the six
sets of judgments would not be accounted for under the original interpretation of the statistical test. For this
reason, the value was modified to be N = 21*6 = 126.
71
This is the special case of a more general test proposed by Durbin (1951), and equivalent to
one proposed by Kendall and Smith (1940). It follows a χ2 distribution with t-1 degrees of
freedom for the number of psychological objects compared. The test statistic uses the matrix
of raw data, operating on a null hypothesis that all treatments (in this case, Heights) are all
alike in the response they evoke. In other words, the judgments from the paired comparisons
are matched against a null hypothesis of there being no preference between each pair of
objects, across all comparisons, which in turn is equivalent to postulating that all discriminal
dispersions are equal. The alternative hypothesis is that there is a preference between pairs
of objects. The ai values from the Experiment 2 data are shown in Table 4.4.
H1 H2 H3 H4
H1 - 71 73 98
H2 34 - 43 74
H3 32 62 - 75
H4 7 31 30 -
ai 73 164 146 247
Table 4.4 - Aggregated confusion matrix, with column totals, ai.
In this case, = 630/4 = 157.5, t = 4, N = 21*6. Analysis showed that the differences
observed in the PCM scores were statistically significant, D = 121.63 against a critical value,
χC2(3, N = 21*6) = 7.82, p < .05. Because D > χC
2, the null hypothesis was rejected. Thus,
results from the Paired Comparison Method suggested that Height did have a significant
effect on participants’ ability to perform the route identification task.
Pairwise contrasts were calculated for the six pairs of combinations of Heights using a
method analogous to the Scheffé method (corrected for Type I error), developed by Starks
and David (1961). The method uses an approximation of the distribution of joint probabilities
of the preferences for one treatment against the remaining treatments in the set. For the
specified contrasts, a Q2 test statistic is created to match the differences between sets of those
approximations against a critical χ2 value, different from the one described above, which has
been adjusted for the contrast between two treatments. The results in Table 4.5 show that 5 of
the 6 contrasts were statistically significant for the column element being preferred over the
row element. Please refer to Appendix A8.1 for calculations of the contrasts.
72
H1 H2 H3 H4
H1 - 65.72** 42.29** 240.29**
H2 - 54.67**
H3 2.57 - 80.96**
H4 - Table 4.5 - Results of pairwise contrasts between levels of Height in Experiment 2, following the Scheffé method
outlined in Starks and David (1961). The value in each cell represents a Q2 test statistic for the column element being
preferred over the row element. Critical values at α = 0.05 and 0.01 are indicated by * and ** respectively.
Figure 4.17 - Plot of the PCM scale values, including contrasts results. Each line indicates that a significant contrast
was found between the conditions at the endpoints of that line.
4.4 Discussion
Overall, the results confirmed the hypothesis of increased performance with increased height,
but with some important nuances. It was clear from Figure 4.15(c) and Figure 4.17 that
performance at the lowest height H1 = 20m afforded the worst performance, and that
performance was best at the highest height, H4 = 164m. Pairwise contrasts revealed that
performance at these two Heights was found to be significantly different from all other
Height levels. However, performance at H2 and H3 was not found to be significantly
different from each other, suggesting that increasing the height from 56m to 92m provided no
additional benefit, even though an increase from 20m to 56m showed a statistically
significant performance increase.
73
In conclusion, the results of Experiment 2 therefore indicated that Height did have an effect
on performance in the route identification task. Consequently, one of the height values was
selected as a fixed value for Experiment 3. In deciding which height that would be, it should
be recalled that increasing the height above the terrain with a fixed size FOV one increases
the amount of terrain that can be observed in any one image. A functionally equivalent effect
might be achieved by providing a FOV whose size is double that of the single size (i.e. in the
dFOV condition), such that it affords a larger portion of the flyover route to be viewed at any
one time, albeit with a larger display size15
. Therefore, in order to select a height value from
Experiment 2 to be used in Experiment 3, a value was chosen with the intention that there
would be potential for improved performance from a larger view of the terrain, whether by
increased height or by extending the FOV. In the case of Experiment 3, I sought to
investigate potentially improve performance by extending the FOV through mosaicing.
Within the context of the Experiment 2 results, H4 = 164m was found to produce much
improved performance relative to H3 = 92m. Thus setting the height at H3 for the single
FOV condition was expected to allow sufficient room for potential improvements to be found
using the dFOV and mFOV conditions, whose extended FOVs might provide an effect
analogous to increasing the height to H4.
In summary, given the complex winding routes and perspective viewpoint introduced in
Experiment 2, this investigation was carried out to determine the effect of height on route
identification performance. As such, the results had important implications for the follow-up
experiment, as the equal interval scale allows the appropriate height to be selected at which
the route identification task can be accomplished. Furthermore, a new method was developed
for evaluating human performance in route identification involving complex winding routes.
The Paired Comparisons Method has proven useful in objectively evaluating aggregate data
along a psychological continuum of quality, to provide a measure of performance that could
not be attained by purely computational methods.
15
The only other difference would be that the spatial resolution with which the terrain is viewed changes as the
height is varied. However, I do not believe that this is important for the purposes of selecting the height for the
global spatial awareness task in the present investigation.
74
Chapter 5. Experiment 3
5.1 Introduction
After developing a complex route in the form of a long winding river, the results of
Experiment 2 served to determine an appropriate height above the terrain, H3 = 92m, for
which it was expected to observe some performance differences between the three display
conditions. Furthermore, as a consequence of the limitations of conventional computational
methods for evaluating global task performance, the method of Paired comparisons was
adopted for evaluating the routes selected by the participants.
It should be recalled that a perspective viewpoint was adopted in Experiment 2, as another
factor that was presumed to make the task of performing route identification more difficult. It
was surmised that the introduction of perspective foreshortening in an angled view would
lead to a greater challenge in forming an accurate cognitive map of the terrain compared to a
top down view. In Experiment 3, I sought to explicitly test the effect of viewing perspective
on route identification performance. In particular, two camera elevation angle (EA) values
were tested: EA = 90°, corresponding to the top down view used in Experiment 1, and EA =
45°, equivalent to the viewpoint used in Experiment 2. The two EA values are shown in
Figure 5.1. The results are expected to have practical implications for the use of mosaic
displays, whose shape properties depend on the parameters of the camera’s viewpoint
relative to the terrain. For a discussion of changes made to the flyover environment based on
pilot testing of EA, please refer to Appendix A10.3.
Figure 5.1 - Illustration of the angled (45°) and top down (90°) viewpoints used in Experiment 3.
75
Using the results and lessons learned from the first two experiments, Experiment 3 sought to
determine the effect of the mosaiced FOV on performance in a spatial awareness task
involving traversal over and identification of a complex winding route. Thus in Experiment 3
I revisited the hypothesis presented in Experiment 1 – that the unique shape properties and
increased size of the mosaic FOV condition should afford increased performance in both the
local and global spatial awareness tasks compared to a fixed single FOV. Thus three display
conditions were tested in Experiment 3 in order to investigate the effect of image mosaicing
on spatial task performance.
Single field of view (sFOV)
Double the size of the single field of view (dFOV)
Mosaic field of view (mFOV)
In order to determine the number of superimposed frames for the mosaic condition to have
roughly equivalent display sizes for the mFOV and dFOV conditions, an analysis of the
display areas determined that an image mosaic composed of 10 frames was equivalent to the
display size of the dFOV. Details of the procedure are provided in Appendix 9.
Taken together, the two camera elevation angles and the three display size produce a total of
six combinations of viewing conditions, as shown in Table 5.1.
76
Display size
sFOV mFOV dFOV
Ca
me
ra e
levatio
n a
ng
le
45°
90°
Table 5.1 – The six combinations of display condition and camera elevation angle used in Experiment 3.
As the Elevation angle is changed from 90° to 45°, the viewpoint will cause perspective
foreshortening, which is expected to distort the participants’ judgements of the spatial
relationships of terrain features in the scene viewed from the camera’s FOV. In the 90°
viewpoint, the participant should be better able to integrate the spatial information of the
environment between successive views, and thus it was hypothesised that performance in
identifying the traversed route should be better in the 90° condition compared to the 45°
condition. Concerning the target detection task, the EA=45° provides a ‘preview’ of the
upcoming terrain compared to the view afforded by the EA=90° condition. Compared to the
targets in the EA=90° whose size remain fixed during the flyover, the foreshortening causes
the targets to appear smaller as they enter the FOV from the top of the display, but also
appear to grow in size as they reach the bottom of the display. As Stager (1974) pointed out,
77
targets appear in the fixation field for a longer period of time as the operator gazes away
from the terrain beneath himself, and thus it was hypothesised that the target detection
performance would be higher at EA = 45°.
The single FOV (sFOV) acts as a baseline condition, whose display size is smaller than both
the mosaic FOV (mFOV) and the double size FOV (dFOV). It was hypothesised that the
larger size afforded by the mFOV and dFOV will result in better route identification
performance compared to the sFOV condition. In addition, the unique shape properties of the
mFOV will help in forming the spatial relationships between the objects and textures in the
environment. Accordingly, it was hypothesised that the task performance in the mFOV
condition will be better than the dFOV condition. Concerning the target detection task, as in
Experiment 1, it was hypothesised that the extended FOV would afford higher detection
performance in the dFOV and mFOV conditions, compared to the sFOV condition.
5.2 Experimental procedure
Combining the procedures of Experiments 1 and 2, participants were asked to watch a series
of 25 sec flyover videos, generated in a virtual environment created in Google Sketchup and
Google Earth. As shown in Figure 5.2, six routes were selected from the same long river used
in Experiment 2. A similar approach to that of Experiment 1 for defining ‘target zones’ was
taken (as described in Section 3.3), by including ‘neutral zones’ to ensure that only one target
could appear within the display at any one time. Each video consisted of a computer
generated grass terrain and a river running the length of the route, with each route containing
seven targets, shown in Figure 5.3, placed within the events the grass terrain. During the
flyover, participants indicated that they detected a target by pressing on a large button located
on the bottom of the ‘Route flyover’ window using a computer mouse, shown in Figure 5.4.
After each video, participants were asked to select the route flown over from within the long
winding river using the buttons in the ‘Route identification’ window, as in Experiment 2.
78
Figure 5.2 - Display of the six Routes selected for Experiment 3, chosen from the long continuous river (Left). Routes
on right show start of each Route with a green marker and end of each Route with a red marker.
(a) (b)
Figure 5.3 - Example of target used in Experiment 3: (a) target magnified to show textures, (b) target within flyover
terrain. In this screenshot, the target is located on the bottom right of the FOV.
In order to ensure that display areas of equal size were being searched for targets for all
conditions, and thus that no advantage was given to the larger mFOV or dFOV conditions,
participants were asked to detect the targets within a portion of the screen demarcated by red
markers superimposed on the display window. In the screenshot shown in Figure 5.4, the
markers flanked the sides of the entire sFOV display area. In the mFOV and dFOV
79
conditions, the markers demarcated an area equal to that of the sFOV condition in the top
half of the display.
In their briefing on the scenario and the two experimental tasks, participants were asked to
perform both tasks equally well. They then conducted 12 training trials with feedback on
both tasks. Two trials for each combination of display condition and viewing perspective
were provided. In the target detection task, the experimenter watched the training video
alongside the participant, and made notes of any missed targets or False Alarms. Participants
were told that there could be any number of targets in each route during the training and
experimental trials. At the conclusion of the video, the experimenter replayed the video and
pointed out the missed targets. For the global awareness task, the participant was shown the
correct route on the response grid after having made his selection.
Figure 5.4 - Screenshot of the ‘Route flyover’ window for the dFOV condition. Participants were asked to press the
‘Target detected’ button beneath the image when a target appeared within the area designated by the red markers.
Note: a target is currently showing in the screenshot, half-covered at the top of the FOV.
80
To ensure that participants were searching within the markers, as well as to generate baseline
performance data, 3 trials were conducted after the training session for which participants
were asked to performed only the target detection task. The participants then completed a
series of 42 experimental trials, carried out in 6 sets of 7 trials in each set. In each of the six
sets, one combination of display size (3) and camera elevation angle (2) was presented. The
first trial of each set of 7 trials was a flyover video different from the 6 Correct routes chosen
for the experiment16
. (This first trial from each block was excluded from the analysis to avoid
any confounding transfer effect between blocks.) The remaining 6 trials corresponded to the
6 Correct routes illustrated in Figure 5.2, randomised for each set and for each participant.
Four pseudorandom sequences of the 6 sets of display conditions were generated, and
distributed among the 13 participants. One sequence was given to 4 participants while the
other three were given to 3 participants. A break of 1 minute was enforced between trial
blocks.
After completing all of the experimental trials, the participants conducted a subjective rating
of the six viewing conditions, to elicit some knowledge about which conditions they felt
afforded the best performance in the global task. The participants were asked to perform a
series of randomised paired comparisons of each of the 15 (=6*5/2) combinations of display
conditions and viewing parameters. In each comparison, the Participant Paired Comparison
Window showed two flyover videos of the same Route but with different viewing parameters
playing simultaneously. The participant was then asked to indicate “which of the two viewing
conditions allowed you to more accurately identify the shape of the Route?” A screenshot of
the ‘Participant Paired Comparison’ Window is shown in Figure 5.5. The participant
responded to each comparison by filling out a Google form on another computer monitor
containing the options LEFT or RIGHT for each pair.
16
Three different routes were chosen for the purposes of being the first trial in the set. Thus, across the 6 sets of
routes, each of these three routes was used twice.
81
Figure 5.5 – Screenshot of the ‘Participant Paired Comparison’ Window, presented to the participants after
completing all experimental trials.
A fully within subjects experiment was performed with 13 graduate students from the
University of Toronto. All were between the ages of 18 and 40, with normal or corrected to
normal vision. Only male participants were recruited in order to avoid any confounding inter-
gender differences in spatial awareness performance. None reported any previous experience
as a search and rescue operator.
The participants completed the entire experiment within approximately two hours and were
compensated with $30.
A similar procedure to Experiment 2 was carried out for using the Paired Comparisons
method to evaluate the performance in the route identification task. The route selections were
aggregated over all participants in order to provide a representation of the collective
performance across the 6 viewing conditions. For each of the routes, 15 pairs (=6*5/2) of
display plus viewing conditions were compared. For the 6 routes, this resulted in a total of 90
pairs (6*15) of diagrams. Twenty-four volunteer judges were recruited to carry out the
paired comparisons. Appendix A4.2 shows the instructions sent to the judges, as well as
screenshots of the interface for performing the paired comparisons.
82
5.3 Results
5.3.1 Target detection task The procedure for analysing the target detection performance was similar to that of
Experiment 1. Screenshots of the display were recorded when the participants clicked on the
‘Target detected’ button on the interface. To measure performance on the task, the
experimenter visually assessed all instances of reported target detections for all participants.
While there were instances where some targets were missed, there were no instances for
which a participant had detected a target when there was in fact no target present (i.e. False
Alarms). As was done in Experiment 1, a simple performance measure of proportion of
identified targets was thus adopted.
The baseline target detection trials, carried out prior to the regular flyovers, when the
participants performed only the detection task, showed perfect performance – that is, all
targets were detected, with no False Alarms. Figure 5.6 shows target detection performance
for the six combinations of display conditions, aggregated over all 13 participants. The graph
suggests that overall different levels of the FOV size and viewing perspective did not appear
to influence the detection rate.
Figure 5.6 - Graph of target detection performance for the six experimental conditions: {45°,90°}x{sFOV, mFOV,
dFOV}.
45,sFOV 45,mFOV 45,dFOV 90,sFOV 90,mFOV 90,dFOV0
0.25
0.5
0.75
1Exp3: Local task performance for six display conditions, over all participants
Viewing condition
Pro
port
ion o
f ta
rgets
dete
cte
d
83
A 2-way within-subject ANOVA (elevation angle X display condition) found no significant
angle main effect (F(1,12) = .156, p > 0.05), as well as no significant display condition main
effect (F(2,11) = .156, p > 0.05), nor a significant interaction effect (F(2,11) = .156, p >
0.05). Please refer to Appendix A1.2 for the statistical tables and the results of the
assumption tests.
5.3.2 Route identification task Using Thurstone’s method under the Case V method, the judgments were compiled into a
6x6 confusion matrix for the six combinations of display condition and viewing condition, by
combining judgments for all of the six routes17
. Given that the computed scale values
represent six combinations of a two-factor design, the scale values were plotted into two 2-
dimensional graphs separated by Display size and Elevation angle in Figure 5.7, to illustrate
the relationships between the treatments. The scale values are shown in units of Standard
Normal deviates, for reasons of convenience when discussing the results. The calculations for
the scale values are shown in Appendix A5.2.
(a) (b)
Figure 5.7 - Plots of PCM results for closeness of aggregated route selections to Correct route performance: (a)
across the three FOV conditions for each Elevation angle, (b) across the two Elevation angles for each Display size.
17
Note that there was no interest in rating performance for the different routes, which is why data were
combined over routes.
sFOV dFOV mFOV0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Display size
Scale
valu
e
EA = 45
EA = 90
45° 90°0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Elevation angle
Scale
valu
e
sFOV
dFOV
mFOV
84
It was observed that the worst route identification performance was judged to be at {EA=45°,
sFOV}, while the best performance was judged to be at {EA=90°, mFOV}. The results in
Figure 5.7(a) suggest that for each of the three Display sizes, performance at EA = 90° was
judged to be better than at EA = 45°. The separation between the EA values for the sFOV
and mFOV appear to be larger than those for the dFOV. The results in Figure 5.7(b) suggest
that for each level of Elevation angle, performance in route identification was judged to be
best for the mFOV, followed by the dFOV and then the sFOV. However, for EA = 45°, the
separation between dFOV and mFOV was relatively small, compared to the equivalent
conditions for EA = 90°. Among the three Display sizes, the route selections in the sFOV
were judged to be of the lowest overall quality, that is, that least closely matched the correct
route.
A one-way statistical analysis showed that the differences observed in the PCM scores were
statistically significant, D = 233.16 against a critical value χC2(5, N = 24*6) = 11.07, p < .05.
Because D > χC2, the null hypothesis was rejected. In other words, there were significant
differences among the six combinations of Display size and Elevation angle; consequently, a
set of pairwise contrasts between pairs of viewing condition combinations was computed18
.
The contrasts between the three Display sizes (contrasts 1-6) and between the two Elevation
angles (contrasts 7-9) are shown in Table 5.2. The calculations for the complete set of all 15
pairwise treatments contrasts are provided in Appendix A8.2.
18
One issue that arises here is that of being able to collapse across conditions, such as Display size or Elevation
angle, to investigate the overall effect of each factor on its own using pairwise contrasts. In order to do this, one
would require that a separate set of paired comparisons be carried out with the routes presented on the same
graph across conditions. For example, to perform pairwise contrasts collapsed over Elevation angle, one would
need to present judges sets of routes for the various Display sizes, aggregated over EA = 45° and 90°. Such
data were not collected in this study, and thus pairwise contrasts to test for such main effects are not possible.
85
Treatment χC2(5,0.05)* =
22.14,
χC2(5,0.01)** =
30.18 45°,
sFOV
45°,
mFOV
45°,
dFOV
90°,
sFOV
90°,
mFOV
90°,
dFOV
Con
trast
nu
mb
er
1 x x
185.19**
2 x
x
114.12**
3
x x
8.56
4
x x
183.34**
5
x
x 31.89**
6
x x 62.30**
7 x
x
45.38**
8
x
x
44.46**
9
x
x 2.89
Table 5.2 – Sets of pairwise contrasts for the judges in Experiment 3, following the Scheffé method outlined in Starks
and David (1961). Each pair of contrasts is indicated by an X in a particular row. The value in the last column
represents a Q2 test statistic for the column element being preferred over the row element. Critical values at α = 0.05
and 0.01 are indicated by * and ** respectively.
The results revealed a number of significant pairwise contrasts, as 7 of the 9 contrasts were
found to be significant at α = 0.01. Contrasts 1 to 6, between the three Display sizes for each
Elevation angle, were found to be significant except for that between {45°, mFOV} and
{45°, dFOV}. Contrasts 7 and 8, between the two Elevation angles for the sFOV and mFOV
respectively, were found to be significant. Contrast 9, for {45°, dFOV} and {90°, dFOV}
was not found not be significantly different. Figure 5.8 shows the results from the paired
contrasts graphically, where each line on the graph represents a significant contrast between
the conditions at the end points of the line.
86
(a)
(b)
Figure 5.8 - Two-dimensional plots for data from Experiment 3 PCM route identification performance, with pairwise
contrast results, (a) across the three FOV conditions for each Elevation angle, (b) across the two Elevation angles for
each Display size.
87
5.3.3 Participant subjective ratings of six viewing conditions The participants were asked to conduct a series of paired comparisons for the 6 different
viewing conditions, each of which represented a combination of viewing perspective and
display FOV. The participants were told that they should respond to which viewing condition
they felt supported better performance in the global task (regardless of the particular route
shown in the paired comparison). In other words, the videos shown in the paired comparisons
were meant to remind the participant of all of the different viewing conditions used during
the experimental task for the particular route shown. The resulting scale values are shown as
two-dimensional graphs in Figure 5.9. Please see Appendix A5.3 for the related
computations.
(a) (b)
Figure 5.9 - Plots of PCM results generated by participants, for the question “which of the two viewing conditions
allowed you to more accurately identify the shape of the Route?”, (a) across the three FOV conditions for each
Elevation angle, (b) across the two Elevation angles for each Display size.
Overall, participants found that identifying the route shape was easier in the EA = 45° angle
case compared to EA = 90°. Participants also indicated that an enlarged FOV, either mFOV
or dFOV, made identifying the route shape easier compared to sFOV. In the EA = 90°
condition, participants found the global task was easier with the dFOV compared to the
mFOV. The positions were reversed in the angled viewpoint, as participants found the
mFOV made it easier than the dFOV when performing the global task.
sFOV dFOV mFOV0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Display size
Scale
valu
e
EA = 45
EA = 90
45° 90°0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Elevation angle
Scale
valu
e
sFOV
dFOV
mFOV
88
Analysis showed that the differences observed in the PCM scores were statistically
significant, D = 68.38 against a critical value χC2(5, N = 13) = 11.07, p < .05. Because D >
χC2, the null hypothesis was rejected. Thus, results from the paired comparison method
suggested that there were overall significant differences between the participants’ subjective
ratings of the viewing conditions.
Pairwise contrasts were computed for all combinations of the six Display conditions. The
contrasts between the three Display sizes (contrasts 1-6) and between the two Elevation
angles (contrasts 7-9) are shown in Table 5.3, showing that only 3 of the 9 contrasts were
significantly different. Please refer to Appendix A8.3 for calculations, as well as the full set
of contrasts.
Treatment
χC2(5,0.05)* = 22.14,
χC2(5,0.01)** = 30.18 45°,
sFOV
45°,
mFOV
45°,
dFOV
90°,
sFOV
90°,
mFOV
90°,
dFOV
Con
trast
nu
mb
er
1 x x
11.54
2 x
x
18.51
3
x x
0.82
4
x x
16.62
5
x
x 27.13*
6
x x 1.28
7 x
x
34.67**
8
x
x
37.38**
9
x
x 16.62
Table 5.3 - Set of pairwise contrasts for the participant ratings in Experiment 3, following the Scheffé method
outlined in Starks and David (1961). The value in the last column represents a Q2 test statistic for the column element
being preferred over the row element. Critical values at α = 0.05 and 0.01 are indicated by * and ** respectively.
Figure 5.10 shows the two-dimensional graphs with the statistically significant pairwise
contrasts highlighted.
89
(a) (b)
Figure 5.10 - Two-dimensional plot for participant ratings of Display conditions, with pairwise contrast results, (a)
across the three FOV conditions for each Elevation angle, (b) across the two Elevation angles for each Display size.
5.4 Discussion
Experiment 3 exploited the results from Experiment 2, whence an appropriate Height was
selected for which a reasonable chance was expected that differences might be observed
between the three display conditions: {sFOV, dFOV, mFOV}. Two camera elevation angles
were also tested: {45°, 90°}, the perspectives used in Experiments 2 and 1 respectively.
5.4.1 Target Detection Results While there were some missed targets in the participants’ target detection responses, no False
Alarms occurred, which was consistent with results for Experiment 1. The results showed no
significant differences in performance across the six display FOV and elevation angle
conditions, with a mean detection rate of approximately 78% of targets. (See Figure 5.6.)
Thus, the hypotheses were not confirmed for the target detection task.
As described in Section 2.6.2, Stager’s (1974) investigation of the effect of Elevation angle
(EA) on target detection performance revealed that an angled viewpoint would allow targets
to remain in the visual field for a longer period of time compared to a top down view. This
led to the hypothesis that target detection performance would be better at EA = 45° compared
to EA = 90°. However, the target detection results revealed no differences between the two
level of EA. In explaining this result, it should be noted that, although the simulated
90
environment was created to replicate some of the environmental features of aerial search, the
height above the simulated terrain was not scaled to real aerial search scenarios. Typically
real aerial search operations may be conducted at heights of up to 1000 ft (approximately 305
m) above the terrain (Stager, 1974). Although it was anticipated that the effect would still
manifest itself at a lower height, as indicated by pilot studies showing encouraging results, no
effect was observed in Experiment 3.
The reader is reminded that the height above the terrain in Experiment 3 was set at 92 m as a
result of calibrating the route identification task in Experiment 2. Presenting the terrain at a
height of 457.2 m would have resulted in the entire route being visible in the sFOV case, as
shown in Figure 5.11, making the route identification task trivial. However, in hindsight,
perhaps an important factor is the apparent size of the targets appearing at a height of 305 m.
In other words, the targets appearing at a height of 92 m in Experiment 3 could have been
scaled to be approximately the same size as objects appearing in the visual field at a height of
305 m. Accomplishing this would involve conducting a search in the literature of the size of
typical objects searched in aerial search scenarios, and calculating the size of the object’s
subtended visual angle at 305 m. Nevertheless, this change may have resulted in some
advantages in the angled viewpoint, in accordance with those reported by Stager (1974).
Figure 5.11 – A route used in Experiment 3 shown at a height of 457.2 m above the terrain. The entire route is shown
in the single FOV (sFOV).
91
Display size was hypothesised to result in better performance in the dFOV and mFOV
conditions, as targets appeared in the FOV longer compared to the sFOV condition.
However, the results showed no differences between the three Display sizes. It is possible to
explain this result, admittedly in hindsight, by considering the particular structure of the task.
In order to ensure that the participants were searching in approximately equal areas of the
display, they were asked to identify targets only within markers placed on the screen. The
markers demarcated an area equal to the size of the single size FOV in all three conditions.
Thus, in retrospect perhaps it is not surprising that no differences in target detection
performance were found. It should also be noted that two participants found the mFOV
condition to be distracting during the flyover, as the mosaic FOV displayed the occasional
jittery frame. This may have shifted their attention away from the zone demarcated by the red
markers, contributing to lower target detection performance in the mFOV. However, one side
effect of the apparently uniform performance levels in the target detection task across
conditions is that any observed differences in the route identification task can be argued to be
the result solely of differences in the independent variables, and not the influence of the
participants’ performance in the target detection task.
5.4.2 Route Identification Results The Route selections were aggregated across participants to gain insight into collective
performance across the different conditions. The 24 volunteer judges performed the Paired
comparisons method by selecting, for each pair of Route ensembles, which set of Routes
collectively more closely resembled the Correct Route. Figure 5.7 shows that the
performance judged to be highest at {90°, mFOV}, while worst performance was judged to
be at {45°, sFOV}, with the remaining four clustered together. Because Thurstonian scaling
produces an equal-interval scale, the smaller relative distances between the four points
indicate that these conditions were judged to be closer in quality (i.e. in matching the Correct
route). A one-way analysis confirmed that there were significant differences between the six
viewing conditions, and two-dimensional graphs (Figure 5.10) with contrasts were generated
to illustrate the differences in performance between the conditions.
92
It was hypothesised that performance at EA = 90° (top-down view) would be better than at
EA = 45°, due to the difficulties imposed by perspective foreshortening from an angled
viewpoint. Figure 5.9(a) illustrates the relationships between the six conditions separated by
Display size for each level of Elevation angle. For each Display size, performance in the EA
= 90°, or top down view condition, was judged to be better compared to EA = 45°, as
hypothesised. In both the sFOV and mFOV, pairwise contrasts revealed significant
differences between the two levels of EA; however this was not the case for dFOV.
The reasons are not clear, as the perceived expansion and compression of objects in the FOV
from angled viewpoint would be even more pronounced in an extended FOV compared to the
sFOV (where an effect was observed). Perhaps in the dFOV condition the effect of preview
was sufficient to aid participants in forming a cognitive map of the flyover route, countering
the potential negative effects of perspective foreshortening in identifying the route.
Furthermore, the fact that route features remained in the FOV longer compared to the sFOV
may help to explain the differing results between the sFOV and dFOV conditions. That is,
the extended FOV allowed the participant to sample terrain features over a longer period of
time compared to the sFOV, facilitating the formation of the cognitive map, such that parity
was reached between the two levels of Elevation angle.
The relative placement of the sFOV, mFOV and dFOV on Figure 5.10(b) followed the same
pattern for each camera Elevation angle (EA), EA=45° and EA=90°. Performance in
identifying the Route was judged to be worst in the sFOV, likely due to the relatively limited
FOV that displayed fewer features of the environment at any one time compared to the
extended FOV conditions (mFOV and dFOV). Because the displayed features remained in
the FOV for a relatively short period of time, less accurate cognitive maps were formed,
leading to less accurate Route selections.
Performance in the dFOV condition was judged to be better than for sFOV. The extended
FOV afforded more environmental features of the complex route, as well as their spatial
relationships to be viewed at any given time, so that features could be integrated more easily
during the traversal.
93
Finally, the mFOV performance was judged to be the highest among the three display
conditions, for both EA = 45° and EA = 90°. In addition to having an extended FOV at
roughly the same size as the dFOV, the important difference between the dFOV and mFOV
conditions was the shape property of the mosaiced image, which directly displayed the shape
of the path followed by the simulated aircraft traversing the terrain.
Contrasts revealed significant pairwise differences between the three Display sizes for each
level of EA, which was consistent with the hypotheses, except for one contrast between {45°,
dFOV} and {45°, mFOV}. In the condition where the angled viewpoint was used, the
potential advantage of the explicit shape of the camera’s path for the mosaic FOV did not
produce significantly better performance.
As discussed earlier, performance at EA = 90° was found to be better than at EA = 45°,
which was consistent with the hypothesis that the perspective foreshortening would cause
distortions in cognitive maps formed during a flyover. Perhaps there was a “ceiling effect” in
the performance being reached at EA = 45°, in that the benefits of the additional shape
information could not completely overcome the deficiencies imposed by the perspective
foreshortening.
Note that a significant contrast was found between dFOV and mFOV in the top-down EA =
90° condition, however. In order to investigate at what value of EA the advantage of viewing
the shape of the camera’s path on the screen manifests itself over having an extended FOV
(i.e. the dFOV condition), more levels of Elevation angle should be tested. This is noted in
Section 6.4 as a suggestion for future work.
Relating the previous discussion of the effect of Elevation angle, Figure 5.12 highlights the
three scale values {45°, dFOV}, {45°, dFOV} and {90°, dFOV} that were not found to be
significantly different from each other. Perhaps additional collection of additional data from
participants would have elicited further separation between these conditions.
94
Figure 5.12 - Graph highlighting (with a dotted circle) the three scale values that were not found to be significantly
different from each other in pairwise contrasts.
5.4.3 Participants’ Subjective Rating Results In addition to carrying out the route identification task, the participants were also asked to
give subjective ratings of the effectiveness of the different display conditions for carrying out
that task. After completing all flyover trials, they used the paired comparison procedure to
compare the six display conditions with regards to which conditions allowed them in general
to achieve the best performance in identifying routes. The resulting graphs (Figure 5.7) had
both similarities and differences in comparison with the route identification results obtained
from the judges (Figure 5.9).
One interesting result from the subjective rating scale was that participants felt that the
viewpoint perspective EA = 45° allowed them to perform better than with the top down view
at EA = 90°. Significant contrasts found this to be the case for sFOV and mFOV, but not for
mFOV. A number of participants noted that the angled viewpoint allowed them to see further
ahead, providing the effect of “preview” for the incoming terrain features. However, the
results from the external judges’ evaluation showed that performance in the top down view
was generally better than with the angled viewpoint. Referring to known difficulties in
acquiring spatial knowledge from perspective views, the distortions caused by perspective
foreshortening for EA = 45° appeared to be greater than the participants’ perceived benefits
of preview. However, from the top down viewpoint, no perspective foreshortening occurred,
allowing the benefits of the real-time mosaic to manifest themselves.
95
In general, the conditions with the extended FOV (mFOV and dFOV) were found to be more
helpful than the sFOV. However, there was a reversal between the mFOV and dFOV
conditions about which provided better performance in the EA = 45° and EA = 90° cases,
respectively. In the EA = 45° condition, the mFOV was found be better, whereas the dFOV
was found to be better in the EA = 90° condition.
Although these differences were not found to be statistically significant, the written
comments provided by the participants after they completed the experiment may offer some
insights into the preference of the extended FOV. One participant wrote that the 45° angle
condition with the “mosaic view also helped a lot, mostly through redundancy (not just the
line of river indicating curvature, but angle at which frames are connected, too)”, referring
presumably to the shape directly displayed in the mFOV. As described earlier, the tunnel
effect may have provided extra visual cues about the sharpness of curves along the route.
It should be recalled that no such “tunnel effect” was present for the mFOV condition in the
top down view. Two participants noted verbally that the mFOV condition could have
distracted them during the flyover, as the overlapping frames of the mosaic FOV displayed
the occasional jittery frame, depending on the terrain features captured in the camera’s field
of view. One of these two participants reported that in a few of the EA = 90° trials, his
attention shifted to the bottom of the mFOV as it changed shape and size. This may have led
participants to select the dFOV condition as being easier to identify the shape of the route.
In summary, the participants’ ratings of the Display conditions revealed some interesting
findings. First, the ordinal relationships between the single FOV and the extended FOV
conditions were consistent with the hypotheses, as well as the judges’ PCM results. However,
participants expressed a clear preference for the angled viewpoint over the top down view,
which was counter to the hypotheses and the evaluation of their route identification data.
5.4.4 Summary Contrary to the hypothesised improvements in performing the target detection task using
angled viewpoints and in the extended FOV conditions, the results for Experiment 3 did not
support the hypotheses. The results showed no differences in target detection performance
between the six combinations of Display size and Elevation angle.
96
The route identification results, on the other hand, did support both hypotheses concerning
the Display size and Camera Elevation angle. For Display size, performance was judged to
be highest in the mFOV condition, followed by dFOV and then sFOV, for both EA = 45° and
90°. Pairwise contrasts revealed significant differences in the hypothesised directions, except
between {45°, dFOV} and {45°, mFOV}. For the Elevation angle, performance using the
top-down view (EA = 90°) was judged to be better than at an angled viewpoint (EA = 45°)
for all three Display sizes. Contrasts revealed that all pairwise differences were significant,
except between {45°, dFOV} and {90°, dFOV}.
The participants’ ratings of the Display conditions revealed some interesting findings. First,
the ordinal relationships between the single FOV and the extended FOV conditions were
consistent with the hypotheses, as well as the judges’ PCM results. However, participants
expressed a clear preference for the angled viewpoint over the top down view, which was
counter to the hypotheses and the evaluation of their route identification data.
97
Chapter 6. Conclusions
The demands of visual tasks in a number of domains have brought to light the frequently
encountered difficulty in forming and maintaining an accurate cognitive map while scanning
a scene for objects of interest. One notable example of this challenge is in aerial search, in
which operators may be tasked with searching for targets using a narrow field of view (FOV)
camera system, while maintaining global awareness of the environment in the event that a
target is detected. In doing so, the operator may be forced to trade off performance in one
task in order to accomplish the other, which may lead to targets being missed, disorientation
and/or a loss of understanding of the surrounding environment. An investigation of the
research literature on target detection and route identification tasks revealed a concerted
effort to provide means for enhancing human performance in aerial search type tasks. More
broadly, the extensive literature on how humans form cognitive maps and the range of
methods related to how cognitive maps can be externalised and evaluated provides a strong
theoretical basis for the tasks and evaluation methods used in the present study.
In considering ways to potentially improve human performance in these types of tasks, the
continued increase in computer processing power and the dropping cost of computing
hardware present an intriguing software solution in the form of real-time image mosaicing.
Essentially one is able to generate and present an artificially broadened FOV by using a
series of previously viewed image frames, aligned and stitched together from a video source.
This has some interesting properties, particularly for cases in which the camera is traversing
a landscape. The ribbon of images formed from image mosaicing directly displays the path of
the camera, without requiring the observer to mentally integrate successive image frames.
Furthermore, whenever the mosaicing algorithm uses specific features of the input image as a
reference, and whenever any camera rotations relative to that reference occur, the result is an
easily perceivable explicit rotation of the outer mosaic image frame, which is able to
communicate the extent of the relative rotation directly to the observer. It was surmised that
this shape property, in addition to the extended FOV size, would enhance operator
performance in spatial awareness tasks. Furthermore, the present study sought to investigate
the effect of viewing perspective on spatial awareness, in light of its practical implications
98
for conducting search tasks from an elevated viewpoint, as well as the many issues cited in
the research on perspective views in 2D and 3D displays.
6.1 Summary of experimental results
6.1.1 Experiment 1 Experiment 1 was an exploratory investigation into the potential benefits of real-time image
mosaicing. Inspired by the work of Morse et al. (2008), who found that an extended FOV
through mosaicing was helpful in a target detection task, the first experiment comprised three
major modifications to Morse et al.’s experiment. First, an additional type of display that was
twice the size (dFOV) of the single FOV (sFOV) was introduced, in order to determine
whether any potential improvements in performance with a mosaic display (mFOV) were due
to the size of the mosaic or the shape property of the mosaic. Second, the target detection
task was modified to provide a means to analyse the task under a signal detection theory
paradigm. This was accomplished by creating discrete events that either did or did not
contain targets. Third, a route identification task was introduced, by having the camera fly
over Routes that contain straight and curved sections of different lengths and curvatures. The
9 participants were asked watch a video of the flyover for approximately 90 sec and while
searching for targets. After the video ended, they were asked to identify the Route overflown
from within a 10x10 grid that contained the correct Route. It was hypothesised that
performance using the mosaiced FOV (mFOV) would be greater for both the target detection
and route identification tasks compared to sFOV and dFOV.
The results for the target detection and global tasks for Experiment 1 were not consistent
with the hypotheses. In the target detection task, there were no False Alarms recorded, thus
preventing a signal detection theory analysis. Using a simpler percent accuracy metric,
performance was found to be highest for the smallest size FOV, despite the fact that targets
appeared in the FOV twice as long in the mFOV and dFOV. The lower levels of performance
in the two extended FOV conditions were approximately equal. The influence of distractors
in the environment may have been the cause for this, as the number of target and non-target
objects to scan increased in the extended FOV conditions.
99
Whereas Morse et al. (2008) used targets that could be identified along a single salient
dimension of colour, the targets in the present study were more closely matched to the
environment with respect to colour, texture and size. It was surmised that this effect, in
retrospect, required participants to serially search the enlarged FOVs containing more
distractors. Coupled with inadequate scanning of the display as the objects passed across the
extended FOV, this likely led to better detection performance in the relatively small size
sFOV. Although the target detection results were surprising, they were not unprecedented.
For example, Crebolder et al. (2003) concluded that an “intermediate” FOV offered best
performance in a survey of DRDC research in aerial search tasks, stating that the results are
task dependent. Indeed, this appears to be case here (and in Experiment 3) where the
characteristics of the targets appeared to play a role as well.
In the route identification tasks, the participants’ route selections were compared to the
correct Route by computing a metric of “distance” between the two Routes using a novel grid
response technique. Two metrics, the “Euclidean” and “City block”, were used. Neither set
of results showed any significant differences between the three display conditions. In other
words, the participants performed the task with similar accuracy despite having an extended
FOV in some conditions. It was surmised that the parity in the performances was due to the
simplicity of the scene about which they were asked to maintain spatial awareness. The
sequence of straight and curved sections in the Routes was too predictable, and the long
flyover time may not have demanded that continual attention be paid to the global task.
6.1.2 Experiment 2 The lessons learned from Experiment 1 led to a focus in Experiment 2 on the issue of
enhancing global task performance. Target detection was thus put aside in favour of devising
an appropriate environment that would require continual attention to form an accurate
cognitive map. To this end, two major changes were made. First, the top-down view of
Experiment 1 was replaced with an angled viewpoint, in order to simulate the kind of
viewing perspective that might be adopted in a SAR type task. Secondly, inspired by
complex terrain features often found in naturalistic landscapes, the environment was changed
to consist of a simulated forest terrain with a winding river running through it. The river was
designed with a number of turns of various curvatures to ensure that sections would not
100
reasonably be memorised. The participants flew over sections (called “Routes”) of the long
river for approximately 20 seconds, after which they were asked to identify which section of
the long river they had just flown over.
The critical factor investigated in Experiment 2 was the Height at which the camera flew
over the terrain. Increasing the Height has an effect similar to extending the FOV, as a
greater portion of the terrain can be viewed at once. On the one hand using an altitude that
was too high might allow the route identification task to be completed without any need for
an extended FOV. Conversely, using an altitude that was too low might render the task too
difficult to accomplish under any conditions. Therefore, in order to avoid potential floor and
ceiling effects in the Route identification data, one of the goals of Experiment 2 was to select
an appropriate Height for the new spatial task environment and the sFOV condition. Four
levels of Height were tested; it was hypothesised that increasing Height above the terrain
would result in progressively better Route identification performance.
Data from 7 participants were collected in the form of their Route selections. Observing the
aggregate Route data from all participants relative to the correct Routes, it was clear that
objective computational measures such as RMS error could not adequately characterise the
subtleties in evaluating the closeness of the participants’ selections relative the correct Route.
As Kitchin and Blades (2002) cautioned, careful attention must be paid to the method of
evaluating the accuracy of cognitive maps. As such, another objective method was devised,
whereby performance was evaluated as a series of paired comparisons. Volunteer judges
were asked to select which of two sets of route selections (each representing the collective
responses from the participants for a particular Route) more closely matched the correct
Route. The aggregated paired comparison data from 21 volunteer judges were processed
using Thurstone’s (1927) Paired Comparisons method (PCM) to produce an equal interval
scale that quantified route identification performance in terms of closeness to the Correct
Route.
The PCM results confirmed the hypothesis that the Height has a significant effect on Route
identification. The one way analysis and contrasts revealed that as the Height increased, the
aggregated Routes were judged to more closely match the Correct Route. The only
discrepancy occurred between Heights of 56m and 92m, where no significant difference was
101
found. Of equal importance within the broader research context was that a particular Height
for this particular winding river plus route identification task, H3 = 92m, could be selected
for the subsequent experiment. The third outcome of Experiment 2 was confirmation of the
viability of using the Paired Comparisons method and a group of informed external judges to
evaluate subtle yet complex differences between selected routes, in the absence of simpler
objective metrics.
6.1.3 Experiment 3 The final experiment in the present study revisited the original research question regarding
the effect of image mosaicing on target detection and route identification. As in Experiment
1, performance in both target detection and route identification tasks was evaluated, once
again across three display conditions: {sFOV, mFOV, dFOV}. Concurrently, the effect of
viewing perspective was tested, for two camera elevation angles: {EA=45°, EA=90°}, in
order to determine whether the Experiment 2 supposition regarding the expected difficulty in
forming accurate cognitive maps from an angled viewpoint was in fact valid. In total, 6
combinations of conditions were evaluated. The 13 participants were again asked to perform
a target detection task during a 25 second flyover video and then identify a Route within the
long winding river, using the same response method as in Experiment 2. They were also
presented a series of paired comparisons to indicate which conditions they felt allowed them
to more accurately identify the Correct Route.
The target detection data contained, as in Experiment 1, no False Alarms, and a simple metric
of percent detection accuracy was thus used. No difference was found in performance across
the three display conditions, and thus the hypotheses were not supported. Concerning the
factor of Elevation angle, the work of Stager (1974) suggests that an angled viewpoint would
allow objects to remain in the visual field longer due to the decreased angular velocity when
gazing away from the perpendicular underneath the aircraft. Thus it was hypothesised that
target detection at EA = 45° would be better than at EA = 90°. However, it was found that
there was no difference at the two levels of EA. It is possible that scaling the terrain and its
features to more closely approximate a realistic altitude (of approximately 1500 ft) would
have brought the results in line with those of Stager (1974).
102
Concerning the effect of Display size, it was hypothesised that target detection would be
better for an enlarged FOV (mFOV and dFOV) compared to sFOV. However, no significant
differences were found between the Display sizes. An important difference may have related
to the structure of the task, in which participants were asked to respond to targets within a
designated area of the screen. Despite the assurance gained from knowing where on the
screen a target was detected (as opposed to the more disruptive response method used in
Experiment 1), this may have inadvertently led to no differences in performance between
Display sizes.
For the route identification task, PCM data from 24 external judges were used to derive an
equal interval scale for the six conditions along the continuum of closeness to matching the
correct Route. Performance was found to be significantly different among the six conditions.
Plotting the scale values on 2D graphs showed that route identification performance for
mFOV was judged to be superior, followed by dFOV, and then sFOV for both levels of
Elevation angle. This was confirmed by contrasts, which showed significant pairwise
differences between conditions in all but one contrast. The unique shape properties of the
image mosaic, created by the trail of images aligned and stitched together, is believed to have
facilitated a more accurate cognitive map of the routes traversed during the flyover.
Furthermore, performance at EA = 90° was judged to be superior to that at EA = 45° for all
three Display sizes, but only reached statistical significance for sFOV and mFOV.
The results from the participants’ subjective ratings of the effectiveness of the different
display conditions with regards to route identification revealed that they found the extended
FOV conditions (dFOV and mFOV) to afford better performance, although the judgments of
which of these two conditions was more effective appeared to depend on the camera
Elevation angle. Curiously, the participants found that the angled viewpoint allowed them to
identify the Route more accurately, even though the data evaluated by the external judges
was in disagreement. The evaluation of the selected Routes showed that performance in the
top down view was generally better. This may have been due to problems of perspective
foreshortening outweighing the benefits of “preview” of upcoming spatial information.
103
6.1.4 Synthesis The work in this dissertation comprises three experiments motivated by the question of
whether a software technology called real-time image mosaicing can enhance performance in
spatial awareness tasks. The conclusions drawn from this investigation suggest that the
answer is a (qualified) “yes”. For the route identification tasks developed throughout
Experiments 1 – 3, the hypothesised advantages of mosaicing, including the extended FOV
and the shape of the camera’s path being displayed directly on the screen, appeared to be
helpful in tasks for identifying complex routes, relative to a FOV of equivalent or smaller
fixed size. Furthermore, the effect of Elevation angle was consistent with the literature, in
that route identification from a top down view was better than at an angled (45°) viewpoint.
However, counterintuitive results were discovered in the target detection tasks in
Experiments 1 and 3. These results were explainable when compared to previous work in
visual search (e.g. Crebolder et al. (2003)), as well as looking at the important differences in
target characteristics between the present work and the work of Morse et al. (2008).
6.2 Limitations
Although the three experiments were carefully designed on the basis of both surmised
advantages of image mosaicing and past research in cognitive mapping in real-time spatial
tasks, a number of limitations are acknowledged.
The route identification task was completed in two steps: the participants watched the flyover
video using an ego-centric track up view, and then identified the route he flew over from a
number of options presented from an exocentric North up perspective. Even though this may
have been representative of actual aerial search scenarios, a transformation from one
perspective to the other nonetheless may have been a confounding factor in the experiment.
In Experiments 1 and 3, efforts were made to design the target detection task so that the data
could be analysed under a signal detection theory paradigm. This involved iterating through
several different target sizes, shapes and textures in order to develop a target, and a
background, whose discriminability was such that participants would occasionally respond to
targets when in fact there were none (i.e. False Alarms). However, no False Alarms occurred
in either Experiment. As such, although a preliminary measure of target detection
104
performance could be gleaned from the data, no insights could be gained about detection
sensitivity (d’) nor about the biases of the participants in responding YES or NO to targets in
the search environment.
In Experiment 1, it was identified that the route identification task was too easy for the
participants, due to the relatively simple route shapes. As such, no differences were found
between the three Display sizes, which was contrary to the hypotheses.
In Experiment 2, an outlier analysis revealed that the similarity between Routes 2 and 5
resulted in a scale for Route 5 that was inconsistent compared to the scales for the other five
Routes. This presented a limitation in the selection of the Routes, and another set of Routes
was selected for Experiment 3.
In Experiment 3, markers on the Route flyover window demarcated where participants were
asked to search for targets. This was done to ensure that the participants were searching in
approximately equal areas of the display. However, admittedly in hindsight, this may have
contributed to there being no differences in performance across the display conditions.
Although the method of Paired Comparisons appeared to be appropriate for evaluating the
complex Routes in the present study, one limitation of the technique concerns the granularity
of the information contained in each individual judgement. Due to the practical
considerations of presenting a reasonable number of paired comparisons to the judges19
, the
route selections were aggregated over all participants for each of the conditions.
Nevertheless, however impractical, providing comparisons between two individual route
selections would have been provided more granular data.
Another limitation to the Thurstonian method of Paired Comparisons was that the aggregate
data do not scale to conventional methods for testing statistical significance. Although a one-
way analysis and contrasts based on the Scheffé method using χ2 test statistics are available,
19
For example, one approach is to present only two selected routes along with one Correct route for each paired
comparison. For the three experiments in the present study, this would have resulted in hundreds of judgments
to be made. Therefore, for practical considerations, the route selections were aggregated over all participants for
each of the conditions.
105
the preferences indicated by judges violate the assumptions of the more conventional
ANOVA.
Furthermore, the reader is reminded that in order to obtain data from a large number of
judges, they were recruited on a volunteer basis and were asked to complete the paired
comparisons on the Web. Although considerable effort was put into providing instructions
that explained the task the participants performed as well as the types of route selection
errors that could occur, it was impossible to know how well the judges understood the nature
of the task they were asked to perform.
6.3 Contributions
The primary contribution of this research was in showing that global spatial awareness can be
enhanced for some tasks by using real-time image mosaicing. The experimental results
revealed that the mosaicing software, by using recently captured image frames, both
extended the size of the useful FOV and directly displayed information of the shape of the
camera’s path. These features are believed to have provided a means for participants to more
easily form a cognitive map of the environment in comparison to fields of fixed shape and/or
smaller size.
A careful examination of the most common methods of evaluating cognitive maps led to the
conclusion that conventional algorithmic methods would be inadequate for assessing the
subtle complexities of Route selections carried out on the basis of the recently formed
cognitive maps. The Paired Comparisons method (PCM) proved to be an effective, although
labour intensive method of objectively evaluating Route selections relative to a known
Correct Route, especially when those Routes are complex. The reliance on a large number of
judges reduces the effects of bias, as long as they are well informed about the desired quality
that is to be evaluated.
Given the variety of domains in which the conflicting demands of maintaining both local and
global spatial awareness exist, it is envisioned that the concept introduced in this research of
providing an extended FOV through image mosaicing may be readily transferrable beyond
the class of visual search (and rescue) scenarios used here as examples. Furthermore, because
such solutions can be implemented solely in software, without the need for additional camera
106
hardware, there is a potential for the output of any existing video system to be retrofitted with
this software.
Finally, in recognition of a number of desirable characteristics for researchers to be able to
evaluate performance in cognitive mapping, novel response methods were developed for
evaluating performance in identifying flyover routes. In Experiment 1, participants selected
routes within a two-dimensional grid of alternatives, varying along two dimensions
(curvature and length ratio). The response method provides a number of benefits including
granularity, the ability to recognise the route rather than recall it, and the recording of the
entire route. Furthermore, these benefits were carried over into the response method for
Experiments 2 and 3 when more complex routes were adopted.
6.4 Suggestions for future work
The three experiments presented here represent a starting point for investigating the potential
benefits of real-time image mosaicing for enhancing spatial task performance. During the
development of the experiments, a number of issues surfaced. For example, although the
factor of Height was identified as critical to the research, there were others that fell beyond
the scope of this project but nevertheless have theoretical and practical implications for the
use of real-time mosaicing.
One of the more intriguing issues is that of objects moving in the environment. Consider for
example, an object such as a car travelling alongside the river as the camera flies over the
terrain. Not surprisingly, using a camera system with a single size FOV, the car would be
seen to be moving within the scene until it falls outside the boundaries of the display.
Similarly, in the mosaicing condition, the car would move within the boundaries of the most
recent image frame. However, when the car exits that last image frame, that final frame
becomes stitched into the mosaic. In other words, the object that was just seen moving now
appears frozen in the image mosaic (Szeliski, 1996).
One can imagine a host of interesting real-time scenarios where this might be either useful or
a potential hindrance. For example, if a UAV were tracking a moving object using a camera
system with real-time image mosaicing, the human operator observing the camera feed may
misjudge the location of the object, as it may have changed speed or direction after appearing
107
in the mosaic. On the other hand, a frozen image mosaic may be useful for an operation in
which the instantaneous positional relationships among multiple objects must be tracked. In
such situations, creating a mosaic from multiple images may be reveal configurations that
would not be possible with a conventional, relatively limited FOV.
Finally, as mentioned earlier, the experiments in the present study cannot be claimed to be
directly representative of actual live aerial search tasks. However, the experimental results
make a compelling case for continuing the effort to move this technology forward, into the
domains discussed earlier, such as telerobotics and surgery, histopathology and remote
camera surveillance.
108
References
Andre, A., Wickens, C., & Moorman, L. (1991). Display formatting techniques for
improving situation awareness in the aircraft cockpit. The International Journal of
Aviation Psychology, 1(3), 205–218.
Austin, R. (2010). Unmanned aircraft systems: UAVS design, development and deployment.
West Sussex, UK: John Wiley and Sons.
Baker, K., & Youngson, G. (2007). Advanced Integrated Multi-sensor Surveillance (AIMS)
Operator Machine Interface (OMI) Definition Study (Tech. Rep.). Toronto: Defence
R&D Canada.
Beck, R. J., & Wood, D. (1976). Cognitive Transformation of Information from Urban
Geographic Fields to Mental Maps. Environment and Behavior, 8(2), 199–238.
Bourgeois, F., Guiard, Y., & Lafon, M. B. (2001). Pan-zoom coordination in multi-scale
pointing. In CHI ’01 extended abstracts on Human factors in computing systems - CHI
’01 (p. 157). New York, New York, USA: ACM Press.
Bowman, D. (2002). Principles for the design of performance-oriented interaction
techniques. In K. Stanney (Ed.), Handbook of Virtual Environments (pp. 277–300).
Mahwah, NJ: Lawrence Erlbaum.
Boyer, B., Campbell, M., May, P., Merwin, D., & Wickens, C. D. (1995). Three-
Dimensional Displays for Terrain and Weather Awareness in the National Airspace
System. Proceedings of the Human Factors and Ergonomics Society Annual Meeting,
39(1), 6–10.
Brickner, M. S., & Foyle, D. C. (1990). Field of View Effects on a Simulated Flight Task
with Head-Down and Head-Up Sensor Imagery Displays. Proceedings of the Human
Factors and Ergonomics Society Annual Meeting, 34(19), 1567–1571.
Brown, L. (1992). A survey of image registration techniques. ACM computing surveys
(CSUR), 24(4), 325 –639.
Burros, R. (1951). The application of the method of paired comparisons to the study of
reaction potential. Psychological review(2), 60–66.
Cadwallader, M. (1979). Problems in Cognitive Distance: Implications for Cognitive
Mapping. Environment and Behavior, 11(4), 559–576.
Canter, D. (1977). The Psychology of Place. London: Architectual Press.
109
Carver, E. (1990). Search of imagery from airborne sensors-implications for selection of
sensor and method of changing field of view. In D. Brogan (Ed.), Visual Search.
London, United Kingdom: Taylor & Francis.
Crebolder, J., Unruh, T., & McFadden, S. (2003). Search performance using imaging
displays with restricted field of view (Tech. Rep.). Toronto, Canada: Defence R&D
Canada.
Croft, J., Pittman, D., & Scialfa, C. (2007). Gaze behavior of spotters during an air-to-ground
search. Human Factors, 49(4), 671–678.
David, H. (1988). The method of paired comparisons (2nd ed.). London: Griffin.
Draper, M., & Ruff, H. (2000). Multi-sensory displays and visualization techniques
supporting the control of unmanned air vehicles. In IEEE International Conference on
Robotics and Automation. San Fransico, California.
Drury, J. L., Riek, L., & Rackliffe, N. (2006). A decomposition of UAV-related situation
awareness. In Proceeding of the 1st ACM SIGCHI/SIGART conference on human-robot
interaction - HRI ’06 (pp. 88–94). New York, New York, USA: ACM Press.
Dunn-Rankin, P., Knezek, G. A., Wallace, S. R., & Zhang, S. (2004). Scaling methods.
Mahwah, NJ: Lawrence Erlbaum.
Durbin, J. (1951). Incomplete blocks in ranking experiments. British Journal of Statistical
Psychology, 4(2), 85–90.
Edwards, A. (1957). Techniques of attitude scale construction. New York, NY: Appleton-
Century-Crofts, Inc.
Ellis, S., Mcgreevy, M. W., & Hitchcock, R. J. (1987). Perspective traffic display format and
airline pilot traffic avoidance. Human Factors: The Journal of the Human Factors and
Ergonomics Society, 29(4), 371–382.
Evans, G. W., Fellows, J., Zorn, M., & Doty, K. (1980). Cognitive mapping and architecture.
Journal of Applied Psychology, 65(4), 474–478.
Ferguson, E. L., & Hegarty, M. (1994). Properties of cognitive maps constructed from texts.
Memory & cognition, 22(4), 455–73.
Fujita, N., Klatzky, R. L., Loomis, J. M., & Golledge, R. G. (2010). The Encoding-Error
Model of Pathway Completion without Vision. Geographical Analysis, 25(4), 295–314.
Gärling, T., Böök, A., & Lindberg, E. (1985). Adults’ memory representations of the spatial
properties of their everyday physical environment. In R. Cohen (Ed.), The development
of spatial cognition (pp. 141–184). Hillsdale, NJ: Lawrence Erlbaum.
110
Gärling, T., Böök, A., Lindberg, E., & Nilsson, T. (1981). Memory for the spatial layout of
the everyday physical environment: Factors affecting rate of acquisition. Journal of
Environmental Psychology, 1(4), 263–277.
Golledge, R. (1993). Geographical perspectives on spatial cognition. In T. Garling & R. G.
Golledge (Eds.), Behavior and environment psychological and geographical
approaches (pp. 16–46). Amsterdam: Elsevier.
Golledge, R. (1999). Human wayfinding and cognitive maps. In R. G. Golledge (Ed.),
Wayfinding behavior: Cognitive mapping and other spatial processes (pp. 5–45).
Baltimore, MD: The Johns Hopkins University Press.
Golledge, R. G. (1978). Representing, Interpreting and Using Cognized Environments.
Papers in Regional Science, 41(1), 169–204.
Gridgeman, N. (1963). Significance and adjustment in paired comparisons. Biometrics, 19(2),
213–228.
Halpern, D. (2000). Sex differences in cognitive abilities (3rd ed.). Mahwah, NJ: Lawrence
Erlbaum.
Haskell, I., & Wickens, C. (1993). Two-and three-dimensional displays for aviation: A
theoretical and empirical comparison. The International Journal of Aviation
Psychology, 3(2), 87–109.
Hodgson, M. (1998). What size window for image classification? A cognitive perspective.
Photogrammetric Engineering & Remote Sensing, 64(8), 797–807.
Hopcroft, R., Burchat, E., & Vince, J. (2006). Unmanned aerial vehicles for maritime patrol:
human factors issues (DSTO-GD-0463) (Tech. Rep.). Victoria, Australia: DSTO
Defence Science and Technology Organisation.
Irani, M., Anandan, P., Bergen, J., Kumar, R., & Hsu, S. (1996). Efficient representations of
video sequences and their applications. Signal Processing: Image Communication, 8(4),
327–351.
Irani, M., & Peleg, S. (1991). Improving resolution by image registration. CVGIP: Graphical
Models and Image Processing, 53(3), 231–239.
Jackson, J., & Fleckenstein, M. (1957). An evaluation of some statistical techniques used in
the analysis of paired comparison data. Biometrics, 13(1), 51–64.
Jeon, S., & Kim, G. J. (2008). Providing a Wide Field of View for Effective Interaction in
Desktop Tangible Augmented Reality. In 2008 IEEE Virtual Reality conference (pp. 3–
10). IEEE.
111
Kearns, M. J., Warren, W. H., Duchon, A. P., & Tarr, M. J. (2002). Path integration from
optic flow and body senses in a homing task. Perception, 31(3), 349–374.
Kendall, M., & Smith, B. (1940). On the method of paired comparisons. Biometrika, 31(3),
324–345.
Kitchin, R. (1996). Methodological convergence in cognitive mapping research:
Investigating configurational knowledge. Journal of Environmental Psychology, 16,
163–185.
Kitchin, R., & Blades, M. (2002). The cognition of geographic space. New York, NY: I.B.
Tauris & Co Ltd.
Kleiner, M., Brainard, D., & Pelli, D. (2011). Psychtoolbox Wiki. Retrieved June 14, 2013,
from http://psychtoolbox.org/
Kuipers, B. (1978). Modeling Spatial Knowledge. Cognitive Science, 2, 129–153.
Lavigne, V., & Ricard, B. (2005). Step-Stare Image Gathering for High-Resolution
Targeting - RTO-MP-SET-092 (Tech. Rep.). Neuilly-sur-Seine, France: RTO. Retrieved
from
http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA4720
01
Liben, L. S. (1982). Children’s large-scale spatial cognition: Is the measure the message?
New Directions for Child and Adolescent Development, 1982(15), 51–64.
Linn, M. C., & Petersen, A. C. (1985). Emergence and characterization of sex differences in
spatial ability: a meta-analysis. Child development, 56(6), 1479–98.
Lloyd, R., & Heivly, C. (1987). Systematic Distortions in Urban Cognitive Maps. Annals of
the Association of American Geographers, 77(2), 191–207.
Lo, H. M. H. (2008). ImProViSur : An Image Processing System for Improving Visualization
for Laparoscopic Surgery (Unpublished doctoral dissertation). University of Toronto.
Loomis, J. M., Klatzky, R. L., Golledge, R. G., & Philbeck, J. W. (1999). Human navigation
by path integration. In R. G. Golledge (Ed.), Wayfinding behavior: Cognitive mapping
and other spatial processes (pp. 125–151). Baltimore, MD: Johns Hopkins University
Press.
Lowrey, R. (1970). Distance concepts of urban residents. Environment and Behavior, 2, 52–
73.
Lynch, K. (1960). The Image of the City. Harvard, MA: The MIT Press.
112
Mann, S. (2002). Intelligent image processing. Wiley-IEEE Press.
Michael, N., Scaramuzza, D., & Kumar, V. (2012). Special issue on micro-UAV perception
and control. Autonomous Robots, 33, 1–3.
Montello, D. (1991). The measurements of cognitive distance: methods and construct
validity. Journal of Environmental Psychology, 11, 101–122.
Montello, D., Lovelace, K. L., Golledge, R. G., & Self, C. M. (1999). Sex-Related
Differences and Similarities in Geographic and Environmental Spatial Abilities. Annals
of the Association of American Geographers, 89(3), 515–534.
Morse, B. S., Gerhardt, D., Engh, C., Goodrich, M. a., Rasmussen, N., Thornton, D., &
Eggett, D. (2008). Application and evaluation of spatiotemporal enhancement of live
aerial video using temporally local mosaics. In IEEE conference on computer vision and
pattern recognition (pp. 1–8). IEEE.
Mosteller, F. (1951). Remarks on the method of paired comparisons: I. The least squares
solution assuming equal standard deviations and equal correlations. Psychometrika,
16(1), 3–9.
O’Brien, J., & Wickens, C. (1997). Free flight cockpit displays of traffic and weather: Effects
of dimensionality and data base integration. In Proceedings of the Human Factors and
Ergonomics Society Annual Meeting (pp. 18–22).
Pietriga, E., Appert, C., & Beaudouin-Lafon, M. (2007). Pointing and beyond: an
operationalization and preliminary evaluation of multi-scale searching. In Proceedings
of the SIGCHI conference on Human factors in computing systems (pp. 1215–1224).
Pollio, J. (1968). Stereo-Photographic Mapping From Submersibles. In C. N. DeMund (Ed.),
Underwater photo-optical instrumentation applications ii.
Ross, R. (1934). Optimum orders for the presentation of pairs in the method of paired
comparisons. Journal of Educational Psychology, 375–382.
Russell, J. A., & Ward, L. M. (1982). Environmental Psychology. Annual review of
psychology, 32 , 651–688.
Schmid, C., Mohr, R., & Bauckhage, C. (2000). Evaluation of Interest Point Detectors.
International Journal of Computer Vision, 37(2), 151–172.
Shum, H., & Szeliski, R. (2000). Systems and experiment paper: Construction of panoramic
image mosaics with global and local alignment. International Journal of Computer
Vision, 36(2), 101–130.
113
Siegel, A., & White, S. (1975). The development of spatial representations of large-scale
environments. In H. W. Reese (Ed.), Advances in child development and behavior.
Academic Press, Vol. 10.
Stager, P. (1974). Visual search capability in Search And Rescue (SAR) – DCIEM report no.
74-R-1009 (Tech. Rep.). Toronto: DCIEM: Defence and Civil Institute of
Environmental Medicine.
Stager, P., & Angus, R. (1975). Eye-movements and related performance in SAR visual
search - DCIEM report no. 75-X11 (Tech. Rep.). Toronto: DCIEM: Defence and Civil
Institute of Environment Medicine. Retrieved from http://pubs.drdc-
rddc.gc.ca/BASIS/pcandid/www/engpub/DDW?W%3DSYSNUM=93393
Stager, P., & Angus, R. (1978). Human Factors : The Journal of the Human Factors and
Ergonomics Society.
Starks, T., & David, H. (1961). Significance tests for paired-comparison experiments.
Biometrika, 48(1), 95–108.
Szeliski, R. (1994). Image mosaicing for tele-reality applications. In Proceedings of the
Second IEEE Workshop on Applications of Computer Vision (pp. 44–53). Cambridge:
IEEE Computer Society Press.
Szeliski, R. (1996). Video mosaics for virtual environments. IEEE Computer Graphics and
Applications, 16(2), 22–30.
Szeliski, R. (2006). Image Alignment and Stitching: A Tutorial. , 273–292.
Tan, D. S., Gergle, D., Scupelli, P. G., & Pausch, R. (2004). Physically large displays
improve path integration in 3D virtual navigation tasks. Proceedings of the 2004
conference on Human factors in computing systems - CHI ’04 , 6(1), 439–446.
Thorndyke, P. W., & Hayes-Roth, B. (1982). Differences in Spatial Knowledge Acquired
and Navigation from Maps and Navigation. Cognitive Psychology, 14(4), 560–589.
Thurstone, L. (1927). A law of comparative judgment. Psychological Review (34), 273–286.
Thurstone, L. (1932). Stimulus dispersions in the method of constant stimuli. Journal of
Experimental Psychology, 15(3), 284–297.
van Breda, L., & Veltman, H. A. (1998). Perspective information in a cockpit as a target
acquisition aid. Journal of Experimental Psychology: Applied, 4(1), 55–68.
van Erp, J. B. (2000). Controlling unmanned vehicles: The human factors solution
(ADPO10325) (Tech. Rep.). Soesterberg, Netherlands: TNO Human Factors Research
Institute.
114
Vos, J. (1990). Visual search: Trade off between magnification and field width. In D. Brogan
(Ed.), Visual search. London, United Kingdom: Taylor and Francis.
Wang, W. (2005). Human navigation performance using 6 degree of freedom dynamic
viewpoint tethering in virtual environments (Unpublished doctoral dissertation).
University of Toronto.
Warner, H., & Hubbard, D. (1992). Area-of-Interest Display Resolution and Stimulus
Characteristics Effects on Visual Detection Thresholds (Report No. AL-TR-1991-0134)
(Tech. Rep.). Williams Air Force Base: Armstrong Laboratory.
Wickens, C. D., & Hollands, J. G. (1999). Engineering psychology and human performance
(3rd ed.). Upper Saddle River, NJ: Prentice Hall.
Wickens, C. D., Liang, C. C., Prevett, T., & Olmos, O. (1996). Electronic maps for terminal
area navigation: effects of frame of reference and dimensionality. The International
journal of aviation psychology, 6(3), 241–71.
Wickens, C. D., Todd, S., & Seidler, K. (1989). Three-dimensional displays: Perception,
implementation, and applications: CSERIAC state-of-the-art report (Tech. Rep.). (Tech.
Rep. No. ARL-89-11/CSERIAC-89-1). Savoy, IL, Aviation Research Laboratory:
Aviation Research Laboratory. Retrieved from
http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA2599
37
Woods, D. (1984). Visual momentum: a concept to improve the cognitive coupling of person
and computer. International Journal of Man-Machine Studies, 21, 229–244.
Woods, R. L., Satgunam, P., & Bronstad, E. P. (2010). Statistical analysis of subjective
preferences for video enhancement. In Proceedings of SPIE – Human Vision and
Electronic Imaging XV (pp. 1–10).
115
Appendix 1. Statistical Outputs (Descriptive measures and
ANOVA results)
A1.1 Experiment 1 results
Descriptive Statistics
Mean Std.
Deviation
N
localm_D1 .8878 .10341 9
localm _D2 .7344 .15396 9
localm _D3 .7222 .18827 9
globalm _D1 2.5589 .68490 9
globalm _D2 2.3289 .50513 9
globalm _D3 2.4511 .34916 9
116
117
A1.2 Experiment 3 results
Descriptive Statistics
Mean Std. Deviation N
a45sFOV .7802 .03525 13
a45mFOV .7821 .03990 13
a45dFOV .7930 .03280 13
a90sFOV .7766 .03577 13
a90mFOV .7875 .03566 13
a90dFOV .7875 .03431 13
118
119
120
Appendix 2. Parameters for the long river for Experiment 2
The long river consisted of four sine functions superimposed to form a continuous winding
path. The total path length of the river was 18750m. The amplitude, frequency and phase
shift values of the four sine waves were adjusted to achieve an overall terrain with a number
and variety of curves along its path.
The following function was selected for the long river:
R(x) = A1*sin(2pi*f1*x+p1) + A2*sin(2pi*f2*x +p2) + A3*sin(2pi*f3*x +p3) +
A4*sin(2pi*f4*x +p4)
The parameters are listed in the table below.
Sinusoidal
Component (i)
Amplitude (Ai) Frequency (fi) Phase shift (pi)
1 0.5 1/14 = 0.0714 0.2
2 0.04 2.7*2.97/10 = 0.8 0
3 0.25 2.97/10 = 0.3 0
4 0.03 4.5*2.97/10 = 1.34 0
121
Appendix 3. Aggregated Route Selections by Participants
A3.1 Experiment 2 Routes
Height 1 = 20m
Height 2 = 56m
122
Height 3 = 128m
Height 3 = 164m
123
Appendix 4. Instructions for the set of paired comparisons
A4.1 Experiment 2 Paired comparisons instructions and form
Copy of instructions for Paired comparisons method
124
Samples of diagram pairs in the Paired comparisons method
125
A4.2 Experiment 3 Paired comparisons instructions and interface
You'll be asked to respond to a series of 90 comparisons, each comparison showing two diagrams. It should take less than 20 minutes to complete. An example is shown in Figure 1.
Figure 1
Each diagram contains a single dotted red line as well as a set of solid black lines. The red line
represents the shape of part of a river that I asked participants to identify from a very long winding
river in a perceptual experiment. They first watched a video flyover over a small section of the river,
and then had to identify which section they flew over. Here’s a video showing one trial of the
experiment:
http://www.youtube.com/watch?v=dr266Kin2ws The two diagrams represent participant responses under different experimental conditions; however, the red lines in the two diagrams will always be identical. The black lines represent the responses that the participants provided. So you will see a set of black lines whose shapes approximate the shape of the red line. But, as you can see in the diagrams, they made some errors in responding to the shape of the line. For each comparison, please indicate which collective set of black lines (TOP or BOTTOM) more closely resembles the shape of the red line.
In each set of diagrams, you may see a few black lines that are “outliers”, so that at first glance they
may seem very different from the red shape. Here are some descriptions of possible outliers.
126
1 - The shapes might be “mirrored” so that the turns end up reversed (ie. turning to the left when the
river actually turned to the right), such as shown in Figure 2. In other words, you may decide that this
is not such a large error if, for example, everything else about that particular shape is very close to
the actual (red) shape.
Figure 2
2 - Only a small portion of the route may be incorrect, which may nevertheless make the overall
shape look very different. For example, participants may have misjudged the length of only one
straight segment, which may have had the effect of causing the overall path to appear to deviate
greatly after that part. Or, similarly, they may have misjudged the sharpness of only one curve, which
may have caused an apparently large change to the overall shape. In other words, as before, you
have to decide whether deviations from the ideal red shape due to such factors should be weighted
strongly or weakly.
3 - Similar to the point above, in some cases participants may have committed a relatively small error in selecting the starting point for their chosen shape. Such a slight error could have resulted in an apparently large discrepancy between the chosen and the ideal (red) shape, even if the shapes were to match almost perfectly, simply due to the fact that they are ‘out of phase.” This is shown in Figure 3.
127
Figure 3
It is imperative, therefore, that you make your selection based on the ensemble of all the routes. In
other words, make sure you consider the collective behaviour of all of the black lines, including those
for which there are apparently large deviations that may have been a consequence of some of the
relatively minor errors discussed above. Conversely, do not make your selection based on only the
smallest number of apparent outliers.
In summary, for each of the 90 comparisons, please indicate which collective set of black lines (TOP or BOTTOM) more closely resembles the shape of the red line.
Thank you for your time!
==========
The judges were presented with pairs of ensembles of selected routes (the black curves) and
were instructed for each pair of ensembles to indicate which of the two ensembles in general
more closely matches the corresponding correct route (the red curve). All judges completed
the full set of 90 comparisons by visiting a Website that presented the 90 pairs in a
128
randomised sequence. The judges responded by clicking on buttons indicating their selection
of either the TOP or BOTTOM set of lines, before moving onto the next comparison.
129
Appendix 5. Calculations for the linear scales using the
Paired Comparisons Method (PCM)
A5.1 Experiment 2: Paired comparisons without Route 5
Raw matrix values for the paired comparisons method without the Route 5 comparisons.
H1 H2 H3 H4
H1 - 71 73 98
H2 34 - 43 74
H3 32 62 - 75
H4 7 31 30 -
The raw matrix values were converted into proportions of the total number of judgements,
5*21 = 105. Note that the diagonal entries of the matrix are filled with a proportion of 0.5, as
it is assumed that being presented two sets of the same Routes, the judges overall would
select one of those sets 50% of the time.
H1 H2 H3 H4
H1 0.500 0.676 0.695 0.933
H2 0.324 0.500 0.410 0.705
H3 0.305 0.590 0.500 0.714
H4 0.067 0.295 0.286 0.500
The proportions in are then converted to Z-score values using the standard normal tables. The
columns in the confusion matrix are then summed and averaged over the number of stimuli
(4 in this case) to obtain the mean Z-scale for performance at each of the four Heights.
Because this is an equal interval scale, a shift of all the values does not affect the distances
between the scale values. Thus, as a last step, the scale values are shifted so that the lowest
scale value acts an anchor at a value of 0 for the scale.
130
H1 H2 H3 H4
H1 0 0.457 0.511 1.501
H2 -0.457 0 -0.229 0.538
H3 -0.511 0.229 0 0.566
H4 -1.501 -0.538 -0.566 0
Sums -2.469 0.148 -0.284 2.605
Means -0.617 0.037 -0.071 0.651
Means + 0.617 0 0.654 0.546 1.269
The final equal interval scale is presented below.
A5.2 Experiment 3: Judge Paired comparisons
45,sFOV 45,mFOV 45,dFOV 90,sFOV 90, mFOV 90, dFOV
45, sFOV - 97 100 91 120 108
45, mFOV 47 - 68 64 80 57
45, dFOV 44 76 - 56 95 88
90, sFOV 53 80 88 - 108 88
90, mFOV 24 64 49 36 - 45
90, dFOV 36 87 56 56 99 -
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
H1 H2H3 H4
(b)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
H1 H2H3 H4
(c)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
H1 H2H3 H4
(a)
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(b)
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
(c)Height
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(a)
131
Proportions of the total number of judgements, 6*24 = 144 in this case.
45, sFOV 45, mFOV 45, dFOV 90, sFOV 90, mFOV 90, dFOV
45, sFOV 0.500 0.674 0.694 0.632 0.833 0.750
45, mFOV 0.326 0.500 0.472 0.444 0.556 0.396
45, dFOV 0.306 0.528 0.500 0.389 0.660 0.611
90, sFOV 0.368 0.556 0.611 0.500 0.750 0.611
90, mFOV 0.167 0.444 0.340 0.250 0.500 0.313
90, dFOV 0.250 0.604 0.389 0.389 0.688 0.500
Z-score values using the standard normal tables, with calculations of adjusted Z-score values
anchored to lowest value.
45, sFOV 45, mFOV 45, dFOV 90, sFOV 90, mFOV 90, dFOV
45, sFOV 0 0.450 0.508 0.337 0.967 0.674
45, mFOV -0.450 0 -0.070 -0.140 0.140 -0.264
45, dFOV -0.508 0.070 0 -0.282 0.412 0.282
90, sFOV -0.337 0.140 0.282 0 0.674 0.282
90, mFOV -0.967 -0.140 -0.412 -0.674 0 -0.489
90, dFOV -0.674 0.264 -0.282 -0.282 0.489 0
Sums -2.937 0.784 0.027 -1.042 2.682 0.486
Means -0.490 0.131 0.005 -0.174 0.447 0.081
Means +
0.490 0 0.620 0.494 0.316 0.937 0.571
Means*√2 0 0.877 0.699 0.447 1.32 0.807
The final equal interval scale and 2D plots for the Judge Paired comparisons in Experiment 3
are shown below.
132
A5.3 Experiment 3: Participant Paired comparisons
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV - 11 10 2 4 5
45, dFOV 2 - 8 1 4 2
45, mFOV 3 5 - 1 3 1
90, sFOV 11 12 12 - 12 11
90, dFOV 9 9 10 1 - 6
90, mFOV 8 11 12 2 7 -
Proportions of the total number of judgements, 13 in this case.
0 0.2 0.4 0.6 0.8 1 1.2 1.4
45,sFOV 45,mFOV45,dFOV
90,sFOV 90,mFOV90,dFOV
sFOV dFOV mFOV0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Display size
Scale
valu
e
EA = 45
EA = 90
45° 90°0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Elevation angle
Scale
valu
e
sFOV
dFOV
mFOV
133
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV 0.5 0.85 0.78 0.15 0.31 0.38
45, dFOV 0.15 0.5 0.62 0.08 0.31 0.16
45, mFOV 0.23 0.38 0.5 0.08 0.23 0.08
90, sFOV 0.85 0.92 0.92 0.5 0.92 0.85
90, dFOV 0.69 0.69 0.77 0.08 0.5 0.46
90, mFOV 0.62 0.85 0.92 0.15 0.54 0.5
Z-score values using the standard normal tables, with calculations of adjusted Z-score values
anchored to lowest value.
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV 0 1.02 0.74 -1.02 -0.50 -0.29
45, dFOV -1.02 0 0.29 -1.43 -0.50 -1.02
45, mFOV -0.74 -0.29 0 -1.43 -0.74 -1.43
90, sFOV 1.02 1.43 1.43 0 1.43 1.02
90, dFOV 0.50 0.50 0.74 -1.43 0 -0.10
90, mFOV 0.29 1.02 1.43 -1.02 0.10 0
Sums 0.06 3.68 4.62 -6.32 -0.22 -1.82
Means 0.01 0.61 0.77 -1.06 -0.04 -0.30
Means +
1.06 1.07 1.67 1.82 0 1.02 0.76
The final equal interval scale and 2D graphs for the Participant Paired comparisons in
Experiment 3 are shown below.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
45,sFOV 45,dFOV 45,mFOV
90,sFOV 90,dFOV90,mFOV
134
sFOV dFOV mFOV0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Display size
Scale
valu
e
EA = 45
EA = 90
45° 90°0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Elevation angle
Scale
valu
e
sFOV
dFOV
mFOV
135
Appendix 6. Statistical tests for assumptions of
Thurstone’s Case V method
A6.1 Experiment 2: Paired comparisons without Route 5
Edwards (1957) details a procedure to check the assumptions of Thurstone’s Case V method
for Paired comparisons method (PCM), namely the property of additivity of the obtained
scale values. For example, consider three stimuli with the scales values ZS1 < ZS2 < ZS3 with
equal discriminal dispersions. If additivity holds, then obtaining D21 as the distance between
ZS1 and ZS2 on the scale and D23 as the distance between ZS2 and ZS3, implies that the
distance between ZS1 and ZS3 should equal D12 + D23. To test this assumption, Mosteller
(1951) developed a χ2 test of significance that is sensitive to this property of additivity as
well as the other assumptions of the Case V model (Edwards, 1957).
The test involves the discrepancies between the ‘observed’ and ‘theoretical’ proportions of
the experimental data using an arcsine transformation given by:
√
is approximately normally distributed with variance equal to:
The ‘observed’ values refer to the proportions obtained from the paired comparisons while
the ‘theoretical’ proportions are determined by taking the mean scale values obtained from
Thurstone’s PCM, and computing back to proportions for each of the entries in the confusion
matrix (Edwards, 1957).
Note that special consideration must be taken for the value of N with regards to Experiment
2. There were 21 judges performing the paired comparisons for the 4 stimuli (different
heights), meaning pairs. Furthermore, each of the 6 pairs of comparisons
136
was repeated for each of the six Routes, for a total of 36 paired comparisons per judge. It is
believed that the original interpretation of N representing the number of judges is somewhat
misleading in this case, as each judge in essence provides a multitude of sets of judgements
for the PCM, albeit for pairs representing different instances of the same perceptual task. In
other words, five of the six sets of judgments would not be accounted for under the original
interpretation of the statistical test. For this reason, the value was modified to be
.
First, the observed proportions are taken from the lower half of the confusion matrix,
representing paired comparisons.
Observed proportions, pij
H1 H2 H3 H4
H1
H2 0.324
H3 0.305 0.590
H4 0.067 0.295 0.286
The arcsine transformation is applied to all the values to form Θij.
Arcsine transformed observed proportions, Θij
H1 H2 H3 H4
H1
H2 34.68
H3 33.51 50.21
H4 14.96 32.91 32.31
Next, the theoretical proportions p’ij are calculated from the scale values obtained from the
PCM, Zij = {0, 0.654, 0.546, 1.27}. To find each entry,
137
For example Z’12, designating the theoretical Z score for the preference of stimuli 2 over
stimuli 1 is:
The matrix of theoretical scale values is filled in for the lower half of the matrix.
Theoretical scale values, Z’ij
H1 H2 H3 H4
H1
H2 -0.654
H3 -0.546 0.108
H4 -1.270 -0.616 -0.724
The theoretical Z’ij values are then converted to back to proportions:
Theoretical proportions, p’ij
H1 H2 H3 H4
H1
H2 0.26
H3 0.29 0.54
H4 0.10 0.27 0.23
138
The theoretical proportions are converted using the arcsine transformation:
Arcsine transformed theoretical proportions,
H1 H2 H3 H4
H1
H2 30.43
H3 32.74 47.47
H4 18.63 31.24 28.97
Using the and matrices, the test statistic can be computed as:
∑
Under this test, rejecting the null hypothesis indicates that the assumptions of the Case V
method are tenable (Edwards, 1957). For the data in Experiment 2, χ2 = 8.23. This was
compared to a critical value χ2
C(3, N =21*6) = 7.82. As such, χ2 > χ
2C, and the assumptions
of the Case V method involved in finding the scale values for the Experiment 2 data were not
tenable.
It should be noted that using the original value of N = 21 would have resulted in the test
statistic χ2 = 1.37, and therefore χ
2 < χ
2C. In this case, the assumptions of the Case V method
would have been tenable. However, follow up calculations of the discriminal dispersions
revealed that they were in fact quite different from each other (as described in Appendix 7),
which was in violation of one of the assumptions of the Case V model. Although that test is
most sensitive to the property of additivity, it has also been shown in some cases to detect
violations of equal dispersions between the stimuli (Edwards, 1957). Taking into
consideration the interpretation proposed by Edwards (1957), the modification of the value of
N was deemed to be appropriate for the data in Experiment 2.
139
A6.2 Experiment 3: Participant Paired comparisons
Observed proportions, pij
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV
45, dFOV 0.15
45, mFOV 0.23 0.38
90, sFOV 0.85 0.92 0.92
90, dFOV 0.69 0.69 0.77 0.08
90, mFOV 0.62 0.85 0.92 0.15 0.54
Arcsine transformed observed proportions, Θij
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV
45, dFOV 1.33
45, mFOV 1.69 1.82
90, sFOV 1.80 1.92 2.19
90, dFOV 1.01 1.21 1.60 1.71
90, mFOV 0.93 1.14 1.55 1.67 0.75
140
Theoretical scale values, Z’ij
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV
45, dFOV -0.52
45, mFOV -0.69 -0.18
90, sFOV 0.92 1.44 1.62
90, dFOV 0.03 0.55 0.72 -0.90
90, mFOV 0.35 0.87 1.04 -0.57 0.32
Theoretical proportions, p’ij
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV
45, dFOV 0.30
45, mFOV 0.24 0.43
90, sFOV 0.82 0.93 0.95
90, dFOV 0.51 0.71 0.77 0.18
90, mFOV 0.64 0.81 0.85 0.28 0.63
Arcsine transformed theoretical proportions,
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV
45, dFOV 33.33
45, mFOV 29.58 40.99
90, sFOV 65.08 74.17 76.73
90, dFOV 45.64 57.28 61.00 25.47
90, mFOV 52.95 63.98 67.37 32.12 52.32
141
Using the and matrices, the test statistic can be computed as:
∑
Under this test, rejecting the null hypothesis indicates that the assumptions of the Case V
method are tenable (Edwards, 1957). For the data in Experiment 3, χ2 = 11.30. This was
compared to the χ2
C(6, N =13) = 18.31. As such, χ2 < χ
2C, and the assumptions of the Case V
method involved in finding the scale values for the Experiment 3 data were tenable.
A6.3 Experiment 3: Judge Paired comparisons
Observed proportions, pij
45, sFOV 45, mFOV 45, dFOV 90, sFOV 90, mFOV 90, dFOV
45, sFOV
45, mFOV 0.33
45, dFOV 0.31 0.53
90, sFOV 0.37 0.56 0.61
90, mFOV 0.17 0.44 0.34 0.25
90, dFOV 0.25 0.60 0.39 0.39 0.69
142
Arcsine transformed observed proportions, Θij
45, sFOV 45, mFOV 45, dFOV 90, sFOV 90, mFOV 90, dFOV
45, sFOV
45, mFOV 34.84
45, dFOV 33.56 46.59
90, sFOV 37.35 48.19 51.42
90, mFOV 24.09 41.81 35.69 30.00
90, dFOV 30.00 51.01 38.58 38.58 56.01
Theoretical scale values, Z’ij
45, sFOV 45, mFOV 45, dFOV 90, sFOV 90, mFOV 90, dFOV
45, sFOV
45, mFOV -0.62
45, dFOV -0.49 0.13
90, sFOV -0.32 0.30 0.18
90, mFOV -0.94 -0.32 -0.44 -0.62
90, dFOV -0.57 0.05 -0.08 -0.25 0.37
Theoretical proportions, p’ij
45, sFOV 45, mFOV 45, dFOV 90, sFOV 90, mFOV 90, dFOV
45, sFOV
45, mFOV 0.27
45, dFOV 0.31 0.55
90, sFOV 0.38 0.62 0.57
90, mFOV 0.17 0.38 0.33 0.27
90, dFOV 0.28 0.52 0.47 0.40 0.64
Arcsine transformed theoretical proportions,
143
45, sFOV 45, mFOV 45, dFOV 90, sFOV 90, mFOV 90, dFOV
45, sFOV
45, mFOV 31.15
45, dFOV 33.87 47.88
90, sFOV 37.82 51.92 49.06
90, mFOV 24.69 37.81 35.00 31.14
90, dFOV 32.21 46.13 43.25 39.20 53.30
Using the and matrices, the test statistic can be computed as:
∑
Under this test, rejecting the null hypothesis indicates that the assumptions of the Case V
method are tenable (Edwards, 1957). For the data in Experiment 3, χ2 = 3.26. This was
compared to the χ2
C(6, N=24*6) = 18.31. As such, χ2 < χ
2C, and the assumptions of the Case
V method involved in finding the scale values for the Experiment 3 data were tenable.
144
Appendix 7. Estimates of the Discriminal Dispersions for
Paired Comparison Method (PCM)
Estimates of the standard deviations of each of the psychological stimuli, referred to as
discriminal dispersions, can be calculated from the Paired comparisons Z score matrix. A
derivation of the formulas used to calculate these dispersions is offered in Edwards (1957). In
essence, the equations for the scale values are rearranged to relate the dispersion terms:
√
√
These equations can be manipulated such that all variables can be related through
expressions containing the subscripts i and k only. In addition, a substitution Vi is used to
represent the standard deviation of the ith row of the confusion matrix. A series of
calculations are carried out, shown in the table below to calculate the discriminal dispersions.
H1 H2 H3 H4
H1 0 0.457 0.511 1.501
H2 -0.457 0 -0.229 0.538
H3 -0.511 0.229 0 0.566
H4 -1.501 -0.538 -0.566 0
(1) ∑Zij2 2.723 0.551 0.634 2.863
(2) ∑Zij -2.469 0.148 -0.284 2.605
(3) (∑Zij)2/n 1.524 0.005 0.020 1.697
(4) ∑Zij2-
(∑Zij)2/n
1.199 0.545 0.613 1.166
(5) V2 0.300 0.136 0.153 0.292
(6) V 0.548 0.369 0.392 0.540
(7) 1/V 1.826 2.708 2.554 1.852
(8) σ 0.634 1.423 1.285 0.657
145
Taking the values from Row 7, the value of the constant ‘a’ can be calculated:
∑ ( )
The discriminal dispersions for each of the column entries of the Z scale matrix can be
computed as:
(
)
As shown in Row 8, the discriminal dispersions of the four stimuli were quite different,
ranging from 0.657 to 1.423. This was consistent with the results of testing the assumptions
of the Case V model for the Experiment 2 data, which indicated that the assumptions were
not tenable. The scale values using the Case III method can now be computed using the
estimates of the discriminal dispersions and the scale values from the Case V method
(Edwards, 1957). For each entry in the corrected Z matrix denoted Zc, the new scale value is
computed as:
√
Using the corrected Z matrix, the new scale values are computed.
146
Corrected Z matrix, Zc
H1 H2 H3 H4
H1 0 0.712 0.732 1.371
H2 -0.712 0 -0.439 0.844
H3 -0.732 0.439 0 0.817
H4 -1.371 -0.844 -0.817 0
(1) Sums -2.815 0.307 -0.524 3.032
(2) Means -0.704 0.077 -0.131 0.758
(3) Means +
0.704 0 0.781 0.573 1.462
The corrected Z scale values for the four heights, as shown in Row 3, are Zc = {0, 0.781,
0.573, 1.462}. The equal interval scale and 2D plot for the Experiment 2 data under the Case
III model are shown in the figure, top and bottom respectively.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
H1 H2H3 H4
(b)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
H1 H2H3 H4
(c)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
H1 H2H3 H4
(a)
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(b)
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
(c)Height
H1 H2 H3 H40
0.5
1
1.5
2
Scale
valu
e
Height(a)
147
Appendix 8 –Contrasts for Paired comparisons
A8.1 Experiment 2: Height Paired comparison contrasts
The contrast method creates a Q2 test statistic, matched against a critical χ
2 value, based on
the aggregated column data of the raw matrix PCM data. For detailed proofs of the method,
please refer to Starks and David (1961).
The sum of each column is computed as ai.
H1 H2 H3 H4
H1 - 71 73 98
H2 34 - 43 74
H3 32 62 - 75
H4 7 31 30 -
ai 242 151 169 68
For each desired treatment contrast, compute Q2 using the difference between the two values
of ai for the respective treatments:
( )
⁄
The value of Q2 is compared against the D-statistic developed in the one-way statistical
analysis for difference among the treatments. For example, at α = 0.01 and α = 0.05:
Contrasts whose value of Q2 exceeds the critical value of 22.89 are considered statistically
significant.
148
H1 H2 H3 H4
H1 - 65.72** 42.29** 240.29**
H2 - 54.67**
H3 2.57 - 80.96**
H4 -
* indicates that the contrast between column and row elements were found to be statistically
significant at α = 0.05, with the column element being dominant over the row element.
** indicates that the contrast between column and row elements were found to be statistically
significant at α = 0.01, with the column element being dominant over the row element.
A8.2 Experiment 3: Judge Paired comparison contrasts
45,sFOV 45,mFOV 45,dFOV 90,sFOV 90, mFOV 90, dFOV
45, sFOV - 97 100 91 120 108
45, mFOV 47 - 68 64 80 57
45, dFOV 44 76 - 56 95 88
90, sFOV 53 80 88 - 108 88
90, mFOV 24 64 49 36 - 45
90, dFOV 36 87 56 56 99 -
ai 204 404 361 303 502 386
149
45°, sFOV 45°, mFOV 45°, dFOV 90°, sFOV 90°, mFOV 90°, dFOV
45°, sFOV - 185.19** 114.12** 45.38** 411.13** 154.35**
45°, mFOV
-
44.46**
45°, dFOV
8.56 -
92.04** 2.89
90°, sFOV
47.23** 15.57 - 183.34** 31.89**
90°, mFOV
-
90°, dFOV
1.5
62.30** -
* indicates that the contrast between column and row elements were found to be statistically
significant at α = 0.05, with the column element being dominant over the row element.
** indicates that the contrast between column and row elements were found to be statistically
significant at α = 0.01, with the column element being dominant over the row element.
A8.3 Experiment 3: Participant Paired comparison contrasts
45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV
45, sFOV - 11 10 2 4 5
45, dFOV 2 - 8 1 4 2
45, mFOV 3 5 - 1 3 1
90, sFOV 11 12 12 - 12 11
90, dFOV 9 9 10 1 - 6
90, mFOV 8 11 12 2 7 -
ai 32 17 13 58 35 40
150
45°, sFOV 45°, dFOV 45°, mFOV 90°, sFOV 90°, dFOV 90°, mFOV
45°, sFOV - 11.54 18.51
45°, dFOV
- 0.82
45°, mFOV
-
90°, sFOV 34.67** 86.21** 103.85** - 27.13* 16.62
90°, dFOV 0.46 16.62 24.82*
-
90°, mFOV 3.28 27.13* 37.38**
1.28 -
* indicates that the contrast between column and row elements were found to be statistically
significant at α = 0.05, with the column element being dominant over the row element.
** indicates that the contrast between column and row elements were found to be statistically
significant at α = 0.01, with the column element being dominant over the row element.
151
Appendix 9 – Calculation of Number of Mosaiced Frames for
Equivalent Size to dFOV Condition
In order to determine the number of frames ‘N’ to create an image mosaic whose size was
approximately equal to that of the double size field of view (dFOV) condition, a Matlab
program was written to compute the number of screen pixels contained in an image frame.
Image frames from the single field of view (sFOV) and dFOV conditions were fed through
the algorithm. Next, the same subroutine was run for a series of videos using a mFOV
display condition using different values of N, to empirically determine size of mFOV that
was reasonably close to the size of the dFOV condition. It was determined that an image
mosaic composed of 10 frames was equivalent to the display size of the dFOV.
152
Appendix 10 – Additional approaches and pilot tests
A number of alternatives for the experimental tasks, response methods, experimental factors
and analyses were investigated throughout the present study. The following is a brief account
of those considerations.
A10.1 Experiment 1
Display sizes
Three Display sizes were selected for the experiments in the present study: {sFOV, mFOV,
dFOV}. The double size FOV was selected to be the same size as that of the mFOV, but
without the unique shape property showing the shape of the camera’s path on the screen.
Both the dFOV and mFOV were twice the size of the single FOV.
Prior to selecting these three Display sizes, a number of alternatives were considered, shown
in the table below, taking into account both the resolution of the camera sensor and the
resolution of the monitor on which the information is being display. As well, the table
includes Display conditions where the speed and height of traversal are also manipulated.
The displays selected for investigation in the present study are Cases A, B and F for the
sFOV, mFOV and dFOV, respectively.
153
Case Description
Monitor Camera
No. of
monitor
pixels
Monitor
size
No. of
sensor
pixels
Distance
covered by
FOV
A Control condition MR MS K L
B Add mosaicing 2MR 2MS 2K 2L
C ½ live FOV + ½ mosaic MR MS K L
D Add mosaicing + 2X
speed 2MR 2MS 2K 2L
E Add mosaicing + 2X
height 2MR 2MS 2K 4L
F 2X screen size + wide
angle camera system 2MR 2MS 2K 2L
G 2X screen size + keep
narrow FOV 2MR 2MS K L
H Same screen size +
wide angle camera
system
MR MS 2K 2L
I Same screen size + ½X
speed MR MS K L
J Same screen size + ½X
height MR MS K ½ L
Monitor
MR = No. of monitor pixels, in units
MS = monitor size, in cm
Camera
K = no. of camera sensor pixels, in units
L = distance covered by camera, in metres
154
The Display condition in Case C was pilot tested but ultimately abandoned. Case C
represents a Display whose total display area equalled the sFOV, but was composed of a
fixed size display of half the length of the sFOV with the remaining half as a mosaic. It was
of academic interest since it provided a display exhibiting properties of a mosaic, but equal in
size to the single FOV. However, in pilot testing Case C, it was observed that, for the most
part, the displayed information looked identical to that of the sFOV. In the absence of a
surmised benefit of this admittedly contrived display configuration, Case C was abandoned.
Targets
Although stationary targets were ultimately used in Experiments 1 and 3, considerable effort
was put into investigating the effect of moving targets on spatial performance. Because the
mosaic operates on images of the recent past, objects moving in the “live FOV” appear
frozen in the mosaic. However, because the objects continue to move in the terrain, the
mosaic shows “stale” information, in that the display no longer displays accurate spatial
information of the objects in the terrain. It was surmised that observers tasked with detecting
and localising moving targets might benefit from the extra time to locate targets, but their
ability to localise them in the real world would suffer upon the basis of the mosaic presenting
“stale” imagery.
Another interesting consequence is that objects leaving the “live FOV” and entering the
mosaiced portion of the display appear distorted in the mosaic, shown below, creating a
number of artefacts in the displayed information. This is cited in a number of papers,
including Morse et al. (2008), as a source of error in the mosaicing algorithm. The extent to
which the object appears distorted depends on the relative velocity (i.e. both magnitude and
direction) between the FOV and the object. It was discovered through several iterations of
pilot testing that the effects of distortion were too difficult to control for, because of the
interaction between the magnitude and direction of travel of the FOV and objects over
straight and curved trajectories. For this reason, the investigation of moving objects was put
aside as important future work for investigating the effect of mosaicing on target detection.
155
Target detection task
Participants responded in the target detection task in Experiment 1 by first hitting the
spacebar on the keyboard and then indicating the location of the target using the mouse. This
was done to ensure that participants were actually responding to targets they detected in the
environment, as opposed to simply guessing that a target was present. This was one of
several alternatives considered for the response method for the target detection task.
One method that was considered was to use an N-alternative forced choice model of signal
detection. In a force choice model, each event (i.e. a portion of the flyover route) is divided
into N areas, or ‘alternatives’, only one of which contains a target. The figure below shows
an event divided into alternatives A and B with a target located in alternative A. The
participant is asked to indicate whether the target appears in A or B, after having flown over
the terrain. In other words, he is forced to select one of the alternatives. The rationale is that
the participant must indicate with some granularity, where along the event the target was
detected. One can imagine subdividing the event into any number of events so that the
participant provides information about the approximate location of the target while
performing the task.
156
This approach was abandoned for two reasons. First, it became clear that the attention paid to
the target detection task would vary depending on if and when the participant had detected
the target. For example, if the participant detected the target towards the start of the zone, he
would no longer need to search for the target until the start of the next event. In this case, the
participant would be able to allocate his attention to the route identification task. However, if
the participant had not detected a target, because no target was present up until that point or
the target was missed, then attention would have to be devoted to target detection. In other
words, the attentional demands in accomplishing the detection task would vary throughout
the task, which would have varying effects on performance in the route identification task.
Second, it was unclear how to score a situation where the participant detected a target in one
zone, when in fact it appeared in a different zone. For example, if the participant selects
alternative B, when the correct answer was alternative A, it could be scored as a Miss (since
he missed the target in the first half of the event) or a False Alarm (since he detected a target
in the second half of the event when there was none). Because of the ambiguity resulting
from this approach, the N-alternative forced choice model was abandoned.
Analysis of route identification data
A number of analysis methods were attempted for the route identification data in Experiment
1. The methods described in the main text relate to “distance errors” between the selected
route and the Correct route.
157
One alternative approach was to allow some error tolerance in the participants’ responses.
For example, errors below some threshold distance away from the Correct route could be
treated as being a correct response. The figure below shows examples of ‘Hit zones’, for
what would be considered correct route selections for distances of less than 2, 3 and 4 units
away from the Correct route.
The graph below shows the results of using hit zones to determine the percentage of routes
that were correctly identified for each of the three Displays. Unfortunately, this approach
revealed no differences between the three Displays.
158
A10.2 Experiment 2
With the realisation that the routes in Experiment 1 may not have required the participant to
pay continual attention to the route identification task, a set of more complex routes was
designed. A number of alternatives were considered for the response method before deciding
on the one-dimensional route identification task used in Experiments 2 and 3.
One alternative that was considered was the two-dimensional grid that was used in
Experiment 1, with more complex routes. The figure below shows an 8 x 8 grid of more
complex routes varying along two dimensions, manipulating one of the component sinusoids:
frequency along the horizontal axis, and phase along the vertical axis.
159
This response method was eventually abandoned, since the presentation of a large set of
complex routes ultimately made the task of recognising the route too difficult. With a large
set of alternatives, the participant was forced to serially search through each route in the grid,
and ultimately began to forget the shape of the route he was trying to identify. Pilot
participants reported being frustrated by this particular response method, and it was dropped
in favour of the one-dimensional response method.
As described in the study, the final response method in the form of the long river retains
many of the same benefits as the two-dimensional grid, including high granularity, recording
of the entire route, and the affordance of recognising rather than recalling the shape of the
route. Furthermore, the presentation of the long river has ecological validity, as participants
have experience in using maps containing winding roads and rivers.
A10.3 Experiment 3
In Experiment 3, pilot studies were used to redesign the search environment for the target
detection task. Whereas it was initially thought that the terrain features and targets from
Experiment 1 could be reused in Experiment 3, it quickly became clear that the experimental
160
parameters selected would cause problems concerning the visibility of the targets. In
Experiment 1, the 3D modelled trees placed in the terrain were viewed only from a top down
view. Targets were placed so as to ensure that the trees did not occlude the targets in the
environment from the top down view.
However, Experiment 3 investigated two Camera Elevation angles: the top down and angled
viewpoints. It was discovered that from the angled viewpoint, the trees occluded many of the
targets that would have been visible from the top down view, presenting an unfair advantage
to the top down view. For this reason, the trees were removed in Experiment 3.
Furthermore, pilot testing revealed that the targets used in Experiment 1 were much easier to
detect in Experiment 3, due to the fact that the Height was lower in Experiment 3 (325 m
compared to 92 m). For this reason, the targets had to be redesigned for the fixed height of
92m.