Perception of 3-D location based on vision, touch, and ... Perception of 3-D location based on...
Transcript of Perception of 3-D location based on vision, touch, and ... Perception of 3-D location based on...
1
Perception of 3-D location based on vision, touch, and extended touch
Nicholas A. Giudice,1 Roberta L. Klatzky,
2 Christopher R. Bennett,
1 and Jack M. Loomis
3
1 Spatial Informatics Program, School of Computing and Information Science, University of Maine,
Orono, Maine, USA 2Department of Psychology, Carnegie Mellon University,
Pittsburgh, Pennsylvania, USA 3Department of Psychological and Brain Sciences, University of California,
Santa Barbara, California, USA
Abstract. Perception of the near environment gives rise to spatial images in working memory that continue to
represent the spatial layout even after cessation of sensory input. As the observer moves, these spatial images are
continuously updated. This research is concerned with (1) whether spatial images of targets are formed when they
are sensed using extended touch (i.e., using a probe to extend the reach of the arm) and (2) the accuracy with which
such targets are perceived. In Experiment 1, participants perceived the 3-D locations of individual targets from a
fixed origin and were then tested with an updating task involving blindfolded walking followed by placement of the
hand at the remembered target location. Twenty-four target locations, representing all combinations of two
distances, two heights, and six azimuths, were perceived by vision or by blindfolded exploration with the bare hand,
a 1-m probe, or a 2-m probe. Systematic errors in azimuth were observed for all targets, reflecting errors in
representing the target locations and updating. Overall, updating after visual perception was best, but the
quantitative differences between conditions were small. Experiment 2 demonstrated that auditory information
signifying contact with the target was not a factor. Overall, the results indicate that 3-D spatial images can be formed
of targets sensed by extended touch and that perception by extended touch, even out to 1.75 m, is surprisingly
accurate.
Keywords: Spatial image, extended touch, haptic perception, spatial cognition
Correspondence should be sent to Nicholas Giudice, Spatial Informatics Program, School of Computing and
Information Science, University of Maine, Orono, ME 04469. Phone 207-581-2187, Fax 207-581-2206
E-mail: [email protected]
2
Introduction
Vision, hearing, touch, and language provide information for the creation of perceptual representations (percepts) of
our surroundings. Such percepts give rise to transient spatial representations in working memory that persist well
after the input has ceased, as well as the source of enduring representations that exist in long-term memory.
Although most research in spatial cognition has focused on vision as the input modality for these processes, there is
growing interest in multimodal spatial cognition—the development of representations based on different sensory and
linguistic inputs and their subsequent use in spatial judgments and action.
We are interested in a form of spatial representation maintained in working memory that plays a role in the control
of action when the source stimulus is temporarily interrupted or removed. We refer to this representation as a
"spatial image" (Loomis, Klatzky, & Giudice, in press-a). For an isolated visual, auditory, or haptic target, the
spatial image of a target is spatially coincident with the percept. As such, perceptual errors in distance or direction
are inherited by the spatial image and remain once the stimulus and resulting percept are no longer present. There is
growing evidence that the spatial image is amodal in nature-- that once formed, it no longer retains modality-specific
information with respect to subsequent tasks that rely on it (Giudice, Betty, and Loomis, 2011; Giudice, Klatzky,
and Loomis, 2009; Loomis et al., 2012; Loomis et al., in press). In this article, our interest is whether there are
spatial images of the perceived locations of targets sensed by extended touch (i.e., using a probe to extend the reach
of the arm) as there are with normal touch (e.g., Giudice et al., 2011) and, if so, to assess the accuracy of extended
touch relative to normal touch and vision.
Our interest in extended touch was prompted by the well-known phenomenological result that feeling with a probe
causes the observer to perceive the location where the probe contacts the surface rather than just perceiving the
vibrations in the handle (Gibson, 1966; Katz, 1925/1989; Lotze, 1894; Polanyi, 1966; Weber, 1846/1978). This
perceptual externalization of sensory information is often referred to as distal attribution (Epstein et al., 1986;
Loomis, 1992; Siegle & Warren, 2010). Going beyond phenomenology, many elegant empirical studies have been
conducted on "dynamic touch" (Gibson, 1966; Turvey, 1996), of which extended touch (involving contact of a probe
with a surface) is a special case. Turvey and associates have shown that people wielding a probe can judge the
length of the probe as well as the distance to the surface being contacted (Carello, Fitzpatrick, & Turvey, 1992;
Chan & Turvey, 1991) and the size of an aperture defined by two edges (Barac-Cikoja & Turvey, 1991). In making
these judgments, people are able to sense intrinsic properties of the wielded probe (e.g., its moments of inertia)
despite considerable variations in how the probe is held and the joints about which limb rotation occurs (e.g., elbow
and wrist). Related work has shown that bi-manual wielding of crossed rods enables observers to accurately judge
the intersection distance (Cabe, Wright & Wright, 2003).
While we appreciate that people are able to sense the intrinsic properties of the wielded probe as the basis for
judging where the contact point is in space, we are still fascinated by the phenomenological claim that people
actually experience the point of contact at the end of the probe. For a point of contact to be perceived without
ambiguity, the observer must effectively model the linkage from the body to the externalized world (Loomis, 1992).
The work cited above indicates that such modeling is possible when the extension is not geometrically complex,
such as a hand-held rod. This elegant achievement of our evolutionary history is evidenced not only in humans, but
in lower animals who exhibit tool use (Maravita & Iriki, 2004; Povinelli, Reaux, & Frey, 2011).
Based on other research we have done on development of the spatial image as representing space around the person
(for review, see Loomis et al., in press), we assert that once striking with the probe has ceased, people continue to
represent the perceived contact point within working memory. The representing spatial image can then serve to
support spatial behaviors, such as spatial updating, whereby the person can move flexibly through space while
updating the spatial locations of the initially perceived contact points. In addition to investigating whether such
spatial images can be formed with extended touch, we were also interested in assessing the accuracy and precision
of locations perceived using extended touch relative to normal touch and full-cue visual viewing.
To address these issues, we used a spatial updating task, where participants perceived a single target from a specific
observation point (the origin), then sidestepped 1 m left or right, and walked without vision toward the remembered
target location. Upon reaching the target, participants indicated its 3-D location by gesturing with the hand, a
response method developed by Ooi, Wu, & He (2004) and subsequently used in other research (Ooi & He, 2006;
3
Ooi, Wu, & He, 2006; Wu, Ooi, & He, 2004). Under the assumption that spatial updating is performed without
systematic error (see Loomis & Philbeck, 2008), the location indicated by the gesturing hand is a best estimate of the
initially perceived 3-D location. With good distance information available, the average gestured location to targets
above the ground and up to 7 m away is within 15 cm of the target (Ooi & He, 2006). In reduced-cue environments,
systematic biases in visual perception can be demonstrated by this blind walking/gesturing technique. For example,
when it is used to measure the perceived 3-D locations of nearby dim points of light viewed in an otherwise dark
room, the indicated location typically is farther from the viewing point than the target, indicating over-perception of
target distance, but essentially in line with the target, indicating correct perception of target direction (Ooi et al,
f2001, 2006; Wu et al., 2004). We note, however, that the claim of correct perception of direction has been
challenged recently. Using a different method for measuring perceived direction, Li and Durgin (2012) concluded
that the visual direction of a target is systematically misperceived, with a consequent misperception of its distance.
Our primary concern in this experiment is to use the walking/gesturing technique to assess the accuracy with which
people can perceive the distance and height of a target using extended touch. However, we also varied the azimuth
of the target relative to the starting orientation of the observer as a manipulation of secondary interest. In the
experiment, targets were presented at two heights, two distances from the origin, and six azimuths relative to the
participants' starting orientation. Perception of all 24 target combinations was assessed for four experimental
conditions (the first three done under blindfold): (1) extended touch with a 2-m aluminum pole, (2) extended touch
with a 1-m aluminum pole following stepping forward 0.75 m, (3) physical touch after walking to the vicinity of the
target, and (4) normal visual viewing of the targets. The 1-m pole condition was included as a hybrid of the 2-m pole
and hand touch conditions inasmuch as it involved both extended touch and the proprioception of walking. In the
two conditions where the participants walked forward (2 and 3), they stepped backward to the origin after
responding, so that in all conditions, responses were initiated after side-stepping from the origin.
Having participants sidestep prior to walking and gesturing was intended to eliminate two alternative strategies that
would not rely on a spatial image. First, it precludes the possibility that the walking and gesturing responses might
be computed from the observation point and performed ballistically. Second, it prevents participants in the normal
touch condition from simply repeating the motor movements during the response phase that they made to reach the
target in the perception phase. In order for participants to respond to the initially perceived target following
sidestepping, they must mentally represent the target in working memory (i.e., form a spatial image) and then update
this representation during both side-stepping and subsequent blind walking / gesturing. Performance with the
extended touch conditions in this study will be possible only if spatial images can be accurately developed of the
locations contacted by the end of the pole.
Method
Participants. Fourteen participants (7 female), ages 20 to 30 (M = 24.6, SD = 3.5), took part in the study. The
research was approved by the University of Maine’s local ethics committee and written informed consent was
obtained from all participants, who received monetary compensation for their time.
Apparatus. Targets consisted of two 6.6 cm diameter field hockey balls, which were affixed to the tops of two
microphone stands (0.5 and 1.5 m height). PVC tubing (4.5 cm diameter) was put over the stalk of the stand to
ensure a uniform surface along its extent. Two aluminum poles (1 m and 2 m, 1.5 cm diameter) were used for the
extended probes used to apprehend the targets (see Figure 1). A fingerless cycling glove with an infrared LED
affixed to the dorsal surface was worn in each condition. The LED was used to track user movement during the
blind walking tests by means of a four camera optical tracking system (PPT, Worldviz Inc., Santa Barbara, CA).
Recording of tracking data and sequencing of experimental trials was done using the Vizard 3D rendering suite
(version 3.17, Worldviz). A blindfold (Mindfold Inc., Tucson, AZ) was worn during all experimental trials, except
for those requiring visual inspection. A Nintendo Wiimote was used for making test responses in the blind walking
trials.
4
Figure 1. Three of the panels illustrate a participant feeling a target in the long pole, short pole, and hand touch
conditions. The fourth panel shows the participant making a gesturing response to the target location following blind
walking to its vicinity.
Procedure. The experiment had four separate conditions, each with 24 trials, and adopted a within participants
design, with each participant completing each condition. The 24 trials represented the different combinations of
height, distance, and azimuth for target placement. The two heights used were 0.5 and 1.5 meters; the two distances
were 1.75 and 1.25 meters from the origin, and the six azimuths used were -135º, -45º, 0º, 45º, 135º, and 180º
relative to the starting orientation (0º) at the origin (see Figure 2), with negative values indicating azimuths to the
left. Within our coordinate frame, x corresponds to left/right variation in the figure, y corresponds to height, and z is
parallel to the starting orientation. The 24 target location presentations were randomized over the 14 participants,
and condition order was counterbalanced as fully as possible given the number of participants using a Latin Square
design. The exception was the vision condition, which was always performed last. This was done as viewing was
performed with full cues and thus provided information about the room, which could have biased the non-visual
conditions if run earlier.
Figure 2. Top down view of the space with a participant at the origin in the initial orientation (0°). Situated around the
person are all of the possible physical locations of the targets (height collapsed) as well as the two drop off points (1m to
the left and 1m to the right).
5
Prior to the start of the experiment proper, the blind walking task with target distances of 3, 5, and 10 ft (0.30, 1.50,
and 3.05 m) was demonstrated in a hallway outside of the lab room. Participants looked at a taped marker on the
floor, walked to the point with eyes closed, and then opened their eyes to get corrective feedback about differences
between the target and the stopping point. This was done for three replications at each distance.1 The participant
then entered the lab room. All experimental trials comprised a perception phase followed by a response phase. For
the three conditions involving touch, one practice trial preceded the experimental trials to acquaint the participant
with the procedure for exploring the target and then responding by blind walking and gesturing (with the target
removed). Participants were not allowed to view or haptically explore the rod and no feedback about the accuracy of
the response was provided.
The procedures for perceiving the target's location in the four different perception conditions are given below (see
Figure 1).
1) Long pole: In this condition, the blindfolded participant used a 2-m pole to explore each target and thus to
perceive its location. To begin, the participant stood at an origin within the lab facing the starting orientation (0º
azimuth), as indicated by a toe rest used to facilitate non-visual alignment. The participant held onto the top (handle
end) of the pole which was vertically oriented directly in front of the origin. Once the target was silently positioned,
the experimenter lifted the tip of the pole, which had been resting on the floor, until it was parallel to the floor,
rotated the pole until aligned with the target azimuth, and rested the tip on the target’s dorsal surface. The participant
was instructed to remain at the origin and rotate in place to follow the movement of the pole, such that it always
remained in front. Once the end of the pole was placed on the target, the participant had 20 sec to explore and thus
perceive the target’s location with respect to his or her own location in the room. This 20 sec period was found to be
more than sufficient from pilot studies in the lab. To explore the target, the participant was allowed unrestricted
movement of the pole up and down along the stand upon which the target was mounted. The participant could also
move the pole left and right to allow for triangulation or forward and back to calibrate the distance against the length
of the pole. To minimize the use of auditory and body-based distance cues, the participant was not allowed to rotate
or translate the head or body during exploration, was discouraged from tapping the target stand with the pole, and
could not sweep the pole back and forth between the target’s location and the origin (i.e. providing a distance metric
beyond the internalized length of the pole). After this perception phase, the experimenter grasped the tip of the pole
and rotated it (and the participant) back to the starting orientation.
2) Short pole. A 1–m pole was used in this condition. The initial procedure was identical to the long pole condition
except that after the participant rotated through the angle from the original orientation to the target location, he/she
stepped 0.75 m forward before the tip of the pole was placed on the top of the target by the experimenter. (This
distance was sufficient to reach both targets.) He/she then had 20 sec to explore and perceive the target as described
earlier. When finished, the participant translated 0.75 m backward to the origin and was rotated back to the starting
orientation with a short guidance rod.
3) Hand touch: In this condition, the blindfolded participant began at the origin. After target placement, the
experimenter handed him/her a short 10 cm rod which was used to guide him/her while rotating into alignment with
the target azimuth. Once the participant was aligned, he/she stepped forward until standing 0.25 meters in front of
the target (placing it within comfortable arm’s reach). The experimenter then guided the participant's hand to rest on
the top of the target, and the participant was allowed the 20 sec exploration period. He/she then translated backward
to the origin and was then rotated back to the starting azimuth with the guidance rod.
4) Vision: The initially blindfolded participant stood at the origin. Once the target was in place, he/she was
instructed to lift the blindfold while remaining at the origin, view the target, and then to rotate so that the target was
directly ahead. Once aligned, he/she had 20 seconds to view the target and then was asked to rotate back to the
starting azimuth under visual control and lower the blindfold.
For all conditions, the response phase began after the participant had returned to the starting azimuth following the
20 sec exploration/viewing period for each trial. It is important to note that the participant was informed that he or
she was being returned to the same origin, with the experimenter-controlled (or, with vision, self-controlled) rotation
and toe rest ensuring the same orientation for every trial. Prior to the actual response, the participant was instructed
to sidestep 1 m either left or right (stepping direction was counterbalanced over all trials, conditions, and
6
participants). Once positioned at the drop-off point, as indicated by toe rests for all conditions to facilitate alignment
with the starting orientation, he/she pressed a button on the Wiimote controller held in the left hand to indicate the
start of the response itself. Because the drop-off points differed from the origin, task performance required spatial
updating of the target during sidestepping in order to know the correct direction for walking toward the target. To
perform the blind walking/gesturing response, participants attempted to turn and then walk directly from the drop-
off location to the initially perceived and subsequently updated target location (which could be in front or behind
them) and to place the palm of their right hand at the height where they believed the object, now removed, had been
located (see lower right panel of Figure 1). Immediately upon reaching the target location and indicating its height,
the participant pressed a second button on the Wiimote controller. The button press logged the precise x, y, z
position calculated from the LED affixed to the glove on the right hand. This location constituted the response in the
subsequent analysis. After making the response, the participant was led back to the starting location and once again
was aligned with the starting azimuth. This perception/test procedure was repeated for each of the 24 trials in every
condition, for a total of 96 trials.
Results and Discussion
Prior to analysis, the nominal target locations were converted to measured locations with the LED affixed to the
glove of an experimenter who positioned his hand on top of each of the four targets at the 0° azimuth; this was
necessary to compensate for the fact that the LED affixed to the glove was several centimeters above the top of the
target when the hand was in the correct position. The initial analysis examined the target-to-response distance error
for effects of the order in which participants participated in the haptic exploratory modes (recall that vision was
always administered last). The effect of order was small and inconsistent across exploratory modes (mean error was
61, 60, and 57 cm for orders 1, 2, and 3), and a set of t-tests comparing all possible orders within each exploratory
mode (with N for combinations of order and mode ranging from 3 to 6) produced no significant results.
Accordingly, data were pooled across order for subsequent analyses.
Biases in the Azimuths of the Response Centroids
Figure 3 gives a top down view of the response centroids (averaged over the two heights) as a function of perception
condition, distance of the target, azimuth of the target, and drop off point. The azimuths of the response centroids
exhibit large biases relative to those of the respective targets, depending on response location. The response
centroids from the location to the right tend to be leftward of targets, whereas responses from the location to the left
tend to be rightward of targets. The biases may reflect two different influences--an error in the representation of
target azimuth and an influence of sidestepping on the subsequent response. A factor contributing to representational
error may have been the relatively transparent geometry of the target layout, given the symmetry and small number
of locations; geometric regularities are known to induce response biases in spatial tasks (Huttenlocher, Newcombe,
& Sandberg, 1994). To the extent that sidestepping itself induces error, this tendency is inconsistent with previous
work showing that spatial updating is performed without systematic error (e.g., Loomis et al., 2002; Philbeck,
Loomis, & Beall, 1997). In particular, the influence of sidestepping was not observed in a previous study by the
authors (Klatzky et al., 2003).
The biases that depend on the response location are all approximately symmetric about the axis of the starting
orientation. This observation was confirmed by summing the signed errors perpendicular to this axis for each pair of
symmetric targets and finding that the 95% confidence interval (-3.8 + 9.6 cm, using the 16 symmetric pairs as units
of observation) included zero. Accordingly, Figure 4 presents the same data after averaging over the two directions
of sidestepping and averaging by reflection over the two sides of this axis. In this figure, the response azimuth tends
to be pulled toward the axis of the starting orientation both for targets off to one side, whether in front and in back.
These biases, after averaging out the systematic error due to response location, are evidence of residual influences in
how the target azimuth is represented. First, the degree of bias is much greater for targets off to one side than for the
targets directly ahead and behind. Average azimuthal errors for the folded data are 2.0º, -2.9º, -9.2º, and 8.0º for
targets with azimuths of zero, 180º, 45º, and 135º, constituting an approximately 4:1 ratio of error for oblique
relative to sagittal targets. Second, the centroids for vision are considerably closer to the targets than are the
centroids for the other three conditions. The average centroid-to-target distances are 12.1 cm for vision, versus 24.7
cm for hand touch, 21.3 cm for long pole, and 18.4 cm for short pole. There is good reason to expect that perception
of azimuth is more accurate with vision, for participants were able to see the separation between the starting
orientation and the target azimuth both before and during the rotation to face the target as well as to sense the body
7
rotation using proprioception. In the other three conditions, the only basis for perceiving target azimuth was to use
proprioception to sense the passive rotation of the body, as guided by the experimenter.
Figure 3. Top down view of the centroids of the response locations (averaged over the two heights), as a function of
perception condition, target location, and drop off points (to the left and right). The same symbol is used for the drop-off
centroid and drop-off point (DOC and DOP, respectively), as the locations of the latter immediately left and right of
center are clear.
Figure 4. Top down view of the response centroids, averaged over the two heights, over the left and right drop off points,
and averaged over left and right targets, after folding over the axis of the initial orientation (0°).
8
Accuracy and precision of perceiving and updating targets varying in height and distance
As mentioned in the introduction, our primary concern is with the accuracy with which distance and height are
perceived using extended touch. Given the clear azimuthal biases just discussed, our subsequent analysis
concentrates on the perception of distance and height. For this subset, analyses of exploratory-condition order again
showed no consistent effect. As is apparent in Figures 3 and 4, errors in distance (i.e., difference between walked
distance and origin-to-target distance) tended to vary little across values of azimuth. ANOVAs on signed error
conducted within each perception condition, with factors of target azimuth and target distance, confirmed the
absence of effects involving azimuth for all perception conditions but for vision, where the main effect of azimuth
was significant, F(5,13) = 2.83, p = .019, η2
p = 0.10. The tendency in the vision condition was for responses along
the 0° azimuth to be more accurate than along the obliques (average signed error = 1.7 cm vs. 9.6 cm, respectively).
We therefore chose to analyze the accuracy of distance and height responses for only this azimuth. If we assume that
participants erred in the direction of walking as a result of sidestepping, the distances of the responses would be
slightly underestimated if we simply took the coordinates of the responses in the z direction. Instead, we computed
the Euclidean distance of each response from the origin, taking into account both the x and z coordinates. By
partialing out azimuthal error in this way, we effectively defined a modified response point that lies within the yz
plane. Subsequent calculations of the centroids and response errors are based on these slightly modified response
points. Figure 5 shows, for each perception condition, a side view of the targets, the centroids of the responses for 0°
azimuth, and the respective standard errors of the mean for both height and distance.
Figure 5. Side view of the targets and responses in each of four perception conditions. These data are based on only the
responses to the targets at 0° azimuth. Here a side profile of a “participant” is shown and in front are the four possible
physical targets (1.25m and 1.75m distances in combination with 0.5m and 1.5m heights). Because of slight azimuthal
errors, the centroids depicted here are based on the heights of the responses and their distances measured from the origin.
Associated with each centroid are the standard errors of the means in the height and distance directions. There were 14
participants in this experiment. Included in the panel for Vision is the mean gestured location ("centroid") to a target
viewed under reduced-cue viewing, showing the substantial distance overestimation typical of near targets (data from
Figure 5c of Ooi et al., 2006).
We would like to draw conclusions about the accuracy and precision with which targets are perceived from the
accuracy and precision of the responses. Clearly, the responses reflect more than perception, for the participant must
remember its location, sense his/her motion through space, update the remembered location, and respond by hand
gesturing. The accuracy and precision of the responses reflect all of these sources of variation (for discussion, see
Loomis & Philbeck, 2008). However, to the extent blind walking and gesturing is calibrated for the near
9
environment for vision under full distance cues, as appears to be the case (Ooi & He, 2006; Wu et al., 2004), the
precision and accuracy of the responses place lower limits on the precision and accuracy of perception and allow
conclusions about the relative accuracy and precision of the four different modes of perception.
The responses to visual targets in Figure 5 constitute a gold standard for performance in our experiment. As in the
full-cue condition in the study by Ooi and He (2006), performance here is very good. More importantly is the
generally high accuracy of the responses for all four perception conditions, as indicated by the proximity of the
centroids to the targets. Especially of interest are the results for the long pole condition, which involves pure haptic
information without translations of the body. For all conditions, updating accuracy was assessed for distance and
height by calculating for each participant the signed error, i.e., the signed distance between the target and the
response either along the distance or height axis. Small values indicate high accuracy. Separate repeated-measures
ANOVAs were conducted on the height and distance errors, with factors of target height, target distance, and
perception condition. The analysis on distance error showed no significant effects. The analysis on height error
showed effects of perception condition, F(3,13) = 3.93, p = 0.015, η2p = 0.23, with means of 1, 9, 7, and 3 cm for
hand, long pole, short pole and vision, respectively. Paired t-tests were used to compare vision to each other
condition. There were significant effects (2-tailed) for long pole (p < 0.05) and short pole (p < 0.01), but not hand (p
> 0.50). Thus, as expected, vision tended to produce more accurate responses. There were also effects of distance,
F(1,13) = 5.79, p = .032, η2
p = 0.31, and a height x distance interaction, F(1,13) = 9.34, p = 0.009, η2p = 0.42,
reflecting some over-estimation at the longer distance and greater height.
We interpret the proximity of all of the centroids to the respective targets in Figure 5 as evidence of accurate
perception in all four conditions. However, because "accurate" is a relative term, it is useful to compare perception
in our experiment with perception in other studies. The panel in Figure 5 for vision includes the results for one
target viewed under conditions of reduced distance cues (from Figure 5c of Ooi et al., 2006). As in the current
study, participants indicated the perceived location of the target using the blind walking and gesturing procedure.
The centroid of the gestured locations, for 8 participants, can be seen to be more or less in line with the target from
the viewing location, indicating fairly accurate perception of direction, but is much farther away, indicating
significant over-perception of distance when distance information is impoverished.
Figure 6. Accuracy of distance perception in the current study and three other studies. The dashed line represents
accurate distance perception. The four lines in the center of the figure give the mean distances of the indicated locations
in the four conditions of the current study after averaging over the two heights used. The results of another extended
touch study are those from Experiment 1 of Chan and Turvey (1991). Large distance overestimation errors occur with
audition in a field outdoors (data from the Dynamic-4 m condition of Figure 3 of Speigle & Loomis, 1993) and with
reduced-cue vision in a darkened room (data taken from the Motion, Binocular condition of Figure 2 of Philbeck &
Loomis, 1997). Both of these latter studies used the blind walking method to measure perceived distance.
10
Other studies using similar methods of response also show evidence of large perceptual errors. Figure 6 gives the
results of two such studies. Speigle and Loomis (1993) had participants perform blind walking to sounds delivered
by loudspeakers in an outdoor field. Even with dynamic distance cues available during a short period of walking
prior to the blind walking response, auditory targets were perceived to be more than a meter farther away than they
actually were. Similarly, when participants viewed glowing targets at eye level in an otherwise dark room, very near
targets were perceived as more distant than they actually were (Philbeck & Loomis, 1997). Figure 6 also gives the
results of another extended touch study (Chan & Turvey, 1991, Experiment 1), obtained using a different response
method. Accuracy was on the same order as that in the current study.
Perceptual precision was assessed for both height and distance by computing the absolute deviation between the
participant’s response and the mean response, averaged over participants. These mean absolute deviations are
closely related to the standard errors for height and distance in Figure 5. Small values indicate high precision. The
ANOVA on height deviations produced effects of height, F(1,13) = 47.39, p < 0.001, η2p = 0.78, and height x
perception condition, F(3,39) = 3.09, p =.038, η2
p = 0.19. The ANOVA on the distance deviations showed effects of
perception condition, F(1,13) = 10.96, p < 0.001, η2
p = 0.46, with means of 21, 30, 17, and 13 cm for hand, long
pole, short pole, and vision, respectively. Paired t-tests were used to compare vision to each other mode. These
were significant (2-tailed) for hand (p <.05) and long pole (p < 0.001), but not short pole (p >.20). Thus, not
surprisingly, vision is slightly more precise.
A secondary purpose of the experiment was to provide evidence that a spatial image is formed of the point of
contact between the probe and the target being explored. While we believe that the usual blind walking/ gesturing
involves updating of a spatial image, requiring the participants to sidestep precluded them from simply making an
estimate of the contact point and performing the blind walking/gesturing response in ballistic fashion. Instead they
were forced to form a representation of the contact point (a spatial image) and update this location in order to
perform the subsequent response. Although sidestepping introduced some systematic bias, the fact that the response
centroids are close to the targets in height and distance is evidence of the ability of participants to form a mental
representation of the contact point.
Experiment 2
Although the instructions to the participant in Experiment 1 were for the purpose of minimizing any auditory cues
produced when the pole struck the target or its supporting stand, weak auditory cues might still have been available
in the two extended touch conditions. If they were, such cues would vitiate any claim we could make about extended
touch conditions based on haptic cues alone. Experiment 2 is a control experiment to determine whether the fairly
accurate and precise performance in the two pole conditions in Experiment 1 was not the result of unintended
auditory cues to the target locations. Here we compared performance with the long and short probe under the same
exploratory conditions as Experiment 1 and under exploratory conditions that eliminated auditory cues. Only direct
walking responses were required, since the issue is whether auditory cues facilitate the perception of target location,
a process that is independent of whether subsequent responses are direct or indirect. If no reliable differences are
observed between conditions, as is expected, then we have good evidence that the results of Experiment 1 were
driven solely by extended haptic exploration with the probes.
Method
Participants. Twelve new participants (6 female), ages 18 to 29 (M = 20.7, SD = 3.1), took part in the study. The
research was approved by the University of Maine’s local ethics committee and written informed consent was
obtained from all participants, who received monetary compensation for their time.
Apparatus and procedure. The apparatus used here was the same as Experiment 1, except that noise cancelling
headphones (Ryobi Tek, model RP4530) fed with white noise were worn during the perception phase on half of the
trials; in pilot testing, the noise level was chosen so as to block out all sounds in the lab, including those made when
the pole contacted the target or stand. The practice, exploration/perception, and testing procedures were the same as
in Experiment 1 except for a few modifications. Because our primary interest here was whether auditory cues might
have influenced extended touch in Experiment 1, we only used the 1-m and 2-m pole conditions. Half of the trials
with each pole were done without auditory masking, as in Experiment 1, and the other half included auditory
11
masking, per above. Practice trials employed masking and did not include experimental target positions. Masking
and no-masking trials were blocked, with long and short probe trials alternating. To keep the number of trials to a
minimum, only targets for 0° azimuth were used. Also, rather than requiring indirect walking from a left or right
drop-off point, as in Experiment 1, testing here was done using direct blind walking from the origin followed by
gesturing of the target position, similar to the method used in the previously mentioned studies of visual distance
perception (Ooi & He, 2006; Ooi et al., 2001, 2006; Wu, Ooi, & He, 2004). As in Experiment 1, the target heights
were 0.5 and 1.5 meters and the target distances were 1.25 and 1.75 meters. In Experiment 2, we added four
distracter trials in each condition, consisting of a 1m high target placed at distances of 1, 1.25, 1.5, and 1.75 meters.
The data from the distracter trials were not analyzed.
Results and Discussion
As in Experiment 1, the results are based on the locations of the 3-D gesturing responses. Here the four conditions
were long pole without noise mask, long pole with noise mask, short pole without noise mask, and short pole with
noise mask. As in Experiment 1, we assessed accuracy by calculating the signed errors between response locations
and the targets for both the distance and height axes, and we assessed precision by calculating the absolute
deviations of the response values from the corresponding means for the two axes. Accuracy is reflected in Figure 7
by the proximity of the response centroids to the targets, and precision is reflected in the size of the error bars
(standard errors of the mean). Separate ANOVAs were conducted to assess whether accuracy and precision were
affected by length of pole (1 m vs. 2 m), presence of a noise mask, which eliminated potential auditory localization
cues, and height and distance of the target. Here we concentrate on effects involving the noise mask factor,
potentially modulated by the pole and target locations.
Figure 7. Side view of the targets and responses in two exploration conditions (long pole, short pole), with auditory cues
present (as in Experiment 1, compare to Figure 5) versus masked by noise. Here a side profile of a “participant” is shown
and in front are the four possible physical targets (1.25m and 1.75m distances in combination with 0.5m and 1.5m
heights). Associated with each centroid are the standard errors of the means in the height and distance directions. There
were 12 participants in this experiment.
The ANOVA on signed error, the inverse of accuracy, showed no significant effects involving noise mask for either
height or distance. The ANOVA on absolute deviations, corresponding to the inverse of precision, indicated that
noise masking, which eliminated auditory localization cues, in some cases slightly improved the precision of the
response height for near targets, as indicated by a significant noise mask x distance interaction, F(1,11) = 5.46, p =
0.039, η2p = 0.33. In the analysis of absolute deviations, a slight increase in precision caused by the noise mask was
evident only with the short pole, leading to a significant interaction of noise x distance x pole length, F(1,11) = 7.14,
p = 0.022, η2p = 0.33.
12
The results for accuracy and precision reveal no facilitation due to the auditory localization cues in this experiment,
there actually being some increase in precision for selected noise masking conditions. We conclude that haptic
information alone is responsible for the generally high accuracy and precision observed with the short and long pole
conditions in both Experiments 1 and 2.
General Discussion
Previous research has demonstrated the ability of people to perceive and update haptic stimuli within arm’s reach.
This is true for both single targets (Barber & Lederman, 1988; Hollins & Kelley, 1988) and configurations of a small
number of targets (Giudice et al., 2009; Giudice et al., 2011; Pasqualotto, Finucane, & Newell, 2005). Rotational
updating of locations perceived beyond arm’s reach, i.e. by touching them with a cane, has also been demonstrated
(May & Vogeley, 2006). The present experiments used the blind walking/gesturing method of Ooi and her
colleagues (Ooi et al., 2001, 2006; Wu et al., 2004) to investigate the ability of people to perceive targets sensed
with a long probe, which extended the reach of the arm by 1 or 2 m. The 2-m pole condition was of greatest interest,
for it involved purely haptic exploration involving the hand and arm.
Because we were primarily interested in the perception of targets varying in height and distance using extended
touch, we focused our analysis more on the targets that were initially straight ahead. In Experiment 1 both accuracy
and precision were best for those trials (Figures 3 and 4); in Experiment 2, which showed that auditory localization
cues were not a factor in extended touch, only straight-ahead targets were used. Figures 5 and 6 give these results for
Experiment 1 and 2, respectively. As expected from prior work (Ooi & He, 2006; Ooi et al., 2001, 2006; Wu et al.,
2004), vision led to responses with the greatest accuracy (Figure 5). Precision was also slightly better. The most
notable result is the remarkably good accuracy and precision with which people perceive targets using a 2 m pole
(Figures 5 and 7). This result extends the research of Chan and Turvey (1991), which showed that people wielding a
probe were able to perceive the vertical distance to a horizontal surface up to 80 cm away. It is also noteworthy that
the short pole and hand touch conditions resulted in comparably good performance. Both of these conditions involve
haptic input from hand and arm and proprioceptive input associated with walking forward to reach the target and
then walking backward to the origin. Previous work investigating haptic learning during ambulation and hand
exploration, where blindfolded participants were guided through a room-sized layout of six sequentially exposed
objects, has shown accurate learning (Yamamoto & Shelton, 2005, 2007).
Although the manipulation of target azimuth in Experiment 1 resulted in significant azimuthal errors (Figures 2 and
3), participants nevertheless showed the ability to update the locations of targets initially lying in all directions about
the starting orientation. Although we did not report the results for height, accuracy and precision of the responses to
height and distance for targets straight ahead were only slightly better than those for the other directions. It is
noteworthy that in all four of the perception conditions, participants were able to update the locations of targets
directly behind them while sidestepping. Indeed, in a study that controlled for differences in body turning during the
perception and response phases, Horn and Loomis (2004) showed that updating performance is about as good in
back as in front.
As discussed in the introduction, phenomenological reports indicate that the most salient aspect of the experience of
contacting a surface or point target with a wielded probe is not the vibrations and forces in the probe but the
perception of the point of contact. We maintain that the participant does indeed perceive the point of contact in
external space, not unlike perceiving the location of a target with the bare hand. Moreover, as other research has
shown with visual and auditory targets (for a summary, see Loomis et al, in press), accompanying the perceived
location is a more abstract spatial representation of the same location (spatial image) that continues even after the
stimulus and its corresponding percept have ceased. The experiments here provide evidence of a 3-D spatial image
corresponding to the perceived contact point between target and probe. In particular, requiring the participants to
sidestep precluded them from simply making an estimate of the contact point and performing the blind
walking/gesturing response in a ballistic fashion. Instead they needed to form a representation of the contact point (a
spatial image) and update this location in order to perform the subsequent response.
Our notion of the spatial image, as built up from extended touch, is a representation of the object at the end of the
probe through a linkage of what is being perceived from the external world and an internal model of the extension
(Loomis, 1992). This idea is somewhat different from other views positing that use of a tool actually leads to a
change in the body schema, such that the representation of the limb expands to encompass the tool (Maravita &
13
Iriki, 2004). In other words, the perceptual-motor expansion of peripersonal space afforded by tool use leads to
expansion of the neural representation of the arm in our body schema, as evidenced by monkeys (Iriki, Tanaka, &
Iwamura, 1996) and humans (Cardinali et al., 2009). While these alterations of body schema are known to occur
after extended training, the current results were demonstrated almost immediately after initial perception with the
probe. We interpret these findings as supporting the development of a spatial image of surrounding space based on
accurate perception of the tool and an internal model of its extent, rather than inducing a more enduring modification
of the body schema. As the spatial image is postulated as representing the perceived contact point of the probe, it
can readily support spatial behaviors after the percept has ceased, such as the spatial updating performance shown in
this paper. This notion is congruent with the phenomenological claim that people experience the point of contact at
the end of the probe, rather than the probe itself (Gibson, 1966; Katz, 1925/1989) and is in agreement with the
suggestion that the tip of the tool is what is being represented, rather than considering it as an extension of the arm’s
representation in the body schema (Holmes & Spence, 2004). This interpretation is also consistent with the view that
we may have separate representations of the hand and tool which are co-registered during context-appropriate actions
(Povinelli et al., 2011).
Acknowledgements
This research was funded by NIH grant 1R01EY016817. The authors thank Mick Ramos for assistance in recruiting
and running participants and Tim McGrath for assistance with coding the experimental scripts.
Footnote
1. Practice with blind walking is not usually provided in studies using blind walking or blind walking/gesturing
techniques to assess distance perception. Even without it, research has shown that the mean indicated distances
using blind walking to visual targets viewed with full distance cues are very accurate (for a summary, see Loomis &
Philbeck, 2008). The ability to perform blind walking and blind walking/gesturing must require some calibration of
the overall scaling factor for walked distance. Normally, this calibration surely is provided by observing the
correspondence between walking speed and observed changes in the visual scene during normal ambulation
(Loomis & Philbeck, 2008, p. 20). Even with such calibration, large systematic performance errors are readily
apparent when distance cues are impoverished, errors that are traceable to errors in perceiving distance (for
summary, see Loomis & Philbeck, 2008). However, because we wished to minimize errors in perceived
displacements associated with blind walking, we chose to provide practice and feedback with only this component
of the response. Thus, the practice served only to precisely calibrate perceived displacements associated with
walking.
14
References
Barac-Cikoja D, Turvey MT (1991) Perceiving aperture size by striking. Journal of Experimental Psychology:
Human Perception and Performance 17: 330-346
Barber PO, Lederman SJ (1988) Encoding direction in manipulatory space and the role of visual experience. Journal
of Visual Impairment & Blindness 82: 99-106
Cabe PA, Wright CD, Wright MA (2003) Descartes’s blind man revisited: Bimanual triangulation of distance using
static hand-held rods. The American Journal of Psychology 116: 71-98
Cardinali L, Frassinetti F, Brozzoli C, Urquizar C, Roy AC, Farne A (2009) Tool-use induces morphological
updating of the body schema. Current Biology 19: R478-479
Carello C, Fitzpatrick P, Turvey MT (1992) Haptic probing: Perceiving the length of a probe and the distance of a
surface probed. Perception & Psychophysics 51: 580-598
Chan TC, Turvey MT (1991) The vertical distances of surfaces by means of a hand-held probe. Journal of
Experimental Psychology: Human Perception and Performance 17: 347-358
Epstein W, Hughes B, Schneider SL, Bach-y-Rita P (1986) Is there anything out there? A study of distal attribution
in response to vibrotactile stimulation. Perception 15: 275-284
Gibson JJ (1966) The senses considered as perceptual systems. Houghton-Mifflin, Boston
Giudice NA, Betty MR, Loomis JM (2011) Functional equivalence of spatial images from touch and vision:
Evidence from spatial updating in blind and sighted individuals. Journal of Experimental Psychology: Learning,
Memory and Cognition 37: 621-634
Giudice NA, Klatzky RL, Loomis JM (2009) Evidence for amodal representations after bimodal learning:
Integration of haptic-visual layouts into a common spatial image. Spatial Cognition & Computation 9: 287-304
Hollins M, Kelley EK (1988) Spatial updating in blind and sighted people. Perception and Psychophysics 43: 380-
388
Holmes NP, Spence C (2004) The body schema and the multisensory representation(s) of peripersonal space.
Cognitive Processing 5: 94-105
Horn DL, Loomis JM (2004) Spatial updating of targets in front and behind. Paidéia 14: 75-81
Huttenlocher J, Newcombe N, Sandberg E (1994) The coding of spatial location in young children. Cognitive
Psychology 27: 115-147
Iriki A, Tanaka M, Iwamura Y (1996) Coding of modified body schema during tool use by macaque postcentral
neurones. Neuroreport 7: 2325-2330
Katz D (1989) The world of touch. (L. E. Krueger, trans). Lawrence Erlbaum Associates (Originally published,
1925). Hillsdale, NJ
Klatzky RL, Lippa Y, Loomis JM, Golledge RG (2003) Encoding, learning, and spatial updating of multiple object
locations specified by 3-D sound, spatial language, and vision. Experimental Brain Research 149: 48-61
Li Z, Durgin FH (2012) A comparison of two theories of perceived distance on the ground plane: The angular
expansion hypothesis and the intrinsic bias hypothesis. iPerception 3: 368-383
Loomis JM (1992) Distal attribution and presence. Presence: Teleoperators and Virtual Environments 1: 113-119
15
Loomis JM, Klatzky RL, Giudice NA (in press) Representing 3D space in working memory: Spatial images from
vision, touch, hearing, and language. In: Lacey S, Lawson R (eds) Multisensory Imagery: Theory & Applications.
Springer, New York
Loomis JM, Klatzky RL, McHugh B, Giudice NA (2012) Spatial working memory for 3-D locations specified by
vision and audition. Attention, Perception, & Psychophysics 74:1260–1267
Loomis JM, Lippa Y, Golledge RG, Klatzky RL (2002) Spatial updating of locations specified by 3-d sound and
spatial language. Journal of Experimental Psychology: Learning, Memory, and Cognition 28: 335-345
Loomis JM, Philbeck JW (2008) Spatial perception with spatial updating and action. In: R.L. Klatzky, M.
Behrmann, & B. MacWhinney (eds) Embodiment, ego-space, and action. Taylor & Francis, New York, pp 1-43
Lotze H (1894) Microcosmus: An essay concerning man and his relation to the world (E. Hamilton and E. E. C.
Jones, trans.). In, vol I, Book V. T & T Clark Edinburgh, p 588
Maravita A, Iriki A (2004) Tools for the body (schema). Trends in Cognitive Sciences 8: 79-86
May M, Vogeley K (2006) Remembering where on the basis of spatial action and spatial language. In: Paper
presented at the International Workshop on Knowing that and Knowing how in Space, Bonn, Germany
Ooi TL He ZJ (2006). Localizing suspended objects in the intermediate distance range (>2 meters) by observers
with normal and abnormal binocular vision. Journal of Vision 6: 422a
Ooi TL, Wu B, He ZJ (2001) Distance determined by the angular declination below the horizon. Nature 414: 197-
200
Ooi TL, Wu B, He ZJ (2006) Perceptual space in the dark affected by the intrinsic bias of the visual system.
Perception 35: 605-624
Pasqualotto A, Finucane CM, Newell FN (2005) Visual and haptic representations of scenes are updated with
observer movement. Experimental Brain Research 166: 481-488
Philbeck JW, Loomis JM (1997) Comparison of two indicators of visually perceived egocentric distance under full-
cue and reduced-cue conditions. Journal of Experimental Psychology: Human Perception and Performance 23: 72-
85
Philbeck JW, Loomis JM, Beall AC (1997) Visually perceived location is an invariant in the control of action.
Perception & Psychophysics 59: 601-612
Polanyi M (1966) The tacit dimension. Doubleday, Garden City, NY
Povinelli DJ, Reaux JE, Frey SH (2011) Chimpanzees' context-dependent tool use provides evidence for separable
representations of hand and tool even during active use within peripersonal space. Neuropsychologia 48: 243-247
Siegle JH, Warren WH (2011) Distal attribution and distance perception in sensory substitution. Perception 39: 208-
223
Speigle JM, Loomis JM (1993) Auditory distance perception by translating observers. Proceedings of IEEE
Symposium on Research Frontiers in Virtual Reality, San Jose, CA, October 25-26, 1993
Turvey MT (1996) Dynamic touch. American Psychologist 51: 1134-1152
Weber EH (1978) Der Tastsinn (D. J. Murray, trans.) Academic Press (Originally published, 1846). New York
Wu B, Ooi TL, He ZJ (2004) Perceiving distances accurately by a directional process of integrating ground
information. Nature 428: 73-77
16
Yamamoto N, Shelton AL (2005) Visual and proprioceptive representations in spatial memory. Memory &
Cognition 33: 140-150
Yamamoto N, Shelton AL (2007) Path information effects in visual and proprioceptive spatial learning. Acta
Psychologica 125: 346-360