Active manual control of object views facilitates visual recognition
0 nowrap>Yahoo! is not affiliated with the authors of this page or responsible for its content.
Active manual control of object views facilitates visual recognition
Brief Communication
1315
Active manual control of object views facilitates visual
recognition
Karin L. Harman, G. Keith Humphrey and Melvyn A. Goodale
Active exploration of large-scale environments leads
to better learning of spatial layout than does passive
observation [13]. But active exploration might also
help us to remember the appearance of individual
objects in a scene. In fact, when we encounter new
objects, we often manipulate them so that they can be
seen from a variety of perspectives. We present here
the first evidence that active control of the visual input
in this way facilitates later recognition of objects.
Observers who actively rotated novel, three-
dimensional objects on a computer screen later
showed more efficient visual recognition than
observers who passively viewed the exact same
sequence of images of these virtual objects. During
active exploration, the observers focused mainly on
the side or front views of the objects (see also [46]).
The results demonstrate that how an object is
represented for later recognition is influenced by
whether or not one controls the presentation of visual
input during learning.
Address: Psychology Department, University of Western Ontario,
London, Ontario N6A 5C2, Canada.
Correspondence: G. Keith Humphrey
E-mail: keith@julian.uwo.co
Received: 9 August 1999
Revised: 20 September 1999
Accepted: 8 October 1999
Published: 8 November 1999
Current Biology 1999, 9:13151318
0960-9822/99/$ see front matter
© 1999 Elsevier Science Ltd. All rights reserved.
Results
Recognition performance
We measured the response latency and accuracy of sub-
jects as they performed an old/new discrimination
between two classes of object: ones they had seen during a
study period and ones they had never seen before. The
objects were novel, computer-generated, three-dimen-
sional objects. As Figure 1 illustrates, the objects were
constructed of geon-like parts [7] and were elongated
along a single axis. During the earlier study period, each
subject viewed half the objects using active exploration
and half using passive observation. A yoked-control design
was used such that the passive viewing sequence for a par-
ticular object viewed by a subject during the study period
was simply a replay of an active exploration of that same
object by another subject.
A within-subjects (between-objects) analysis of variance
demonstrated that actively explored objects were recog-
nized faster than were passively viewed objects
(F(1,21) = 16.1,
p < 0.001). As can be seen in Figure 2, the
activepassive difference in the speed of recognition of the
studied objects (that is, correctly responding old to studied
objects) was evident in three of the four different views of
the objects that were presented during the old/new task.
Specifically, active exploration facilitated recognition of the
front (p < 0.004), side (p < 0.003) and the three-quarter back
(p < 0.03) view of the objects. Speed of recognition of the
other three-quarter view, the so-called canonical view, did
not depend on whether or not the object had been studied
actively. The same pattern of results was seen when a
between-subjects analysis (yoked subjects, within objects)
was carried out (F(1,18) = 3.8, p < 0.02).
A within-subjects analyses of variance showed no effect of
active exploration on the accuracy of recognition. Accuracy
was, however, affected by the view of the object that was
presented during the old/new task (F(3,21) = 10.77,
p < 0.0001). As is evident in Figure 3, the front or foreshort-
ened view of objects was recognized best. The same
pattern of results was found with a between-subjects analy-
sis (F(3,20) = 9.9, p < 0.0001). It is interesting that accuracy
in general was quite low in both conditions, probably
because of the similarity among the target and distracter
Figure 1
(aj) Examples of the novel, computer-rendered, three-dimensional
objects used in the present study. (kn) Examples of the views used
during the old/new test session. (k) Front or foreshortened view, where
the principal axis of elongation is perpendicular to the viewers line of
sight. (l) Side view, where the axis of elongation is parallel to the
viewers line of sight. (m) Three-quarter back view, a 45
°
or
intermediate view between a side and a back view. (n) Three-quarter
front view (sometimes called canonical view [7]), a 45
°
intermediate
view between the front and the side views.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
Current Biology
items. In fact, we deliberately designed the old/new task to
be difficult so that we could increase the response latency
enough to reveal a difference between study conditions.
But why accuracy was not sensitive to the activepassive
manipulation is unclear. Of course, continuous measures,
such as reaction time, are often more sensitive than simple
accuracy scores.
Exploration data analyses
We also examined how subjects distributed their looking
time in the active exploration condition. In particular, we
examined the amount of time that subjects spent on dif-
ferent views of the objects. We calculated peak dwell
times for each subject and found that, when these values
were averaged across subjects, a distinct pattern of explo-
ration emerged. Rather than exploring the objects in an
idiosyncratic manner, the subjects spent most of their
time studying only four views of the objects, all of which
were rotations about the vertical axis (see Figure 4).
These four views corresponded to the front, back and two
side views of the objects. Subjects tended to spend very
little time studying particular intermediate views between
these angles.
Discussion and conclusions
The results provide the first demonstration that active
control of visual input during perceptual learning leads to
more efficient object recognition. We found that subjects
who actively rotated novel, three-dimensional objects on a
computer screen recognized objects more rapidly than did
subjects who passively viewed the exact same sequence of
images of these virtual objects. In addition, we found that,
when exploring such novel objects, subjects concentrated
on particular views.
Although other studies have demonstrated that active
exploration can improve scene recognition through the
detection of changes in a stimulus array [3], our study pro-
vides convincing evidence that fundamental mechanisms
mediating object recognition can be influenced by active
exploration. In other words, active control over the way in
which the different views of an object are revealed leads
to faster recognition. Just why this occurs is not clear. It
could be that direct manual control over the sequence of
views provides efference copy and/or proprioceptive infor-
mation (see also [3]) that helps to integrate the different
views by allowing subjects to anticipate the upcoming
view and relate it to the previous view. Alternatively, or at
the same time, active exploration could allow subjects to
test predictions about the expected deformations in the
image that would occur when the object is rotated in a par-
ticular way. The advantage observed with active explo-
ration in our experiment might have depended critically
on the fact that the movement of the object on the com-
puter screen was, in some ways, an isomorphic reflection
of the movement of the trackball. This relationship
between visual input and manual control resembles, in
some respects, the way in which we might visually inspect
an actual object that we are holding in our hands.
Of course, integrating views and/or testing hypotheses
about the structure of an object would involve attention.
But attentional resources would not necessarily be distrib-
uted the same way in the two study conditions. In other
words, subjects in the active exploration condition might
have deployed their attention strategically increasing
their attention when a particular view of the object was on
the screen. Indeed, they might have anticipated the need
to increase their attention at this time. At other times, their
attention might not have been as well focused. This strate-
gic manipulation of attention would be expected to occur
less often in the passive viewing condition where attention
1316
Current Biology Vol 9 No 22
Figure 2
Response latencies to target objects during the test session. Actively
studied objects were recognized faster than passively studied objects,
except for the three-quarter front view. Note that generalization to a
less-studied view (three-quarter back view, see also Figure 4) was
greater for the active group than for the passive group. This
generalization difference between the two study groups was, however,
less pronounced for the three-quarter front view (see also Figure 4).
Error bars indicate one standard error above the mean.
Front
Side
Three-quarter
back
Three-quarter
front
Test view
Current Biology
Active
Passive
0
200
400
600
800
1,000
1,200
1,400
Response time (msec)
1,600
Figure 3
The percentage of correctly recognized target objects as a function of
test angle. The front view is recognized more accurately than the other
test views. Error bars indicate one standard error above t