The use of modality in the design of verbal aids in computer-based learning environments

17
The use of modality in the design of verbal aids in computer-based learning environments Emilio Sanchez 1 , Hector Garcia-Rodicio * Department of Developmental and Educational Psychology, University of Salamanca, Avenida de la Merced, 109-131, 37005 Salamanca, Spain article info Article history: Received 23 May 2008 Received in revised form 31 July 2008 Accepted 4 August 2008 Available online 15 August 2008 Keywords: Computer-based learning environments Verbal aids Visual modality Auditory modality abstract Computer-based learning environments include verbal aids helping learners to gain a deep understand- ing. These aids can be presented in either the visual or the auditory modality. The problem is that it is not clear-cut how to present them for two reasons: the modality principle [Mayer, R.E., 2001. Multimedia Learning. Cambridge University Press, New York] is not applicable because verbal aids do not usually come with related pictures and the little empirical research on the question provides diverging results. Our aim was twofold: to present a research framework, which makes it possible to reinterpret prior find- ings, and to test it empirically as it provides guidelines about how to present verbal aids. It distinguishes between two types of verbal aids: regulatory, which guide the learners’ decision making process during learning, and explanatory, which help learners to revise their understanding of the to-be-learned con- tents. The framework suggests that explanatory aids should be presented visually and regulatory aids should be presented auditorily. In two experiments participants learned from a computer-based learning environment on plate tectonics and solved retention and inference questions afterwards. They received verbal aids presented in different modalities depending on the condition. Participants receiving visual explanatory aids outperformed those receiving auditory explanatory aids both in retention and inference questions. Participants receiving auditory regulatory aids showed no advantage; the same pattern was obtained in the second experiment, in which the auditory aids were given by a pedagogical agent. Results have practical implications for the design of computer-based materials. Ó 2008 Elsevier B.V. All rights reserved. 1. Introduction Because computer-based learning environments (CBLs) have multiple capabilities, designers must make many decisions about their construction. One decision is whether incorporating some kind of assistance or aid into the CBL. Another important decision designers must make is how to present those aids: should they use the visual or the auditory modality? This paper is concerned with this question. Before addressing the question in depth it is impor- tant to define what are the capabilities of the CBLs and why is so important to include aids into them. 1.1. The capabilities of computer-based learning environments CBLs are flexible tools of instruction. In order to describe their multiple instructional capabilities three levels of analysis can be taken into account (Schnotz, 2005): the presentation format level, the sensory level, and the component level. The presentation format level makes a distinction between two types of information which can be conveyed through CBLs: verbal and pictorial information. Thus, this level refers to information such as texts or narrations as well as illustrations or animations. The sensory level distin- guishes between two modalities: the visual and the auditory modalities. The visual modality uses the eyes in entering the infor- mation into the cognitive system; conversely, the auditory modal- ity uses the ears. 2 Finally, the component level distinguishes between two components which are comprised by CBLs: to-be- learned contents and aids. To-be-learned contents consist of infor- mation regarding a subject matter learners have to assimilate; it is of interest that subject matters in CBLs are mainly scientific topics, e.g., geometry (Aleven and Koedinger, 2002), physics (Conati and VanLehn, 2000), a cars’ braking system (Mayer and Anderson, 1992), or the human circulatory system (Azevedo et al. 2004). Aids, in turn, help learners in constructing a coherent representation of 0953-5438/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.intcom.2008.08.001 * Corresponding author. Tel.: +34 923 29 44 00x3294; fax: +34 923 29 47 08. E-mail addresses: [email protected] (E. Sanchez), [email protected] (H. Garcia- Rodicio). 1 Tel.: +34 923 29 44 00x3309; fax: +34 923 29 47 08. 2 These first two levels correspond with those suggested by Schnotz (2005), who stressed the multiple meanings of the term ‘‘multimedia”. According to Schnotz, multimedia refers to (a) multiple delivery media such as computers, screens, and loudspeakers, (b) multiple forms of representation such as texts and animations, and (c) multiple senses such as the eye and the ear. The latter two being the levels used here. Interacting with Computers 20 (2008) 545–561 Contents lists available at ScienceDirect Interacting with Computers journal homepage: www.elsevier.com/locate/intcom

Transcript of The use of modality in the design of verbal aids in computer-based learning environments

Page 1: The use of modality in the design of verbal aids in computer-based learning environments

Interacting with Computers 20 (2008) 545–561

Contents lists available at ScienceDirect

Interacting with Computers

journal homepage: www.elsevier .com/locate / intcom

The use of modality in the design of verbal aids in computer-basedlearning environments

Emilio Sanchez 1, Hector Garcia-Rodicio *

Department of Developmental and Educational Psychology, University of Salamanca, Avenida de la Merced, 109-131, 37005 Salamanca, Spain

a r t i c l e i n f o a b s t r a c t

Article history:Received 23 May 2008Received in revised form 31 July 2008Accepted 4 August 2008Available online 15 August 2008

Keywords:Computer-based learning environmentsVerbal aidsVisual modalityAuditory modality

0953-5438/$ - see front matter � 2008 Elsevier B.V. Adoi:10.1016/j.intcom.2008.08.001

* Corresponding author. Tel.: +34 923 29 44 00x32E-mail addresses: [email protected] (E. Sanchez), ga

Rodicio).1 Tel.: +34 923 29 44 00x3309; fax: +34 923 29 47 0

Computer-based learning environments include verbal aids helping learners to gain a deep understand-ing. These aids can be presented in either the visual or the auditory modality. The problem is that it is notclear-cut how to present them for two reasons: the modality principle [Mayer, R.E., 2001. MultimediaLearning. Cambridge University Press, New York] is not applicable because verbal aids do not usuallycome with related pictures and the little empirical research on the question provides diverging results.Our aim was twofold: to present a research framework, which makes it possible to reinterpret prior find-ings, and to test it empirically as it provides guidelines about how to present verbal aids. It distinguishesbetween two types of verbal aids: regulatory, which guide the learners’ decision making process duringlearning, and explanatory, which help learners to revise their understanding of the to-be-learned con-tents. The framework suggests that explanatory aids should be presented visually and regulatory aidsshould be presented auditorily. In two experiments participants learned from a computer-based learningenvironment on plate tectonics and solved retention and inference questions afterwards. They receivedverbal aids presented in different modalities depending on the condition. Participants receiving visualexplanatory aids outperformed those receiving auditory explanatory aids both in retention and inferencequestions. Participants receiving auditory regulatory aids showed no advantage; the same pattern wasobtained in the second experiment, in which the auditory aids were given by a pedagogical agent. Resultshave practical implications for the design of computer-based materials.

� 2008 Elsevier B.V. All rights reserved.

1. Introduction

Because computer-based learning environments (CBLs) havemultiple capabilities, designers must make many decisions abouttheir construction. One decision is whether incorporating somekind of assistance or aid into the CBL. Another important decisiondesigners must make is how to present those aids: should they usethe visual or the auditory modality? This paper is concerned withthis question. Before addressing the question in depth it is impor-tant to define what are the capabilities of the CBLs and why is soimportant to include aids into them.

1.1. The capabilities of computer-based learning environments

CBLs are flexible tools of instruction. In order to describe theirmultiple instructional capabilities three levels of analysis can betaken into account (Schnotz, 2005): the presentation format level,the sensory level, and the component level. The presentation format

ll rights reserved.

94; fax: +34 923 29 47 [email protected] (H. Garcia-

8.

level makes a distinction between two types of information whichcan be conveyed through CBLs: verbal and pictorial information.Thus, this level refers to information such as texts or narrationsas well as illustrations or animations. The sensory level distin-guishes between two modalities: the visual and the auditorymodalities. The visual modality uses the eyes in entering the infor-mation into the cognitive system; conversely, the auditory modal-ity uses the ears.2 Finally, the component level distinguishesbetween two components which are comprised by CBLs: to-be-learned contents and aids. To-be-learned contents consist of infor-mation regarding a subject matter learners have to assimilate; it isof interest that subject matters in CBLs are mainly scientific topics,e.g., geometry (Aleven and Koedinger, 2002), physics (Conati andVanLehn, 2000), a cars’ braking system (Mayer and Anderson,1992), or the human circulatory system (Azevedoet al. 2004). Aids,in turn, help learners in constructing a coherent representation of

2 These first two levels correspond with those suggested by Schnotz (2005), whostressed the multiple meanings of the term ‘‘multimedia”. According to Schnotz,multimedia refers to (a) multiple delivery media such as computers, screens, andloudspeakers, (b) multiple forms of representation such as texts and animations, and(c) multiple senses such as the eye and the ear. The latter two being the levels usedhere.

Page 2: The use of modality in the design of verbal aids in computer-based learning environments

546 E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561

the to-be-learned contents: they are devices helping learners in theexecution of the learning processes. Thus, an example of to-be-learned contents might be the following: ‘‘In the Himalaya twocontinental plates collide whereas in the Andes there is one conti-nental and one oceanic plate. Moreover, the Himalaya has no vol-canoes whereas the Andes does.” A sentence setting the aim of thatmaterial (‘‘The purpose of this material is to define the differencesbetween the Andes and the Himalaya ranges) would function as anaid here (e.g., Loman and Mayer, 1983), as it adds no new informa-tion to the passage but serve to set an standard of comprehension,which facilitates the comprehension monitoring process. Overall,the wide range of possibilities (i.e., multiple formats, modalities,and components) makes CBLs versatile devices of instruction. Atthe same time, these multiple possibilities demand designers tomake decisions about how to use them.

1.2. Should computer-based learning environments include aids?

As will be shown, learning from CBLs poses several difficultiesto learners; this making the inclusion of aids into CBLs indispens-able. The first difficulty is that scientific topics are complex (i.e.,comprise many ideas and interconnections) and expressed in tech-nical vocabulary (Graesser et al., 2002), which requires learners todraw many ideas from the materials and establish the correspond-ing links among them. Second, related information is presentedboth as words and pictures, which requires the translation of oneformat into the another (Ainsworth et al., 2002; Schnotz and Bann-ert, 2003). Third, using CBLs demands a highly self-regulated learn-ing on the part of learners, that is, CBLs require learners to setlearning goals, determine which strategies to execute, monitortheir emerging understanding, and so on. However, in most casesthey do not exhibit such a self-regulated process (Azevedo et al.,2008; Commander and Stanwyck, 1997; Otero and Campanario,1990). Fourth, the human capacity of processing is limited, whichmeans that learning from CBLs entails risk of working memoryoverload (Chandler and Sweller, 1996; Sweller et al., 1998). Finally,although prior knowledge is critical for learning (Shapiro, 2004),learners sometimes lack correct prior knowledge on the domainthe material is about; they hold misconceptions about it, instead(Vosniadou and Brewer, 1992, 1994). Overall, learners have to (a)mentally integrate multiple ideas and information presented indifferent formats and (b) self-regulate their learning process; fur-thermore, there are two cognitive constraints they have to dealwith: a limited capacity of processing and the influence of miscon-ceptions. These drawbacks are serious enough to warrant the useof aids.

What exactly are these aids? As was stated, aids are deviceshelping learners in executing all learning processes and overcom-ing cognitive constraints. Accordingly, aids can be classified onthe basis of the learning process they facilitate (integration pro-cesses or self-regulation processes) or their ability to alleviatethe influence of the cognitive constraints mentioned before. Withrespect to the aids to integration processes, instructional explana-tions inserted into worked-out examples help learners to see therationale behind the solution steps (Renkl, 2002), menu-basedtools assist learners in generating explanations to repair theirflawed mental representations (Aleven and Koedinger, 2002),splicing in correct pieces of information help learners to constructcorrect and complete mental representations (Graesser et al.,2004), some hints guide learners in establishing connections be-tween words and pictures (Seufert, 2003), and so on. Regardingthe aids to self-regulation, corrective feedback helps learners toidentify flaws in their mental representations (Moreno and Mayer,2005), similarly, prompts to explain some question in more depthhelp learners to monitor their understanding (Graesser et al.,2004). Finally, with respect to the cognitive constraints, refutation-

al texts help learners to remove misconceptions (Diakidoy et al.,2003; Mikkilä-Erdmann, 2001) whereas strategies, such as pre-senting pictures and words in a physically integrated fashion, pre-vent from memory overload (Chandler and Sweller, 1992). Allthese experiments have showed that aids incorporated into CBLsare helpful indeed.

According to their format, aids in CBLs can be pictorial (e.g.,Mautone and Mayer, 2007) or verbal – the latter being those weare focusing on. Verbal aids are typically based on human tutoringperformance (e.g., Chi et al., 2001; Graesser et al., 1996, 1995) andthey are usually given by animated pedagogical agents or assis-tants (e.g., Dehn and van Mulken, 2000; Johnson et al., 2000).Verbal aids can also be presented in either the visual and the audi-tory modalities: this will be examined thoroughly in the nextsection.

1.3. The use of modality in the design of verbal aids

Another important decision is how to use modality in designingverbal aids. Is it better to present them in the visual or the auditorymodality? Whereas the use of modality is clear-cut in the design ofto-be-learned contents, it is not so for verbal aids. Regarding to-be-learned contents, the modality principle (Mayer, 2001; Swelleret al., 1998) suggests that verbal contents accompanying picturesshould be presented auditorily: use narrations rather than texts.It is done in that way in order to (a) avoid visual channel overload,which occurs when both words and pictures are processed by theeye and to (b) expand the effective size of working memory, whichis possible because of the use of two processors in working mem-ory (Mousavi et al., 1995; Moreno and Mayer, 1999). These two cir-cumstances take place only when words come with relatedpictures (either sequentially or simultaneously). The modalityprinciple has been widely supported by empirical research (seeGinns (2005) for a meta-analysis).

There is no such a principle in the design of verbal aids fortwo reasons. First, verbal aids do not usually come with relatedpictures so that the modality principle is not applicable. Notethat the visual channel overload takes place when learners mustprocess both words and pictures through the eye, whereas thememory expansion is possible when one processor in workingmemory processes words and the other one pictures. Second,the little empirical research available provides diverging findings.To our knowledge only three sets of experiments have been car-ried out so far. Moreno and her colleagues (Moreno and Mayer,2002; Moreno et al., 2001) incorporated explanatory feedback(enhancing the integration among the contents) into a CBLteaching botany. The feedback was presented either visually orauditorily. Participants receiving auditory feedback outperformedthose receiving visual feedback. Atkinson (2002) introduced anagent into a CBL teaching how to solve mathematics’ word-prob-lems through worked-out examples. The agent provided explan-atory aids (clarifying some aspects about the solution steps) ineither the visual or the auditory modality. The auditory modalitywas better than the visual modality on near transfer but therewere no differences in far transfer (Experiment 1) or there wereno differences in near and far transfer between modalities(Experiment 2). More recently, Graesser et al. (2003) incorpo-rated several tutoring aids (such as prompting learners to gener-ate explanations or jumping in and providing correctexplanations) into a CBL teaching computer literacy. Tutoringaids were presented either visually or auditorily (among otherconditions). There were no differences in performance as a func-tion of the modality of verbal aids. Taken in conjunction, theauditory modality was better but also equal to the visual modal-ity, hence, there is no consensus regarding how to present verbalaids.

Page 3: The use of modality in the design of verbal aids in computer-based learning environments

E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561 547

2. Current proposal: the functional approach

Our aim is twofold. First, to present a research framework,which makes it possible to reinterpret the diverging findings andto formulate new hypothesis. Second, to test it empirically, gather-ing support for the framework and extending the little research onthe topic in question. The framework is called functional approachand it is based on several assumptions.

First assumption. The visual and the auditory modalities havespecific advantages. One of the assumptions is that each modalityhas specific advantages. The visual modality is controllable, i.e., itallows the learner to regulate the pace of reading, to reread, andto focus on any of the segments of the material. This control canbe beneficial. In fact, there is evidence showing that fixing the paceof reading hinders comprehension (Kintsch and Keenan, 1973),rereading is beneficial for comprehension (Rawson and Kintsch,2005), and preventing readers from viewing prior segments ofthe text has drawbacks on comprehension (Just et al., 1982). Theauditory modality has expressiveness, i.e., it draws on prosody toconvey information about both the communicative intention (illo-cutionary acts such as stating, requesting, promising, warning, andso on) and the speaker’s attitude towards the message (value, ur-gency, and so on) – among other things. There is evidence showingthat people are sensitive to these prosodic cues (e.g., Juzczyck et al.,1992) and make inferences relying on them. For instance, peoplecan infer speaker’s confidence in what he/she is saying (Brennanand Williams, 1995), speaker’s emotional state towards the mes-sage (Baum and Nowicki, 1998), or speaker’s cooperative attitude(Haskard et al., 2008) from prosody.

Second assumption. Verbal aids accomplish different functions:regulatory and explanatory. The functional approach further as-sumes that there are two types of tutoring verbal aids: some fulfill-ing a regulatory function and others fulfilling an explanatoryfunction. These types of aid differ on two aspects. First, they differon the learning process they facilitate. The regulatory aids helplearners in self-regulating their learning. That is to say, they assistthem in setting goals, tell them when, why, and which strategythey should execute, or check whether the understanding theyare attaining is appropriate. Therefore, the regulatory aids are in-tended to guide learners’ decision making. And they do it by pro-viding instructions on how to proceed with learning. Examples ofregulatory aids are: setting a goal (‘‘now you must think aboutwhy plants have water, air, and soil”), proposing a strategy to reacha goal (‘‘first of all you must gather all the things you know, thenyou have to organize them”), detecting misconceptions (‘‘are yousure oaks spin to the sun?”), giving positive/negative feedback(‘‘. . .and water, that’s good!”, ‘‘no, no, petals are not part of trees”,respectively),3 or those mentioned above (Graesser et al., 2004;Moreno and Mayer, 2005). The explanatory aids, in turn, help learnersin executing integration processes. That is, the explanatory aids areintended to revise and repair learners’ mental representation whenthe understanding they have gained is not appropriate, which issomething common when learning scientific topics (e.g., Chi,2000). And they do it by providing elaborations on the to-be-learnedcontents. Examples of explanatory aids are: reformulating informa-tion from prior contributions (‘‘. . .so plants need water to surviveand to make their food”), adding critical pieces of information notexplicitly mentioned before (‘‘. . .without water, air, and soil plantsdie”), or those aids mentioned above (Aleven and Koedinger, 2002;Renkl, 2002; Seufert, 2003).

Second, the regulatory and the explanatory aids differ on theway that learners process them. Learners receiving regulatory aids

3 Examples extracted from real classroom interactions between a teacher and herstudents (Sánchez et al., 2008). It was a lesson on plants.

have to modify their behavior and do what the aids ask them to do(i.e., revise their learning process). This meaning they have to stopdoing something (e.g., ‘‘no, that is not what we were looking for”)and start performing a different action (e.g., ‘‘remember we werelooking for plants’ needs”). Conversely, learners receiving explana-tory aids have to elaborate a mental representation from the con-tents given through the explanation (e.g., ‘‘. . .so plants need water,soil, and sunlight to live”) without being expected to perform anyspecific action. Note that this difference can also be observed be-tween the two types of speech acts distinguished by the linguistGivón (1984). When receiving ‘‘manipulative acts”, which are thosemanipulating the behavior of others (e.g., ‘‘pass me the salt”), hear-ers are expected to change their behavior; on the other hand, whenreceiving ‘‘informative acts”, which are those transferring informa-tion to others (e.g., ‘‘yesterday it rained”), hearers are expected tochange their mental representation of the world.

In sum, regulatory aids help learners to self-regulate their learn-ing by providing instructions on how to proceed whereas explana-tory aids help learners to revise their understanding of the ideasfrom the to-be-learned contents by providing elaborations onthem. Learners, in turn, have to decide whether perform the actionsuggested by the regulation or process the elaboration in theexplanatory aids.

Third assumption. There is an interaction between modalitiesand types of verbal aid. The third assumption is that the specificadvantages of each modality are relevant or not depending onthe type (or function) of the tutoring aid. Recall that the firstassumption suggested that each modality has particular advanta-ges, now the question is: are those advantages always equally crit-ical? The answer proposed here is that it depends on the type ofthe verbal aid.

When provided with explanatory aids, learners have to assimi-late contents. More accurately, they have to deal with complexexplanations, which are difficult to understand, describing com-plex phenomena. In so doing, the possibility of controlling the textcould be helpful. Regulatory aids are not difficult to understand, asthey give simple instructions about what to do. Therefore, controlcould be less helpful in regulatory aids. Simply put, control mightbe critical only when interpreting explanatory aids.

When provided with regulatory aids, learners have to changetheir behavior following the instructions the tutor is giving. Learn-ers then have to decide whether it is worthy or not to follow theinstructions. In so doing, the additional information conveyedthrough prosodic expressiveness, such as speaker’s communicativeintention (as it makes it easy to recognize whether the utterance isa request or a statement) and attitude toward the action to perform(value, urgency, and so on), could be helpful since it clarifies if theaction is worthy or not. Explanatory aids do not expect the learnerto decide whether performing a specific action. Therefore, expres-siveness could be less helpful in explanatory aids. Simply put, pro-sodic expressiveness might be critical only when followingregulatory aids.

2.1. Overview of the study and hypotheses

A set of experiments in which participants learned from a CBLwere carried out. Both regulatory and explanatory aids were incor-porated into a CBL teaching plate tectonics. There were differentversions of the CBL. Each version had the two types of help: regu-latory and explanatory aids. CBL versions differed on the way theaids were presented. One experimental variable was the modalityof regulatory aids (visual/auditory); and another was the modalityof explanatory aids (visual/auditory). Both variables were factorial-ly combined. Therefore, there were four versions of the CBL aids:auditory-regulation and visual-explanation, visual-regulation andauditory-explanation, visual-regulation and visual-explanation,

Page 4: The use of modality in the design of verbal aids in computer-based learning environments

548 E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561

auditory-regulation and auditory-explanation. After using the CBL,participants answered to retention and inference questions aboutplate tectonics.

In the light of the framework suggested above, several predic-tions were formulated.

– Visual explanatory aids will be better than auditory explanatoryaids. This is called henceforth the visual-explanation superiorityhypothesis.

– Auditory regulatory aids will be better than visual auditory aids.This is called henceforth the auditory-regulation superiorityhypothesis.

More specifically, we expected participants receiving visualexplanatory aids to outperform those receiving auditory explana-tory aids; we further expected those receiving auditory regulatoryaids to outperform those receiving visual regulatory aids. Note thatevery participant was provided with aids; nevertheless, only thosereceiving configurations consistent with the functional approachwould fully profit from them. This would be true on retentionquestions performance, since aids would allow participants to clar-ify and reinforce basic concepts regarding plate tectonics. And alsoon inference questions performance, since aids would allow partic-ipants to revise their mental models (Chi, 2000) in such a way itwould be possible to apply the acquired knowledge to novel situ-ations (Kintsch, 1994, 1998).

2.2. Reinterpreting prior research on modality and verbal aids

According to the framework presented above it is possible toreinterpret prior research on the topic. This reinterpretation maybe also useful in stressing the novelty of the study reported here.As was mentioned before, Moreno and her colleagues (Morenoand Mayer, 2002; Moreno et al., 2001) provided participants withverbal aids in the form of explanatory feedback. Participants de-signed a plant that was expected to survive under certain condi-tions. Once the CBL described these conditions, participants wereasked to choose among several options in order to design the plant.They chose between different types of roots, stems, and leaves.Then they received feedback on each choice explaining why theirchoices were correct/incorrect and providing the building blocksof the correct answer. The explanatory feedback was presentedeither visually or auditorily. It follows from our theoretical frame-work that visual explanations should be better than auditory onesbut this was not the case: actually, the auditory explanations werebetter than the visual ones. A careful examination of these expla-nations may sort out the problem: explanations were simple andeasy to understand. They may be considered simple because theycomprised few ideas and relations between them (see AppendixA); similarly, they may be considered easy because they were eas-ily deductible from the description of the conditions (under whichthe plants would grow). Therefore, we explain the visual non-advantage in Moreno et al.’s studies as the result of using simpleand easy explanations, which made the control available in thevisual modality not so helpful.

The same might be true for the results in Atkinson (2002). Par-ticipants learned to solve word-problems on proportion receivingexplanatory aids. More specifically, they were given worked-outexamples, which were intended to elicit learning. The explanatoryaids clarified the rationale behind the solution states. After thelearning phase, participants were asked to solve similar anddifferent problems as a measure of near and far transfer, respec-tively. Visual explanatory aids were worse in near transfer andequal in far transfer to auditory explanatory aids (Experiment 1)or there were no differences in near or far transfer (Experiment2). The non-advantage of the visual explanatory aids could be ex-

plained in the same fashion as before: explanations may be consid-ered both simple and easy to understand. They were simple as theycomprised few ideas and relations among them (see Appendix A);they were easy because they displayed rationales easy deductiblefrom the solution states (in fact, people learn rapidly and effec-tively from worked-out examples; e.g., Sweller and Cooper, 1985).

This interpretation is in line with prior research on text compre-hension. Easy and/or simple texts are equally comprehended bothin the visual and the auditory modality (Gernsbacher et al., 1990;Kintsch and Kozminsky, 1977; Kintsch et al., 1975; Smiley et al.,1977) but, when texts are difficult and/or complex, the visual pre-sentation yields better performances than the auditory presenta-tion (see Green, 1981). Although this interpretation is consistentwith prior research (and even with common sense) it needs to befurther supported for several reasons. First, the data availablecomes from only one experiment (the one reported in Green,1981). Second, the experiment was carried out using expositorytexts but not explanatory aids in CBLs, as it is the concern here.

As we already mentioned, Graesser et al. (2003) provided par-ticipants with tutoring verbal aids either through the visual orthe auditory modality. Participants learned computer literacy withthe help of verbal aids. Aids were diverse. For instance, there wereaids such as splicing in a correct response or elaborating uponlearners’ contributions as well as aids such as prompting the lear-ner to generate an explanation or giving reinforcing feedback (seePerson et al. (2000) for a detailed description). Results showed thatthe modality of these aids was not significant. This requires anexplanation. According to our explanatory/regulatory distinction,aids in Graesser et al. can be categorized; in so doing, the problemmay be resolved. When participants gave an incorrect or an incom-plete answer to a query from the CBL tutor, the tutor providedeither the correct response or an elaboration on learner response,respectively. These may be considered explanatory aids since theygo on the to-be-learned contents in the CBL, reinforcing the inte-gration among its ideas. When the CBL tutor prompted the partic-ipants it wanted them to go on their answering process; similarly,when it gave reinforcing feedback it wanted the participants tostay in the same path of reasoning. In both cases the CBL tutordid not draw on prior contents; it regulated participants’ learningprocess instead. If this argument is correct, then an interpretationfor Graesser et al.’s results is that they were giving explanatory andregulatory aids unaware as to the extent in which these were beingemployed. Confounding the types of aid made it difficult to inter-pret the results.

The present study extends prior research because it (a) usedcomplex and difficult explanations, (b) focused on verbal aidsrather than texts, and (c) kept the use of regulatory and explana-tory aids under experimental control. The explanatory aids incor-porated into the learning materials were complex, as theycomprised a great amount of ideas and interconnections (seeAppendix A); and we expected them to be difficult, as they dealtwith unfamiliar contents and did it in depth (note these contentswere typically misunderstood by learners; see below, Section3.5). Controlling the use of regulatory and explanatory aids wasuseful in determining the causes of our results.

3. Experiment 1: The effect of the visual and the auditorymodalities of different verbal aids on learning

In this experiment the impact of the modality of verbal aids onlearning was examined. The participants learned plate tectonicsfrom a CBL and received regulatory and explanatory aids presentedin different modalities. Then they solved retention and inferencetests. Those participants receiving aids presented in a fashioncompatible with the framework presented above were expectedto outperform their counterparts.

Page 5: The use of modality in the design of verbal aids in computer-based learning environments

4 One could argue that this misunderstanding is not the result of a failure in learningbut a problem in the CBL we have designed. However, this is not the case as the samemisunderstanding has been identified using different materials.

E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561 549

3.1. Participants and design

Fifty-two undergraduate students enrolled in educational psy-chology courses at the University of Salamanca (Spain) participatedin this study. They were randomly assigned to one of the four con-ditions. Thirteen participants served in the auditory-regulation andvisual-explanation condition (AR-VE), 13 in the visual-regulationand auditory-explanation condition (VR-AE), 13 in the visual-regu-lation and visual-explanation condition (VR-VE), and 13 in theauditory-regulation and auditory-explanation condition (AR-AE).The mean age of the sample was 21. Eighty-one percent of thesample consisted of female students and 19% of male students.Although there was a majority of women this might not representa problem for two reasons. First, it is representative, as it corre-sponds to the proportion of women in the faculty where the partic-ipants were recruited. Second, different studies conducted in ourlab suggested that genre is not a critical variable (also supportedin Buisine and Martin (2007)). All participants had experience inusing computers.

The experiment had a 2 � 2 factorial design with the modalityof the regulatory aids (visual/auditory) and the modality of theexplanatory aids (visual/auditory) as the between-subjects factors.Retention and inference questions were the dependent variables.General reading comprehension skill and prior domain knowledgewere used as control variables.

One might argue that there was more than one differencebetween the visual and the auditory conditions (which would putthe experimental control into question). The visual condition (a)provides control to learners and (b) has no expressiveness. The audi-tory condition (a) provides no control to learners and (b) has expres-siveness. Therefore, there are two – rather than one – differencesbetween the conditions. This might indicate that the experimentalcontrol is not warranted. However, there are two reasons why thisis not problematic. First, the auditory modality could not have thesame level of control as the visual modality whereas the visualmodality could not have the same level of expressiveness as theauditory modality. This means that control and visual modality can-not be separated; and the same goes for expressiveness and auditorymodality. Second, our goal was more to contrast the impact of con-trol and expressiveness in different situations (i.e., type of aids) thanto assess the specific impact of each feature.

3.2. Materials

For each participant materials consisted of a prior knowledgetest, a general reading comprehension skill test, the computerizedlearning materials, a recall questions test, and an inference ques-tions test.

The prior knowledge test consisted of a paper-and-pencil testand comprised six open-ended questions. Questions were aboutgeology, testing directly on the lesson’s contents and on similar is-sues. Questions were: ‘‘What is a plate? Where are plates located inthe Earth’s internal structure?”, ‘‘Is it possible for continents to bepermanently moving? Explain why.”, ‘‘Is it possible to find fossil onthe peak summit of a mountain? Explain why.”, ‘‘Explain how vol-canoes are formed.”, ‘‘How are mountains formed?” One additionalquestion was a naming task in which students had to name theparts of an illustration. The Earth’s internal structure was showed,distinguishing its three layers.

The general reading comprehension skill test consisted of eightmultiple-choice questions about a narration, which was presentedon the computer screen. It was a Gernsbacher and Varner test(1988) translated by Díez and Fernández (1997). The narrationwas about a trader who travelled to Cabo Verde looking for newtrade routes. The trader helped people in Cabo Verde to kill the ratsliving there. Participants solved questions about the narration such

as ‘‘How did the King of Cabo Verde feel when the cats killed all therats?” or ‘‘Why were there soldiers holding lances during thedinner?”

The computerized learning materials consisted of a CBL includingtwo sections: a lesson on plate tectonics, in which participantslearned geology, and an aid episode, in which participants receivedsupport for improving their learning (see Fig. 1). CBLs were devel-oped using the ToolBook II 6.1 (Asymetrix Corporation, 2001) soft-ware for hypermedia design. The first section, i.e., the lesson,consisted of a verbal–pictorial presentation on geology (specifi-cally, on plate tectonics), which was the same for the four groups.It included animations with concurrent narrations and was pre-sented on the computer (see Fig. 1). The lesson had a total durationof 386 s. It described how plate tectonics work including the fol-lowing data: the Earth’s layers; magma currents; ridge mecha-nisms; specific features, processes, and effects of the platecollision in the Andes; specific features, processes, and effects ofthe plate collision in the Himalaya. The target mental model wewanted the participants to construct was: ‘‘The Earth’s core is abig sphere made of iron which is very warm. The core makes themagma in the mantle heat up and, thus, approach the Earth’s sur-face. The magma inside the Earth surfaces through ridges andpushes the plates, which are moving away from each other. Whenmoving, plates can collide with other plates. There are differentkinds of collisions depending on the plates engaged in the crashes.Plates can collide and one can sink or subduct inside the Earth orig-inating mountains with volcanoes; or they can collide and movevertically originating mountains without volcanoes. When a platesinks it is destroyed and becomes magma again. The magma sur-facing through the ridges gets colder and solidifies. This createsnew plate material.”

The second section, i.e., the aid episode, was given after the les-son. It comprised a regulatory aid followed by an explanatory aid(see Fig. 2). The aid episode was designed to point out and revisea typical misunderstanding learners commit when learning platetectonics for the first time. Prior experiments carried out in ourlab (see the pilot study below – Section 3.5 – and Sánchez et al., ac-cepted) allowed us to identify the learners’ typical misunderstand-ings. Specifically, one of them is that students learning platetectonics for the first time tend to underestimate the differencesbetween the Andes and the Himalaya plate collisions.4 This makeslearners merge the specific features of each type of collision. It isworthy of mention that the misunderstanding is not held prior toreading but created during reading. The regulatory aid assumedthat students would fall into this flawed idea and pointed it out,stressing its inaccuracy. Therefore, it was intended to facilitatethe self-regulation process of monitoring understanding (e.g.,Moreno and Mayer, 2005). The subsequent explanatory aid showeda clarification of the differences between the two types of plate col-lisions. That is, it provided learners with the building blocks re-quired to carry out the revision of their mental representations(Chi, 2000). The explanation stressed the features of the plates in-volved in the Andes and the Himalaya collisions, the processeseach of the plates follow, and the consequences on the Earth sur-face those processes have. Words included in both the regulatoryaid and the explanatory aid were the same for the four versions(namely, AR-VE, VR-AE, VR-VE, AR-AE). The only difference wasthe modality in which they were presented.

There are three aspects of the aid episode that require a closeexamination. First, in order to develop the auditory sections, anexperienced teacher was recruited (the first author of this paper).

Page 6: The use of modality in the design of verbal aids in computer-based learning environments

Fig. 1. Screenshots from the computer-based learning environment used in the experiments. Screenshots (a) and (b) correspond to the lesson on plate tectonics, which wasidentical for all conditions. Screenshots (c) and (d) correspond to the aid episode either in the auditory modality (c) or in the visual modality (d).

550 E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561

He recorded both the regulations and the explanations and theresulting audio recordings were attached to the CBL. In both typesof aid, the teacher who recorded the audio recordings attempted toemploy auditory expressiveness. Two judges selected those record-ings which they considered more natural and convincing. Criteriafor considering a recording convincing were that it must clarifyspeaker’s communicative intention and provide additional cuesabout his/her attitude toward the message. In the regulatory partof the aid episode, a convincing recording was one clarifying thatthe communicative intention was to warn about the possible mis-understanding (regarding differences between the Andes and theHimalaya) and showing an attitude of urgency and importance(i.e., the speaker stressed the urgency of revising the misunder-standing and the seriousness of the problems related to thinkingabout differences between the Andes and the Himalaya as some-thing trivial). In the explanatory part, a convincing recording wasone talking about the differences between the Andes and theHimalaya in an intelligible pace and accent, clarifying that theintention was to make some statements (not warn, request, or oth-ers), and showing a neutral attitude. A second question regardingthe design of the aid episode is that there were no pictures relatedto it. This meaning that learners did not have to mentally integrateverbal information in the aid episode with any picture. This is thereason why modality principle was not applicable in this situation.Finally, the flaws in participants’ mental representations were notdetected by a test but directly assumed. This might be consideredinappropriate at first glance. However, we assumed learners tohave misunderstood the materials as it is very common (83% inprior studies) and, moreover, there is evidence indicating that itis an effective strategy (see Section 3.5).

The retention test consisted of a paper-and-pencil test whichcontained eight open-ended questions. Questions were about thecontents presented in the learning materials. They asked the learn-ers to recall the critical aspects of the material: ‘‘Which are themain plate movements?”, ‘‘What is a trench?”, ‘‘Which are the sim-ilarities between the Andes and the Himalaya plate collisions?”, ‘‘Is

any type of plate material destroyed in the subduction?”, ‘‘Whatare the differences between the Andes and the Himalaya plate col-lisions?”, ‘‘What happened to the continental plate in the Andescollision?”, ‘‘What happened to the ocean floor in the Himalaya col-lision?”, ‘‘Why does the oceanic plate sink in the Andes collision?”

The transfer test consisted also of a paper-and-pencil test andcomprised eight questions. They described a hypothetical scenariowithin which the student had to predict the results. Questionswere: ‘‘How could a mountain like the Himalaya be formed in Hol-land?”, ‘‘If the Andes’ volcanoes stopped erupting, what would youthink about it?”, ‘‘If new plate material is created permanently inthe ridges, why does the Earth’s surface remain the same size?”,‘‘Is it possible for the Himalaya mountains to have volcanoes inthe future?”, ‘‘How should the coast in the nearness of the Andeschain look like?”, ‘‘Imagine you find a fossil in the peak summitof a mountain, is it a mountain like the Andes or the Himalaya?”,‘‘Is it possible for the Himalaya chain to have a ridge in thefuture?”, ‘‘Imagine the mountains in the Andes stop growing, whatwould happen?” There was also a task including an illustration inwhich participants had to place a volcano and explain why it wasthere.

3.3. Procedure

Participants were tested simultaneously in groups of 10–15 par-ticipants per session. Each participant was seated in front of his/herindividual computer and headphones. Learning materials werepresented on Acer personal computers (Intel Pentium III proces-sor), which included 17 in. flat monitors. We used Sony head-phones. After receiving some basic instructions, prior knowledgetests were delivered and participants solved them in silence. Ittook no more than 12 min. Basic instructions were as follows:‘‘Thank you for participating in this experiment. We are interestedin how people learn from computer-based materials. You will beasked to use a computer-based learning environment on plate tec-tonics. After viewing the materials you will have to solve two sets

Page 7: The use of modality in the design of verbal aids in computer-based learning environments

Fig. 2. The aid episode presented in visual modality. The slide (a) corresponds tothe regulatory aid, which identifies a typical misunderstanding and stresses itsinaccuracy. The slide (b) corresponds to the explanatory aid, which provides anelaboration on the to-be-learned contents in order to clarify some aspects.

E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561 551

of questions. Please, solve all the tests in silence. Do not forgetrequesting for your tests when you have finished viewing the pre-sentation. Before using the computer materials, we want you to fillin a prior knowledge on geology. Please, try to remember all thethings you know about it. You must do it silently.” When everyonefinished filling in the prior knowledge test, participants startedusing the CBL. An experimenter ran the computerized materialson the participant’s computer. The experimenter randomly ranthe version of the CBL so that participants were randomly assignedto each condition. The CBL lasted about 8 min – considering boththe lesson (386 s) and the aid episode. Participants first viewedthe lesson (to-be-learned) contents, afterwards they received theaid episode including the regulatory and the explanatory aids.The regulatory aid was intended to detect a typical misunderstand-ing usually displayed by students learning plate tectonics (basedon prior studies) whereas the explanatory aid was intended to re-vise it.

Whenever an aid was presented visually, participants could useit as follows. Because we were interested in exploring the effect ofthe control of the visual modality, it was allowed for participantsviewing aids on the screen to take as long as they wanted in doingso (times were recorded). Moreover, the whole text was presentedin an all-at-once fashion on the screen so that participants couldfreely manage their time, exploring the segment of the text theywanted. As was argued before, although participants in visual con-

ditions could have control on visual aids, this possibility would beuseful only for explanations (not for regulations).

Whenever an aid was presented auditorily, participants coulduse it as follows. Audio recordings were played once. The auditoryregulatory aid lasted 33 s whereas the auditory explanatory aidlasted 66 s. Audio recordings were created by the first author andthey were judged as natural and convincing by blind judges, aswas explained before (see Section 3.2). Recordings were intendedto show maximum expressiveness.

Participants in the AR-VE condition listened to the regulatoryaid through their headphones. The audio recording was playedonce. Then, participants viewed the explanatory aid on the com-puter screen. They could freely manage their time. Participants inthe VR-AE condition viewed the regulatory aid on the computerscreen having control on their reading. Then, they listened to theregulatory aid once. Participants in the VR-VE condition viewedboth the regulatory and the explanatory aids on the screen havingcontrol on their reading. Participants in the AR-AE condition lis-tened to both the regulatory and the explanatory aids. Audiorecordings were played once.

After using the CBL (including the lesson and the aid episode)participants were given the recall questions test. Solving the testtook no more than 10 min. Then they were given the inferencequestions test and they had no more than 15 min to solve it. Final-ly, they were tested on general reading comprehension skill. Theyread a story about an explorer on the computer screen and thenanswered a set of multiple-choice questions on a paper. After thisfinal test was collected, participants were seen off. Each sessionlasted about 60 min.

3.4. Scoring

A rater scored all the questionnaires unaware of the condition ofeach participant. Approximately 30% of the tests were also scoredby a second rater. Interrater agreement was above .85 in the priorknowledge, the retention, and the transfer tests. Disagreementswere resolved by consensus.

A template with possible answers was developed for the priorknowledge test. It included accurate, correct but incomplete, andincorrect answers. They yielded 2, 1, or 0 points, respectively.Accurate answers for the question about the plates in the priorknowledge test were those stating that they are blocks of thesuperficial layer of the Earth. Accurate answers for the plates’ loca-tion question included saying that they are above the mantle, asthey are part of the superficial layer. Accurate answers for thequestion about continents moving included stating that continentsmove because they are also plates or part of a plate. Accurate an-swers for the question about fossils were those stating that moun-tains comprise blocks that were under the sea in the past. Accurateanswers for the question about volcanoes formation included men-tioning plates collision and cracks in one of the plates involved asthe causes. Accurate answers for the question about mountain for-mation were those mentioning plates collision. Each element cor-rectly named in the question about the Earth’s layers yielded 1point. Total scores ranged from 0 to 15 in the prior knowledge test.

Each right answer in the general reading comprehension skilltest yielded 1 point. Wrong answers yielded 0 points. Total scoresranged from 0 to 8.

A template with possible answers was created for the retentiontest. Accurate answers yielded 2 points, accurate but incompleteyielded 1 point, and incorrect answers yielded 0 points. Accurateanswers for the question about main plate movements includedmentioning divergent and convergent movements (distinguishingthe Andes and the Himalaya types). Accurate answers for the ques-tion about the trench were those stating that it is a trough wherethe oceanic plate material is destroyed. Accurate answers for the

Page 8: The use of modality in the design of verbal aids in computer-based learning environments

552 E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561

question about similarities between the Andes and the Himalayaincluded stating that in both cases two plates collide and newmountains are formed. Accurate answers for the question aboutplate destruction were those saying that the oceanic plate is de-stroyed in the Andes collision. Accurate answers for the questionabout differences and similarities between the Andes and theHimalaya collisions included mentioning the plates involved, theprocesses, and the effects on the Earth’s surface of each collisiontype. Accurate answers for the question about the continental platein the Andes were those stating that it suffers from oceanic pres-sure and, hence, cracks are formed on it. Accurate answers forthe question about the ocean floor in the Himalaya included sayingthat it was destroyed, dragged, and finally integrated into themountain formed. Accurate answers for the question about thesinking of the oceanic plate in the Andes were those stating thatthe oceanic plate sinks as a result of its weight. The range of totalscores was 0–16 in the retention tests.

A template with possible answers was also developed for thetransfer test. Once again accurate answers yielded 2 points,accurate but incomplete yielded 1 point, and incorrect answersyielded 0 points. Accurate answers for the question about locat-ing a volcano in a picture included locating it and explaining thereason behind that location. Accurate answers for the questionabout a mountain in Holland included mentioning that thereshould be two continental plates colliding and finally folding up-wards. Accurate answers for the question about volcanoes with-out eruptions included saying that the collision is over so thereis no longer subduction, no pressure, and, then, no cracksthrough which magma emerges. Accurate answers for the ques-tion about volcanoes in the Himalaya were those arguing that itwill not be volcanoes in the Himalaya because no plate sinksand, hence, there is no pressure on the continental plates form-ing cracks so that is not possible for the magma to surface. Accu-rate answers for the question about the coast in the Andesincluded mentioning that it should have craggy cliffs. Accurateanswers for the questions about fossils were those choosingthe Himalaya as the place where it is possible to find fossilsand explaining why. Accurate answers for the question about atrench in the Himalaya were those mentioning that it is not pos-sible for the Himalaya to have a trench because the two platesinvolved in the collision are equal and, hence, none of themsinks. The range of total scores was 0–16 in the transfer test.

3.5. Pilot study

A pilot experiment was carried out with the goal of testing theeffectiveness of the aid episode and creating a list of typical misun-derstandings committed by learners. It should be noted that thereis prior evidence of the effectiveness of the aid episode (Sánchezet al., in press). However, an additional test was conducted in orderto reinforce that finding. In the same study (Sánchez et al.) it wasalso possible to collect evidence about learners’ common misun-derstandings. Thus, the aim was again to reinforce these priorfindings.

In the pilot experiment participants similar to those participat-ing in the experiments below were recruited. Similarity refers hereto age, demographic aspects, general reading comprehension skill,and prior knowledge on plate tectonics. They were randomly as-signed to one of two conditions: a CBL including the lesson on platetectonics and the aid episode (AID) and a CBL including the samelesson on plate tectonics without the aid episode (noAID). Thirteenparticipants served in the AID condition and 13 in the noAIDcondition.

Each participant received the same set of materials. The setcomprised a prior knowledge test, a general reading comprehen-sion skill test, the computerized learning materials, a retention

test, and an inference test. Prior knowledge, general reading com-prehension skill, retention, and inference tests were identical tothose described before. The computer learning materials werethe AR-VE version (already described) for the AID condition andan identical version, except that the aid episode (including the reg-ulatory and the explanatory aids) was removed, for the noAIDcondition.

The experimental and the scoring procedures were the same asthose described in prior sections. All the variables were analyzedusing a t-test and an alpha of .05 (thorough this article). Tests weretwo-tailed. Results indicated that there were no significant differ-ences in prior knowledge scores between the AID condition(M = 6.77, SD = 2.68) and the noAID condition (M = 5.92,SD = 2.66), t(24) = 0.808, p > .05, MSE = 7.135. The same goes forgeneral reading comprehension skill performance, with partici-pants in the AID condition (M = 5.15, SD = 1.63) scoring equal tothose in the noAID condition (M = 4.62, SD = 1.56), t(24) = 0.863,p > .05, MSE = 2.532. With respect to the performances in theretention test, participants serving in the AID condition(M = 10.31, SD = 3.25) recalled basic concepts better than theircounterparts (M = 7.00, SD = 1.53), t(24) = 3.321, p < .01,MSE = 6.449, g2 = .31. The same was true in the inference test,the AID condition (M = 9.62, SD = 2.96) outerperformed the noAIDcondition (M = 4.23, SD = 2.09), t(24) = 5.361, p < .001,MSE = 6.558, g2 = .54. This indicates that participants in the AIDcondition constructed better mental representations and weremore able to apply them to solve the inference tasks. To rule outthe possibility that differences in retention and inference scoreswere due to differences in prior knowledge and/or general readingcomprehension skill we conducted an analysis of covariance. Priorknowledge and general reading comprehension skill were enteredas covariates. Results were the same as before: the impact of theaid episode was still true both in retention, F(1,22) = 9.045,p < .01, MSE = 5.961, g2 = .02, and in inference scores,F(1,22) = 27.561, p < .001, MSE = 5.683, g2 = .10.

Common misunderstandings were also analyzed. One of thelikeliest misunderstandings was the confusion between the Andesand the Himalaya collisions. More accurately, 83% of the partici-pants underestimated the specific aspects of each collision. Thisproportion comprises only those participants who did not receivethe aid episode. The proportion includes both participants in thenoAID group of this pilot experiment and of previous ones (Sán-chez et al., in press). The confusion between the Andes and theHimalaya collisions was coded as follows: the participantexpressed (in his/her answers to either the retention or the infer-ence tests) that the Himalaya is the result of a crash between a con-tinental and an oceanic plate, the participant expressed that it ispossible to find volcanoes in the Himalaya, the participant didnot distinguish between continental and oceanic plates or mixedup their features. Participants in the AID condition committed thismisunderstanding less likely, v2(1) = 27.30, p < .001.

One might argue that misunderstandings are the result of apoorly designed CBL. Nevertheless, the same failures in compre-hension have been identified using different CBLs. This is the rea-son why we believe misunderstandings are due to learners’limitations not materials’ limitations.

To sum up, participants receiving the aid episode (in the form ofan auditory regulatory aid plus a visual explanatory aid) learnedmore from the CBL and constructed better mental representations.This was true in retention and in inference tests; and when exam-ining misunderstandings as well. Specifically, participants who didnot receive the aid episode mixed up the features of the Andes andthe Himalaya plate collisions. As long as the aids were effective, itseemed reasonable to explore which configuration of the modalityof those aids would be the best.

Page 9: The use of modality in the design of verbal aids in computer-based learning environments

Table 1Means (M) and standard deviations (SD) of all conditions in control and learning measures: Experiment 1

Prior knowledge General reading comprehension skill Retention test Inference test

M SD M SD M SD M SD

AR-VE 5.88 2.87 3.85 1.63 8.35 2.76 7.23 2.86VR-VE 5.92 2.29 5.46 0.97 8.46 1.63 6.85 1.95AR-AE 6.00 2.24 5.38 0.77 7.54 0.85 4.23 2.65VR-AE 6.73 2.89 5.54 1.20 8.00 2.01 4.08 1.98

Note. The maximum scores were 15 for prior knowledge, 8 for general reading comprehension skill, 16 for retention test, and 16 for inference test.

E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561 553

3.6. Results

The variables under analysis were prior knowledge, generalreading comprehension skill, retention, and transfer tests scores.Times taken on the visual regulatory and explanatory aids were re-corded and they will be showed below. Retention and transfer testsscores were analyzed using a factorial analysis of variance with themodality of the regulatory aid (visual/auditory) and the modalityof the explanatory aid (visual/auditory) as the between-subjectsfactors. Prior knowledge and general reading comprehension skillscores were entered as covariates. Eta-squared (g2) was calculatedas a measure of the effect size whenever there was a significant ef-fect. All the scores are shown in Table 1.

3.6.1. Control variablesPerformances of all conditions in prior knowledge and general

reading comprehension skill tests are shown in Table 1. Two one-way analysis of variance were conducted in order to ensure thatall conditions had the same level in these variables. With respectto prior knowledge there were no significant differences,F(3,48) = 0.311, p > .05, MSE = 6.704. With respect to general read-ing comprehension skill there were significant differences,F(3,48) = 6.092, p < .01, MSE = 1.401, g2 = .03. This is the reasonwhy we decided to enter this variable as a covariate in the subse-quent analyses. To rule out the possibility that prior knowledgewas affecting performance in retention and inference we decidedto include it as a covariate as well (even when there were no signif-icant differences between conditions in this variable).

3.6.2. Retention testThe conditions including the visual explanatory aid (namely,

the AR-VE and the VR-VE conditions) scored higher than theircounterparts (namely, the AR-AE and the VR-AE conditions) inthe retention test. An analysis of covariance revealed that themodality of the explanatory aid was a significant factor,F(1,46) = 4.822, p < .05, MSE = 2.916, g2 = .09. This indicates thatparticipants receiving visual explanatory aids recalled basic con-cepts about plate tectonics better than those who received audi-tory explanatory aids. Scores in retention tests were similarbetween the auditory regulatory aid conditions (namely, theAR-VE and the AR-AE conditions) and the visual regulatory aidconditions (namely, the VR-VE and the VR-AE conditions). Notsurprisingly, an analysis of covariance showed that the modalityof the regulatory aid was not significant, F(1,46) = 0.536, p > .05.This means that participants receiving the auditory regulatoryaid did not recall basic concepts better than the participantsreceiving the visual regulatory aid. The interaction between thetwo factors was not significant, F(1,46) = 2.134, p > .05. This indi-cates that the results reported above were not qualified by anyinteraction.

3.6.3. Inference questionsRegarding inference questions, scores were not equal for all

conditions, with participants in the visual explanatory aid condi-

tions performing better than those in the auditory explanatoryaid conditions. An analysis of covariance revealed that the differ-ence was reliable, F(1,46) = 21.259, p < .01, MSE = 5.209, g2 = .32.That is to say, participants in the visual explanatory aid conditionswere more able to revise and apply their mental representations tosolve tasks than did those in the auditory explanatory aid condi-tions. Scores in inference questions were not different betweenthe conditions including the auditory regulatory aid and thoseincluding the visual regulatory aid. This was confirmed by an anal-ysis of covariance, F(1,46) = 1.100, p > .05. Accordingly, partici-pants receiving the auditory regulatory aid and those receivingthe visual regulatory aid were equally able to revise and apply theirmental representations on plate tectonics. As occurred in retentionquestions, the interaction between the two factors was not signif-icant, F(1,46) = 0.549, p > .05. This indicates that the results wehave just reported were not qualified by any interaction.

3.6.4. Time-recordingsParticipants in the visual regulatory aid conditions (M = 24.76 s,

SD = 3.63) spent less time viewing the regulatory aids than those inthe auditory regulatory aid conditions (33 s). The same goes for theexplanatory aid: those in the visual explanatory aid conditions(M = 46.67 s, SD = 4.96) took less time viewing it than those inthe auditory explanatory aid conditions (66 s).

3.7. Discussion

Overall, support has been gathered for the visual-explanationsuperiority hypothesis but not for the auditory-regulation superi-ority hypothesis. Both results need further comment.

Participants receiving the explanatory aid presented in visualmodality outperformed those receiving the same aid presented inauditory modality. This was true both in retention and inferencescores. In the light of the functional approach, we expected com-plex/difficult explanations to be better comprehended when pre-sented visually, since the visual modality provides control tolearners. The explanation comprised by the explanatory aid wascomplex and difficult, as argued above. The results are consistentwith our predictions. In order to interpret why this was the case,several explanations can be suggested. First, one might think thatparticipants in the visual-explanation conditions took more timeviewing the explanatory aid than those listening to it. But thiswas not so. As time-recordings showed, participants having controlover the explanatory aid did not take more but less time. Theseparticipants dedicated a 25% less time (approximately) than theircounterparts on average. Another possibility is to think that partic-ipants receiving the visual explanatory aid read it completely morethan once. As we did not count the number of readings it is notpossible to rule out this possibility. However, it seems less likelythat they did so, as they spend 330 ms (46,670 ms/141 words)per word on average. According to some evidence, readers needabout 350 ms to access the meaning of a word (see for exampleKintsch (1998)). A third possible explanation is that the control al-lowed these participants to freely manage the time available, even

Page 10: The use of modality in the design of verbal aids in computer-based learning environments

554 E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561

if it was short. This is consistent with patterns of visual search dur-ing reading (Just et al., 1982; Rayner et al., 2005): readers fixatelonger on certain words, review prior segments when they needit, and so on. The point is not the total time spent but the way thistime is distributed. Therefore, participants in the visual explana-tory aid conditions distributed the time available in an efficientway. This is particularly relevant when learners are provided withcomplex and difficult texts, such as the explanatory aids used here.

The results regarding the advantage of the visual modality inexplanatory aids are in line with prior research on text comprehen-sion. As was mentioned before, the auditory and the visual modal-ities present no differences when readers are provided withsimple/easy texts to comprehend (Gernsbacher et al., 1990; Kin-tsch and Kozminsky, 1977; Kintsch et al., 1975; Smiley et al.,1977). But they do when texts are complex/difficult, as found inthe experiment of Green (1981): in these cases visual texts are bet-ter than auditory ones. In our experiment, the explanatory aid wasa complex/difficult text and, thus, the visual modality was betterthan the auditory modality. This extends the work of Green(1981) in two ways. First, it replicates her findings. Second,whereas she used expository texts, we employed explanations inthe form of verbal aids incorporated into a CBL.

We found no support for the auditory-regulation superiorityhypothesis. This was indicated by the fact that participants in audi-tory regulatory aid conditions did not outperform those whoserved in the visual regulatory aid conditions. The same patternwas found in the retention and the inference tests. We predictedthe auditory-regulation to be better than the visual-regulationbecause the former provided additional information – such as thecommunicative intention and the speaker’s attitude. This addi-tional information would be relevant only when dealing with reg-ulatory aids, since they assist learners in selecting appropriatelearning strategies, monitoring their emerging understanding,and so on. In those cases, knowing what does the speaker or assis-tant want and his attitude towards the action he/she is asking thelearner to perform would be beneficial. However, this was not truein this experiment. Why was not the auditory modality in regula-tion as decisive as we expected? One possibility is that the auditorymodality needs further cues to make the additional information itprovides more trustworthy. For instance, the auditory modalitywould need not only expressive cues (i.e., prosody) but a visiblespeaker or agent. This is in line with the following arguments.Moreno and her colleagues (Moreno and Flowerday, 2006; Morenoet al., 2001) have suggested that agents incorporated into CBLs pro-mote a sense of social agency, which means that learners are moreengaged in learning when they think they are interacting with an-other human. This rationale is in line with the notion of persona ef-fect (Dehn and van Mulken, 2000; Johnson et al., 2000).Accordingly, all cues that enhance the sense of social agency canfoster learning (e.g., visibility of the agent). Thinking of the assis-tant or agent as a human would make the expressive additionalinformation more trustworthy. This rationale is described morethoroughly in the next section.

The explanatory aids used here consisted of complex and diffi-cult to understand explanations. This explains why the visualmodality was better than the auditory modality when givingexplanations. This is different from prior experiments. It wasargued above that Moreno et al. (Moreno and Mayer, 2002; Morenoet al., 2001) and Atkinson (2002) used simple and easy explana-tions (instead of complex and difficult ones). It is not clear whetherthey would obtain a visual non-advantage for complex/difficultexplanatory aids. Graesser and his colleagues (2003) were givingboth regulatory and explanatory aids without controlling it. Thiswould explain why they found no differences between the visualand the auditory modalities. It is not clear that the same resultswould be obtained for materials in which the use of different aids

is kept under control. In our experiment, different aids were distin-guished and, thus, different results were obtained for each one.

4. Experiment 2: Exploring the effect of auditory regulatory aidsincluding a pedagogical agent

The second experiment had two goals. First, to explore morethoroughly the impact of the auditory modality when giving regu-latory aids. Second, to replicate the results found in the first exper-iment regarding the superiority of visual explanatory aids.

In this experiment, it was hypothesized that the visibility andphysical presence of the pedagogical agent would make the addi-tional information delivered through the auditory modality moretrustworthy – and, hence, effective. According to the social agencytheory (Moreno and Flowerday, 2006; Moreno et al., 2001), whenlearners think about the pedagogical agent as a human, theirengagement is enhanced and, thus, learning is promoted. The morecues (e.g., visibility, presence), the more humanity is perceived(also in line with the notion of persona effect, Dehn and van Mul-ken, 2000; Johnson et al., 2000). We argued that this sense ofhumanity would be beneficial only for regulatory aids. As was ex-plained before, when dealing with explanatory aids, learners haveto assimilate the contents comprised by them without taking spe-cial decisions. Conversely, regulatory aids are given when learnershave to take decisions (e.g., why, when, which learning strategy touse). This means that learners provided with regulatory aids haveto stop doing something and perform the action expressed in theaid. In so doing they have to assess whether it is worthy or notto carry out that action. The additional information deliveredthrough the expressiveness of the auditory modality may facilitatesuch a decision taking. The point is that for such additional infor-mation to be trustworthy, a real life-like agent should providethe regulatory aids. When this condition is attained agent’s inten-tions and attitudes become more believable and reliable. As a con-sequence, the impact of regulatory aids should be more powerfulas well.

In the light of the above, we decided to introduce a real agent intothe CBL presentation. The agent was not an animated virtual agentbut an actual human being (an experienced teacher). This personcame along with the participants through their session using theCBL and took part whenever they received aids presented in audi-tory modality. Therefore, auditory aid recordings from Experiment1 were replaced by actual utterances from a human being, who ut-tered them in a trustworthy fashion. The agent, then, was not onlyvisible but physically present in the experimental session. Both vis-ibility and presence were considered cues of humanity. It is interest-ing that prior studies have also introduced a human being as apedagogical agent in the experimental session (e.g., Azevedo et al.,2004, 2008) – although they had a different purpose.

It should be noted that the impact of agents’ visibility has beenalready tested empirically, finding somewhat discouraging results(Dehn and van Mulken, 2000). There are several studies on thisquestion, which present some relevant differences. First, they differon the function the pedagogical agent accomplishes: the agent caneither deliver the to-be-learned contents or provide learners withsome kind of assistance or aids. Second, the studies differ on thedependent measures used: these measures can either be subjective(such as likeability or engagement) or objective (learning measuressuch as retention or performance on transfer tasks). Overall, the ef-fect of agent’s visibility has been weak. Regarding the studies usingthe agent as a contents deliverer, the effects of its visibility onlearning are discouraging, being those found on subjectivemeasures more encouraging. For instance, Craig et al. (2002) con-ducted an experiment in which the participants learned aboutthe lightning formation with the help of a multimedia presenta-

Page 11: The use of modality in the design of verbal aids in computer-based learning environments

E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561 555

tion. It was an agent who provided the explanations accompanyingthe animations about the lightning process. There were three con-ditions: invisible agent, visible agent, visible agent showing ges-tures. The visibility was not significant in any of the dependentvariables, which were retention and transfer. The same patternwas found on the measure persona perception. Mayer et al.(2003) carried out a similar experiment using materials on the to-pic of electronics. The participants clicked on different parts of anillustration to activate animations with concurrent narrationsdelivered by a pedagogical agent. There were two conditions: vis-ible agent and invisible agent. The participants in both conditionsperformed equally on a transfer task. van Mulken et al. (1998)asked some participants to learn engineering from a multimediapresentation. There were some illustrations accompanied by narra-tions, which were given either by a visible or an invisible agent.The participants learned equally well from the two conditionsbut those in the visible agent condition found the materials moreentertaining. In a recent experiment, positive results on the learn-ing measures have been found. Buisine and Martin (2007) had par-ticipants to learn how to use a remote control from a multimediapresentation. An illustration was shown and an agent deliveredthe corresponding narrations. The agent was either visible andshowing gestures, visible but static, or invisible. Those in the for-mer condition scored better than their counterparts in a recall taskand also reported higher ratings of likeability of the agent. Regard-ing the studies using the agent as an aid provider, the findings arequite similar to those we have just reviewed. Atkinson (2002)manipulated the visibility of an agent who helped participants(giving explanatory aids) in learning from worked-out examples.In his first experiment, the impact of the visibility was not signifi-cant in any of the dependent variables, i.e., perceived difficulty ofthe CBL, near, and far transfer (although agent’s visibility was sig-nificant in far transfer in his Experiment 2). Moreno and her col-leagues (2001) found the same pattern of results, also independent variables such as motivation or perceived understand-ability: the visibility of an agent giving explanatory feedback wasnot significant. The same was true both for virtual and human-likeagents. Baylor and Ryu (2003) had participants to solve psychoped-agogical consultation problems with the help of multiple agents.These agents gave different aids, such as corrective feedback(‘‘that’s a good idea”) or explanatory feedback (‘‘but. . . the studentswon’t get it just from the definitions and the laws. . . they need tointeract with the information”; examples extracted from Baylor(2002)). There were three types of agent: visible and animated, sta-tic, and invisible. The participants in the former condition scoredhigher than their counterparts on the engaging and person-like rat-ings; however, there were no differences on the learning measures.Mitrovic and Suraweera (2000) asked their participants to learnhow to design search engines for databases. In their learning envi-ronment there was an agent providing participants with multipleaids similar to those in Baylor and Ryu’s study. Mitrovic and Sur-aweera manipulated the visibility of the agent finding that partic-ipants seeing the agent scored higher than their counterparts in thelikeability ratings; however, there were no data available concern-ing performances on the learning measures. Finally, in Graesseret al. (2003) it was found that seeing the face of the agent hadno impact in learning either. Their agent gave both explanatoryand regulatory aids, as was argued before.

When considering the agents’ fulfilling an assistance function(aid providers rather than contents deliverers) and its effect onlearning measures, why was the agents’ visibility ineffective? Weargue that visibility is relevant only for regulatory aids, not forexplanations. Perceiving additional cues (such as agent’s attitude)is not beneficial in learning from an explanation, as the learnerhave no decisions to take while doing so. However, these cueswould be useful when having to take decisions, as they facilitate

the decision taking. In the experiments we have just describeddesigners gave visibility to agents providing explanations but notregulations (in Atkinson and Moreno et al.) or an agent providingthe two types of aid without controlling it (in Baylor and Ryuand Graesser et al.). This would explain why those researchersfound no effect of agents’ visibility. Additionally, it could be arguedthat agents’ visibility is not strong enough; hence, cues such asphysical presence are also required.

The second experiment differed from earlier work in two ways.We attempted to solve the problems described above by (a) exam-ining the impact of agent’s features either in explanation or regu-lation and (b) providing our agent with additional cues (i.e., hewas physically present) besides visible.

We had the same predictions as in Experiment 1. This meansthat the auditory modality and humanity of the verbal aids was ex-pected to be helpful only for regulations but not for explanations.The control of the visual modality would be helpful only whendealing with explanatory aids. Therefore, we expected participantsin the visual explanatory aid conditions to outperform those in theauditory explanatory aid conditions; we further expected partici-pants in the auditory regulatory aid conditions to outperform thosein the visual regulatory aid conditions. This would be true both onretention and inference scores.

4.1. Participants and design

Sixty-one undergraduate students enrolled in educational psy-chology courses at the University of Salamanca (Spain) participatedin this experiment. They were randomly assigned to one of the fourconditions. Sixteen participants served in the auditory-regulationand visual-explanation condition (AR-VE), 14 in the visual-regula-tion and auditory-explanation condition (VR-AE), 16 in the visual-regulation and visual-explanation condition (VR-VE), and 15 in theauditory-regulation and auditory-explanation condition (AR-AE).The mean age of the sample was 22. Ninety-one percent of thesample consisted of female students and 9% of male students. Aswas argued in Experiment 1 (see Section 3.1), this proportion ofwomen may not represent a problem. All participants had experi-ence in using computers.

As in Experiment 1, we used a 2 � 2 factorial design with themodality of the regulatory aids (visual/auditory) and the modalityof the explanatory aids (visual/auditory) as the between-subjectsfactors. In this experiment, the auditory modality consisted of avisible human agent providing the aids. Retention and inferencequestions were again the dependent variables. General readingcomprehension skill and prior domain knowledge were used ascontrol variables.

4.2. Materials

For each participant materials consisted of a prior knowledgetest, a general reading comprehension skill test, the computerizedlearning materials, a recall questions test, and an inference ques-tions test. All the materials were identical to those used in Exper-iment 1 except for the following changes.

The computerized learning materials were the same as those usedin Experiment 1, except for the aids presented in the auditorymodality. The CBL consisted of the same presentation includingtwo sections: a lesson on plate tectonics and an aid episode. Theaid episode was given after the lesson, as was done before. In thisexperiment, auditory aids (both regulatory and explanatory) wereprovided by a human agent who was in the same room as the par-ticipants. In both types of aid, the human agent attempted to makeuse of auditory expressiveness. As was done in Experiment 1, twojudges assessed the naturalness of the agent’s utterances. A con-vincing regulatory aid was one clarifying that the communicative

Page 12: The use of modality in the design of verbal aids in computer-based learning environments

556 E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561

intention was to warn about the possible misunderstanding andshowing an attitude of urgency and importance. In the explanatorypart, a convincing aid was one clarifying the differences betweenthe Andes and the Himalaya in an intelligible pace and accent,expressing that the communicative intention was to make somestatements, and showing a neutral attitude. The judges were pres-ent in the experimental sessions in order to ensure this was thecase. Furthermore, they checked the words the agent used to seeif they were exactly the same as in the visual modality. Becausethe words were the same, the variable under consideration wasthe modality not the contents of the verbal aids.

As was done in Experiment 1, there were no pictures related tothe aid episode (see Section 3.2). This prevented learners from hav-ing to mentally integrate verbal information in the aid episodewith any picture. This is the reason why modality principle wasnot applicable in this situation.

4.3. Procedure

Procedure was the identical to that used in Experiment 1 exceptfor some changes we will describe next. Participants were againtested simultaneously in groups of 10–15 participants per session.Each participant was seated in front of his/her individual computerand headphones. When everybody finished the prior knowledgetest, participants started using the CBL. An experimenter ran thecomputerized materials on the participant’s computer. Participantswere randomly assigned to each condition. This was done in a dif-ferent fashion with respect to Experiment 1. In this second exper-iment, all participants in a session were assigned to the samecondition (because the human agent gave the aids to all the partic-ipants in a session). A given session was randomly designated touse certain experimental condition. Participants first viewed thelesson contents, then they received the aid episode including theregulatory and the explanatory aids.

Whenever an aid was presented in the visual modality, partici-pants viewed it on the screen taking as long as they wanted indoing so. The whole text was presented in an all-at-once fashionon the screen so that participants could freely manage their time,exploring the segment of the text they wanted. A human agent pre-sented the auditory aids. He uttered them once attempting to makeuse of the auditory expressiveness. Two judges were present in theexperimental sessions to check several aspects. First, whether theagent was convincing. Second, whether the words uttered werethe same as those included in the visual aids. Third, times were re-corded in order to ensure that they were the same as those of therecordings in Experiment 1. Judges reported that all aspects wereappropriate.

Participants in the AR-VE condition listened to the regulatoryaid, which was provided by the human agent once. Then, partic-ipants viewed the explanatory aid on the computer screen hav-ing control on their reading. Participants in the VR-AE conditionviewed the regulatory aid on the computer screen having con-trol on their reading. Then, they listened to the regulatory aid,which was provided by the human agent once. Participants in

Table 2Means (M) and standard deviations (SD) of all conditions in control and learning measure

Prior knowledge General reading comprehension

M SD M SD

AR-VE 6.63 3.57 5.63 1.36VR-VE 6.53 2.47 5.31 1.66AR-AE 7.94 2.42 5.47 0.99VR-AE 8.00 2.65 4.71 1.07

Note. The maximum scores were 15 for prior knowledge, 8 for general reading compreh

the VR-VE condition viewed both the regulatory and the explan-atory aids on the screen having control on their reading. Partic-ipants in the AR-AE condition listened to both the regulatoryand the explanatory aids. They were provided by the humanagent once.

After using the CBL, participants were provided the recall andinference tests. They had no more than 10 and 15 min to solvethem, respectively. Finally, participants were tested on generalreading comprehension skill. Each session lasted about 60 min.

4.4. Scoring

Scoring procedures were identical to those used in Experiment 1.

4.5. Results

The variables under analysis were prior knowledge, generalreading comprehension skill, retention, and transfer tests scores.Times taken on the visual regulatory and explanatory aids wererecorded (see below). Retention and transfer tests scores wereanalyzed using a factorial analysis of variance with the modalityof the regulatory aid (visual/auditory) and the modality of theexplanatory aid (visual/auditory) as the between-subjects fac-tors. Prior knowledge and general reading comprehension skillscores were entered as covariates. Eta-squared (g2) was calcu-lated as a measure of the effect size. All the scores are shownin Table 2.

4.5.1. Control variablesPerformances of all conditions in prior knowledge and general

reading comprehension skill tests are shown in Table 2. Twoone-way analysis of variance were carried out. There were no sig-nificant differences in prior knowledge, F(3,57) = 1.226, p > .05,MSE = 7.990. The same was true for general reading comprehen-sion skill, F(3,57) = 1.343, p > .05, MSE = 1.715. To rule out the pos-sibility that prior knowledge or general reading comprehensionskill were affecting performance in retention and inference wedecided to include it as covariates in subsequent analyses(although there were no significant differences between conditionsin these variables).

4.5.2. Retention testThe conditions scored equally in the retention test. An analysis

of covariance revealed that the modality of the explanatory aid wasnot significant, F(1,55) = 0.712, p > .05, MSE = 1.938. This meansthat participants receiving visual explanatory aids did not recallbasic concepts about plate tectonics better than those who re-ceived auditory explanatory aids. This is not consistent with the re-sults in Experiment 1 and it will be discussed below. An analysis ofcovariance showed that the modality of the regulatory aid was notsignificant, F(1,55) = 0.523, p > .05. This indicates that participantsreceiving the auditory regulatory aid did not recall basic conceptsbetter than the participants receiving the visual regulatory aid.The interaction between the two factors was not significant,

s: Experiment 2

skill Retention test Inference test

M SD M SD

6.50 1.55 12.06 2.796.44 1.75 10.94 2.866.34 1.11 8.80 2.085.57 1.65 8.07 2.06

ension skill, 16 for retention test, and 16 for inference test.

Page 13: The use of modality in the design of verbal aids in computer-based learning environments

E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561 557

F(1,55) = 1.789, p > .05. This indicates that the results above re-ported were not qualified by any interaction.

4.5.3. Inference questionsWith respect to inference questions, participants in the visual

explanatory aid conditions outperformed those in the auditoryexplanatory aid conditions. An analysis of covariance confirmedthis result, F(1,55) = 20.421, p < .001, MSE = 4.630, g2 = .02. Thisindicates that participants in the visual explanatory aid conditionswere more able to revise and apply their mental representationsthan did those in the auditory explanatory aid conditions. An anal-ysis of covariance showed that there were no significant differ-ences between conditions as a function of the modality of theregulatory aid, F(1,55) = 1.941, p > .05. That is to say, participantsreceiving the auditory regulatory aid and those receiving the visualregulatory aid were equally able to revise and apply their mentalrepresentations on plate tectonics. The interaction between thetwo factors was not significant either, F(1,55) = 0.007, p > .05, thismeaning that the results we have just reported were not qualifiedby any interaction.

4.5.4. Time-recordingsParticipants in the visual regulatory aid conditions (M = 25.98,

SD = 4.18) spent less time viewing the regulatory aids than thosein the auditory regulatory aid conditions (33 s). Those in the visualexplanatory aid conditions (M = 47.80 s, SD = 5.24) took less timeviewing it than those in the auditory explanatory aid conditions(66 s).

4.6. Discussion

Overall, we have gathered support for the visual-explanationsuperiority hypothesis but not for the auditory-regulation superi-ority hypothesis. The former result is consistent with the findingsof Experiment 1 and with our predictions. The latter is also consis-tent with the findings in Experiment 1 but not with ourexpectations.

Participants receiving visual explanatory aids outperformedthose receiving auditory explanatory aids in inference scores. Wepredicted that complex and difficult explanatory aids would becomprehended better in visual modality, since this makes it possi-ble to control the presentation. This is what has been found and itis in line with the results found before. As was done in Experiment1, this result will receive further consideration. The visual advan-tage would be explained by three alternative arguments. One pos-sibility is that participants receiving visual explanatory aids tookmore time viewing them than those in the auditory explanatoryaid conditions. Again, time-recordings showed that the former tookno more but less time (25% less time, approximately). Another pos-sibility is to think that these participants read the full explanatoryaid twice. This might be possible but seems low likely, as theyspend 339 ms (47,800 ms/141 words) per word on average. Finally,it could be argued that these participants freely managed the timeavailable, focusing on the segment they wanted. This is consistentwith patterns of visual movements during reading (Just et al.,1982; Rayner et al., 2005).

The visual advantage represents additional evidence consistentwith the results in Green (1981). Research in text comprehensionhas demonstrated that when texts are easy and/or simple, themodality (visual/auditory) of presentation is not relevant (Gernsb-acher et al., 1990; Kintsch and Kozminsky, 1977; Kintsch et al.,1975; Smiley et al., 1977). Nevertheless, when they are difficultand/or complex, the visual modality outperforms the auditorymodality (Green, 1981). Our result is in line with these findings.Moreover, our results extends Green’s findings because she usedtexts whereas we used verbal explanatory aids introduced into a

CBL. Note that the visual advantage was found only in explanatoryaids, which are complex and difficult to understand, not in regula-tory aids, which are simple and easy.

Despite the advantage of the visual explanatory aid in inferencescores, there was no effect in retention scores. This needs closeexamination. Inference tests are designed to elicit the use of theknowledge acquired in a productive way. They pose novel andproblematic situations and learners must reason about them usingthe knowledge they have just gained. In sum, inference tests elicitproblem-solving processes on learners. This is different from theprocess that is elicited by retention tests. Retention tests ask learn-ers to recall basic information. This makes it possible for learners toreproduce what they have learned without deeply reflecting on it.Therefore, retention tests should not be considered measures ofdeep learning – at least in a strict sense (see for example Kintsch,1994 for a more thorough discussion). On the contrary, inferencetests provide learners with tasks in which they have to demon-strate whether they have acquired a deep learning. The aids weintroduced in the CBL were intended mainly to promote deeplearning. This is the reason why the effects might be observableonly in deep learning measures. In fact, there are experiments inwhich the impact of learning aids led to an increase in performancein deep learning tasks but not in other tasks (e.g., McNamara et al.,1996; Vidal-Abarca et al., 2000).

In this experiment it was not possible to find support for theauditory-regulation superiority hypothesis, as happened in Exper-iment 1. The auditory modality was enriched by adding visibilityand physical presence to the agent. It was hypothesized that thiswould make the additional information delivered by the auditoryexpressiveness more trustworthy and, hence, more effective. Butthis was not the case. In order to explain these results, severalexplanations may be suggested. One possibility is that, althoughlearners usually exhibit difficulties when having to self-regulatetheir learning, in this case they could deploy self-regulation skillsas the learning materials were not so demanding for them. Learn-ing materials did not demand so many mental resources in such away they had room for self-regulating their learning to a certainextent. This is the reason why receiving regulatory aids either inthe form of visual aids or auditory aids was enough for them. Thatis, they profited from regulatory aids regardless of their modality ofpresentation. But learners either facing more demanding learningmaterials or with poorer self-regulation skills would need addi-tional cues to profit from the regulatory aids. Cues such as auditoryexpressiveness or the trustworthiness of agent’s visibility and pres-ence. If this is true, then participants provided with more demand-ing materials or exhibiting low self-regulation skills would takeadvantage of regulatory aids presented in auditory modality. Thispossibility requires future investigation. Another possibility is thatthe auditory-regulation superiority hypothesis is simply wrong.We will discuss this question in the next section.

Prior research on agents’ visibility has shown that seeing theagent would have a neutral effect on learning. Here it was arguedthat this was the case because the agents were providing learnerswith explanatory aids rather than regulatory (Atkinson, 2002;Moreno et al., 2001) or giving both types of aid without beingaware of it (Baylor and Ryu, 2003; Graesser et al., 2003). Resultsdid not confirm our argument. Overall, this could indicate thatagents’ features are less important than we expected. Further workis needed to investigate this possibility.

5. General discussion

Computer-based learning environments (CBLs) are flexible toolsof instruction. They include verbal and pictorial information, theyuse different modalities, and they comprise verbal aids besides

Page 14: The use of modality in the design of verbal aids in computer-based learning environments

558 E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561

to-be-learned contents. Because CBLs have multiple capabilities,designers must take several decisions with respect to their construc-tion. For example, designers decide whether verbal aids have to beincluded. One reason for this inclusion is that, although CBLs requirelearners to integrate information from multiple representations andself-regulate their learning taking decisions about when and whichstrategy to apply, learners seldom do that (e.g., Ainsworth et al.,2002; Azevedo et al., 2008). Another decision designers must takeis how to present to-be-learned contents regarding their modality.The modality principle (Mayer, 2001; Sweller et al., 1998) suggeststhat words should be presented auditorily whereas pictures shouldbe presented visually, thus preventing learners from memory over-load. A further decision is how to present verbal aids regarding itsmodality. This is relevant to the extent that the modality principleis not applicable in these cases, since verbal aids have no related pic-tures. The answer in this case is not clear-cut: there is little researchin this respect and it provides diverging results (Atkinson, 2002;Graesser et al., 2003; Moreno and Mayer, 2002; Moreno et al.,2001). The aim of the present paper was twofold. First, to presentthe functional approach, a theoretical framework which made it pos-sible to reinterpret prior findings and formulate hypotheses. Second,to provide it with empirical support.

The functional approach assumes that the visual and the audi-tory modalities have specific advantages. The visual modality iscontrollable whereas the auditory modality is expressive. Theapproach also assumes that there are different types of verbal aids.Some are regulatory, guiding learners’ behavior and, in so doing,helping them in self-regulating their learning. Others are explana-tory, elaborating on the to-be-learned contents in order to reviselearners’ understanding. Finally, the approach assumes that advan-tages of each modality become particularly helpful in some situa-tions – but not in others. It was argued that the control of thevisual modality may be useful when receiving explanatory aidsbut not when receiving regulatory aids (the visual-explanationsuperiority hypothesis). This is because explanatory aids are com-plex and difficult to understand. The expressiveness of the auditorymodality may be useful when receiving regulatory aids but notwhen receiving explanatory aids (the auditory-regulation superior-ity hypothesis). This is because regulatory aids ask the learner tostart performing a specific action, which requires deciding whetherit is worthy to execute it. In two experiments participants learnedfrom a CBL on plate tectonics and were provided with regulatoryand explanatory aids to improve their learning. Each type of aidwas presented either visually or auditorily; hence, a 2 � 2 factorialdesign was used (regulatory/explanatory, auditory/visual). Afterusing the CBL participants solved retention and inference tests.Those receiving aids designed according to the functional approachwere expected to outperform their counterparts.

Results indicated that explanatory aids were more beneficialwhen they were presented in visual modality, confirming thevisual-explanation superiority hypothesis. This was true in reten-tion (Experiment 1) and inference (Experiments 1 and 2) tests. Itwas interpreted that participants in the visual explanatory aidconditions freely managed their time, this allowing them to processthe explanation better. This is in line with research on eye move-ments during reading (Just et al., 1982; Rayner et al., 2005). Theseresults also extends prior research on text comprehension (Green,1981). In the past it was found that only when complex/difficult areused, the visual modality is better than the auditory modality. Thisis what we found using explanatory aids rather than expository texts.

Results indicated that regulatory aids were not more beneficialwhen presented in auditory modality, which is not in line with theauditory-regulation superiority hypothesis. In the second experi-ment, it was hypothesized that the additional information providedthrough auditory expressiveness needs a human agent to becometrustworthy. This is line with the social agency theory (Moreno

and Flowerday, 2006). This theory suggests that learners have moreengagement when they think of the CBL as a human and such anengagement promotes learning. As a consequence, the more cuesof humanity the agent has, the more effective it is. We argued thatcues of humanity would be useful only when giving regulatory aidsbut not when giving explanatory aids. Results, however, indicatedthat those features did not make the auditory regulatory aid betterthan the visual regulatory aid. One possible explanation for these re-sults is that learners had the opportunity to deploy their self-regula-tion skills to some extent. This was possible because the learningmaterials were not so demanding. As a consequence, participantshad extra mental resources available, which were devoted to self-regulating learning. These extra resources would have been usefulin profiting from regulatory aids even when presented in visualmodality. That is, they did not need the additional information pro-vided through the auditory modality. However, it is not clear whatwould happen if the materials had been more demanding or if theparticipants had had lower self-regulation skills. It could be arguedthat the more demanding the material is or the less self-regulationskills the learners have, the more beneficial additional cues of audi-tory regulatory aids become. Additional cues refer to auditoryexpressiveness and/or agent’s visibility or presence. If this is true,then one might expect learners provided with demanding materialsand/or with low self-regulation skills to take advantage of additionalcues. That is to say that the auditory-regulation superiority hypoth-esis would be true only for those learners who are facing verydemanding learning materials or have low self-regulation skills. Infact, a study recently conducted in our lab provided evidence in linewith this argument (García-Rodicio and Sánchez , 2008). Anotherpossibility is to think that the auditory-regulation superiorityhypothesis is wrong. But we still believe in such a hypothesis, asthe evidence from the study suggests it could be true. In any case, fur-ther work is needed to clarify this question.

Our results extend prior research in some ways. Atkinson(2002) and Moreno and her colleagues (Moreno and Mayer,2002; Moreno et al., 2001) found that the visual modality wasequal and sometimes worse than the auditory modality. It was ar-gued that this was the case because their explanatory aids weresimple (see Appendix A) and easy. It is not clear whether the sameresult would be obtained for materials including complex/difficultexplanatory aids. In the present study we used this kind of explan-atory aids and we found a visual advantage. In the experiment ofGraesser and his colleagues (2003) regulatory and explanatory aidswere not distinguished and, thus, not controlled. It is not clearwhat would happen if they had kept different aids under experi-mental control. In the present study experimental control was keptand results were different for explanatory and regulatory aids. Insum, we claim that it is important to (a) consider the level of com-plexity and difficulty of the aids used and to (b) make a distinctionbetween different aids, as there are different effects for each type.

Past research demonstrated that the impact of some features ofpedagogical agents is perhaps not so impressive (Dehn and vanMulken, 2000). In our second experiment agent’s visibility andphysical presence were not beneficial for learning, even when theywere incorporated into the regulatory aid. Further work is requiredin order to explore the effectiveness of different agents’ features(such as these) and under different circumstances.

There are some limitations in this study. First, only learners withlow prior knowledge were studied. As indicated by their scores inprior knowledge tests, participants know little about plate tectonicsbefore the experiment. It could be argued that high prior knowledgelearners would not have take advantage of the aids. High priorknowledge learners would not have underestimated differences be-tween the Andes and the Himalaya plate collisions, since it is likelythey have complete and coherent mental representations in this re-spect. Hence, aids would not be beneficial. Second, we have tested

Page 15: The use of modality in the design of verbal aids in computer-based learning environments

E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561 559

the impact on learning of aids presented in different modalities but acomparison between verbal and pictorial aids (such as those in Mau-tone and Mayer (2007)) has not been made. It would be interestingto explore the effectiveness of both verbal and pictorial aids in the fu-ture. Finally, in both experiments computer-based materials wereon plate tectonics. More research is needed to explore the modalityof verbal aids in other scientific topics.

Acknowledgments

Emilio Sánchez is supported by a program from the ‘‘Ministeriode Educación” of Spain (project SEJ2006-13464). Héctor García-Rodicio is supported by a grant (FPI) from the ‘‘Fondo Social Euro-peo” and the ‘‘Junta de Castilla & León”. J. Ricardo García and Wolf-gang Schnotz provided helpful comments on earlier versions of thispaper. We would like to thank Santiago R. Acuña for his assistancein programming the computer-based learning environment andconducting the experiments.

Appendix A

Complexity analyses of the aids in the present experiments andin prior research (following the system of Graesser and Goodman(1985) for constructing conceptual structures).

A.1. Example of explanatory feedback in Moreno et al. (2001)

‘‘A short stem here in this shade is dangerous for the plant,because its leaves won’t get any sunlight. The stem should be longenough to put the leaves in the sun.” (p. 184).

Analysis:

� Statement node 1: A short stem is dangerous for the plant (inthis shade).� Statement node 2: Leaves (from the stem) will not get any

sunlight.� Statement node 3: The stem should be long enough.� Statement node 4: Long stems put the leaves in the sun.� Consequence arc: Statement node 2 [causes] Statement node 1.� Consequence arc: Statement node 3 [causes] Statement node 4.

The explanatory aid involves four statement nodes and two ca-sual links.

A.2. Example of explanatory aids in Atkinson (2002)

‘‘We need to set up another proportional relationship to deter-mine the production time.” (p. 420).

Analysis:

� Statement node 1: We need to set up another proportionalrelationship.� Statement node 2: The setting up determines the production

time.� Consequence arc: Statement node 1 [causes] Statement node 2.

The explanatory aid involves two statement nodes and one ca-sual link.

A.3. Analyses of our aids

A.3.1. The regulatory aid (87 words in Spanish)‘‘Usually the people who watch this presentation tend to elabo-

rate a simplified conception of the plate collisions process; thus,

probably you only saw that plates collide so that mountains areformed both in the Andes and in the Himalaya by the same princi-ple. However, there are important differences between both platecollisions that play a big role in clarifying what plate tectonics is.You must consider the following ideas.”

Analysis:

� Statement node 1: Usually people (those who watch this pre-sentation) tend to elaborate a simplified conception of theplate collisions process.

� Statement node 2: (Probably) You saw that plates collide sothat mountains are formed.

� Statement node 3: There are important differences betweenplate collisions.

� Statement node 4: Differences play a big role in clarifyingwhat is plate tectonics.

� Statement node 5: You must consider the following ideas.� Consequence arc: Statement node 1 [causes] Statement node 2.

The explanatory aid involves five statement nodes and onecasual link.

A.3.2. The explanatory aid (141 words in Spanish)‘‘There are two main differences. First, in the Andes one plate

sinks because of its weight. In this collision, there is one conti-nental plate and one oceanic plate, each with its own composi-tion. The denser plate, namely, the oceanic plate, sinks andreturns to the mantle. In the Himalaya, both plates are continen-tal and so have the same composition. When the plates collidethey push each other and neither sinks. Second, the effects ofthese two types of collisions are quite different. In the Andesthe sinking plate pushes the plate above, putting pressure uponit and forming cracks. Magma emerges through the cracks andthus volcanoes are formed. In contrast, in the Himalaya magmadoes not emerge through the cracks; instead, both plates pusheach other without producing cracks so that a mountain withoutvolcanoes is formed.”

Analysis:

� Statement node 1: There are two main differences (between theAndes and the Himalaya).� Statement node 2: In the Andes one plate sinks.� Statement node 3: The sinking plate has certain weight.� Statement node 4: There is one continental and one oceanic

plate.� Statement node 5: Each one has its own composition.� Statement node 6: The oceanic plate sinks and returns to the

mantle.� Statement node 9: The oceanic plate is the denser plate.� Statement node 10: In the Himalaya there are two continental

plates.� Statement node 11: Continental plates have the same

composition.� Statement node 12: When plates collide they push each other

and neither sinks.� Statement node 13: Effects are quite different.� Statement node 14: In the Andes the sinking plate pushes the

plate above.� Statement node 15: The pushing put pressure and forms cracks.� Statement node 16: Magma emerges through the cracks.� Statement node 17: Volcanoes are formed.� Statement node 18: In the Himalaya magma does not emerge

through the cracks.� Statement node 19: Both plates push each other without pro-

ducing cracks.

Page 16: The use of modality in the design of verbal aids in computer-based learning environments

560 E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561

� Statement node 20: A mountain without volcanoes is formed.� Consequence arc: Statement node 4 [causes] Statement node 3.� Consequence arc: Statement node 10 [causes] Statement node 11.� Consequence arc: Statement node 13 [causes] Statement node 14.� Consequence arc: Statement node 14 [causes] Statement node 15.� Consequence arc: Statement node 15 [causes] Statement node 16.� Consequence arc: Statement node 18 [causes] Statement node 17.� Consequence arc: Statement node 17 [causes] Statement node 19.

The explanatory aid involves 19 statement nodes and sevencasual links.

References

Ainsworth, S., Bibby, P., Wood, D., 2002. Examining the effects of different multiplerepresentational systems in learning primary mathematics. Journal of theLearning Sciences 11, 25–61.

Aleven, V., Koedinger, K.R., 2002. An effective metacognitive strategy: learning bydoing and explaining with a computer-based cognitive tutor. Cognitive Science26, 147–179.

Asymetrix Corporation, Toolbook Instructor II, 2001. [Computer Program, PC],Asymetrix Learning Systems, Bellevue, WA.

Atkinson, R.K., 2002. Optimizing learning from examples using animatedpedagogical agents. Journal of Educational Psychology 94, 416–427.

Azevedo, R., Cromley, J.G., Seibert, D., 2004. Does adaptive scaffolding facilitatestudents’ ability to regulate their learning with hypermedia? ContemporaryEducational Psychology 29, 344–370.

Azevedo, R., Moos, D.C., Greene, J.A., Winters, F.I., Cromley, J.G., 2008. Why isexternally-facilitated regulated learning more effective than self-regulatedlearning with hypermedia? Instructional Science 56, 45–72.

Baum, K.M., Nowicki, S.J., 1998. Perception of emotion: measuring decodingaccuracy of adult prosodic cues varying in intensity. Journal of NonverbalBehavior 22, 89–107.

Baylor, A., 2002. Agent-based learning environments as a research tool forinvestigating teaching and learning. Journal of Educational ComputingResearch 26, 249–270.

Baylor, A., Ryu, J., 2003. Does the presence of image and animation enhancepedagogical agent persona? Journal of Educational Computing Research 28,373–395.

Brennan, S.E., Williams, M., 1995. The feeling of another’s knowing: prosody andfilled pauses as cues to listeners about the metacognitive states of speakers.Journal of Memory and Language 34, 383–398.

Buisine, S., Martin, J.C., 2007. The effects of speech–gesture cooperation in animatedagents’ behavior in multimedia presentations. Interacting with Computers 19,484–493.

Chandler, P., Sweller, J., 1996. Cognitive load while learning to use a computerprogram. Applied Cognitive Psychology 10, 151–170.

Chandler, P., Sweller, J., 1992. The split-attention effect as a factor in thedesign of instruction. British Journal of Educational Psychology 62, 233–246.

Chi, M.T.H., 2000. Self-explaining expository texts: the dual processes of generatinginferences and repairing mental models. In: Glaser, R. (Ed.), Advances inInstructional Psychology. Lawrence Erlbaum Associates, Mahwah, NJ, pp. 161–238.

Chi, M.T.H., Siler, S.A., Jeong, H., Yamauchi, T., Hausmann, R.G., 2001. Learning fromhuman tutoring. Cognitive Science 25, 471–533.

Commander, N.E., Stanwyck, D.J., 1997. Illusion of knowing in adult readers: effectsof reading skill and passage length. Contemporary Educational Psychology 22,39–52.

Conati, C., VanLehn, K., 2000. Further results from the evaluation of an intelligentcomputer tutor to coach self-explanation. In: Gauthier, G., Frasson, C., VanLehn,K. (Eds.), Proceedings 5th International Conference ITS’2000. Springer-Verlag,Montreal, Canada, pp. 304–313.

Craig, S.D., Gholson, B., Driscoll, D., 2002. Animated pedagogical agents inmultimedia educational environments: effects of agent properties picturefeatures and redundancy. Journal of Educational Psychology 94, 428–434.

Dehn, D.M., van Mulken, S., 2000. The impact of animated interface agents: a reviewof empirical research. International Journal of Human–Computer Studies 52, 1–22.

Diakidoy, I.A., Kendeou, P., Ioannides, C., 2003. Reading about energy: the effects oftext structure in science learning and conceptual change. ContemporaryEducational Psychology 28, 335–356.

Díez, E., Fernández, A., 1997. Batería multimedia de comprensión (versiónabreviada) [Comprehension multimedia inventory (summarized version)],University of Salamanca.

García-Rodicio, H., Sánchez, E., 2008. The use of modality in the design of verbal aidsin computer-based learning environments, Tilburg, Netherlands. In:Proceedings of EARLI SIG 2008 Conference on Comprehension of Texts andGraphics, 2008, August, pp. 59–63.

Gernsbacher, M.A., Varner, K.R., 1988. The multi-media comprehension battery(Tech. Rep. No. 88-3), Eugene, OR, Institute of Cognitive and Decision Sciences.

Gernsbacher, M.A., Varner, K.R., Faust, M.E., 1990. Investigating differences ingeneral comprehension skill. Journal of Experimental Psychology: Learning,Memory and Cognition 16, 430–445.

Ginns, P., 2005. Meta-analysis of the modality effect. Learning and Instruction 15,313–331.

Givón, T., 1984. Prolegomena to discourse pragmatics. Journal of Pragmatics 8, 489–516.

Graesser, A.C., Leon, J.A., Otero, J.C., 2002. Introduction to the psychology of sciencetext comprehension. In: Otero, J., Leon, J.A., Graesser, A.C. (Eds.), The Psychologyof Science Text Comprehension. Erlbaum, Mahwah, NJ, pp. 1–15.

Graesser, A.C., Baggett, W., Williams, K., 1996. Question-driven explanatoryreasoning. Applied Cognitive Psychology 10, S17–S32.

Graesser, A.C., Goodman, S.H., 1985. How to construct conceptual graph structures.In: Britton, B.K., Black, J.B. (Eds.), Understanding Expository Text. LawrenceErlbaum Associates, Hillsdale, NJ, pp. 363–383.

Graesser, A.C., Lu, S., Jackson, G.T., Mitchell, H., Ventura, M., Olney, A., Louwerse, M.,2004. AutoTutor: a tutor with dialogue in natural language. Behavioral ResearchMethods, Instruments, and Computers 36, 180–192.

Graesser, A.C., Moreno, K., Marineau, J., Adcock, A., Olney, A., Person, N., 2003.AutoTutor improves learning of computer literacy: is it the dialog or the talkinghead? In: Hoppe, U., Verdejo, F., Kay, J. (Eds.), Proceedings of ArtificialIntelligence in Education. IOS Press, Amsterdam, pp. 47–54.

Graesser, A.C., Person, N., Magliano, J., 1995. Collaborative dialog patterns innaturalistic one-on-one tutoring. Applied Cognitive Psychology 9, 495–522.

Green, R., 1981. Remembering ideas from text: the effect of presentation. BritishJournal of Educational Psychology 51, 83–89.

Haskard, K.B., Williams, S.L., DiMatteo, M.R., Heritage, J., Rosenthal, R., 2008. Theprovider’s voice: patient satisfaction and the content-filtered speech of nursesand physicians in primary medical care. Journal of Nonverbal Behavior 32, 1–20.

Johnson, W.L., Rickel, J.W., Lester, J.C., 2000. Animated pedagogical agents: face-to-face interaction in interactive learning environments. International Journal ofArtificial Intelligence in Education 11, 47–78.

Just, M.A., Carpenter, P.A., Wolley, J.D., 1982. Paradigms and processes in readingcomprehension. Journal of Experimental Psychology: General 111, 228–238.

Juzczyck, P.W., Hirsh-Pasek, K., Kemler Nelson, D.G., Kennedy, L.K., Woodward, A.,Piwoz, J., 1992. Perception of acoustic correlates of major phrasal units byyoung infants. Cognitive Psychology 24, 252–293.

Kintsch, W., 1998. Comprehension: A Paradigm for Cognition. Cambridge UniversityPress, New York.

Kintsch, W., 1994. Text comprehension memory and learning. AmericanPsychologist 49, 294–303.

Kintsch, W., Keenan, J., 1973. Reading rate and retention as a function of number ofpropositions in base structure of sentences. Cognitive Psychology 5, 257–274.

Kintsch, W., Kozminsky, E., 1977. Summarizing stories after reading and listening.Journal of Educational Psychology 69, 491–499.

Kintsch, W., Kozminsky, E., Streby, W., McKoon, G., Keenan, J.M., 1975.Comprehension and recall of text as a function of content variables. Journal ofVerbal Learning and Verbal Behavior 14, 196–214.

Loman, N.L., Mayer, R.E., 1983. Signaling techniques that increase theunderstandability of expository prose. Journal of Educational Psychology 75,402–412.

Mautone, P.D., Mayer, R.E., 2007. Cognitive aids for guiding graph comprehension.Journal of Educational Psychology 99, 640–652.

Mayer, R.E., 2001. Multimedia Learning. Cambridge University Press, New York.Mayer, R.E., Anderson, R.B., 1992. The instructive animation: helping students build

connections between words and pictures in multimedia learning. Journal ofEducational Psychology 84, 444–452.

Mayer, R.E., Dow, G.T., Mayer, S., 2003. Multimedia learning in an interactive self-explaining environment: what works in the design of agent-basedmicroworlds? Journal of Educational Psychology 95, 806–813.

McNamara, D.S., Kintsch, E., Songer, N., Kintsch, W., 1996. Are good texts alwaysbetter? Interactions of text coherence background knowledge and levels ofunderstanding in learning from text. Cognition and Instruction 14, 1–43.

Mikkilä-Erdmann, M., 2001. Improving conceptual change concerningphotosynthesis through text design. Learning and Instruction 11, 241–257.

Mitrovic, A., Suraweera, P., 2000. Evaluating an animated pedagogical agent. LectureNotes in Computer Science 1839, 73–82.

Moreno, R., Flowerday, T., 2006. Students’ choice of animated pedagogical agents inscience learning: a test of the similarity attraction hypothesis on gender andethnicity. Contemporary Educational Psychology 31, 186–207.

Moreno, R., Mayer, R.E., 1999. Cognitive principles of multimedia learning: the roleof modality and contiguity. Journal of Educational Psychology 91, 358–368.

Moreno, R., Mayer, R.E., 2002. Learning science in virtual reality multimediaenvironments: role of methods and media. Journal of Educational Psychology94, 598–610.

Moreno, R., Mayer, R.E., 2005. Role of guidance reflection and interactivity inan agent-based multimedia game. Journal of Educational Psychology 97,117–128.

Moreno, R., Mayer, R.E., Spires, H.A., Lester, J.C., 2001. The case for social agency incomputer-based teaching: do students learn more deeply when they interactwith animated pedagogical agents? Cognition and Instruction 19, 177–213.

Mousavi, S.Y., Low, R., Sweller, J., 1995. Reducing cognitive load by mixing auditoryand visual presentation modes. Journal of Educational Psychology 87, 319–334.

Page 17: The use of modality in the design of verbal aids in computer-based learning environments

E. Sanchez, H. Garcia-Rodicio / Interacting with Computers 20 (2008) 545–561 561

Otero, J.C., Campanario, J.M., 1990. Comprehension evaluation and regulation inlearning from science texts. Journal of Research in Science Teaching 27, 447–460.

Person, N.K. et al., The Tutoring Research Group, 2000. Dialog move generation andconversation management in AutoTutor. In: Proceedings of the AAAI FallSymposium: Building Dialogue Systems for Tutorial Applications. AAAI Press,Falmouth, MA, 2000, pp. 45–51.

Rawson, K.A., Kintsch, W., 2005. Rereading effects depend on time of test. Journal ofEducational Psychology 97, 70–80.

Rayner, K., Juhasz, B.J., Pollatsek, A., 2005. Eye movements during reading. In:Hulme, C., Snowling, M. (Eds.), Handbook of Reading Research. Blackwell,Oxford, pp. 79–97.

Renkl, A., 2002. Learning from worked-out examples: instructional explanationssupplement self-explanations. Learning and Instruction 12, 529–556.

Sánchez, E., García, J.R., de Sixte, R., Castellano, N., Rosales, J., 2008. El análisis de lapráctica educativa y las propuestas instruccionales: integración yenriquecimiento mutuo (The analysis of teaching practice and instructionalproposals: integration and mutual enrichment). Infancia & Aprendizaje 31,233–258.

Sánchez, E., García-Rodicio, H., Acuña, S.R., in press. Are instructional explanationsmore effective in the context of an impasse? Instructional Science.

Schnotz, W., 2005. An integrated model of text and picture comprehension. In:Mayer, R.E. (Ed.), The Cambridge Handbook of Multimedia Learning. CambridgeUniversity Press, New York. pp. 49–69.

Schnotz, W., Bannert, M., 2003. Construction and interference in learning frommultiple representation. Learning and Instruction 13, 141–156.

Seufert, T., 2003. Supporting coherence formation in learning from multiplerepresentations. Learning and Instruction 13, 227–237.

Shapiro, A., 2004. Prior knowledge must be included as a subject variable in learningoutcomes research. American Educational Research Journal 41, 159–189.

Smiley, S.S., Oakley, D.D., Worthen, D., Campione, J.C., Brown, A.L., 1977. Recall ofthematically relevant material by adolescent good and poor readers as afunction of written versus oral presentation. Journal of Educational Psychology69, 187–381.

Sweller, J., Cooper, G.A., 1985. The use of worked examples as a substitute forproblem solving in learning algebra. Cognition and Instruction 2, 58–89.

Sweller, J., van Merriënboer, J.J.G., Paas, F., 1998. Cognitive architecture andinstructional design. Educational Psychology Review 10, 251–296.

van Mulken, S., André, E., Müller, J., 1998. The persona effect: how substantial is it?In: Proceedings of HCI on People and Computers XIII, 1998, January.

Vidal-Abarca, E., Martínez, G., Gilabert, R., 2000. Two procedures to improveinstructional text: effects on memory and learning. Journal of EducationalPsychology 92, 1–10.

Vosniadou, S., Brewer, W.F., 1994. Mental models of the day/night cycle. CognitiveScience 18, 123–184.

Vosniadou, S., Brewer, W.F., 1992. Mental models of the earth: a study of conceptualchange in childhood. Cognitive Psychology 24, 535–585.