Assess Reasoning

Assessing Reasoning 1

Assessing Mathematical Reasoning:

An Action Research Project

Jill Thompson

Michigan State University – TE 855

December 2, 2006


Introduction

To fully participate in 21st century society, it will be necessary for our students to possess the

ability to “analyze, reason and communicate effectively as they pose, solve and interpret

mathematical problems in a variety of situations…” (OECD, 2004, p. 37). Yet, traditional and

large-scale tests historically have not measured the full range of important mathematics.

According to Achieve Inc., the most challenging standards and objectives are the ones that are

either underrepresented or left out, and those that call for high-level reasoning are often omitted

in favor of much simpler cognitive processes (Rothman, Slattery, Vranek & Resnick, 2002, p. 8).

This is an important observation, because it has long been recognized that what is tested has a

significant influence on what is taught (NCTM, 1991, p. 9). Even if large-scale tests can evolve

to the point where they effectively assess important high-level standards, they constitute only one

piece of the assessment puzzle. A balanced approach to assessment is needed so that, in addition

to serving accountability purposes, the assessments also can influence instructional decisions.

Assessment for this purpose must be done frequently, and must consist of items designed to

measure both content and cognitive processes (NCTM, 1991, p. 9). Perhaps one of the most

important of the process standards is reasoning, since “mathematical reasoning is as fundamental

to knowing and using mathematics as comprehension of text is to reading”

(Ball & Bass, 2003, p. 29).


Focus

Because reasoning is such a critical cognitive skill, it is important for teachers to know how to

measure it. Teachers regularly use a variety of assessment practices in their classrooms,

including observations, discussion, performance tasks, and more traditional assessments such as

multiple-choice or short answer tests. Because formal assessment plays such an important role in

assigning course grades or evaluating the effectiveness of instruction both within the classroom

and increasingly within school districts, the purpose of this study is to seek an answer to this

question: what can formal assessment tasks tell us about students’ mathematical reasoning?

Review of Literature

In order to measure reasoning, it is first necessary to define it. There is not universal agreement

on its definition. Research clearly points out the role of formal proof and argument as a central

aspect of mathematical reasoning. This is consistent with Thompson’s (1996) description of

reasoning as “purposeful inference, deduction, induction and association in the areas of quantity

and structure” (Yackel & Hanna, 2003, p. 228). NCTM includes a reasoning and proof standard

that emphasizes formal argumentation processes, but also includes the process of making and

investigating mathematical conjectures and supporting conclusions with mathematically sound

arguments (2000, p. 342). The National Research Council describes mathematical reasoning as

“the ability to think logically about the relationships among concepts and situations”

(CFE, 2001, p. 129). Finally, researchers such as Lampert (2001) take a more social view of

reasoning as an important and public process arising from the interaction of learners as they

solve problems together.


In addition to providing insight into what constitutes mathematical reasoning, research can also

guide the design of assessment tasks. The Harvard Research Group Balanced Assessment Project

developed a program of assessment that consisted of tasks that would provide learning

opportunities for students while also serving as a basis for accountability measures. These tasks

were field tested, and fall into one of three categories. “Performance tasks” are short answer

constructed response items that measure students’ proficiency using skills and procedures.

“Problems” require extended responses, and provide information about a student’s ability to

model, infer and generalize. “Projects” can take from 20 minutes to an entire class period, and

provide information about a student’s ability to analyze, organize and model complexity

(Balanced Assessment, 1995, p. 18). An extensive scoring rubric was designed to assure validity

and reliability due to the complexity of the learning tasks.

Some of the most interesting research related to task design revolves around the use of a

framework for evaluating mathematical tasks based on the level of cognitive demand required by

the task. This framework was developed as part of a project intended to help teachers match

mathematical tasks to learning goals (Arbaugh & Brown, 2005, p. 506). Tasks were rated

according to the Level of Cognitive Demand (LCD) criteria at four levels. The two lowest levels

are memorization and using procedures without connections to meaning. Tasks at these levels

require recall and repetition of previously learned facts or routines, are generally unambiguous,

have no connections to concepts, and require explanations that focus only on the procedure that

was used to solve a problem. The two higher levels of cognitive demand involve using

procedures with connections to meaning or actually doing mathematics. Tasks at this level

require use of procedures that are closely connected to underlying concepts, may involve


multiple representations or pathways to a solution, require cognitive effort, and require complex

and non-algorithmic thinking (Arbaugh & Brown, 2005, p. 530). The results of the QUASAR

study that used this framework showed that when teachers choose and implement learning tasks

at the higher levels of cognitive demand, students show an increased level of understanding and

reasoning (Arbaugh & Brown, 2005, p. 527).

Research Questions

Following a review of the literature, the research question was narrowed to focus on the actual

assessment task design. This study seeks to answer the following questions:

1. What can formal assessment tasks tell us about students’ mathematical reasoning?

2. What can be learned about reasoning from multiple-choice versus constructed responses

questions?

3. What can be learned about reasoning from tasks at various levels of cognitive demand?

Mode of Inquiry

The mode of inquiry employed in this project can best be described as authentic practitioner

research within the grounded theory tradition. The intent of the project is to explore the

effectiveness of different types of learning tasks, both traditional and authentic, at varying levels

of cognitive demand. It is a grounded theory study because it involves detailed analysis of data

from more than one perspective, and its ideal outcome would be a theory which may or may not

need to be explored further using more empirical evidence (Creswell, 1998, p. 58).


Context

This study was conducted in one classroom and involved 60 students in two sections of

Algebra 2. Normal classroom practice includes both formal and informal assessment at frequent

intervals. The formal assessments typically consist of a mixture of short constructed response

items (the majority) and some multiple-choice items. The questions are developed by the teacher

and are loosely based on test items provided in the supplemental materials to the course

textbook. Tests always include one or two items that present an entirely new context within

which students are asked to apply the knowledge they have gained during the unit of instruction.

Typical classroom instruction involves daily opportunities for students to discuss their work,

primarily with each other in small group interactions. At least one class period each week begins

with opportunities for students to relate their recently acquired knowledge to other ideas and

prior knowledge. These activities are generally “warm-up” activities and precede direct

instruction. During whole class discussions, students are frequently asked to explain and justify

their conclusions or solutions. This style of discourse can encourage mathematical reasoning

because it creates an environment where it is important for students to communicate, explain,

justify, and form relationships between ideas (Yackel & Hanna, 2003, p. 229). However, it is

important to note that the curriculum materials are traditional and emphasize the development of

procedural knowledge.

Innovation

The innovation selected for this inquiry involved designing and implementing a set of learning

tasks that included standardized test questions drawn from different sources. The tasks were

deliberately selected at different levels of cognitive demand, based on the LCD criteria (Arbaugh


& Brown, 2005, p. 530). For purposes of this study, the tasks were classified at either the lower

level (memorization or procedures without connections to meaning) or the higher level

(procedures with connections or doing mathematics). They consisted of a mixture of constructed

response and multiple-choice items.

Two of the multiple-choice items were drawn from the Virginia End of Course Exam for

Algebra 2 (VDOE, 2000, p. 15). The third was an extended response item from the NAEP

(IES, 2006, p. E-33) and was rated at the moderate level of complexity for that examination,

which corresponds to the higher level of cognitive demand using the LCD criteria.

Three of the constructed response items were created with the intent of assessing students’

ability to reason through deduction, inference, and association (Yackel & Hanna, 2003, p. 228).

The remaining constructed response item was drawn from the course text. It required students to

compute and compare two quantities, and select the appropriate multiple-choice response based

on the comparison. Although the text presented the problem as a multiple-choice “test

preparation” item (Glencoe, 1998, p. 251), for purposes of this exercise, students were also asked

to show and explain how they approached the problem. A summary of the types of questions is

shown below, and the tasks and reasons for their classifications are shown in Appendix A.

Tasks by Format and Level of Cognitive Demand

Multiple Choice

Constructed Response

Lower level of cognitive demand Problems 1 and 2 Problem 5Higher level of cognitive demand Problem 7 Problems 3,4 and 6


A simple rubric was used to evaluate student responses to the constructed response items. It

involved a system of assigning up to four points:

SCORE CRITERIA

4 Response is substantially correct and complete

3 Response includes one significant error or omission

2 Response is partially correct with more than one significant error or omission

1 Response is largely incomplete but includes at least one correct argument

0 Response is based on incorrect process or argument, or no is response given

The assessment items, which all covered content related to the ongoing study of matrix

algebra, provided an opportunity for students to demonstrate their reasoning ability. Students are

best able to demonstrate reasoning ability when they possess a sufficient knowledge base, the

tasks are motivating, and the context is familiar (CFE, 2001, p. 129). This assignment was made

at the end of the unit of instruction, the format of the assignment was similar to previous class

projects, and the students were motivated both by the novelty of the tasks as well as the

communicated purpose of reviewing for their test. The students were given one class period to

complete the tasks, although some did not finish and required additional time outside the

classroom.

Data

The data used for this study included the responses of 48 students who completed the seven

learning tasks, and also their scores on the summative assessment for the unit. The data were

analyzed from three perspectives. First, the responses on the learning tasks were analyzed as to

evidence and quality of mathematical reasoning. Second, a comparison was made between


student performance on the learning tasks and their performance on the unit test using a simple

linear correlation. Third, student responses were reviewed for the existence of common themes.

A few problems emerged in the process of assigning the tasks and collecting the data. One of

these resulted from the fact that not all students were present or handed in the learning tasks on

the day they were assigned, so only 48 of the 60 students in the two Algebra 2 classes are

represented in the study. Some of the students requested additional time to complete the

assignment at home because they were concerned about the grade they would earn on the

assignment. Their responses (handed in the next day) were not included in the analysis because

the students were given additional time and resources to complete the project.

Also, while the project was intended to be an individual project, many students requested

clarification, either from the teacher or from each other, on two of the constructed response

items. Ultimately, students discussed and collaborated on the most difficult item, which asked

them to determine and justify their conclusion as to whether the commutative properties of

addition and multiplication apply to matrix addition and multiplication. Thus, student responses

on the learning tasks do not necessarily represent their own interpretation and approach.

Further, it was difficult to ascertain whether the papers that did not include responses to the last

few questions were unfinished, or whether students did not know how to approach the problems.

The non-response rate to the last few problems of the assignment may reflect a lack of time

rather than a lack of understanding.

Finally, during the analysis process, it became evident that the rubric designed to evaluate

student performance did not anticipate the wide variation seen in the students’ responses. To

adjust for this problem, the student responses were analyzed in two different ways. The rubric


was used only to assign a score for purposes of comparing student performance with their

performance on the summative unit assessment. Other analysis focused primarily on the

characteristics of the student responses, which were unique to each task.

Findings

The percentages of students who earned full credit for their responses on the various

assessment items was as follows:

Percentage of Students with Correct Responses or Rating of 4 on the Rubric

Level of Cognitive Demand

Multiple Choice

Constructed Response

Lower 96%, 98% 73%Higher 25% 0%, 60%, 83%

A discussion of students’ performance on each group of items follows, based on the format and

the level of cognitive demand required by the tasks.

Questions at Lower Level of Cognitive Demand – Multiple-Choice

As might be expected near the end of a unit instruction, there was a high rate of success on the

lower demand, multiple-choice items (problems 1 and 2). These are, coincidentally, both

standardized test items. These items were related to solving a linear system in three variables or

finding the inverse of a 2x2 matrix and required application of learned procedures.

Questions at Lower Level of Cognitive Demand - Constructed Response

The constructed response item at the lower level of cognitive demand (problem 5) required

students to identify whether a hypothetical student, “Ken”, correctly multiplied a 2x2

determinant by a constant, and to justify their conclusion. The success rate on this problem was


fairly high, although not as high as on the multiple-choice items. Despite the fact that 73% of the

students responded correctly on this item, only 58% of the students offered an explanation. Many

students thought it was adequate to show work as the explanation. Others provided more detail as

to the source of the error in Ken’s work, as in the response below:

The students who responded incorrectly repeated the error and arrived at an answer of “80”.

Applying the order of operations to determinants was a frequent topic of discussion during the

unit, yet almost 23% of the students still repeated the mistake.

Questions at Higher Level of Cognitive Demand - Multiple-choice

Problem 7 was originally designed to be a multiple-choice question and is reproduced below:


Although this problem was taken from the student textbook and was designated as standardized

test practice, it is clear that the design of the problem is flawed. It is possible for students to

incorrectly use “1” as the solution for Quantity A, or else make a calculation error, and still

arrive at the correct answer. In fact, while 25% of the students gave the correct response “A”,

almost half of them did so, either because they assumed the determinant value of “1” was also

equal to the value of the unknown, or else they used an incorrect formula to arrive at a number

that was still larger than y = -6. Further, the students who did not complete the determinant part

of the problem (23%) and the students who stated there was not enough information by choosing

answer D (6%) actually may have demonstrated a greater level of understanding than some of

the students who accidentally chose the correct answer, because they recognized the placement

of the unknown quantity inside the determinant as something requiring a novel approach. Using

this problem as a constructed response item provided much more insight into student

understanding than simply looking at the percentage who responded correctly.

Questions at Higher Level of Cognitive Demand - Constructed Response

The constructed response items at the higher level of cognitive demand showed widely variable

success rates, from 0% to 83%. Not surprisingly, these problems also provided the most insight

into students’ thought processes.

Problem 3 actually consisted of two parts: the first was the solution of a 3x3 system (similar to

question 1), and the second was an explanation of how students could tell their solution was

correct. While 83% of the students accurately determined that the system had no solution, only

69% of the students stated this explicitly. The other 14% of the students stated that the

determinant was equal to zero rather than stating that the system had no solution. Because


students had often encountered 2x2 systems in class (solved without graphing calculator

technology), they usually checked the value of the determinant first before trying to find the

inverse of the coefficient matrix.

Evaluation of students’ explanations of the results raises some interesting observations about

their understanding. Of those who presented a reasonable argument, a majority (60%) of the

students identified either that the value of the determinant was zero, or that the 3rd row of the

matrix was a multiple of the 1st row. Only one student presented both of these reasons,

recognizing that one caused the other. None of the students compared the result to the concept of

parallel lines in space, which had been a topic of discussion in a previous class session.

The remaining students presented no explanation (6%), had an incorrect response to the initial

problem (8%), or relied either on repetition or the calculator’s result of “Singular Matrix” as their

explanation (23%). This was probably the most informative of any of the learning tasks about

students’ perceptions about what it means to justify a conclusion or solution to a problem. The

results also raise the possibility that students are relying heavily on learned procedures, such as

using the calculator or finding the value of the determinant before attempting the multiplication

(as was done frequently in class when solving 2x2 systems without the calculator), to solve this

type of problem.

In examining students’ work on the NAEP application test item, it was interesting to note that

only 50% of the students used matrix algebra. The other 10% who were successful on this

problem reverted to strategies such as algebraic substitution or “guess and check”. This problem

actually provides more information about students’ problem-solving approach than it does about


their mathematical reasoning, but is representative of the typical items included on exams used to

compare the performance of US students to international standards.

The most interesting question to analyze was question 6, where none of the students responded

in a completely satisfactory manner. This question asked students to analyze matrix addition and

multiplication to determine whether the commutative property might apply:

The example shown above represents a partially correct response, but is significant because of

the attempt at generalization. Only two students arrived at a general argument about

multiplication using the dimensions of the matrices. While this response shows a high level of

reasoning in the connection between the matrix multiplication algorithm and the commutative

property, the response does not indicate whether the two students thought of a formal reasoning

process involving a counterexample, which was the approach selected by 27% of the students.

Given that it is more difficult to offer proof that a statement is true, it is not surprising that

none of the students were successful in providing a reasonable argument about addition. Only

15% of the students even attempted to address addition, and most did so using a specific


example, as shown in the response below:

It is important to note that, while students had previous experience with algebraic proofs in this

class, they had not been asked to prove general arguments about quantities. Therefore, this task

was completely novel, and drew primarily on the prior knowledge and reasoning capacity of the

students rather than the accumulated effect of recent instruction. Some of the incorrect responses

contained some very interesting connections, as in the cases where students tried to extend real

number properties to matrices because matrices are composed of real numbers. Expanding on

this reasoning could ultimately have led to a valid and defensible argument, and these students

demonstrated a novel approach and sophisticated connection between ideas that was not evident

in the responses of students who followed a more conventional path. It was on this problem that

the scoring rubric was the least helpful in making a quantitative assessment of student reasoning.

Not only was this task informative about reasoning, it is also possible to make evaluations of

students’ procedural knowledge. In the example, close examination reveals that the student has

problems arriving at the 2nd row 2nd column element each time she multiplies, possibly indicating


a problem with the matrix multiplication algorithm itself. Every student’s response provided a

wealth of information that would not have been evident in tasks at lower levels of cognitive

demand or multiple-choice format.

Themes

Adaptive reasoning includes students’ capacity for logical thought, explanation and

justification (CFE, 2001, p. 129) and the ability to use accumulated knowledge to solve new and

diverse problems (Ball & Bass, 2003, p. 28). In examining student work for evidence of

reasoning ability, a number of common themes and assumptions emerged that indicate problems

with mathematical reasoning. These were identified in the responses to item 3 (the explanation)

and item 6. They were:

Some students believe that showing work is sufficient explanation;

Some students believe that obtaining the same result through repetition is adequate

assurance that the conclusion or result is correct; and

Some students believe that giving one example provides a basis for generalizing that

something is always true.

A Comparison of Performances

To determine whether the data about student performance on the reasoning tasks was

consistent with student performance on the summative assessment, a simple linear correlation

was completed. Student performance on the unit assessment was very high, with a number of

students earning scores of 100% or more (due to inclusion of an extra credit question). However,

close examination of the summative assessment revealed that all but two of the questions on the


test could be rated at the lower levels of cognitive demand. Results of the correlation show an r2

value of 29%, indicating that 29% of the variation in the summative assessment score could be

explained by student performance on the learning tasks. When test scores of 100% or more were

excluded, the r2 value rose to 39%.

Other variables causing the relatively low r2 values could include variations in the approach to

scoring the tasks or a mismatch between the types of questions included in the learning tasks as

opposed to those on the summative assessment. However it is interesting to note that only four of

the 48 students earned a high score (above the median) on the assessment tasks and a low score

(below the median) on the summative assessment. A plot of the data is shown below.

Test Scores Related to Practice Sheet Scores

0

20

40

60

80

100

120

0 5 10 15 20

Practice Sheet Scores

Per

cen

tag

e o

n T

est


Conclusions and Implications

The findings clearly indicate that the constructed response items were the most valuable in

providing information about student reasoning. Also, the performance tasks designed at the

higher levels of cognitive demand generated the most variation in student responses and thus the

most insight into their thinking. Therefore, it is evident that designing constructed response items

at the highest levels of cognitive demand provide the best measures of student reasoning, despite

the inherent difficulties in scoring these items.

The scoring problem highlights a perplexing problem that has been facing the educational

community for some time. Assigning scores for assessment purposes becomes more difficult as

the complexity of assessment items increases, and it is inevitable that subjectivity in scoring will

enter the process, decreasing the reliability of the assessment. According to NCTM, assessment

should “promote valid inferences about mathematics learning”, and one threat to this validity is

the introduction of bias into the scoring process (1995, p. 19). Based on this problem, it is

possible to argue that classroom assessments should serve as the primary means of evaluating

students’ mathematical reasoning. The difficulty lies in defining what role this evaluation can

play in the current high-stakes testing environment and its reliance on standardized tests.

A surprising result of this study highlights the importance of using assessment to evaluate and

modify instructional practice. It was clear from the results of this study that the understanding of

many students could be described as procedural. If students are instructed with a focus on

mathematics as procedure, they are likely to rely on procedure in attempting to explain their

approach to a task, as evidenced by common themes identified in the student work. It is clear that

instruction must be modified to help students develop a view of mathematics as reasoning rather


than procedure. To this end, problems such as the one dealing with the commutative property

present a unique opportunity if they are used to help develop reasoning in the classroom

community as a whole. All of the elements of a sound mathematical argument were present in

the students’ responses, yet students had no opportunity to share them with each other.

Vygotsky’s theory of social constructivism suggests that learning occurs when one shares ideas

with more capable peers, lending support to the idea that group tasks are useful in helping

students develop understanding (NCRMSE, 1991, p. 12) and therefore reasoning.

To return to the one issue that may have the most alarming implications, the difficulty in

scoring the complex assessment items raises questions about whether standardized test items will

ever be able to adequately measure reasoning. NCTM’S coherence standard argues for a

balanced approach to assessment where the various types and phases of assessment are matched

to their purpose (1995, p. 21). In order to measure the important standard of mathematical

reasoning, students must be provided with instruction that allows them to invent, test and support

their own ideas, and assessment items must measure whether students are able to apply their

knowledge in novel situations (Battista, 1999, p. 15). Yet, with the unprecedented pressure of the

accountability movement and massive revisions to state curriculum standards, there is evidence

that instruction is increasingly focused on test preparation and that the quality of instruction is

actually decreasing (Popham, 2004, p. 31). One can only hope that future studies confirm the

value of focusing instruction on important standards such as reasoning, and that doing this will

naturally lead to better performance on standardized tests.


Bibliography

Arbaugh, F. & Brown, C.A. (2005). Analyzing mathematical tasks: A catalyst for change?

Journal of Mathematics Teacher Education 8(499-536).

Balanced Assessment in Mathematics Project (1995). Assessing Mathematical Understanding

and Skills Effectively: An Interim Report of the Harvard Group.

Retrieved online 10/2006 from http://balancedassessment.concord.org/amuse.html

Ball, D.L. & Bass, H. (2003). Making mathematics reasonable in school. A Research Companion

to Principles and Standards for School Mathematics, (27-44). Reston, VA: NCTM

Battista, M. (1999). The miseducation of America’s youth. Phi Delta Kappan Online.

Retrieved 10/28/2006 from http://.www.pdkintl.org/kappan/kbat9902.htm

Center for Education of the National Research Council (CFE), (2001). The strands of

mathematical proficiency. Adding it up: Helping children learn mathematics (115-156).

Retrieved online 9/2006 from http://www.nap.edu/books/0309069955/html

Creswell, J.W. (1998). Qualitative inquiry and research design: Choosing among five traditions.

Thousand Oaks, CA: Sage Publications.

Glencoe (1998). Algebra 2.Columbus, OH: McGraw-Hill

Institute of Educational Sciences (IES), (2006). Comparing mathematical content in the NAEP,

TIMSS and PISA 2003 assessments: Technical report. USDOE: National Center for

Education Statistics, Institute of Education Sciences, NCES 2006-029

Lampert, M. (2001). Teaching problems and the problems of teaching. Newhaven, CT: Yale

University Press


National Council of Teachers of Mathematics (NCTM), (1991). Assessment: Myths, models,

good questions and practical suggestions. Reston, VA: NCTM

National Council of Teachers of Mathematics (NCTM), (2000). Principles and standards for

school mathematics. Reston, VA: NCTM.

National Center for Research in Mathematical Sciences Education (NCRMSE), (1991). A

framework for authentic assessment in mathematics. NCRMSE Research Review,1(1).

Retrieved online 1028/2006 from

http://www.wcer.wisc.edu/NCISLA/Publications/Newsletters/NCRMSE/Vol1Num1.pdf

Organization for Economic Cooperation and Development (OECD), (2004). Learning for

tomorrow: First results from PISA 2003. Retrieved online 10/16/2006 from

http://www.pisa.oecd.org/dataoecd/1/60/34002216.pdf

Popham, W.J. (2004). Curriculum matters. American School Board Journal, November, 2004,

30-33. Reprinted with permission online, retrieved 10/28/2006 from ASCD.org.

Rothman, R., Slattery, J., Vranek, L., & Resnick, L. (2002). Benchmarking and alignment of

standards and testing: CSE technical report 566. Retrieved online 12/1/2006 from

http://achieve.org/files/TR566.pdf

Virginia Department of Education (VDOE), (2000). Released test items 2000: Algebra II end of

course examination. Retrieved online 11/1/2006 from

http://www.pen.k12.va.us/VDOE/Assessment/release2000.algebra2.pdf

Yackel, E. & Hanna, G. (2003). Reasoning and proof. A Research Companion

to Principles and Standards for School Mathematics,(227-249). Reston, VA: NCTM


Appendices

Appendix A: Learning Tasks, Source, and Reason for Ratings Using LCD Criteria.

Task 1: Solve a 3x3 linear system with a missing variable Format: multiple-choice Source: Virginia End of Course Exam for Algebra 2: Released Items (VDOE, 2000) Level of Cognitive Demand: Lower (Procedures without connections to meaning),

because students are required to execute a procedure evident from prior instruction in order to solve for the missing variable.

Task 2: Find the inverse of a 2x2 matrix with a determinant not equal to 1 Format: multiple-choice Source: Virginia End of Course Exam for Algebra 2: Released Items (VDOE, 2000) Level of Cognitive Demand: Lower (Procedures without connections to meaning),

because students need to execute a procedure evident from prior instruction in order to find the inverse of the matrix.

Task 3: Solve a 3x3 linear system given in standard equation form, with a result of “no solution”, and explain the results

Format: constructed response Source: teacher Level of Cognitive Demand: Higher (Procedures with connections to meaning), because

students need to engage with underlying concepts to explain why the system has no solution, and they need to make connections within mathematics to interpret the results.

Task 4: Solve a problem with two variables and two constraints set in a real-life context Format: extended constructed response Source: NAEP, Example 24 (IES, 2006, p. E32) Level of Cognitive Demand: moderate according to NAEP, corresponds to higher

(Procedures with connections to meaning), because students need to represent the constraints in symbolic form after interpreting from the real life context and then execute one of several possible procedures to arrive at a reasonable solution, which should consist of integers based on the context of the problem.


Appendix A (continued)

Task 5: Determine whether the example showing an incorrect calculation of the product of a scalar and a 2x2 determinant is correct, and then explain why the solution presented is wrong.

Format: constructed response Source: teacher Level of Cognitive Demand: Lower (procedures without connection to meaning) because

students need to know the order of operations, which was evident from prior instruction, and make a straightforward argument about what step was incorrectly applied in the illustrated solution.

Task 6: Justify application of the commutative properties of multiplication and addition to matrix operations.

Format: extended constructed response Source: teacher Level of Cognitive Demand: Higher (doing mathematics) because students were asked to

engage in complex and non-algorithmic thinking in an entirely new context. The task required exploration of mathematical contexts and application of connected ideas from different mathematical settings as well as a high level of anxiety, since a correct approach or solution was not evident.

Task 7: Calculate the value of an unknown in a 3x3 determinant given the value of the determinant, and compare this to the value of “y” in the solution of a 3x3 linear system to determine which quantity is the greatest.

Format: multiple-choice Source: Glencoe standardized test practice (Glencoe, 1998, p. 251) Level of Cognitive Demand: higher (procedures with connections to meaning) because

the task required a degree of cognitive effort due to the placement of the unknown value inside the 3x3 determinant. Although students could use previously learned procedures, they could not do so mindlessly. Additional complexity was introduced by requiring the comparison of two quantities to select the correct multiple-choice response.


Appendix B: Detailed Evaluation of Student Responses on Performance Tasks

Item 1: Solve 3x3 systemStudent Responses # % of Total

Correct Response 46 95.8%

Incorrect Response 2 4.2%

Item 2: Find 2x2 InverseStudent Responses # % of Total

Correct Response 47 97.9%

Incorrect Response 1 2.1%

Item 3: Problem: 3x3 System with No SolutionStudent Responses # % of Total

Correct (No solution) 33 68.8%

Correct, but stated determinant = 0 7 14.6%

Incorrect response (found solution or A inverse) 4 8.3%

No response 4 8.3%

Item 3 Explanation of how students know if they are correctStudent Responses # % of Total

Correct; multiple reasons given 1 2.1%Correct; determinant = 0 20 41.7%Correct; 3rd row is multiple of 1st row 9 18.8%Incorrect or reliance on calculator or repetition 15 31.3%No response 3 6.3%

Item 4: NAEP Application Problem, 2 variables and 2 constraintsStudent Responses # % of Total

Correct response 29 60.4%Incorrect response; 2 correct constraints 9 18.8%Incorrect response; 1 correct constraint 5 10.4%No response 5 10.4%Of the correct responses, all but five students used matrices. Three used algebraic substitution and two used "guess and check". One student with an incorrect response set up the constraints as inequalities.

Item 5: Product of scalar and determinant, explanation of error in order of operationsStudent Responses # % of Total

Correct conclusion with explanation 28 58.3%Correct conclusion; no explanation, showed work 6 12.5%Correct conclusion; no supporting work 1 2.1%Incorrect conclusion 11 22.9%No attempt 2 4.2%


Appendix B (continued)

Item 6: Commutative property application to matrix multiplication and additionStudent Responses # % of Total

General reasoning for addition; counterexample for multiplication 0 0.0%Example for addition; counterexample for multiplication 7 14.6%No conclusion for addition; counterexample for multiplication 6 12.5%No conclusion for addition; used dimensions for multiplication 2 4.2%Example for addition; nothing for multiplication 6 12.5%Too general of a conclusion or incorrect reasoning 7 14.6%No attempt 20 41.7%

Two students concluded that since matrices are composed of real numbers, real number properties should apply to matrices. Five students used the identity matrix to show the commutative property of multiplication worked in that one instance and concluded that matrix multiplication is commutative. None of the thirteen students who argued that the commutative property of addition worked for addition were able to provide a general argument: they relied on a single example that worked. Some students concluded neither property worked because there are instances where it is impossible to add or multiply matrices because of their dimensions. Two students were able to present a general argument for matrix multiplication using dimensions of rectangular matrices and showing that the order of multiplication could not be reversed. Two students referenced the idea of a determinant, indicating they were thinking about inverses.

Item 7: Comparison of unknown element in determinant and one variable in 3x3 systemStudent Responses # % of Total

Correct response 7 14.6%Correct response; incorrect calculations or assumptions 5 10.4%Incorrect response: concluded "not enough information" 3 6.3%No response: problem partially attempted (3x3 system) 11 22.9%No response, no attempt evident 22 45.8%

Students making an incorrect assumption about the given information could actually have gotten the correct answer to this problem, while students who concluded "not enough information" probably had a better understanding of the problem.

Assess Reasoning

Documents

Transcript of Assess Reasoning