Assessing Outcomes and Processes of Student Collaboration
Transcript of Assessing Outcomes and Processes of Student Collaboration
![Page 1: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/1.jpg)
Assessing Outcomes and Processesof Student Collaboration
Peter F. Halpin
April 19, 2016
Joint work with: Alina von Davier, Yoav Bergner,Jiangang Hao, Lei Liu (ETS); Jacqueline Gutman (NYU)
1 / 89
![Page 2: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/2.jpg)
Outline
Part 1: Wherefore assessments involving collaboration?
I Set up the current perspective: performance assessments
I Selective review of research on small group productivity
Part 2: Outcomes of collaboration
I Combining psychometric models with research on small groupproductivity
I Testing models against observed team performance
Part 3: Processes of collaborationI Focus on chat data (for now!)
I Modeling engagement among collaborators using temporalpoint processesHalpin, von Davier, Hao, & Lui (under review). Journal of Educational
Measurement.
2 / 89
![Page 3: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/3.jpg)
Outline
Part 1: Wherefore assessments involving collaboration?
I Set up the current perspective: performance assessments
I Selective review of research on small group productivity
Part 2: Outcomes of collaboration
I Combining psychometric models with research on small groupproductivity
I Testing models against observed team performance
Part 3: Processes of collaborationI Focus on chat data (for now!)
I Modeling engagement among collaborators using temporalpoint processesHalpin, von Davier, Hao, & Lui (under review). Journal of Educational
Measurement.
3 / 89
![Page 4: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/4.jpg)
Outline
Part 1: Wherefore assessments involving collaboration?
I Set up the current perspective: performance assessments
I Selective review of research on small group productivity
Part 2: Outcomes of collaboration
I Combining psychometric models with research on small groupproductivity
I Testing models against observed team performance
Part 3: Processes of collaborationI Focus on chat data (for now!)
I Modeling engagement among collaborators using temporalpoint processes1
1Halpin, von Davier, Hao, & Lui (under review). Journal of Educational Measurement.
4 / 89
![Page 5: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/5.jpg)
Part 1: Why?
I 21st-century skills, non-cognitive skills, soft skills,hard-to-measure skills, social skills, ...
I Theme: traditional educational tests target a relatively narrowset of constructs
I Analyses of US labour markets indicate that such skills arevalued by employers (Burrus et al.,2013; Deming, 2015)
I There is a salient demand for assessments of a broader rangeof student competencies
5 / 89
![Page 6: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/6.jpg)
Part 1: Why?
I 21st-century skills, non-cognitive skills, soft skills,hard-to-measure skills, social skills, ...
I Theme: traditional educational tests target a relatively narrowset of constructs
I Analyses of US labour markets indicate that such skills arevalued by employers (Burrus et al.,2013; Deming, 2015)
I There is a salient demand for assessments of a broader rangeof student competencies
6 / 89
![Page 7: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/7.jpg)
Part 1: Why?
I 21st-century skills, non-cognitive skills, soft skills,hard-to-measure skills, social skills, ...
I Theme: traditional educational tests target a relatively narrowset of constructs
I Analyses of US labour markets indicate that such skills arevalued by employers (Burrus et al.,2013; Deming, 2015)
I There is a salient demand for assessments of a broader rangeof student competencies
7 / 89
![Page 9: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/9.jpg)
Self-reports
I Self-report measures often do not require the respondent toexhibit the skills about which we wish to make inferences
→ Unsuitable for supporting consequential decisions ineducational settings2
2cf. Duckworth, & Yeager. (2015). Measurement matters: Assessing personal qualities other than cognitive
ability for educational purposes. Educational Researcher, 44(4), 237-251.
9 / 89
![Page 10: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/10.jpg)
Educational assessments
, Reliability and generalizability in traditional content domains
/ Current psychometric models don’t seem entirely appropriateto “next generation assessments”
I e.g., IRT models don’t use process data
/ Collateral damage: teaching to the test, test anxiety,bubble-filling, ...
I NY opt-out movement: 20% of students (parents) boycottedstate test last year
10 / 89
![Page 11: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/11.jpg)
Educational assessments
, Reliability and generalizability in traditional content domains
/ Current psychometric models don’t seem entirely appropriateto “next generation assessments”
I e.g., IRT models don’t use process data
/ Collateral damage: teaching to the test, test anxiety,bubble-filling, ...
I NY opt-out movement: 20% of students (parents) boycottedstate test last year
11 / 89
![Page 12: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/12.jpg)
Educational assessments
, Reliability and generalizability in traditional content domains
/ Current psychometric models don’t seem entirely appropriateto “next generation assessments”
I e.g., IRT models don’t use process data
/ Collateral damage: teaching to the test, test anxiety,bubble-filling, ...
I NY opt-out movement: 20% of students (parents) boycottedstate test last year3
3www.wnyc.org/story/
new-york-city-students-make-modest-gains-state-tests-opt-out-numbers-triple/
12 / 89
![Page 13: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/13.jpg)
Performance assessments4
4Davey, Ferrara, Holland, Shavelson, Webb, & Wise (2015). Psychometric Considerations for the Next
Generation of Performance Assessment. Princeton, NJ. p. 10
13 / 89
![Page 14: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/14.jpg)
Collaboration as a modality of performance assessment
I Small group interactions are a highly-valued educationalpractice
I The Jigsaw Classroom (Aronson et al., 1978; jigsaw.org)
I Group-worthy tasks (Cohen et al., 1999)
I The use of information technology to support studentcollaboration is well established
I CSCL (e.g., Hmelo-Silver et al., 2013)
I The use of group work in assessment contexts has a relativelylong-standing history
I e.g., Webb, 1995; 2015
14 / 89
![Page 15: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/15.jpg)
Collaboration as a modality of performance assessment
I Small group interactions are a highly-valued educationalpractice
I The Jigsaw Classroom (Aronson et al., 1978; jigsaw.org)
I Group-worthy tasks (Cohen et al., 1999)
I The use of information technology to support studentcollaboration is well established
I CSCL (e.g., Hmelo-Silver et al., 2013)
I The use of group work in assessment contexts has a relativelylong-standing history
I e.g., Webb, 1995; 2015
15 / 89
![Page 16: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/16.jpg)
Intellective tasks
I Defined as having a demonstrably“correct” answer with respect toan agreed upon system ofknowledge
I Differentiated from decision /judgement tasks on a continuumof demonstrability (Laughlin2011)
I Differentiated from mixed-motivetasks in that the goals andoutcomes are the same for allmembers McGrath’s (1984) group task circumplex
16 / 89
![Page 17: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/17.jpg)
Lorge & Solomon 19555
5Two models of group behavior in the solution of Eureka-type problems. Psychometrika, 1955, 20 (2), p. 141
17 / 89
![Page 18: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/18.jpg)
Lorge & Solomon 19556
6Two models of group behavior in the solution of Eureka-type problems. Psychometrika, 1955, 20 (2), p. 141
18 / 89
![Page 19: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/19.jpg)
Smoke and Zajonc 19627
If p is the probability that a given individual member is correct, thegroup has a probability h(p) of being correct, where h(p) is afunction of p depending upon the type of decision scheme acceptedby the group. We shall call h(p) a decision function. Intuitively, itwould seem that a decision scheme is desirable to the extent thatit surpasses p.
7On the reliability of group judgements and decisions. In Mathematical methods for small group processes
(Eds. Criswell, Solomon, Suppes), p. 322
19 / 89
![Page 20: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/20.jpg)
Schiflett 19798
8Towards a general model of group productivity. Psychological Bulletin, 86 (1), pp. 67-68
20 / 89
![Page 21: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/21.jpg)
Summary
I Building on research on small groups:
I Intellective tasks (vs decision tasks)
I Cooperative group interactions (vs competitive ormixed-motive)
I Describing group outcomes via decision / functions thatdepend on characteristics of individuals
I But with a focus on:
I Letting probability of success vary over individuals (e.g., viaability)
I Describing relevant task characteristics (e.g., via difficulty)
I The performance of individual groups rather than groups inaggregate
21 / 89
![Page 22: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/22.jpg)
Outcomes of collaboration: A basic scenario
I Two students each write a conventional math assessment
I Their math ability is estimated to be θj and θk
I The two students then work together on a secondconventional math assessment
I What do we expect about their performance on the secondtest, based on the first?
22 / 89
![Page 23: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/23.jpg)
Collaboration as a psychometric question
I Traditional psychometric models assume conditionalindependence of the items
p(xj | θj) =
N∏i
p(xij | θj) (1)
I Traditional psychometric models also assume that theresponses of two (or more) persons are independent
p(xj xk | θj θk) = p(xj | θj) p(xk | θk) (2)
I When people work together does equation (2) hold?
23 / 89
![Page 24: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/24.jpg)
“Working together” in terms of scoring rules9
I For binary items and pairs of responses, consider:
I The conjunctive rule
xijk =
{1 if xij = 1 and xik = 10 otherwise
I The disjunctive rule
xijk =
{0 if xij = 0 and xik = 01 otherwise
I More possibilities, especially for items with > 2 responses or groupswith > 2 collaborators
9cf. Steiner’s 1966 classification of task types
24 / 89
![Page 25: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/25.jpg)
Scoring rules vs decision functions
I Scoring rules describe what “counts” as a correct groupresponse
I Under control of the test designer10
I Decision functions describe the strategies adopted by a team
I Under control of the team
I Basic research strategyI Assume a certain scoring rule
I Consider plausible models for team strategies
I Test the models against data
10Maris & van der Maas (2012). Speed-accuracy response models: scoring rules based on response time and
accuracy. Psychometrika, 77 (4), 615-633
25 / 89
![Page 26: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/26.jpg)
Scoring rules vs decision functions
I Scoring rules describe what “counts” as a correct groupresponse
I Under control of the test designer10
I Decision functions describe the strategies adopted by a team
I Under control of the team
I Basic research strategyI Assume a certain scoring rule
I Consider plausible models for team strategies
I Test the models against data
10Maris & van der Maas (2012). Speed-accuracy response models: scoring rules based on response time and
accuracy. Psychometrika, 77 (4), 615-633
26 / 89
![Page 27: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/27.jpg)
“Working together” in terms of scoring rules
I For binary items and pairs of responses, consider:
I The conjunctive rule
xijk =
{1 if xij = 1 and xik = 10 otherwise
I The disjunctive rule
xijk =
{0 if xij = 0 and xik = 01 otherwise
I More possibilities, especially for items with > 2 responses or groupswith > 2 collaborators
27 / 89
![Page 28: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/28.jpg)
Defining successful pairwise collaboration
I The independence model
Eind[xijk | θj θk] = E[xij | θj ] E[xik | θk]
I Successful collaboration
E[xijk | θj θk] > Eind[xijk | θj θk]
I Unsuccessful collaboration
E[xijk | θj θk] < Eind[xijk | θj θk]
I Note: these definitions are item- and dyad- specific
28 / 89
![Page 29: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/29.jpg)
Defining successful pairwise collaboration
I The independence model
Eind[xijk | θj θk] = E[xij | θj ] E[xik | θk]
I Successful collaboration
E[xijk | θj θk] > Eind[xijk | θj θk]
I Unsuccessful collaboration
E[xijk | θj θk] < Eind[xijk | θj θk]
I Note: these definitions are item- and dyad- specific
29 / 89
![Page 30: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/30.jpg)
Some models for successful collaboration
I Minimum individual performance (disruptive team member)
Emin[xijk | θj θk] = min{E[xij | θj ], E[xik | θk]}
I Maximum individual performance (cheating / tutor)
Emax[xijk | θj θk] = max{E[xij | θj ], E[xik | θk]}
I “True collaboration”
E[xijk | θj θk] ≥ max{E[xij | θj ], E[xik | θk]}
30 / 89
![Page 31: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/31.jpg)
Some models for successful collaboration
I Minimum individual performance (disruptive team member)
Emin[xijk | θj θk] = min{E[xij | θj ], E[xik | θk]}
I Maximum individual performance (cheating / tutor)
Emax[xijk | θj θk] = max{E[xij | θj ], E[xik | θk]}
I “True collaboration”
E[xijk | θj θk] ≥ max{E[xij | θj ], E[xik | θk]}
31 / 89
![Page 32: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/32.jpg)
Some models for successful collaboration
I Minimum individual performance (disruptive team member)
Emin[xijk | θj θk] = min{E[xij | θj ], E[xik | θk]}
I Maximum individual performance (cheating / tutor)
Emax[xijk | θj θk] = max{E[xij | θj ], E[xik | θk]}
I “True collaboration”
E[xijk | θj θk] ≥ max{E[xij | θj ], E[xik | θk]}
32 / 89
![Page 33: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/33.jpg)
A model for “true collaboration”
I An additive model
Eadd[xijk | θj θk] = E[xij | θj ] + E[xik | θk]− E[xijk | θj θk]
I Recalling E[xijk | θj θk] > E[xij | θj ]E[xik | θk], define anadditive independence (AI) model
EAI [xijk | θj θk] = E[xij | θj ] + E[xik | θk]− E[xij | θj ]E[xik | θk]
≥ Eadd[xijk | θj θk]
I AI is an upper bound on any “more interesting” additive model forsuccessful collaboration
33 / 89
![Page 34: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/34.jpg)
A model for “true collaboration”
I An additive model
Eadd[xijk | θj θk] = E[xij | θj ] + E[xik | θk]− E[xijk | θj θk]
I Recalling E[xijk | θj θk] > E[xij | θj ]E[xik | θk], define anadditive independence (AI) model
EAI [xijk | θj θk] = E[xij | θj ] + E[xik | θk]− E[xij | θj ]E[xik | θk]
≥ Eadd[xijk | θj θk]
I AI is an upper bound on any “more interesting” additive model forsuccessful collaboration
34 / 89
![Page 35: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/35.jpg)
More on AI model
I Can also be written as:
EAI [xijk | θj θk] = E[xij | θj ] (1− E[xik | θk])+ E[xik | θk] (1− E[xij | θj ])+ E[xij | θj ]E[xik | θk]
I Which has an interpretation in terms of three cases
35 / 89
![Page 36: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/36.jpg)
More on AI model
I And is also equivalent to Lorge & Solomon’s Model A
EAI [xijk | θj θk] = 1− (1− E[xij | θj ])(1− E[xik | θk])
I Except the “probability an individual can solve the problem”now depends on both the individual and the problem
36 / 89
![Page 37: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/37.jpg)
More on AI model11
I We probably want some constraints on what counts as a goodcollaborative IRF
I Easy to show that AI satisfies latent monotonicity, if theindividual IRFs do (trivial for other models also)
11Holland & Rosenbaum (1986). Conditional Association and Unidimensionality in Monotone Latent Variable
Models. The Annals of Statistics, 14 (4), 1523 – 1543
37 / 89
![Page 38: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/38.jpg)
More on AI model11
I We probably want some constraints on what counts as a goodcollaborative IRF
I Easy to show that AI satisfies latent monotonicity, if theindividual IRFs do (trivial for other models also)
11Holland & Rosenbaum (1986). Conditional Association and Unidimensionality in Monotone Latent Variable
Models. The Annals of Statistics, 14 (4), 1523 – 1543
38 / 89
![Page 39: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/39.jpg)
AI: latent monotonicity
Assumptions:
f(x) ≥ f(x′) for x > x′ and 0 ≤ g(y) ≤ 1
Show:
f(x) + g(y)− f(x) g(y) ≥ f(x′) + g(y)− f(x′) g(y)
Contradiction:
f(x) + g(y)− f(x) g(y) <f(x′) + g(y)− f(x′) g(y)
→ f(x)− f(x′) <g(y) (f(x)− f(x′))
39 / 89
![Page 40: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/40.jpg)
AI: example IRF12
theta1
prob
12Using 2PL model for individual IRFs with α = 1 and β = 0
40 / 89
![Page 41: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/41.jpg)
Models abound!
I Basic idea: write down IRFs for collaboration based onassumed-to-be-known individual abilities (and itemparameters)
I But how do we characterize empirical team performance?
41 / 89
![Page 42: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/42.jpg)
Empirical team performance
I We have
I Observed collaborative responses xjk = (x1jk, x1jk, . . . , xmjk)
I A model for individual performance on the m (conventional)math items
I So we can get “team theta,” e.g.,
θjk = argmaxθ
{L0(xjk | θ)} (3)
I Where L0 is the likelihood of the model calibrated onindividual performance (reference model)
42 / 89
![Page 43: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/43.jpg)
Empirical team performance
I We have
I Observed collaborative responses xjk = (x1jk, x1jk, . . . , xmjk)
I A model for individual performance on the m (conventional)math items
I So we can get “team theta,” e.g.,
θjk = argmaxθ
{L0(xjk | θ)} (3)
I Where L0 is the likelihood of the model calibrated onindividual performance (reference model)
43 / 89
![Page 44: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/44.jpg)
Proposed method for testing models
I Testing of different models against reference model
Dmodel = −2 lnLmodel(xjk | θj θk)L0(xjk | θjk)
(4)
I Also a “direct test” of effect of collaboration for eachindividual
D0 = −2 lnL0(xjk | θj)L0(xjk | θjk)
(5)
with effect size δjk =θjk−θjσθ
44 / 89
![Page 45: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/45.jpg)
Proposed method: reference distribution
I Ind and AI models are not nested with reference model → NoWilk’s theorem
I Can use Vuong’s 198913 results for LR with non-nestedmodels, but asymptotic in m
I Good news: we can bootstrap a null distribution for (4) and(5) pretty easily
13Likelihood ratio tests for model selection and non-nested hypotheses. Econometrika, 57(2), 307 – 333.
45 / 89
![Page 46: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/46.jpg)
Bootstrapping the reference distribution
Assuming known item parameters and θj , θk. For r = 1, . . . , R
Step 1 Generate collaborative response patterns x(r)jk from
Emodel[xijk | θj θk]
Step 2 Compute Lmodel(x(r)jk | θj θk)
Step 2 Estimate θ(r)jk for each x
(r)jk ; save L0(x
(r)jk | θjk)
Step 4 Compute D(r)model or D
(r)0
46 / 89
![Page 47: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/47.jpg)
Example 1
I Design
I Pool of pre-calibrated math items (grade 12 NAEP, modifiedto be numeric response)
I Individual “pre-test” → estimate individual abilities
I Collaborative “post-test” → evaluate models, estimate δjk
I Modality of collaboration: online chat
I Limitations:
I Small calibration sample; crowd workers
I Individual and collaborative forms were not counterbalanced(neither in order nor content)
47 / 89
![Page 48: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/48.jpg)
NAEP grade 12 math items, deployed via OpenEdx
48 / 89
![Page 49: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/49.jpg)
AMT crowdworkers (calibration sample)
Variable Levels n %∑
%
Gender Female 155 46.5 46.5Male 178 53.5 100.0
Age 18-30 117 35.2 35.230-40 129 38.9 74.140-55 71 21.4 95.555+ 15 4.5 100.0
Education Some Grade School 3 0.9 0.9High School Diploma 49 14.7 15.6Some College 118 35.4 51.0Bachelor’s Degree 132 39.6 90.7Master’s Degree 22 6.6 97.3Ph.D or Advanced Degree 9 2.7 100.0
Country United States 313 94.0 94.0India 16 4.8 98.8Canada 3 0.9 99.7United Kingdom 1 0.3 100.0
English First Lang Yes 321 96.4 96.4No 12 3.6 100.0
49 / 89
![Page 50: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/50.jpg)
Deltas
-2
0
2
-2 0 2Individual Theta
Col
labo
rativ
e Th
eta
Collaborative vs Individual Performance
50 / 89
![Page 51: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/51.jpg)
Model tests: Sanity check using individual pre-test
Figure reports P(Dmodel > |obs|) for individual pre-tests scored using conjunctivescoring rule
51 / 89
![Page 52: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/52.jpg)
Model tests: Collaborative data
Figure reports P(Dmodel > |obs|) for collaborative tests scored using conjunctivescoring rule
52 / 89
![Page 53: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/53.jpg)
Ind Model
-2
0
2
-2 0 2Individual Theta
Col
labo
rativ
e Th
eta
pairs3
4
12
15
16
35
Collaborative vs Individual Performance
53 / 89
![Page 54: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/54.jpg)
Min Model
-2
0
2
-2 0 2Individual Theta
Col
labo
rativ
e Th
eta
pairs7
8
9
13
29
36
Collaborative vs Individual Performance
54 / 89
![Page 55: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/55.jpg)
Max Model
-2
0
2
-2 0 2Individual Theta
Col
labo
rativ
e Th
eta
pairs2
5
10
14
17
19
22
24
25
26
27
28
30
38
44
45
Collaborative vs Individual Performance
55 / 89
![Page 56: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/56.jpg)
AI Model
-2
0
2
-2 0 2Individual Theta
Col
labo
rativ
e Th
eta
pairs6
18
20
21
23
32
33
37
39
40
41
42
43
Collaborative vs Individual Performance
56 / 89
![Page 57: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/57.jpg)
Not one of our four models
-2
0
2
-2 0 2Individual Theta
Col
labo
rativ
e Th
eta
pairs1
11
31
34
Collaborative vs Individual Performance
57 / 89
![Page 58: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/58.jpg)
Summary of collaborative outcomes
I Can define, estimate, and test models of collaboration onacademic performance using IRT-based methods
I But how distinct are these models, really?
I Models do not cover all cases
58 / 89
![Page 59: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/59.jpg)
Possible next step – one model to rule them all!
Let w1, w2 ∈ [0, 1] and define the weighted additive independencemodel
EWAI[Xijk | θj θk] = wjPi(θj)Qi(θk) + wkPi(θk)Qi(θj) + Pi(θj)Pi(θk)
I Includes original four and everything in between
I Includes (Pi(θj) + Pi(θk))/2 when w1 = w2 = .5
I Weights describe how well each individual obtains his/her “optimalcollaboration level”
59 / 89
![Page 60: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/60.jpg)
Part 3: What are process data?14
I Any task-related actions of a respondent performed during thecompletion of a task
I In ed tech context, typically associated with time-stamped userlogs (“trace data”)
I All the stuff IRT ignores:
p(x | θ) =∏i
p(xi | θ)
14Halpin & von Davier 2013, Hao, & Lui (under review). Journal of Educational Measurement.
60 / 89
![Page 61: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/61.jpg)
Part 3: What are process data?14
I Any task-related actions of a respondent performed during thecompletion of a task
I In ed tech context, typically associated with time-stamped userlogs (“trace data”)
I All the stuff IRT ignores:
p(x | θ) =∏i
p(xi | θ)
14Halpin & von Davier 2013, Hao, & Lui (under review). Journal of Educational Measurement.
61 / 89
![Page 62: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/62.jpg)
Part 3: What are process data?14
I Any task-related actions of a respondent performed during thecompletion of a task
I In ed tech context, typically associated with time-stamped userlogs (“trace data”)
I All the stuff IRT ignores:
p(x | θ) =∏i
p(xi | θ)
14Halpin & von Davier 2013, Hao, & Lui (under review). Journal of Educational Measurement.
62 / 89
![Page 63: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/63.jpg)
Part 3: What are collaborative process data?
I Ideally a richly detailed recording of the sequence of actionstaken by each team member during the completion of a task
I ATC21S collaborative problem solving prototype items15
I CPS frame16
I Focus today: chat messages sent between online collaborators
15http://www.atc21s.org/uploads/3/7/0/0/37007163/pd_module_3_nonadmin.pdf
16In alpha at Computational Psychometrics lab at ETS
63 / 89
![Page 64: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/64.jpg)
Part 3: What are collaborative process data?
I Ideally a richly detailed recording of the sequence of actionstaken by each team member during the completion of a task
I ATC21S collaborative problem solving prototype items15
I CPS frame16
I Focus today: chat messages sent between online collaborators
15http://www.atc21s.org/uploads/3/7/0/0/37007163/pd_module_3_nonadmin.pdf
16In alpha at Computational Psychometrics lab at ETS
64 / 89
![Page 65: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/65.jpg)
Two perspectives on the analysis of chat / email / etc.
I Text-based analysis of strategy and sentiment
I e.g., Howley, Mayfield, & Rose, 2013; Liu, Hao, von Davier,Kyllonen, & Zapata-Rivera, 2015
I Time series analysis of sending times
I e.g., Barabasi, 2005; Ebel, Mielsch, & Bornholdt, 2002; Halpin& De Boeck, 2013
65 / 89
![Page 66: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/66.jpg)
Temporal point process: basic idea (more tomorrow)17
I Data: events that have negligible duration relative to a periodof observation
I Contrast events with states, regimes
I Basic idea: model the Bernoulli probability of an eventhappening in a small window of time [t, t+ ∆), conditional onthe events that have happened before t ∈ R+.
I “Instantaneous probability” of an event, denoted p(t)
17Daley, D. J., & Vera-Jones. (2003). An introduction to the theory of point processes: Elementary theory and
methods (2nd ed., Vol. 1). New York: Springer.
66 / 89
![Page 67: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/67.jpg)
Temporal point process in interpersonal context
I Modeling p(t) to describe
I How the probability of each person’s actions changes incontinuous time
I How this depends on their previous actions
I Emergent or group-level phenomena like coordination,reciprocity, ...
67 / 89
![Page 68: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/68.jpg)
Chat engagement via Hawkes processes
I Hawkes process provides a means of modeling instantaneousprobabilities in a multivariate context
I Halpin et al. (under review) suggest the response intensityparameter as a measure of engagement of student j with k
αjk > njk/nk (6)
I njk is the expected total number of responses made by studentj to student k is (inferred from model)
I nk is the number of actions of student k (observed)
I Lower bound is tight in practice; not necessary forcomputations
68 / 89
![Page 69: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/69.jpg)
Chat engagement via Hawkes processes
I Aggregating to team (dyad) level
α ≡ α12n2 + α21n1n1 + n2
(7)
I Interpretation: the proportion of all group members’ actions,n1 + n2, that were responded to by any other member duringa collaboration
I See paper for more details, including initial results on SEs ofαjk
69 / 89
![Page 70: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/70.jpg)
Example: Tetralogue
I A simulation-based science game with an embeddedassessment recently developed at ETS (Hao, Liu, von Davier,& Kyllonen, 2015)
1 Dyads work together to learn and make predictions aboutvolcano activity
2 At various points in the simulation, the students are asked toindividually submit their responses to an assessment itemwithout discussing the item
3 Following submission of responses from both students, they areinvited to discuss the question and their answers
4 Lastly, they are given an opportunity to revise their responsesto the item, with the final answers counting towards theteam’s score
70 / 89
![Page 71: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/71.jpg)
Example: Full sample
I 286 dyads solicited via AMT and randomly paired (based onarrival in queue)
I Median reported age was 31.5 years
I 52.5% reported that they were female
I 79.2% reported that they were White.
I Additionally, all participants were required to
I Have an IP address located in the United States
I Self-identify as speaking English as their primary language
I Self-identify as having at least one year of college education
71 / 89
![Page 72: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/72.jpg)
Example: Estimating chat engagement
0
5
10
15
0.0 0.2 0.4 0.6 0.8Alpha
count
Engagement Index
0.00
0.05
0.10
0.15
50 75 100 125Number of Chats of Partner
Sta
ndar
d E
rror
MethodHessian
Lower Bound
Standard Error Against Number of Chats
0.0
0.2
0.4
0.6
0.2 0.4 0.6Alpha
Alpha
Relation with Partner's Index
0
1
2
3
4
0.0 0.2 0.4 0.6Alpha
Diff
eren
ce in
Num
ber o
f Cha
ts
Relation with Number of Chats
Note: Alpha denotes the estimated response intensities from Equation 6. Hessian denotes standard errors obtainedvia the Hessian of the log-likelihood. See appendix of Halpin et al. for Lower Bound. Difference in Number ofChats was scaled using the log of the absolute value of the difference.
72 / 89
![Page 73: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/73.jpg)
Example: Relation with revision on embedded assessment
0.25
0.30
0.35
0.40
Alpha Partner Alpha Team Alpha
Mea
n E
ngag
emen
t
No Revisions
Revisions
Measures of Chat Engagement vs Item Revisions
Note: Comparison of mean levels of engagement indices for individuals who either did or did not revise at least oneresponse after discussion with their partners. Alpha denotes the estimated response intensities from Equation 6;Partner’s Alpha denotes the partner’s response intensity; Team Alpha denotes the team-level index in Equation 7.For the latter, the data are reported for dyads, not individuals, and no revisions means that both individuals on theteam made no revisions. Error bars are 95% confidence intervals on the means.
73 / 89
![Page 74: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/74.jpg)
Example: Relation with revision on embedded assessment
Table 1: Summary of group differences.
Index Group Mean SD N Hedges’ g r
Alpha No Revisions 0.31 0.13 82 –Alpha Revisions 0.36 0.10 66 0.40 .20Partner’s Alpha No Revisions 0.31 0.14 82 –Partner’s Alpha Revisions 0.37 0.14 66 0.44 .21Team Alpha No Revisions 0.27 0.11 26 –Team Alpha Revisions 0.37 0.13 48 0.84 .38
Note: Alpha denotes the estimated response intensities from Equation alpha2; Partner’s Alpha denotes theengagement index of the individual’s partner; Team Alpha denotes the team-level index in Equation 7. Hedges’ gused the correction factor described by Hedges (1981) and r denotes the point-biserial correlation.
74 / 89
![Page 75: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/75.jpg)
Summary of collaborative processes
I Hawkes processes are a feasible model for process data obtained oncollaborative tasks
I Resulting measures of chat engagement are meaningfully related totask performance
I Future modeling work
I Random effects models for simultaneous estimation over multiple groups
I Inclusion of model parameters describing task characteristics
I Analytic expressions for standard errors of model parameters
I Methods for improving optimization with relatively small numbers ofevents
I Integration with text-based analyses (e.g., using marks / time-varyingcovariates)
75 / 89
![Page 76: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/76.jpg)
What’s next
I Integration of task design, outcomes, processes, ... and theory!!
76 / 89
![Page 77: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/77.jpg)
Contact: [email protected]
Support: This research was funded by a postdoctoral fellowship from theSpencer Foundation and an Education Technology grant from NYU Steinhardt.
77 / 89
![Page 78: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/78.jpg)
References not already included in footnotes
Aronson, E., Blaney, N., Stephan, C., Sikes, J., & Snapp, M. (1978). The jigsaw classroom. Beverly Hills, CA:Sage.
Burrus, J., Carlson, J., Bridgeman, B., Golub-smith, M., & Greenwood, R. (2013). Identifying the Most Important21st Century Workforce Competencies : An Analysis of the Occupational Information Network ( O * NET ) (ETSRR-13-21). Princeton, NJ.
Cohen, E. G., Lotan, R. A., Scarloss, B. A., & Arellano, A. R. (1999). Complex instruction: Equity in cooperativelearning classrooms. Theory Into Practice, 38, 80-86.
Davey, T., Ferrara, S., Holland, P. W., Shavelson, R. J., Webb, N. M., & Wise, L. L. (2015). PsychometricConsiderations for the Next Generation of Performance Assessment. Princeton, NJ.
Deming, D. J. (2015). The Growing Importance of Social Skills in the Labor Market. National Bureau of EconomicResearch Working Paper Series, (21473).
Griffin, P., & Care, E. (2015). Assessment and teaching of 21st century skills: Methods and approach. New York:Springer.
Hmelo-Silver, C. E., Chinn, C. A., Chan, C. K., & O?Donnel, A. M. (2013). International handbook ofcollaborative learning. New York: Taylor and Francis.
McGrath, J. E. (1984). Groups: Interaction and performance. (Prentice-Hall, Ed.). Englewood Cliffs, NJ.
Organisation for Economic Co-operation and Development. (2013). PISA 2015 Draft Collaborative ProblemSolving Framework. Retrieved fromhttp://www.oecd.org/pisa/pisaproducts/DraftPISA2015CollaborativeProblemSolvingFramework.pdf
Webb, N. M. (1995). Group Collaboration in Assessment: Multiple Objectives, Processes, and Outcomes.Educational Evaluation and Policy Analysis, 17(2), 239-261.
78 / 89
![Page 79: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/79.jpg)
Bootstrapping the reference distribution for Dmodel
Assuming known item parameters and θj , θk. For r = 1, . . . , R
Step 1 Generate collaborative response patterns x(r)jk from
Emodel[xijk | θj θk]
Step 2 Compute Lmodel(x(r)jk | θj θk)
Step 2 Estimate θ(r)jk for each x
(r)jk ; save L0(x
(r)jk | θjk)
Step 4 Compute D(r)model or D
(r)0
79 / 89
![Page 80: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/80.jpg)
Instructions
80 / 89
![Page 81: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/81.jpg)
Jigsaw / information sharing items
81 / 89
![Page 82: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/82.jpg)
Jigsaw / information sharing items
82 / 89
![Page 83: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/83.jpg)
Jigsaw / information sharing items
83 / 89
![Page 84: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/84.jpg)
Jigsaw / information sharing items
84 / 89
![Page 85: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/85.jpg)
Hints / information requesting items
85 / 89
![Page 86: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/86.jpg)
Hints / information requesting items
86 / 89
![Page 87: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/87.jpg)
Hints / information requesting items
87 / 89
![Page 88: Assessing Outcomes and Processes of Student Collaboration](https://reader033.fdocuments.us/reader033/viewer/2022061101/629c2d41c521a777233e4b85/html5/thumbnails/88.jpg)
Multiple answer / negotiation items
88 / 89