2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

77
Introduction to Feature-Aware Student Knowledge Tracing (FAST) Model and Toolkit José P. González-Brenes, Pearson Yun Huang, University of Pittsburgh Acknowledging: Peter Brusilovsky, University of Pittsburgh

Transcript of 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Page 1: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Introduction to Feature-Aware Student Knowledge Tracing (FAST) Model and Toolkit

José P. González-Brenes, Pearson Yun Huang, University of Pittsburgh Acknowledging: Peter Brusilovsky, University of Pittsburgh

Page 2: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

•  Toolkit 1-2-3 •  Walk-through examples

1.  Item difficulty 2.  Temporal Item Response Theory

•  Conclusion

Page 3: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Motivation

•  Personalize learning of students – For example, teach students new material as

they learn, so we don’t teach students material they know

•  How? Typically with Knowledge Tracing

Page 4: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

:  

û û      ü    ü  û û ü     ü                      ü  

û û ü      ü        ü  

û û ü        ü  

:  

Page 5: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

:  

:  

û û ü      ü        ü  

û û ü    ü  

Learns a skill or not

•  Knowledge Tracing fits a two-state HMM per skill

•  Binary latent variables indicate the knowledge of the student of the skill

•  Four parameters: 1.  Initial Knowledge 2.  Learning 3.  Guess 4.  Slip

Transition

Emission

Page 6: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

What’s wrong?

•  Only uses performance data (correct or incorrect) •  We are now able to capture feature rich data

– MOOCs & intelligent tutoring systems are able to log fine-grained data

– Used a hint, watched video, after hours practice…

•  … these features can carry information or intervene on learning

Page 7: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

What’s a researcher gotta do?

•  Modify Knowledge Tracing algorithm •  For example, just on a small-scale

literature survey, we find at least nine different flavors of Knowledge Tracing

Page 8: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Are all of those models sooooo different? •  No! we identify three main variants •  We call them the “Knowledge Tracing

Family”

Page 9: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Knowledge Tracing Family

No features Emission (guess/slip)

Transition (learning)

Both (guess/slip and

learning)

•  Item  difficulty  (Gowda  et  al  ’11;  Pardos  et  al  ’11)  

•  Student  ability  (Pardos  et  al    ’10)  

•  Subskills  (Xu  et  al  ’12)  

•  Help  (Sao  Pedro  et  al  ’13)  

•  Student  ability  (Lee  et  al  ’12;  Yudelson  et  al  ’13)  

•  Item  difficulty  (Schultz  et  al  ’13)  

•  Help  (Becker    et  al  ’08)  

k  

y  

k  

y  

f  

k  

y  

f  

k  

y  

f  f  

Page 10: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

•  Each model is successful for an ad hoc purpose only – Hard to compare models – Doesn’t help to build a

cognition theory

Page 11: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

•  Learning scientists have to worry about both features and modeling

Page 12: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

•  These models are not scalable: – Rely on Bayes Net’s

conditional probability tables – Memory performance grows

exponentially with number of features

– Runtime performance grows exponentially with number of features (with exact inference)

Page 13: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Example:

Knowledge p(Correct)

False (1) 0.10 (guess)

True (2) 0.85 (1-slip)

20+1 parameters!

Emission probabilities with no features:

Page 14: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Example: Emission probabilities with 1 binary feature:

Knowledge Hint p(Correct)

False False (1) 0.06

True False (2) 0.75

False True (3) 0.25

True True (4) 0.99

21+1 parameters!

Page 15: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Example: Emission probabilities with 10 binary features: Knowledge F1 … F10 p(Correct)

False False False False (1) 0.06

… …

True True True True (2048) 0.90

210+1 parameters!

Page 16: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

1.  Low Complexity with Many Features 2.  Flexible Feature Engineering 3.  Flexible Parameterization 4.  High Predictive Performance, Plausibility and

Consistency •  Toolkit 1-2-3 •  Walk-through Examples

1.  Item difficulty 2.  Temporal Item Response Theory

•  Conclusion

Page 17: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Something old… k  

y  

f  f  •  Uses the most general model

in the Knowledge Tracing Family

•  Parameterizes learning or emission (guess and slip) probabilities

Page 18: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Something new… k  

y  

f  f  •  Instead of using inefficient

conditional probability tables, we use logistic regression [Berg-Kirkpatrick et al’10 ]

•  Exponential complexity -> linear complexity

Page 19: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Example (guess and slip):

# of features # of parameters in KT # of parameters in FAST*

0 2 2

1 4 4

10 2048 22

25 67,108,864 52

52 features are not that many, and yet they can become intractable with Knowledge Tracing Family

*  Parameterizing  guess  and  slip  probabili4es  without  sharing  features.    

Page 20: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Something blue? k  

y  

f  f  •  Not a lot of changes to

implement prediction •  Training requires quite a bit of

changes – We use a recent modification of

the Expectation-Maximization algorithm proposed for Computational Linguistics problems [Berg-Kirkpatrick et al’10 ]

Page 21: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

KT uses Expectation-Maximization

Conditional Probability

Table Lookup / Update

Latent Knowledge Estimation

E-Step:Forward-Backward algorithm

M-Step: Maximum Likelihood

Page 22: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

“Conditional Probability

Table” Lookup / Update

Latent Knowledge Estimation

Logistic Regression

Weights Estimation

FAST uses a recent E-M algorithm [Berg-Kirkpatrick et al’10 ]

E-step

Page 23: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Slip/guess lookup:

Mastery p(Correct)

False (1)

True (2)

Use the multiple parameters of logistic regression to fill the

values of a conditional

probability table! [Berg-Kirkpatrick et al’10 ]

Page 24: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

“Conditional Probability

Table” Lookup / Update

Latent Knowledge Estimation

Logistic regression

weights Estimation

FAST uses a recent E-M algorithm [Berg-Kirkpatrick et al’10 ]

Page 25: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

FAST uses a recent E-M algorithm [Berg-Kirkpatrick et al’10 ]

Latent Knowledge Estimation

Instance Weights for

Logistic Regression

Train a weighted logistic regression !

P(hidden | Observed), i.e., P(Learnedt|O) The probability of being in Learned state at tth practice given a student’s practice sequence.

Page 26: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

observation 1observation 2

observation n

...

featu

re 1

featu

re 2

featu

re k

featu

re 1

featu

re 2

featu

re k

featu

re 1

featu

re 2

featu

re k

... ... ...

observation 1observation 2

observation n

...

{ { {active when

masteredactive when not masteredalways active

Features:Instance weights:

prob

abili

ty o

f no

t mas

terin

gpr

obab

ility

of

mas

terin

g

P(Learned1|O) P(Learned2|O)

P(Learnedn|O)

P(Unlearned1|O) P(Unlearned2|O)

P(Unlearnedn|O)

active when learned

active when unlearned

O

O

Feature Design Matrix for slip/guess logistic regression

Page 27: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Parameterization example

•  To model the impact of example usage, we construct a binary example feature Et: whether a student clicked an example before current practice.

•  This feature affects the Guess and Slip probabilities: when a student checked an example, he/she has higher probability to guess? (and lower probability to slip?)

Page 28: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Parameterization example

feature Et for Slip

bias for Slip

feature Et for Guess

bias for Guess

0 1 0 0 1 1 0 0

0 0 0 1 0 0 1 1

•  Mary attempted a problem twice. •  On the 1st attempt, she failed. •  She checked an example. On the 2nd attempt, she succeeded.

Outcome  

incorrect

correct

incorrect

correct

original data for learned state

a copy of the data for unlearned state

P(Learnedt|O)  

0.3 0.6 0.7 0.4

standard logistic regression weight for each observation

weighted logistic regression to train coefficients

s2 s1 g2 g1

Page 29: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Parameterization example

Page 30: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

observation 1observation 2

observation n

...

featu

re 1

featu

re 2

featu

re k

featu

re 1

featu

re 2

featu

re k

featu

re 1

featu

re 2

featu

re k

... ... ...

observation 1observation 2

observation n

...

{ { {active when

masteredactive when not mastered

always active

Features:Instance weights:

prob

abili

ty o

f no

t mas

terin

gpr

obab

ility

of

mas

terin

g

Slip/Guess logistic regression

When FAST uses only intercept terms as features for the two levels of mastery, it is equivalent to Knowledge Tracing!

Page 31: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

March 28, 2014 31

7,100 11,300 15,500 19,8000

10

20

30

40

50

60

23

28

46

54

0.08 0.10 0.12 0.15

# of observations

exe

cutio

n tim

e (

min

.)

BNT!SM (no feat.)

FAST (no feat.)

FAST is 300x faster than BNT-SM!

(On  an  old  laptop,  no  parallelizaLon,  nothing  fancy)    

Page 32: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

BNT-SM vs FAST

•  BNT-SM contains other functionalities that FAST doesn’t have. For example, it allows different ways to learn parameters.

•  We recommend you to explore different tools to find the best fit.

Page 33: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

1.  Low Complexity with Many Features 2.  Flexible Feature Engineering 3.  Flexible Parameterization 4.  High Predictive Performance, Plausibility and

Consistency •  Toolkit 1-2-3 •  Walk-through Examples

1.  Item difficulty 2.  Temporal Item Response Theory

•  Conclusion

Page 34: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

What kind of features can we put?

•  Item Dummies Incorporating item difficulties

•  Student Dummies Incorporating student abilities

•  Item and Student Dummies Temporal Item Response Theory, incorporating both item difficulties and student abilities

•  Subskill Dummies Incorporating subskill difficulties

Page 35: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

What kind of features can we put?

•  Binary hint features à Consider whether a student requested a hint or not

•  Binary example features à Consider whether a student checked an example or not …

Page 36: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

1.  Low Complexity with Many Features 2.  Flexible Feature Engineering 3.  Flexible Parameterization 4.  High Predictive Performance, Plausibility and

Consistency •  Toolkit Setup •  Examples

1.  Item difficulty 2.  Multiple subskills 3.  Temporal Item Response Theory

•  Conclusion

Page 37: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

What can we parameterize? •  For example, to model the impact of example

usage, we can consider following parameterizations with a binary example feature:

(Huang et al. ’15)

Page 38: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

1.  Low Complexity with Many Features 2.  Flexible Feature Engineering 3.  Flexible Parameterization 4.  High Predictive Performance, Plausibility and Consistency

•  Toolkit 1-2-3 •  Walk-through Examples

1.  Item difficulty 2.  Temporal Item Response Theory

•  Conclusion

Page 39: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Beyond higher predictive performance….

•  FAST promises higher predictive performance than Knowledge Tracing with proper feature engineering.

•  Moreover, it increase model plausibility and consistency.

•  Details in our paper in EDM 2015 (http://www.educationaldatamining.org/EDM2015/uploads/papers/paper_164.pdf). A quick introduction of how the FAST toolkit addresses these issues: –  By specifying #random restarts, it automatically picks

the one with the maximum log likelihood on train set, –  It outputs plausibility evaluation metrics.

Page 40: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

1.  Low Complexity with Many Features 2.  Flexible Feature Engineering 3.  Flexible Parameterization 4.  High Predictive Performance, Plausibility and

Consistency •  Toolkit 1-2-3 •  Walk-through Examples

1.  Item difficulty 2.  Temporal Item Response Theory

•  Conclusion

Page 41: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Toolkit Setup -- input 1.  Download the latest release from

https://github.com/ml-smores/fast/releases 2.  Decompress the file (fast-2.1.0-release.zip). The main files

for starting are: •  fast-2.1.0-final.jar •  data/item_exp/FAST+item1.conf (configuration file) •  data/item_exp/train0.csv, test0.csv (data)

3.  Open a terminal, go to the directory where fast-2.1.0-final.jar locates, and type:

java -jar fast-2.1.0-final.jar ++data/item_exp/FAST+item1.conf Details can be found in our wikihttps://github.com/ml-smores/fast/wiki

Page 42: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

•  XXX_Prediction.csv – P(Correct) – Knowledge estimation: P(Learned|O) …

•  XXX_Evaluation.csv – Overall AUC, mean AUC …

•  XXX_Parameters.csv – Non parameterized – Parameterized: feature weights

•  Runtime.log

Toolkit Setup -- output data/item_exp/

Page 43: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

1.  Low Complexity with Many Features 2.  Flexible Feature Engineering 3.  Flexible Parameterization 4.  High Predictive Performance, Plausibility and

Consistency •  Toolkit 1-2-3 •  Walk-through Examples

1.  Item difficulty 2.  Temporal Item Response Theory

•  Conclusion

Page 44: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Modeling item difficulty

•  Within the same skill, students may perform well on easier items (problems), and worse on harder items.

•  Probably harder items have lower guess and higher slip?

•  We use binary item dummies (indicators) as features to parameterize guess and slip probabilities.

Page 45: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Results on the Java dataset

Overall AUC Mean AUC Knowledge Tracing .71 ± .01 .58 FAST+item .75 ± .01 .68

6% improvement

•  Java tutoring system Quizjet (Hsiao et al ’10) •  20,808 observations, 19 skills, 110 students, 70% correct. •  Randomly select 80% in train, 20% in test •  Parameterizing emission

17% improvement

95% confidence intervals

Page 46: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Current experiment •  Here, we experiment on a public dataset from PSLC

Datashop, the Geometry dataset (Koedinger et al ’10). •  5,055 observations, 18 skills, 59 students, 75% correct. •  Randomly selected 80% students in train, remaining in test.

Model #Random restart Parameterization KT1 1 / KT2 20 / FAST+item1 1 emission FAST+item2 20 emission FAST+item3 20 initial, transition, emission

Page 47: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Have a look at the input data…

Required columns for KT and FAST Feature columns for FAST models

data/item_exp/train0.csv

Page 48: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Have a look at the configuration file KT1.conf

modelName KT1 parameterizing false parameterizingInit false parameterizingTran false parameterizingEmit false forceUsingAllInputFeatures false

nbRandomRestart 1

inDir ./data/item_exp/ outDir ./data/item_exp/ trainInFilePrefix train testInFilePrefix test inFileSuffix .csv EMMaxIters 500 LBFGSMaxIters 50 EMTolerance 1.0E-6 LBFGSTolerance 1.0E-6

data/item_exp/KT1.conf

Page 49: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Let’s run Knowledge Tracing baseline first … •  java -jar fast-2.1.0-final.jar ++data/item_exp/KT1.conf •  Open KT1_Evaluation.csv (data/item_exp/)

Model #restart Overall AUC Mean AUC Time(s)

KT1 1 .71 .55 1

•  KT2.conf only changes nbRandomRestarts to 20 •  java -jar fast-2.1.0-final.jar ++data/item_exp/KT2.conf •  Open KT2_Evaluation.csv

Model #restart Overall AUC Mean AUC Time(s)

KT2 20 .71 .56 11

Page 50: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Let’s run FAST with item features …

modelName FAST+item1 parameterizing true parameterizingInit false parameterizingTran false parameterizingEmit true forceUsingAllInputFeatures true nbRandomRestart 1

inDir ./data/item_exp/ outDir ./data/item_exp/ trainInFilePrefix train testInFilePrefix test inFileSuffix .csv EMMaxIters 500 LBFGSMaxIters 50 EMTolerance 1.0E-6 LBFGSTolerance 1.0E-6

data/item_exp/FAST+item1.conf

Page 51: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Let’s run FAST+item parameterizing emission probabilities …

•  Run FAST+item1: java -jar fast-2.1.0-final.jar ++data/item_exp/FAST+item1.conf

•  Run FAST+item2 (nbRandomRestart=20): java -jar fast-2.1.0-final.jar ++data/item_exp/FAST+item2.conf

•  Open FAST+item1_Evaluation.csv, FAST+item2_Evaluation.csv

Model #restart Overall AUC Mean AUC Time(s)

KT2 20 .71 .56 11

FAST+item1 1 .71 .58 10

FAST+item2 20 .72 .60 145

7% improvement

Page 52: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

What about parameterizing all the probabilities?

modelName FAST+item3 parameterizing true parameterizingInit true parameterizingTran true parameterizingEmit true forceUsingAllInputFeatures true nbRandomRestart 1

inDir ./data/item_exp/ outDir ./data/item_exp/ trainInFilePrefix train testInFilePrefix test inFileSuffix .csv EMMaxIters 500 LBFGSMaxIters 50 EMTolerance 1.0E-6 LBFGSTolerance 1.0E-6

data/item_exp/FAST+item3.conf

Page 53: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

What about parameterizing all the probabilities?

•  java -jar fast-2.1.0-final.jar ++data/item_exp/FAST+item3.conf •  Open FAST+item3_Evaluation.csv •  For running parameterizing all probabilities with 20 restarts, we need

more than 7 minutes, yet we can get the same result with only 1 restart (FAST+item3).

Model #restart Parameterization

Overall AUC

Mean AUC

Time (s)

KT2 20 / .71 .56 11

FAST+item3 1 Initial, transition, emission

.72 .62 27

11% improvement

Page 54: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

1.  Low Complexity with Many Features 2.  Flexible Feature Engineering 3.  Flexible Parameterization 4.  High Predictive Performance, Plausibility and

Consistency •  Toolkit 1-2-3 •  Walk-through Examples

1.  Item difficulty 2.  Temporal Item Response Theory

•  Conclusion

Page 55: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Two paradigms: (50 years of research in 1 slide) •  Knowledge Tracing

– Allows learning – Every item = same difficulty – Every student = same ability

•  Item Response Theory – NO learning – Models items difficulties – Models student abilities

Page 56: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Can FAST help merging the paradigms?

Page 57: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Item Response Theory

•  The simplest of its forms, it’s the Rasch model

•  The Rasch can be formulated in many ways: – Typically using latent variables – Logistic regression

•  a feature per student •  a feature per item •  We end up with a lot of features! – Good thing we

are using FAST ;-)

Page 58: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Results on the Java dataset

Overall AUC Mean AUC Knowledge Tracing .67 ± .03 .56 FAST + IRT .76 ± .03 .70

13% improvement

•  Java tutoring system Quizjet (Hsiao et al ’10) •  6,549 observations (first attempts), 60% correct. •  Randomly select 50% students in train, for remaining students,

place the first half of practices in train and predict the rest •  Only consider parameterizing emission

25% improvement

Page 59: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Current experiment •  We choose one skill, “Nested Loops” from the Java dataset

(Caution: this is a private dataset, please don’t distribute this subset).

•  Randomly select 50% students in train, for remaining students, place the first half of practices in train and predict the rest

•  Only consider parameterizing emission •  The toolkit can automatically generate student and item dummy

features according to “student” and “problem” columns from train and test sets.

•  Here, we force both hidden states share features, which means the student ability or item difficulty remains the same whether the student is in learned or unlearned states.

Page 60: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Have a look at the input data… Train set

Test set

Training Datapoints

Test Datapoints

•  We need to put the entire skill-student sequence in test set (using “fold” column to differentiate datapoints from train or test). This is for allowing the toolkit updating knowledge estimation by historical practices.

Page 61: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Have a look at the configuration file for FAST …

modelName FAST+IRT1 parameterizing true parameterizingInit false parameterizingTran false parameterizingEmit true forceUsingAllInputFeatures true generateStudentDummy true generateItemDummy true

nbRandomRestart 1

inDir ./data/item_exp/ outDir ./data/item_exp/ trainInFilePrefix train testInFilePrefix test inFileSuffix .csv EMMaxIters 500 LBFGSMaxIters 50 EMTolerance 1.0E-6 LBFGSTolerance 1.0E-6

data/IRT_exp/FAST+IRT1.conf

Page 62: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Let’s run baselines and FAST+IRT models …

•  Type the following command consecutively for 1) KT1.conf, 2) KT2.conf, 3) FAST+item1.conf, 4) FAST+item2.conf

java -jar fast-2.1.0-final.jar ++data/IRT_exp/XXX.conf •  Open XXX_Evaluation.csv

Model #restart AUC Time(s)

KT1 1 .60 <1

KT2 20 .59 1

FAST+IRT1 1 .71 3

FAST+IRT2 20 .73 39

24% improvement

Page 63: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Outline •  Introduction •  FAST – Feature-Aware Student Knowledge Tracing

1.  Low Complexity with Many Features 2.  Flexible Feature Engineering 3.  Flexible Parameterization 4.  High Predictive Performance, Plausibility and

Consistency •  Toolkit 1-2-3 •  Walk-through Examples

1.  Item difficulty 2.  Temporal Item Response Theory

•  Conclusion

Page 64: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Comparison of existing techniques

March 28, 2014 64

allows features

slip/ guess

recency/ordering learning

FAST ✓   ✓   ✓   ✓  PFA Pavlik et al ’09 ✓   ✗   ✗   ✓  Knowledge Tracing Corbett & Anderson ’95 ✗   ✓   ✓   ✓  Rasch Model Rasch ’60 ✓   ✗   ✗   ✗  

Page 65: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

•  FAST lives by its name •  FAST provides high flexibility in utilizing

features, and as our studies show, even with simple features improves significantly over Knowledge Tracing

Page 66: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

•  The effect of features depends on how smartly they are designed and on the dataset

•  We are looking forward for more clever uses of feature engineering for FAST in the community.

Page 67: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Thank you!

Page 68: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Multiple subskills

•  Experts annotated items (question) with a single skill and multiple subskills

Page 69: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Multiple subskills & KnowledgeTracing •  Original Knowledge Tracing can not

model multiple subskills •  Most Knowledge Tracing variants assume

equal importance of subskills during training (and then adjust it during testing)

•  State of the art method, LR-DBN [Xu and

Mostow ’11] assigns importance in both training and testing

Page 70: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

FAST can handle multiple subskills

•  Parameterize learning •  Parameterize slip and guess

•  Features: binary variables that indicate presence of subskills

Page 71: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

FAST vs Knowledge Tracing: Slip parameters of subskills

•  Conventional Knowledge assumes that all subskills have the same difficulty (red line)

•  FAST can identify different difficulty between subskills

•  Does it matter?

subskills within a skill:

Page 72: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

State of the art (Xu & Mostow’11)

•  The 95% of confidence intervals are within +/- .01 points

Model AUC

LR-DBN .71

KT - Weakest .69 KT - Multiply .62

Page 73: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Benchmark Model AUC

LR-DBN .71 Single-skill KT .71 KT - Weakest .69 KT - Multiply .62

•  The 95% of confidence intervals are within +/- .01 points •  We are testing on non-overlapping students, LR-DBN was

designed/tested in overlapping students and didn’t compare to single skill KT

!  

Page 74: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Benchmark Model AUC

LR-DBN .71 Single-skill KT .71 KT - Weakest .69 KT - Multiply .62

•  The 95% of confidence intervals are within +/- .01 points •  We are testing on non-overlapping students, LR-DBN was

designed/tested in overlapping students and didn’t compare to single skill KT

!  

Page 75: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Benchmark

•  The 95% of confidence intervals are within +/- .01 points

Model AUC FAST .74 LR-DBN .71 Single-skill KT .71 KT - Weakest .69 KT - Multiply .62

Page 76: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Have a look at the input data…

data/others/FAST+subskill_train0.csv

Page 77: 2015EDM: Feature-Aware Student Knowledge Tracing Tutorial

Let’s run FAST+subskill models …

•  Move to “data/others/” folder •  Copy fast-2.1.0-final.jar under this folder •  Create “input” folder under this folder, and put FAST

+subskill_train0.txt and FAST+subskill_test0.txt under it. •  Type the following command for

java -jar fast-2.1.0-final.jar ++FAST+subskill.conf