Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian...

24
AAIR 2011 Forum Putting the Spotlight on Identifying Students at Risk of Attrition A Case Study in Applied Data Analytics © 2011 IBM Corporation Nathan Banks, IBM Dean Ward, Edith Cowan University

Transcript of Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian...

Page 1: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

AAIR 2011 Forum

Putting the Spotlight on Identifying Students at Risk of Attritiong p g y gA Case Study in Applied Data Analytics

© 2011 IBM Corporation

Nathan Banks, IBMDean Ward, Edith Cowan University

Page 2: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Attrition within Universities

All Universities are exposed to the Attrition

Large resources are devoted to reduce Attrition with varying impacts and Large resources are devoted to reduce Attrition with varying impacts and results

Attrition occurs at various points, with some major points being –Attrition occurs at various points, with some major points being oApplications Offered to Enrolments CommencedoEnrolment Commenced to First Census DateoFirst Census to end of 1st SemesteroFirst Census to end of 1 Semestero1st Semester to End of 1st Yearo1st Year to 2nd Year

2nd Year to 3rd Yearo2nd Year to 3rd Year

E h U i i ’ “ fil ” i diff h h f ll i iEach University’s “profile” is different, however the following is a generalised view -

© 2011 IBM Corporation2

Page 3: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Attrition within Australian Universities – Completion Rate

100%

80%90%

100%

60%70%80%

30%40%50%

10%20%30% Lowest Mid Highest

0%Commence Commence

and make itMake it

through theMake it

through toMake it

through toand make it to first

Census Date

through the following

Year

through to the end of

Year 2

through to the end of

Year 3

© 2011 IBM Corporation3

Page 4: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Predictive Analytics and Attrition

L ’ f i i i hLot’s of activity in the sector –

1) New York – Mayor Bloomberg – “get the Completion Rate up” from 46.5% in 2005 A l ti d t U i’ 61% (O t b 2011)2005 – Analytics used at some Uni’s – now 61% (October, 2011)

2) Models deployed and in use at a number of Universities in the USA - US Coast Guard Academy Arizona/Nevada Uni’s implement and improve retentionCoast Guard Academy, Arizona/Nevada Uni s implement and improve retention by 4% - “We have limited resources in terms of our student assistance program, and we want to make sure that we engage the right students and are not spending time on students that really don’t need the help ”not spending time on students that really don t need the help.

3) Bill & Melinda Gates Foundation – awarded $1M (USD) to unify data from Five3) Bill & Melinda Gates Foundation awarded $1M (USD) to unify data from Five Uni’s to “demonstrate the use of predictive analytics methods for improving student outcomes.” (May, 2011, Western Interstate Commission for Higher Education))• 640,000 anonymous student records• 3,000,000 Unit records• 33 common variables

© 2011 IBM Corporation4

• First finding were due to be released on the 28th October

Page 5: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

ECU and IBM – Predictive Analytics

ECU in partnership with IBM sought to -

1)Determine a probability of Attrition for major student cohorts – Domestic Students, Undergraduate and certain Post Graduates through the use of a 450 variable dataset

2)Provide a drill through to how the probability was arrived at ie the driving variables

© 2011 IBM Corporation5

Page 6: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Overview

Agreed Scope of Modelling – Degree Type

Agreed Scope of Modelling – Semester of Study

Dealing with False Attrition – Combining/Excluding Course switchers

Model Building Approach

Predictive Characteristics Before 1st Semester

Attrition Model before Semester 1

Attrition Timing During 1st Semester

Predictive Characteristics Before 2nd SemesterPredictive Characteristics Before 2 Semester

Attrition During 2nd Semester

Model Score DistributionsModel Score Distributions

Recommendations

K Fi di

© 2011 IBM Corporation

Key Findings

6

Page 7: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Agreed Scope of Modelling – Degree Type

Students 2009-2011

39,655

Higher Education

38 307(97%)

VET, Other1,348 (3%) 38,307(97%)

UndergraduateEnablingPostgraduate Cross Institutn& Other g

23,132 (58%)g

3,021 (8%)g

11,881 (30%) & Other1,180 (3%)

Dip, Adv Dip & Ass Dip

108 (0%)

Bachelor Pass21,808 (96%)

Postgraduate Research506 (1%)

Postgraduate Course work11,375 (29%)

Ba Hons, grad entry. Ass Deg

5,045 (8%)

Grad Cert2,316 (20%)

Grad Dip & PG Dip

3 348 (30%)

Masters Coursework5 696 (50%)

Doctorate Coursework

15 (0%)

© 2011 IBM Corporation7

, ( )3,348 (30%)5,696 (50%)15 (0%)

Page 8: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Agreed Scope of Modelling – Semester of Study

Attrition Rate*

14%

16%

18%

Attrition Rate

8%

10%

12%

14%

4%

6%

8%

0%

2%

1 2 3 4 5

Semester of Study

* Attrition is for students enrolled between 2009-2010 in major campuses and excludes

© 2011 IBM Corporation

o s o s ude s e o ed be ee 009 0 0 ajo ca puses a d e c udesstudents who start a second course at ECU in that time.

8

Page 9: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Scope of Modelling – Semester of Study

Over 10% of students move course in the observation periodOver 10% of students move course in the observation period.

If a course code changes from A to B, then a students results need to be combined, failure to do so creates false attrition.combined, failure to do so creates false attrition.

If a student leaves a course A for a completely different course Z then students results should not be combined, or progress in course Z will be overestimated.

There are many cases where the degree of overlap between courses attempted by a student is uncertain e g from double degree to singleattempted by a student is uncertain e.g. from double degree to single degree, same course but different major, same field of study but different course.

Dean and myself made a case-by-case call to combine 735 courses and exclude 4000 students who switched courses guided by the following information:information:

oCourse CodeoCourse description

Co rse stat s (acti e inacti e pre e pire)

© 2011 IBM Corporation

oCourse status (active, inactive, pre-expire), oCourse field of study, same or differentoNumber students moving between courses9

Page 10: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Model Building Approach

1. Remove exclusions, and split students completing their first ,second and third semester into separate datasets.p

2. Split each modelling dataset into a training (75%) and validation (25%) dataset.

3. Classify students as attrite before next semester or don’t attrite before next semester.

4. Apply modelling techniques (neural networks, support vector machines, rule induction, logistic regression) to predict student attrition based on characteristics of students in the training dataset.g

5. Choose the best performing models based on their ability to predict attrition of students in the validation sample, which were not used to b ild h d lbuild the model.

© 2011 IBM Corporation10

Page 11: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Predictive Variables Before 1st SemesterDetailed Basis Of AdmDetailed Basis Of AdmDetailed Basis Of Adm

Narrow Funding CategoryAge At EnrolmentAttendance Type

Broad CitizenshipAdvanced Standing

Detailed Basis Of AdmNarrow Funding Category

Age At EnrolmentAttendance Type

Broad CitizenshipAdvanced StandingAdvanced Standing

Birth CountryCourse Credit Points Required

Course Total EFTSLYear Arrival

Detailed Course Level

Advanced Standing Birth Country

Course Credit Points Required Course Total EFTSLYear Arrival

Detailed Course LevelHome CampusDetailed Course LevelHome Campus

Years To Complete CourseHome Language

Year Completed SchoolCourse Size at Enrolment

pYears To Complete Course

Home LanguageYear Completed School

Course Size at EnrolmentPostgraduate/UndergraduateCourse Size at Enrolment

Postgrad/UndergradBroad Basis Of AdmissionBroad Field Of Education

National SES StatusNon English Speaking Background

g gBroad Basis Of AdmissionBroad Field Of Education

National SES Status Non-English Speaking Background

Highest Parent EducationNon-English Speaking BackgroundHighest Parent Education

ScholarshipCompetence In English

Secondary School National SESCurrent Course Offer Round

ScholarshipCompetence In English

Secondary School National SES Current Course Offer Round

Year 12 Student Count( )

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

Current Course Offer RoundYear 12 Student Count

Tertiary Entrance Rank (TER)

Predicivity (Information Value)

Tertiary Entrance Rank ( TER )Start Semester (1 or 2)

© 2011 IBM Corporation11

Predicivity (Information Value)

Page 12: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Attrition Model before Semester 1

Due to the non-linear, interactive and categorical nature of the predictive characteristics a CHAID (Chi-squared Automatic Interaction Detector) tree was found to perform best.

The Natural first split of the CHAID tree is narrow funding category.

We force a first split undergraduate/postgraduate split which is highly correlated with narrow funding category.

Funding Category Postgrad Undergrad Overall %g g y g gCommonwealth Government Supported 35% 86% 70%Domestic Tuition Fee 47% 0% 15%Fee-paying Overseas 18% 13% 15%UNKNOWN 1% 0% 1%T t l 100% 100% 100%

© 2011 IBM Corporation12

Total 100% 100% 100%

Page 13: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

First Semester Undergraduate – First 13 Nodes

© 2011 IBM Corporation13

Page 14: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

First Semester Postgraduate – First 14 Nodes

© 2011 IBM Corporation14

Page 15: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Predictivity of Postgrad/Undergrad Models

Semester 1 Predictivity

CHAIDUndergrad

Postgrad

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

© 2011 IBM Corporation

Predictivity (Gini)

15

Page 16: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Detailed Basis Of Admission

© 2011 IBM Corporation37

Page 17: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Detailed Basis Of Admission

© 2011 IBM Corporation38

Page 18: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Narrow Funding Category

Commonwealth Government supported students make up majority of the population (80%)

Domestic Tuition fee paying students Funding category is closely related to undergraduate/postgraduate

Unknown represents incomplete data for 120 students (<1%)

© 2011 IBM Corporation

p p ( )

39

Page 19: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Age At Enrolment (Undergraduate)

© 2011 IBM Corporation40

Page 20: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Age At Enrolment (Postgraduate)

© 2011 IBM Corporation41

Page 21: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Attendance Type

© 2011 IBM Corporation42

Page 22: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Broad Citizenship

© 2011 IBM Corporation43

Page 23: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Advanced Standing

© 2011 IBM Corporation44

Page 24: Puttinggpg yg the Spotlight on Identifying Students … Fora/Forum2011...Attrition within Australian Universities – Completion Rate 100% 80% 90% 60% 70% 30% 40% 50% 10% 20% Lowest

Birth Country

© 2011 IBM Corporation45