Special Topics in Educational Data Mining
description
Transcript of Special Topics in Educational Data Mining
![Page 1: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/1.jpg)
Special Topics in Educational Data Mining
HUDK5199Spring term, 2013January 23, 2013
![Page 2: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/2.jpg)
Wow
• Welcome!
• There’s a lot of you– It’s great to see so much interest in EDM
![Page 3: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/3.jpg)
Administrative Stuff
• Is everyone signed up for class?
• If not, and you want to receive credit, please talk to me after class
![Page 4: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/4.jpg)
Class Schedule
![Page 5: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/5.jpg)
Class Schedule
• Updated versions will be available on the course webpage
• PDF files are also available there for publicly available readings
• Other readings will be made available in a course Dropbox
![Page 6: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/6.jpg)
Class Schedule
• After I made the schedule– Two days a week, 100 minutes per session
• I was told that I’m teaching double the content of a typical class at TC
![Page 7: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/7.jpg)
Class Schedule
• After I made the schedule– Two days a week, 100 minutes per session
• I was told that I’m teaching double the content of a typical class at TC
• Oops!– Newbie mistake!
![Page 8: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/8.jpg)
My Solution
• I have cut the course schedule to be only 50% more content than usual
• These cuts also coincide with when I need to travel to the LAK, CREA, and AERA conferences– Might as well solve two problems at once…
• See the course schedule…
![Page 9: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/9.jpg)
Required Texts
• Witten, I.H., Frank, E. (2011) Data Mining: Practical Machine Learning Tools and Techniques.
![Page 10: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/10.jpg)
Readings
• This is a graduate class
• I expect you to decide what is crucial for you
• And what you should skim to be prepared for class discussion and for when you need to know it in 8 years
![Page 11: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/11.jpg)
Readings
• That said
![Page 12: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/12.jpg)
Readings and Participation
• It is expected that you come to class– I will not be taking attendance
• It is expected that you be prepared for class by skimming the readings to the point where you can participate effectively in class discussion– I will not be giving quizzes
• This is your education, make the most of it!
![Page 13: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/13.jpg)
Course Goals
• This course covers methods from the emerging area of educational data mining.
• You will learn how to execute these methods in standard software packages
• And the limitations of existing implementations of these methods.
• Equally importantly, you will learn when and why to use these methods.
![Page 14: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/14.jpg)
Course Goals
• Discussion of how EDM differs from more traditional statistical and psychometric approaches will be a key part of this course
• In particular, we will study how many of the same statistical and mathematical approaches are used in different ways in these research communities.
![Page 15: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/15.jpg)
Assignments
• There will be 10 homeworks
• You choose 8 of them to complete– 4 from the first 5 (e.g. HW 1-5)– 4 from the second 5 (e.g. HW 6-10)
![Page 16: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/16.jpg)
Assignments
• Homeworks will be due at least 3 hours before the beginning of class (e.g. noon) on the due date
• Since you have a choice of homeworks, extensions will only be granted for instructor error or extreme circumstances– Outside of these situations, late = 0 credit
• Homeworks will be due before the class session where their topic is discussed
![Page 17: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/17.jpg)
Why?
• These are not your usual homeworks
• Most homework is assigned after the topic is discussed in class, to reinforce what is learned
• This homework is due before the topic is discussed in class, to enable us to talk more concretely about the topic in class
![Page 18: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/18.jpg)
These homeworks
• These homeworks will not require flawless, perfect execution
• They will require personal discovery and learning from text resources
• Giving you a base to learn more from class discussion
![Page 19: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/19.jpg)
Because of that
• You must be prepared to discuss your work in class
• You do not need to create slides
• But be prepared – to have your assignment projected– to discuss aspects of your assignment in class
![Page 20: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/20.jpg)
I’m not your textbook
• I want you to learn what you can from the readings and homework
• And then we’ll leverage my experience in discussing the issues the readings and homeworks bring forth
![Page 21: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/21.jpg)
Homework
• All assignments for this class are individual assignments– You must turn in your own work– It cannot be identical to another student’s work– The goal is to get diverse solutions we can discuss in class
• However, you are welcome to discuss the readings or technical details of the assignments with each other
![Page 22: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/22.jpg)
Examples
• Buford can’t figure out the UI for the software tool. Alpharetta helps him with the UI.– OK!
• Deanna is struggling to understand the item parameter in PFA to set up the mathematical model. Carlito explains it to her.– OK!
![Page 23: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/23.jpg)
Examples
• Fernando and Evie do the assignment together from beginning to end, but write it up separately. – Not OK
• Giorgio and Hannah do the assignment separately, but discuss their (fairly different) approaches over lunch – OK!
![Page 24: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/24.jpg)
Plagiarism and Cheating: Boilerplate Slide
• Don’t do it
• If you have any questions about what it is, talk to me before you turn in an assignment that involves either of these
• University regulations will be followed to the letter
• That said, I am not really worried about this problem in this class
![Page 25: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/25.jpg)
Grading
• 8 of 10 Assignments – 10% each (up to a maximum of 80%)
• Class participation 20%
• PLUS: For every homework, there will be a special bonus of 20% for the best hand in. ‐“Best” will be defined in each assignment.
![Page 26: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/26.jpg)
Examinations
• None
![Page 27: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/27.jpg)
Accommodations for Students with Disabilities
• See syllabus and then see me
![Page 28: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/28.jpg)
Questions
• Any questions on the syllabus, schedule, or administrative topics?
![Page 29: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/29.jpg)
Who are you
• And why are you here?
• What kind of methods do you use in your research/work?
• What kind of methods do you see yourself wanting to use in the future?
![Page 30: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/30.jpg)
This Class
![Page 31: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/31.jpg)
EDM
“Educational Data Mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational settings, and using those methods to better understand students, and the settings which they learn in.”
(www.educationaldatamining.org)
![Page 32: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/32.jpg)
32
EDM is…
• “… escalating the speed of research on many problems in education.”
• “Not only can you look at unique learning trajectories of individuals, but the sophistication of the models of learning goes up enormously.”
• Arthur Graesser, Editor, Journal of Educational Psychology
![Page 33: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/33.jpg)
33
EDM is…
• “… great.”
• Me
![Page 34: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/34.jpg)
34
Types of EDM method(Baker & Yacef, 2009)
• Prediction– Classification– Regression– Density estimation
• Clustering• Relationship mining
– Association rule mining– Correlation mining– Sequential pattern mining– Causal data mining
• Distillation of data for human judgment• Discovery with models
![Page 35: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/35.jpg)
35
Types of EDM method(Baker & Siemens, in preparation)
• Prediction– Classification– Regression– Latent Knowledge Estimation
• Structure Discovery– Clustering– Factor Analysis– Domain Structure Discovery– Network Analysis
• Relationship mining– Association rule mining– Correlation mining– Sequential pattern mining– Causal data mining
• Distillation of data for human judgment• Discovery with models
![Page 36: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/36.jpg)
Prediction
• Develop a model which can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables)
• Which students are off-task?• Which students will fail the class?
![Page 37: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/37.jpg)
Structure Discovery
• Find structure and patterns in the data that emerge “naturally”
• No specific target or predictor variable
![Page 38: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/38.jpg)
Structure Discovery
• Different kinds of structure discovery algorithms find…
![Page 39: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/39.jpg)
Structure Discovery
• Different kinds of structure discovery algorithms find… different kinds of structure– Clustering: commonalities between data points– Factor analysis: commonalities between variables– Domain structure discovery: structural
relationships between data points (typically items)– Network analysis: network relationships between
data points (typically people)
![Page 40: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/40.jpg)
Relationship Mining
• Discover relationships between variables in a data set with many variables– Association rule mining– Correlation mining– Sequential pattern mining– Causal data mining
![Page 41: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/41.jpg)
Discovery with Models
• Pre-existing model (developed with EDM prediction methods… or clustering… or knowledge engineering)
• Applied to data and used as a component in another analysis
![Page 42: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/42.jpg)
Distillation of Data for Human Judgment
• Making complex data understandable by humans to leverage their judgment
![Page 43: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/43.jpg)
Why now?
• Just plain more data available
• Education can start to catch up to research in Physics and Biology…
![Page 44: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/44.jpg)
Why now?
• Just plain more data available
• Education can start to catch up to research in Physics and Biology… from the year 1985
![Page 45: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/45.jpg)
Why now?
• In particular, the amount of data available from educational software is orders of magnitude more than was available just a decade ago
• Supported by open educational data bases like the PSLC DataShop (next week)
![Page 46: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/46.jpg)
Learning Analytics
• A closely related community
• Who here has heard of Learning Analytics?
![Page 47: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/47.jpg)
Two communities
• Society for Learning Analytics Research– First conference: LAK2011
• International Educational Data Mining Society– First event: EDM workshop in 2005 (at AAAI)– First conference: EDM2008– Publishing JEDM since 2009
![Page 48: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/48.jpg)
Learning Analytics
“… the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs.”
![Page 49: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/49.jpg)
Two communities
• Joint goal of exploring the “big data” now available on learners and learning
• To promote– New scientific discoveries & to advance learning sciences– Better assessment of learners along multiple dimensions
• Social, cognitive, emotional, meta-cognitive, etc.• Individual, group, institutional, etc.
– Better real-time support for learners
![Page 50: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/50.jpg)
Key Distinctions(Siemens & Baker, 2012)
![Page 51: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/51.jpg)
Key Distinctions: Origins
• LAK– Semantic web, intelligent curriculum, social
networks, outcome prediction, and systemic interventions
• EDM– Educational software, student modeling, course
outcomes
![Page 52: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/52.jpg)
Key Distinctions: Modes of Discovery
• LAK– Leveraging and supporting human judgment is key;
automated discovery is a tool to accomplish this goal– Information distilled and presented to human decision-
maker
• EDM– Automated discovery is key; leveraging human judgment is
a tool to accomplish this goal– Humans provide labels which are used in classifiers
![Page 53: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/53.jpg)
Key Distinctions: Guiding Philosophy
• LAK– Stronger emphasis on understanding systems as
wholes, in their full complexity– “Holistic” approach
• EDM– Stronger emphasis on reducing to components
and analyzing individual components and relationships between them
![Page 54: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/54.jpg)
Key Distinctions: Adaptation and Personalization
• LAK– Greater focus on informing and empowering
instructors and learners and influencing the design of the education system
• EDM– Greater focus on automated adaption (e.g. by the
computer with no human in the loop) and influencing the design of interactions
![Page 55: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/55.jpg)
Questions? Comments?
![Page 56: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/56.jpg)
Tools• There are a bunch of tools you can use in this class
– I don’t have strong requirements about which tools you choose to use
• We’ll talk about them throughout the semester
• You may want to think about downloading or setting up accounts for– RapidMiner (I prefer 4.6. 5.2 is fine, I just will not be able to give as much tech
support)– SAS OnDemand for Academics– Weka– Microsoft Excel– Java– Matlab
• No hurry, but keep it in mind…
![Page 57: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/57.jpg)
Next Class• Monday, January 8• 3pm-4:40pm
• Bayesian Knowledge Tracing
• Corbett, A.T., Anderson, J.R. (1995) Knowledge Tracing: Modeling the Acquisition of Procedural Knowledge. User Modeling and User-Adapted Interaction, 4, 253-278.
• Baker, R.S.J.d., Corbett, A.T., Aleven, V. (2008) More Accurate Student Modeling Through Contextual Estimation of Slip and Guess Probabilities in Bayesian Knowledge Tracing. Proceedings of the 9th International Conference on Intelligent Tutoring Systems, 406-415
![Page 58: Special Topics in Educational Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062410/56816381550346895dd46564/html5/thumbnails/58.jpg)
The End