Generating Semantic Concept Map for MOOCs · 2016-06-28 · Generating Semantic Concept Map for...
Transcript of Generating Semantic Concept Map for MOOCs · 2016-06-28 · Generating Semantic Concept Map for...
Generating Semantic Concept Map for MOOCs
Zhuoxuan Jiang, Peng Li, Yan Zhang, Xiaoming Li
School of Electronics Engineering and Computer Science, Peking University, China
Methodology
MOOC Materials Need Reorganization• Improve experience of learning process (adaptive learning)
• Knowledge management across domains
• Concept map, useful but manually costly (experts required)
Semantic Concept Map (SCM)Definition: SCM = {C, R}. C is set of concepts {c1,c2,...,cn}. R is
set of relationships (edges) between concepts {r11,...,rij,...,rnn}
• Consider only the semantic relationship
• Concise for program to run large-scalely
• Universal to be scaled to cross-domain courses
Applications of SCM• Discrimination of approximate concepts (e.g. for quiz)
• Crowdsourcing of course Wiki for social learning
• Concept hints in forum discussions
Two-phase SCM generation approach• Phase 1: Concept extraction
• Phase 2: Relationship establishment
Introduction
Textual Preprocessing• Tokenization, filtering, word segment, etc.
• Conflation, data are randomly shuffled
Concept Extraction• Sequence labeling problem, each word ϵ {B, I, O}
• Course- and instructor-agnostic features: stylistic, structural,
contextual, semantic and dictionary features
• CRF + semi-supervised machine learning framework
• Self-training, to reduce the cost of labeling
Definition of Node and Edge• Node weight for Fundamentality: term frequency(tf)
• Node weight for Importance: term frequency and inverted
document frequency (tf-idf)
• Edge weight: semantic similarity (cosine distance of concept
word vectors)
Learning Path Generation• For each concept, iteratively choose top k most semantically
similar concepts
• Regard the most fundamental/important ones as successors
• Consider locational order of appearance in Lecture Notes
Experiments
Data SetsTeaching materials of an interdisciplinary course conducted on
Coursera, including lecture notes, PPTs and questions
Baselines to Extract Concepts• Term Frequency (TF), a statistic baseline
• Bootstrapping (BT), a rule-based iterative algorithm
• Supervised Concept-CRF (SC-CRF), a supervised CRF
• Semi-Supervised Concept-CRF (SSC-CRF), semi-supervised
version of SC-CRF
ResultsTable 2: Performance of
Baselines and Our Approach
Figure 1: Procedure of Semantic Concept Map Generation
Figure 2: Comparison between SC-CRF
and SSC-CRF
Table 1: Statistics of the Course: People and Network
Figure 3: Two Kinds of Semantic Concept Map and Learning Path Generated
(b) For importance(a) For fundamentality
Conclusion• Technical and pedagogical feasibility of the novel SCM
• A universal two-phase approach to re-organize MOOC
materials with a good efficacy
• The SCMs and learning paths generated can be manually
revised up to requirements
• To be scaled to more across-domain courses in the future
Teaching
Materials
Textual
Preprocessing
Concept
Extraction
Semantic
Concept
Map
Learning Path
Generation
Definition of
Node and Edge
Phase 1 Phase 2
Two demo learning paths:(a) Node → Edge → Element → Set → Alternative → Vote
(b) PageRank → PageRankAlgorithm → SmallWorld → Balance → NashBalance