Generating Semantic Concept Map for MOOCs · 2016-06-28 · Generating Semantic Concept Map for...

Post on 01-Apr-2020

13 views 0 download

Transcript of Generating Semantic Concept Map for MOOCs · 2016-06-28 · Generating Semantic Concept Map for...

Generating Semantic Concept Map for MOOCs

Zhuoxuan Jiang, Peng Li, Yan Zhang, Xiaoming Li

School of Electronics Engineering and Computer Science, Peking University, China

Methodology

MOOC Materials Need Reorganization• Improve experience of learning process (adaptive learning)

• Knowledge management across domains

• Concept map, useful but manually costly (experts required)

Semantic Concept Map (SCM)Definition: SCM = {C, R}. C is set of concepts {c1,c2,...,cn}. R is

set of relationships (edges) between concepts {r11,...,rij,...,rnn}

• Consider only the semantic relationship

• Concise for program to run large-scalely

• Universal to be scaled to cross-domain courses

Applications of SCM• Discrimination of approximate concepts (e.g. for quiz)

• Crowdsourcing of course Wiki for social learning

• Concept hints in forum discussions

Two-phase SCM generation approach• Phase 1: Concept extraction

• Phase 2: Relationship establishment

Introduction

Textual Preprocessing• Tokenization, filtering, word segment, etc.

• Conflation, data are randomly shuffled

Concept Extraction• Sequence labeling problem, each word ϵ {B, I, O}

• Course- and instructor-agnostic features: stylistic, structural,

contextual, semantic and dictionary features

• CRF + semi-supervised machine learning framework

• Self-training, to reduce the cost of labeling

Definition of Node and Edge• Node weight for Fundamentality: term frequency(tf)

• Node weight for Importance: term frequency and inverted

document frequency (tf-idf)

• Edge weight: semantic similarity (cosine distance of concept

word vectors)

Learning Path Generation• For each concept, iteratively choose top k most semantically

similar concepts

• Regard the most fundamental/important ones as successors

• Consider locational order of appearance in Lecture Notes

Experiments

Data SetsTeaching materials of an interdisciplinary course conducted on

Coursera, including lecture notes, PPTs and questions

Baselines to Extract Concepts• Term Frequency (TF), a statistic baseline

• Bootstrapping (BT), a rule-based iterative algorithm

• Supervised Concept-CRF (SC-CRF), a supervised CRF

• Semi-Supervised Concept-CRF (SSC-CRF), semi-supervised

version of SC-CRF

ResultsTable 2: Performance of

Baselines and Our Approach

Figure 1: Procedure of Semantic Concept Map Generation

Figure 2: Comparison between SC-CRF

and SSC-CRF

Table 1: Statistics of the Course: People and Network

Figure 3: Two Kinds of Semantic Concept Map and Learning Path Generated

(b) For importance(a) For fundamentality

Conclusion• Technical and pedagogical feasibility of the novel SCM

• A universal two-phase approach to re-organize MOOC

materials with a good efficacy

• The SCMs and learning paths generated can be manually

revised up to requirements

• To be scaled to more across-domain courses in the future

Teaching

Materials

Textual

Preprocessing

Concept

Extraction

Semantic

Concept

Map

Learning Path

Generation

Definition of

Node and Edge

Phase 1 Phase 2

Two demo learning paths:(a) Node → Edge → Element → Set → Alternative → Vote

(b) PageRank → PageRankAlgorithm → SmallWorld → Balance → NashBalance