DMLP
Transcript of DMLP
-
8/9/2019 DMLP
1/6
Aca Format X
BAHUBALI COLLEGE OF ENGINEERING, SHRAVANABELAGOLA
Lesson/Session Plan Template
Department of Information Science & Engineering
Sub code: 06IS74
Sub: DATA MINING Sem: VII
S
N
Date Content Activity
UNIT - 1
INTRODUCTION, DATA 1: What is Data Mining? Motivating Challenges; The origins of data
mining; Data Mining Tasks. Types of Data; Data Quality.
6 Hours
1 What is Data Mining? Participation
&discussion
2 Motivating Challenges;
3 The origins of data mining; Explain with
an example
4 Data Mining Tasks. Discussions5 Types of Data;
6 Types of Data Discussions
7 Data Quality.
Measurement & data Collection Issues
Issues Related To Application
8 QP
Assignment
UNIT 2
DATA 2: Data Preprocessing; Measures of Similarity and Dissimilarity
6 Hours
9 Data Preprocessing
Aggregation
Sampling
Participation
&discussion
10 Data Preprocessing cont..
Dimensionality Reduction
Feature Subset Selection
Feature Creation
Discussions
11 Data Preprocessing cont
Discretization & Binarization
Variable Transformation
Explain with
an example
12 Measures of Similarity and Dissimilarity
Basics
Similarity & Dissimilarity between Simple Attributes
Dissimilarities Between Data Objects
13 Measures of Similarity and Dissimilarity
Similarities Between Data Objects Examples if Proximity Measures
Explain with
an example
14 Issues in Proximity Calculation
1
-
8/9/2019 DMLP
2/6
Aca Format X
Selecting The Right Proximity Measure
15 Question Paper
16 Question Paper
UNIT - 7
FURTHER TOPICS IN DATA MINING: Multidimensional analysis and descriptive mining of
complex data objects; Spatial data mining; Multimedia data mining; Text mining; Mining the WWW.Outlier analysis.
7 Hours
17 Multidimensional analysis and descriptive mining of complex data
objects
Generalization of Structured Data
Aggregation and Approximation in Spatial and Multimedia
Data Generalization
Generalization of Object Identifiers and Class/subclass
Hierarchies
Participation
&discussion
18 Multidimensional analysis and descriptive mining of complex data
objects cont
Generalization of Class Composition Hierarchies
Construction and Mining of Object Cubes
Generalization Based Mining of Plan Databases by Divide
and Conquer
19 Spatial data mining;
Spatial data Cube Construction and Spatial OLAP
Mining Spatial Association and Co-location Patterns
Discussions
20 Spatial Clustering Methods
Spatial Classification and Spatial Trend Analysis
21 Multimedia data mining; Mining Raster Databases
Similarity Search in Multimedia Data
Multidimensional Analysis of Multimedia Data
22 Multimedia data mining; Mining Raster Databases cont..
Classification and Predication Analysis of Multimedia Data
Mining Association in Multimedia Data
Audio & Video Data Mining
23 Text mining Text Data Analysis and Information Retrieval
Dimensionality Reduction for Text
Text Mining Approach
24 Mining the WWW.
Mining the Web page layout structure Mining the Web link Structure to Identify Authoritative
2
-
8/9/2019 DMLP
3/6
Aca Format X
Web Pages
25 Mining Multimedia Data on the Web
Automatic Classification of Web Documents
Web Usage Mining
UNIT - 8
APPLICATIONS: Data mining applications; Data mining system products and research prototypes;
Additional themes on Data mining; Social impact of Data mining; Trends in Data mining.
6 Hours
26 Data mining applications;
Data mining for Financial Data Analysis
Retail Industry
Telecommunication Industry
Participation
&discussion
27 Biological Data Analysis
Other Scientific Application
Intrusion Detection
28 Data mining system products and research prototypes;
How to Choose a Data mining System
Examples of Commercial Data Mining Systems
Explain with
an example
29 Additional themes on Data mining
Theoretical Foundation of Data Mining
Statistical Data Mining
Discussions
30 Visual and Audio Data Mining
Data Mining Privacy and Data Security
31 Social impact of Data mining;
Ubiquitous and Invisible Data Mining
Data Mining Privacy and Data Security
32 Trends in Data mining.
33 Question Bank Discussions
UNIT 3
CLASSIFICATION: Preliminaries; General approach to solving a classification problem; Decision
tree induction; Rule-based classifier; Nearest-neighbor classifier.
8 Hours
34 General approach to solving a classification problem; Participation
&discussion
35 Decision tree induction;
How a Decision Tree Works
How To Build A Decision Tree
Method for expressing attribute test conditions
Discussions
36 Decision tree induction cont Discussions
3
-
8/9/2019 DMLP
4/6
Aca Format X
Measure for selecting the best split
Algorithm for decision tree induction
An example : web robot detection
Characteristics Of decision tree induction
37 Rule-based classifier
How a rule based classifier works
Rule ordering schemes
How to build a rule based classifier
Discussions
38 Rule-based classifier cont..;
Direct methods for rule extraction
Indirect method for rule extraction
Characteristics of rule based classifier39 Nearest-neighbor classifier.
Algorithm
Characteristics Of Nearest Neighbor Classifier
Discussions
40 Question Paper Assignement
UNIT - 4
ASSOCIATION ANALYSIS 1: Problem Definition; Frequent Itemset generation; Rule
Generation; Compact representation of frequent itemsets; Alternative methods for generating frequent
itemsets.
6 Hours
41 Problem Definition; Participation&discussion
42 Frequent Itemset generation;
The Apriori Principal
Frequent Itemset Generation in the Apriori Algorithm
Candidate Generation and Pruning
Support Counting
Computational Complexity
43 Rule Generation;
Confidence Based Pruning
Rule Generation in Apriori Algorithm
An Example: Congressional Voting Records
Discussions
44 Compact representation of frequent itemsets;
Maximal Frequent Itemsets
Closed Frequent Itemsets
45 Alternative methods for generating frequent itemsets. Discussions
46 Alternative methods for generating frequent itemsets.
47 Question paper AssignmentUNIT - 5
ASSOCIATION ANALYSIS 2: FP-Growth algorithm, Evaluation of association patterns; Effect
4
-
8/9/2019 DMLP
5/6
Aca Format X
of skewed support distribution; Sequential patterns.
6 Hours
48 FP-Growth algorithm, FP Tree Representation
Frequent Itemset
Generation in FP Growth Algorithm
Participation
&discussion
49 Evaluation of association patterns;
Objective Measures of Interestingness
Measure beyond pairs of Objective measures of
Interestingness binary variables
Simsons Paradox
Discussions
50 Effect of skewed support distribution;
51 Problem Formulation
Sequential Pattern Discovery
Timing Constraints
Alternative Counting Schemes
Explain with
an example
52 Sequential patterns
53 Question paper Assignment
UNIT - 6
CLUSTER ANALYSIS: Overview, K-means, Agglomerative hierarchical clustering, DBSCAN,
Overview of Cluster Evaluation.
7 Hours
54 Overview,
What Is Cluster Analysis
Different Types of Clustering
Different Types of Clusters
Participation
&discussion
55 K-means,
The basic K-means Algorithm
K-means: Additional issues
Bisecting K-Means
K-Means and Different Types of Cluster
Strength and Weaknesses
K-means as an Optimization Problem
56 Agglomerative hierarchical clustering
Basic Agglomerative Hierarchical Clustering Algorithm
Specific Techniques
The Launce-Williams Formula for Cluster
Key issue in Hierarchical Clustering Strength & Weakness
Discussions
57 DBSCAN
5
-
8/9/2019 DMLP
6/6
Aca Format X
Traditional Density: Center-Based Approach
The DBSCAN Algorithm
Strengths and Weaknesses
58 Overview of Cluster Evaluation.
Overview
Unsupervised Cluster Evaluation Using Cohesion and
Separation
Unsupervised Cluster Evaluation Using Proximity Matrix
Unsupervised Evaluation of Hierarchical Clustering
Discussions
59 Overview of Cluster Evaluation.
Determining the correct Number of Clusters
Clustering Tendency
Supervised Measures of Cluster Validity
Assessing the Significance of Cluster Validity Measures
60 Question paper Assignment
TEXT BOOKS:
1. Introduction to Data Mining - Pang-Ning Tan, Michael Steinbach, Vipin Kumar,
Pearson Education, 2007
2. Data Mining Concepts and Techniques - Jiawei Han and Micheline Kamber, 2nd
Edition, Morgan Kaufmann, 2006.
REFERENCE BOOKS:
1. Insight into Data Mining Theory and Practice - K.P.Soman, Shyam Diwakar,
V.Ajay, PHI, 2006
6