DMLP

download DMLP

of 6

Transcript of DMLP

  • 8/9/2019 DMLP

    1/6

    Aca Format X

    BAHUBALI COLLEGE OF ENGINEERING, SHRAVANABELAGOLA

    Lesson/Session Plan Template

    Department of Information Science & Engineering

    Sub code: 06IS74

    Sub: DATA MINING Sem: VII

    S

    N

    Date Content Activity

    UNIT - 1

    INTRODUCTION, DATA 1: What is Data Mining? Motivating Challenges; The origins of data

    mining; Data Mining Tasks. Types of Data; Data Quality.

    6 Hours

    1 What is Data Mining? Participation

    &discussion

    2 Motivating Challenges;

    3 The origins of data mining; Explain with

    an example

    4 Data Mining Tasks. Discussions5 Types of Data;

    6 Types of Data Discussions

    7 Data Quality.

    Measurement & data Collection Issues

    Issues Related To Application

    8 QP

    Assignment

    UNIT 2

    DATA 2: Data Preprocessing; Measures of Similarity and Dissimilarity

    6 Hours

    9 Data Preprocessing

    Aggregation

    Sampling

    Participation

    &discussion

    10 Data Preprocessing cont..

    Dimensionality Reduction

    Feature Subset Selection

    Feature Creation

    Discussions

    11 Data Preprocessing cont

    Discretization & Binarization

    Variable Transformation

    Explain with

    an example

    12 Measures of Similarity and Dissimilarity

    Basics

    Similarity & Dissimilarity between Simple Attributes

    Dissimilarities Between Data Objects

    13 Measures of Similarity and Dissimilarity

    Similarities Between Data Objects Examples if Proximity Measures

    Explain with

    an example

    14 Issues in Proximity Calculation

    1

  • 8/9/2019 DMLP

    2/6

    Aca Format X

    Selecting The Right Proximity Measure

    15 Question Paper

    16 Question Paper

    UNIT - 7

    FURTHER TOPICS IN DATA MINING: Multidimensional analysis and descriptive mining of

    complex data objects; Spatial data mining; Multimedia data mining; Text mining; Mining the WWW.Outlier analysis.

    7 Hours

    17 Multidimensional analysis and descriptive mining of complex data

    objects

    Generalization of Structured Data

    Aggregation and Approximation in Spatial and Multimedia

    Data Generalization

    Generalization of Object Identifiers and Class/subclass

    Hierarchies

    Participation

    &discussion

    18 Multidimensional analysis and descriptive mining of complex data

    objects cont

    Generalization of Class Composition Hierarchies

    Construction and Mining of Object Cubes

    Generalization Based Mining of Plan Databases by Divide

    and Conquer

    19 Spatial data mining;

    Spatial data Cube Construction and Spatial OLAP

    Mining Spatial Association and Co-location Patterns

    Discussions

    20 Spatial Clustering Methods

    Spatial Classification and Spatial Trend Analysis

    21 Multimedia data mining; Mining Raster Databases

    Similarity Search in Multimedia Data

    Multidimensional Analysis of Multimedia Data

    22 Multimedia data mining; Mining Raster Databases cont..

    Classification and Predication Analysis of Multimedia Data

    Mining Association in Multimedia Data

    Audio & Video Data Mining

    23 Text mining Text Data Analysis and Information Retrieval

    Dimensionality Reduction for Text

    Text Mining Approach

    24 Mining the WWW.

    Mining the Web page layout structure Mining the Web link Structure to Identify Authoritative

    2

  • 8/9/2019 DMLP

    3/6

    Aca Format X

    Web Pages

    25 Mining Multimedia Data on the Web

    Automatic Classification of Web Documents

    Web Usage Mining

    UNIT - 8

    APPLICATIONS: Data mining applications; Data mining system products and research prototypes;

    Additional themes on Data mining; Social impact of Data mining; Trends in Data mining.

    6 Hours

    26 Data mining applications;

    Data mining for Financial Data Analysis

    Retail Industry

    Telecommunication Industry

    Participation

    &discussion

    27 Biological Data Analysis

    Other Scientific Application

    Intrusion Detection

    28 Data mining system products and research prototypes;

    How to Choose a Data mining System

    Examples of Commercial Data Mining Systems

    Explain with

    an example

    29 Additional themes on Data mining

    Theoretical Foundation of Data Mining

    Statistical Data Mining

    Discussions

    30 Visual and Audio Data Mining

    Data Mining Privacy and Data Security

    31 Social impact of Data mining;

    Ubiquitous and Invisible Data Mining

    Data Mining Privacy and Data Security

    32 Trends in Data mining.

    33 Question Bank Discussions

    UNIT 3

    CLASSIFICATION: Preliminaries; General approach to solving a classification problem; Decision

    tree induction; Rule-based classifier; Nearest-neighbor classifier.

    8 Hours

    34 General approach to solving a classification problem; Participation

    &discussion

    35 Decision tree induction;

    How a Decision Tree Works

    How To Build A Decision Tree

    Method for expressing attribute test conditions

    Discussions

    36 Decision tree induction cont Discussions

    3

  • 8/9/2019 DMLP

    4/6

    Aca Format X

    Measure for selecting the best split

    Algorithm for decision tree induction

    An example : web robot detection

    Characteristics Of decision tree induction

    37 Rule-based classifier

    How a rule based classifier works

    Rule ordering schemes

    How to build a rule based classifier

    Discussions

    38 Rule-based classifier cont..;

    Direct methods for rule extraction

    Indirect method for rule extraction

    Characteristics of rule based classifier39 Nearest-neighbor classifier.

    Algorithm

    Characteristics Of Nearest Neighbor Classifier

    Discussions

    40 Question Paper Assignement

    UNIT - 4

    ASSOCIATION ANALYSIS 1: Problem Definition; Frequent Itemset generation; Rule

    Generation; Compact representation of frequent itemsets; Alternative methods for generating frequent

    itemsets.

    6 Hours

    41 Problem Definition; Participation&discussion

    42 Frequent Itemset generation;

    The Apriori Principal

    Frequent Itemset Generation in the Apriori Algorithm

    Candidate Generation and Pruning

    Support Counting

    Computational Complexity

    43 Rule Generation;

    Confidence Based Pruning

    Rule Generation in Apriori Algorithm

    An Example: Congressional Voting Records

    Discussions

    44 Compact representation of frequent itemsets;

    Maximal Frequent Itemsets

    Closed Frequent Itemsets

    45 Alternative methods for generating frequent itemsets. Discussions

    46 Alternative methods for generating frequent itemsets.

    47 Question paper AssignmentUNIT - 5

    ASSOCIATION ANALYSIS 2: FP-Growth algorithm, Evaluation of association patterns; Effect

    4

  • 8/9/2019 DMLP

    5/6

    Aca Format X

    of skewed support distribution; Sequential patterns.

    6 Hours

    48 FP-Growth algorithm, FP Tree Representation

    Frequent Itemset

    Generation in FP Growth Algorithm

    Participation

    &discussion

    49 Evaluation of association patterns;

    Objective Measures of Interestingness

    Measure beyond pairs of Objective measures of

    Interestingness binary variables

    Simsons Paradox

    Discussions

    50 Effect of skewed support distribution;

    51 Problem Formulation

    Sequential Pattern Discovery

    Timing Constraints

    Alternative Counting Schemes

    Explain with

    an example

    52 Sequential patterns

    53 Question paper Assignment

    UNIT - 6

    CLUSTER ANALYSIS: Overview, K-means, Agglomerative hierarchical clustering, DBSCAN,

    Overview of Cluster Evaluation.

    7 Hours

    54 Overview,

    What Is Cluster Analysis

    Different Types of Clustering

    Different Types of Clusters

    Participation

    &discussion

    55 K-means,

    The basic K-means Algorithm

    K-means: Additional issues

    Bisecting K-Means

    K-Means and Different Types of Cluster

    Strength and Weaknesses

    K-means as an Optimization Problem

    56 Agglomerative hierarchical clustering

    Basic Agglomerative Hierarchical Clustering Algorithm

    Specific Techniques

    The Launce-Williams Formula for Cluster

    Key issue in Hierarchical Clustering Strength & Weakness

    Discussions

    57 DBSCAN

    5

  • 8/9/2019 DMLP

    6/6

    Aca Format X

    Traditional Density: Center-Based Approach

    The DBSCAN Algorithm

    Strengths and Weaknesses

    58 Overview of Cluster Evaluation.

    Overview

    Unsupervised Cluster Evaluation Using Cohesion and

    Separation

    Unsupervised Cluster Evaluation Using Proximity Matrix

    Unsupervised Evaluation of Hierarchical Clustering

    Discussions

    59 Overview of Cluster Evaluation.

    Determining the correct Number of Clusters

    Clustering Tendency

    Supervised Measures of Cluster Validity

    Assessing the Significance of Cluster Validity Measures

    60 Question paper Assignment

    TEXT BOOKS:

    1. Introduction to Data Mining - Pang-Ning Tan, Michael Steinbach, Vipin Kumar,

    Pearson Education, 2007

    2. Data Mining Concepts and Techniques - Jiawei Han and Micheline Kamber, 2nd

    Edition, Morgan Kaufmann, 2006.

    REFERENCE BOOKS:

    1. Insight into Data Mining Theory and Practice - K.P.Soman, Shyam Diwakar,

    V.Ajay, PHI, 2006

    6