Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes...
-
Upload
mervyn-townsend -
Category
Documents
-
view
214 -
download
0
Transcript of Data Mining By Dave Maung. What is Data Mining? The process of automatically searching large volumes...
Data Mining
By
Dave Maung
What is Data Mining?
The process of automatically searching large volumes of data for patterns.
Also known as KDD Knowledge-Discovery.
Different types of Data Mining
Relational data mining Text mining Web mining
Relational Data Mining
Data mining technique for relational databases
Relational data mining algorithms look for patterns among multiple tables
Used classification rules and Association rules
Classification
Predicting an item classFinding rules that partition the given data
into disjoints groupsPopular classification Methods is
decision tree
Decision Tree
A graph of decisions and their possible consequences
Decision trees are constructed to help making decisions.
A decision tree used tree structure.
Example of Decision Tree
Text Mining
Is the process of extracting interesting non-trivial informationknowledge from unstructured text
Text Mining (continued)
Also known as intelligent text analysistext data mining unstructured data managementor knowledge-discovery in text
Web Mining
Is the extraction of interesting potentially useful patterns
Implicit information from artifacts Activity related to the Worldwide Web
Web Mining (continued)
Three knowledge discovery domains that pertain to web miningWeb Content Mining, Web Structure Mining, Web Usage Mining
Web Content Mining
Is an automatic process that goes beyond keyword extraction.
There are two groups of web content mining strategies: mine the content of documents improve on the content search of other tools
like search engines.
Web Structure Mining
Is Worldwide Web can reveal more information than just the information contained in documents
Web Structure Mining (example)
Links pointing to a document indicate the popularity of the document.
Links coming out of a document indicate the richness or perhaps the variety of topics covered in the document.
Web Usage Mining
Web servers record and accumulate data about user interactions whenever requests for resources are received.
Analyzing the web access logs of different web sites
Web Usage Mining
Two main tendencies in Web Usage Mining driven: General Access Pattern Tracking Customized Usage Tracking
General access pattern
Analyzes the web logs to understand access patterns and trends
Give better structure and grouping of resource providers
Can be used to restructure sites in a more efficient grouping, and target specific users for specific selling ads
Customized usage tracking
Analyzes individual trends To customize web sites to usersSuccess of Application depends on what
and how much valid and reliable knowledge one can discover from the large raw log data.
Web Mining Architecture
Reference
http://wikipedia.com