1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to...
-
Upload
georgia-mcdaniel -
Category
Documents
-
view
223 -
download
2
Transcript of 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to...
![Page 1: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/1.jpg)
1
Data Mining & Machine LearningIntroduction
Intelligent Systems Lab.
Soongsil University
Thanks to Raymond J. Mooney in the University of Texas at Austin, Isabelle Guyon
![Page 2: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/2.jpg)
2
Artificial Intelligence (AI): Research Areas
ArtificialIntelligence
Research
Rationalism (Logical)Empiricism (Statistical)Connectionism (Neural)Evolutionary (Genetic)Biological (Molecular)
Paradigm
Application
Intelligent AgentsInformation RetrievalElectronic CommerceData MiningBioinformaticsNatural Language Proc.Expert Systems
Learning AlgorithmsInference MechanismsKnowledge RepresentationIntelligent System Architecture
![Page 3: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/3.jpg)
3
Artificial Intelligence (AI): Paradigms
Symbolic AI Rule-Based Systems
Connectionist AI Neural Networks
Evolutionary AI Genetic Algorithms
Molecular AI: DNA Computing
![Page 4: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/4.jpg)
4
What is Machine Learning?
Learning
algorithm
TRAININGDATA Answer
Trained
machine
Query
![Page 5: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/5.jpg)
5
Definition of learning
ProgramLearning Program
LearnedProgram
Experience, E Task, T
Performance, P
Task
Performance
• Definition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E
![Page 6: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/6.jpg)
6
What is Learning?
• Herbert Simon: “Learning is any process by which a system improves performance from experience.”
![Page 7: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/7.jpg)
7
Machine Learning
• Supervised Learning– Estimate an unknown mapping from known input- output pairs
– Learn fw from training set D={(x,y)} s.t.
– Classification: y is discrete– Regression: y is continuous
• Unsupervised Learning– Only input values are provided
– Learn fw from D={(x)} s.t.
– Clustering
)()( xxw fyf
xxw )(f
![Page 8: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/8.jpg)
8
Why Machine Learning?
• Recent progress in algorithms and theory• Growing flood of online data• Computational power is available• Knowledge engineering bottleneck. Develop systems
that are too difficult/expensive to construct manually because they require specific detailed skills or knowledge tuned to a specific task
• Budding industry
![Page 9: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/9.jpg)
9
Niches using machine learning
• Data mining from large databases.– Market basket analysis (e.g. diapers and beer)– Medical records → medical knowledge
• Software applications we can’t program by hand– Autonomous driving– Speech recognition
• Self customizing programs to individual users.– Spam mail filter– Personalized tutoring– Newsreader that learns user interests
![Page 10: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/10.jpg)
10
Trends leading to Data Flood
• More data is generated:– Bank, telecom, other
business transactions ...
– Scientific data: astronomy, biology, etc
– Web, text, and e-commerce
![Page 11: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/11.jpg)
11
Big Data Examples
• Europe's Very Long Baseline Interferometry (VLBI) has 16 telescopes, each of which produces 1 Gigabit/second of astronomical data over a 25-day observation session – storage and analysis a big problem
• AT&T handles billions of calls per day– so much data, it cannot be all stored -- analysis has to
be done “on the fly”, on streaming data
![Page 12: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/12.jpg)
12
Largest databases in 2007
• Commercial databases:– AT&T: 312 TB– World Data Centre for Climate: 220 TB– YouTube: 45TB of videos – Amazon: 42 TB (250,000 full textbooks)– Central Intelligence Agency (CIA): ?
![Page 13: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/13.jpg)
13
Data Growth
In 2 years, the size of the largest database TRIPLED!
![Page 14: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/14.jpg)
14
Machine Learning / Data Mining Application areas
• Science– astronomy, bioinformatics, drug discovery, …
• Business– CRM (Customer Relationship management), fraud
detection, e-commerce, manufacturing, sports/entertainment, telecom, targeted marketing, health care, …
• Web: – search engines, advertising, web and text mining, …
• Government– surveillance, crime detection, profiling tax cheaters, …
![Page 15: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/15.jpg)
15
Data Mining for Customer Modeling
• Customer Tasks:– attrition prediction– targeted marketing:
• cross-sell, customer acquisition
– credit-risk– fraud detection
• Industries– banking, telecom, retail sales, …
![Page 16: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/16.jpg)
16
Customer Attrition: Case Study
• Situation: Attrition rate at for mobile phone customers is around 25-30% a year !
• With this in mind, what is our task?– Assume we have customer information for the
past N months.
![Page 17: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/17.jpg)
17
Customer Attrition: Case Study
Task:
• Predict who is likely to attrite next month.
• Estimate customer value and what is the cost-effective offer to be made to this customer.
![Page 18: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/18.jpg)
18
Customer Attrition Results
• Verizon Wireless built a customer data warehouse
• Identified potential attriters
• Developed multiple, regional models
• Targeted customers with high propensity to accept the offer
• Reduced attrition rate from over 2%/month to under 1.5%/month (huge impact, with >30 M subscribers)
![Page 19: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/19.jpg)
19
Assessing Credit Risk: Case Study
• Situation: Person applies for a loan
• Task: Should a bank approve the loan?
• Note: People who have the best credit don’t need the loans, and people with worst credit are not likely to repay. Bank’s best customers are in the middle
![Page 20: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/20.jpg)
20
Credit Risk - Results
• Banks develop credit models using variety of machine learning methods.
• Mortgage and credit card proliferation are the
results of being able to successfully predict if a person is likely to default on a loan
• Widely deployed in many countries
![Page 21: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/21.jpg)
21
Successful e-commerce – Case Study
• Task: Recommend other books (products) this person is likely to buy
• Amazon does clustering based on books bought:– customers who bought “Advances in Knowledge
Discovery and Data Mining”, also bought “Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations”
• Recommendation program is quite successful
![Page 22: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/22.jpg)
22
Security and Fraud Detection - Case Study
• Credit Card Fraud Detection• Detection of Money laundering
– FAIS (US Treasury)
• Securities Fraud– NASDAQ KDD system
• Phone fraud– AT&T, Bell Atlantic, British Telecom/MCI
• Bio-terrorism detection at Salt Lake Olympics 2002
![Page 23: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/23.jpg)
23
Example Problem:Handwritten Digit Recognition
• Handcrafted rules will result in large no. of rules and Exceptions
• Better to have a machine that learns from a large training set
Wide variability of same numeral
![Page 24: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/24.jpg)
24
Chess Game
– Let IBM’s stock increase
by $18 billion at that year
In 1997, Deep Blue(IBM) beat Garry Kasparov( 러 ).
![Page 25: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/25.jpg)
25
Some Successful Applications ofMachine Learning
• Learning to drive an autonomous vehicle
– Train computer-controlled vehicles
to steer correctly
– Drive at 70 mph for 90 miles on public
highways
– Associate steering commands with
image sequence
– 1200 computer-generated images as
training examples
• Half-hour training An additional information from previous image indicating the darkness or lightness of the road
![Page 26: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/26.jpg)
26
Some Successful Applications ofMachine Learning
• Learning to recognize spoken words– Speech recognition/synthesis
– Natural language understanding/generation
– Machine translation
![Page 27: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/27.jpg)
27
Example 1: visual object categorization
• A classification problem: predict category y based on image x.• Little chance to “hand-craft” a solution, without learning.• Applications: robotics, HCI, web search (a real image Google..)
![Page 28: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/28.jpg)
28
Face Recognition - 1
Given multiple angles/ views of a person, learn to identify them.
Learn to distinguishmale from female faces.
![Page 29: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/29.jpg)
29
Face Recognition - 2
Learn to recongnize emotions, gestures
Li, Ye, Kambhametta, 2003
![Page 30: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/30.jpg)
30
Robot
• Sony AIBO robot – Available on June 1, 1999 – Weight: 1.6 KG – Adaptive learning and growth capabilities – Simulate emotion such as happiness and anger
![Page 31: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/31.jpg)
31
Robot
• Honda ASIMO (Advanced Step in Innovate MObility)
– Born on 31 October, 2001
– Height: 120 CM, Weight: 52 KG
http://blog.makezine.com/archive/2009/08/asimo_avoids_moving_obstacles.html?CMP=OTC-0D6B48984890
![Page 32: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/32.jpg)
32
Biomedical / Biometrics
• Medicine:– Screening– Diagnosis and prognosis– Drug discovery
• Security:– Face recognition– Signature / fingerprint – DNA fingerprinting
![Page 33: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/33.jpg)
33
Computer / Internet
• Computer interfaces:– Troubleshooting wizards – Handwriting and speech– Brain waves
• Internet– Spam filtering– Text categorization– Text translation– Recommendation
7
![Page 34: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/34.jpg)
34
Classification
• Assign object/event to one of a given finite set of categories.– Medical diagnosis– Credit card applications or transactions– Fraud detection in e-commerce– Worm detection in network packets– Spam filtering in email– Recommended articles in a newspaper– Recommended books, movies, music, or jokes– Financial investments– DNA sequences– Spoken words– Handwritten letters– Astronomical images
![Page 35: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/35.jpg)
35
Problem Solving / Planning / Control
• Performing actions in an environment in order to achieve a goal.– Solving calculus problems– Playing checkers, chess, or backgammon– Driving a car or a jeep– Flying a plane, helicopter, or rocket– Controlling an elevator– Controlling a character in a video game– Controlling a mobile robot
![Page 36: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/36.jpg)
36
Applications
inputs
training examples
10
102
103
104
105
Bioinformatics
Ecology
OCRHWR
MarketAnalysis
TextCategorization
Machine Vision
Syst
em d
iagn
osis
10 102 103 104 105
![Page 37: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/37.jpg)
37
Disciplines Related with Machine Learning
• Artificial intelligence– 기호 표현 학습 , 탐색문제 , 문제해결 , 기존지식의
활용
• Bayesian methods– 가설 확률계산의 기초 , naïve Bayes classifier,
unobserved 변수 값 추정
• Computational complexity theory– 계산 효율 , 학습 데이터의 크기 , 오류의 수 등의
측정에 필요한 이론적 기반
• Control theory– 이미 정의된 목적을 최적화하는 제어과정과 다음 상태
예측을 학습
![Page 38: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/38.jpg)
38
• Information theory– Entropy 와 Information Content 를 측정 , Minimum
Description Length, Optimal Code 와 Optimal Training 의 관계
• Philosophy– Occam’s Razor, 일반화의 타당성 분석
• Psychology and neurobiology– Neural network models
• Statistics– 가설의 정확도 추정시 발생하는 에러의 특성화 , 신뢰구간 ,
통계적 검증
Disciplines Related with Machine Learning (2)
![Page 39: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/39.jpg)
39
Definition of learning
ProgramLearning Program
LearnedProgram
Experience, E Task, T
Performance, P
Task
Performance
• Definition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E
![Page 40: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/40.jpg)
40
Example: checkers
Task T: Playing checkers.
Performance measure P: % of games won.
Training experience E: Practice games by playing against itself.
![Page 41: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/41.jpg)
41
Example: Recognizing handwritten letters
Task T: Recognizing and classifying handwritten words within images.
Performance measure P: % words correctly classified.
Training experience E: A database of handwritten words with given classifications.
![Page 42: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/42.jpg)
42
Example: Robot driving
Task T: Driving on public four-lane highway using vision sensors.
Performance measure P: Average distance traveled before an error (as judged by a human overseer).
Training experience E: A sequence of images and steering commands recorded while observing a human driver.
![Page 43: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/43.jpg)
43
Designing a learning system
Task T: Playing checkers.Performance measure P: % of games won.Training experience E: Practice games by playing against itself.
– What does this mean?– and what can we learn from it?
![Page 44: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/44.jpg)
44
Measuring Performance
• Classification Accuracy• Solution correctness• Solution quality (length, efficiency)• Speed of performance
![Page 45: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/45.jpg)
45
Designing a Learning System
1. Choose the training experience2. Choose exactly what is to be learned, i.e. the target
function.3. Choose how to represent the target function.4. Choose a learning algorithm to infer the target function
from the experience.
Environment/Experience
Learner
Knowledge
PerformanceElement
Train examples…
Test examples…
![Page 46: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/46.jpg)
46
Designing a Learning System
1. Choosing the Training Experience
• Key Attributes– Direct/indirect feedback 을 제공하는가 ?
• Direct feedback: checkers states and correct move• Indirect feedback: move sequences and final outcomes of various games
– Credit assignment problem
– Degree of controlling the sequence of training example• Learner 의 자율성 , 학습 정보를 얻을 때 teacher 의
도움을 받는 정도 ,
– Distribution of examples• Train examples 의 분포와 Test examples 의 분포의 문제• 시스템의 성능을 평가하는 테스트의 예제 분포를 잘
반영해야 함• 특수한 경우의 분포가 반영 (The Checkers World Champion
이 격은 특수한 경우의 분포가 반영 될까 ?)
![Page 47: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/47.jpg)
47
Training Experience
• Direct experience: Given sample input and output pairs for a useful target function.– Checker boards labeled with the correct move,
• e.g. extracted from record of expert play
• Indirect experience: Given feedback which is not direct I/O pairs for a useful target function.– Potentially arbitrary sequences of game moves and their
final game results.– Credit/Blame Assignment Problem: How to assign credit/
blame to individual moves given only indirect feedback?
![Page 48: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/48.jpg)
48
Source of Training Data
• Provided random examples outside of the learner’s control. ( 학습자 - 알고리즘 판단 밖의 사례를 무작위 추출 )– Negative examples available or only positive?
• Good training examples selected by a “benevolent teacher.” (Teacher 로 부터 )
• Learner can construct an arbitrary example and query an oracle for its label. ( 시스템 자체에서 예제를 구축 )
– Learner can design and run experiments directly in the environment without any human guidance.
( 사람의 도움없이 시스템이 새로운 문제틀을 제시함 )
– Learner can query an oracle about class of an unlabeled example in the environment. ( 불안전한 예제의 답을 질문 , 즉 (x, ___) 입력은 있으나 결과값이 없는 경우 )
![Page 49: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/49.jpg)
49
Training vs. Test Distribution
• Generally assume that the training and test examples are independently drawn from the same overall distribution of data.– IID: Independently and identically distributed
• If examples are not independent, requires collective classification.
(e.g. communication network, financial transaction network, social network 에 대한 관계 모델구축 시 )
• If test distribution is different, requires transfer learning. that is, achieving cumulative learning
![Page 50: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/50.jpg)
50
![Page 51: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/51.jpg)
Transfer learning
51
![Page 52: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/52.jpg)
52
참고 : Transfer learning
• Transfer learning is what happens when someone finds it much easier to learn to play chess having already learned to play checkers;
or to recognize tables having already learned to recognize chairs;
or to learn Spanish having already learned Italian.
• Achieving significant levels of transfer learning across tasks -- that is, achieving cumulative learning -- is perhaps the central problem facing machine learning.
![Page 53: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/53.jpg)
53
Designing a Learning System
1. Choose the training experience
2. Choose exactly what is to be learned, i.e. the target function.
3. Choose how to represent the target function.
4. Choose a learning algorithm to infer the target function from the experience.
Environment/Experience
Learner
Knowledge
PerformanceElement
![Page 54: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/54.jpg)
54
• 어떤 지식 ( 함수 ) 을 학습시키고 , 평가 시스템에 의하여 어떻게 이용 되어 질 것인가 ?
• 가정 : “ 장기게임에서 , 현재의 장기판에서 타당한 움직임들을 생성하는 함수가 있고 , 최고의 움직임을 선택한다” .
– Could learn a function: 1. ChooseMove : B →M( 최고의 움직임 )
Or
2. Evaluation function, V : B → R * 각 보드의 위치에 따라 얼마나 유리한가에 대한 점수를 부여함 . * V 는 각각의 움직임에 대한 결과의 점수에 따라 선택하는데 이용가능 하여 * 결국은 최고 높은 점수를 얻을 수 있는 움직임을 선택하게 한다 .
2. Choosing a Target Function
Designing a Learning System
![Page 55: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/55.jpg)
55
• A function that chooses the best move M for any B– ChooseMove : B →M– Difficult to learn
• It is useful to reduce the problem of improving performance P at task T, to the problem of learning some particular target function.
• An evaluation function that assigns a numerical score to any B– V : B → R
Designing a Learning System
2. Choosing the Target Function
![Page 56: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/56.jpg)
56
The start of the learning work
Instead of learning ChooseMove we establish a value function:
target function, V : B → R
that maps any legal board state in B to some real value in R.
어떤 Position 에서도 그 Position 이 초래하게 되는 Score 를 최대화 시키는 움직임을 선택하도록 한다 .
1. if b is a final board state that is won, then V (b) = 100.2. if b is a final board state that is lost, then V (b) = −100.3. if b is a final board state that is drawn, then V (b) = 0.4. if b is not a final board state, then V (b) =
![Page 57: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/57.jpg)
57
The start of the learning work
Instead of learning ChooseMove we establish a value function:
target function, V : B → R
that maps any legal board state in B to some real value in R..어떤 Position 에서도 그 Position 이 초래하게 되는 Score 를 최대화 시키는
움직임을 선택하도록 한다 .
1. if b is a final board state that is won, then V (b) = 100.2. if b is a final board state that is lost, then V (b) = −100.3. if b is a final board state that is drawn, then V (b) = 0.4. if b is not a final board state, then V (b) = V (b’), b’ : 성취될 수 있는 최고의 마지막 상태 (the best final board state) ( 단 상대방도 최적으로 수를 둔다고 가정 )
Unfortunately, this did not take us any further!
![Page 58: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/58.jpg)
58
Approximating V(b)
• Computing V(b) is intractable since it involves searching the complete exponential game tree.
• Therefore, this definition is said to be non-operational.
• An operational definition can be computed in reasonable (polynomial) time.
• Need to learn an operational approximation to the ideal evaluation function.
![Page 59: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/59.jpg)
59
Designing a Learning System
1. Choose the training experience
2. Choose exactly what is to be learned, i.e. the target function.
3. Choose how to represent the target function.
4. Choose a learning algorithm to infer the target function from the experience.
Environment/Experience
Learner
Knowledge
PerformanceElement
![Page 60: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/60.jpg)
60
3. Choosing a Representation for the Target Function
• Describing the function – Tables
– Rules
– Polynomial functions
– Neural nets
• Trade-off in choice– Expressive power( 함수의 표현 )
– Size of training data ( 예제의 크기 )
• 표현 ( 제약조건 ) 이 많을 수록 함수의 해의 결과는 더 좋아 진다 . 즉 더욱 좋은 근사값을 얻는 함수를 구할 수 있다 .
• 그러나 정확한 함수를 얻기 위해서 더 많은 예제 , 사이즈가 큰 예제가 필요하게 된다 .
![Page 61: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/61.jpg)
61
Approximate representation
w1 - w6: weights
![Page 62: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/62.jpg)
62
Linear Function for Representing V(b)
• Use a linear approximation of the evaluation function.
(b) = w0 + w1x1 + w2x2 + w3x3 +w4x4 + w5x5 + w6x6
w0 : an additive constant
^
V
![Page 63: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/63.jpg)
63
Designing a Learning System
1. Choose the training experience
2. Choose exactly what is to be learned, i.e. the target function.
3. Choose how to represent the target function.
4. Choose a learning algorithm to infer the target function from the experience.
Environment/Experience
Learner
Knowledge
PerformanceElement
![Page 64: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/64.jpg)
64
4. Choosing a Function Approximation Algorithm
A training example is represented as an ordered pair <b, Vtrain(b) >
b: board stateVtrain(b) : training value for b
Instance: “black has won the game <<x1=3, x2=0, x3=1, x4=0, x5=0, x6=0>, +100> (x2 = 0) indicates that white has no remaining pieces.
Estimating training values for intermediate board states
Vtrain(bi) ← ( bi+1 ) = (Successor(bi))
: current approximation to V, ( 즉 the learned function, hypothesis)
Successor(bi): the next board state, 즉 bi+1 state
* 현재의 b 보드상태에 대한 training value 은 ← 다음단계의 장기판 상태 (b+1) 의 가설의 함수값을 사용
^
V
^
V^
V
![Page 65: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/65.jpg)
65
Estimating Training Values
DESIGNING A LEARNING SYSTEM
![Page 66: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/66.jpg)
66
부연설명 : Temporal Difference Learning
• Estimate training values for intermediate (non-terminal) board positions by the estimated value of their successor in an actual game trace.
• where successor(b) is the next board position
where it is the program’s move in actual play.
• Values towards the end of the game are initially more accurate and continued training slowly “backs up” accurate values to earlier board positions.
))successor(()( bVbVtrain
![Page 67: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/67.jpg)
67
How to learn?
![Page 68: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/68.jpg)
68
How to learn?
![Page 69: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/69.jpg)
69
How to change the weights?
![Page 70: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/70.jpg)
70
How to change the weights?
![Page 71: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/71.jpg)
71
Obtaining Training Values
• Direct supervision may be available for the target function.
• With indirect feedback, training values can be estimated using temporal difference learning (used in reinforcement learning where supervision is delayed reward).
![Page 72: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/72.jpg)
72
Learning Algorithm
• Uses training values for the target function to induce a hypothesis definition that fits these examples and hopefully generalizes to unseen examples.
• In statistics, learning to approximate a continuous function is called regression.
• Attempts to minimize some measure of error (loss function) such as mean squared error:
![Page 73: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/73.jpg)
73
The LMS(Least Mean Square) weight update rule
• Due to mathematical reasoning, the following update rule is very sensible.
![Page 74: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/74.jpg)
74
LMS Discussion
• Intuitively, LMS executes the following rules:– 예제를 적용한 결과 (the output for an example ) 가 정확하다면 , 변화를 주지
않는다 . – 예제를 적용한 결과 가 너무 높게 나오면 , 해당 features 의 값에
비례하여 weight 값을 낮춘다 . 그러면 전반적인 예제의 결과도 줄어들게 된다 .
– 예제를 적용한 결과 가 너무 낮게 나오면 , 해당 features 의 값에 비례하여 weight 값을 높인다 . 그러면 전반적인 예제의 결과도 늘어들게 된다 .
• Under the proper weak assumptions, LMS can be proven to eventually converge to a set of weights that minimizes the mean squared error.
)(bV
)(bV
![Page 75: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/75.jpg)
75
Lessons Learned about Learning
• Learning 은 ? 선택된 target function 을 근사화 (approximation) 하기
위해 direct or indirect experience 을 사용한다 .
• Function approximation 이란 ?: a space of hypotheses 에서 training data 들에 가장
적합한 가설 (hypotheses) 을 찾아가는 Search 에 해당 됨
• Different learning methods assume different hypothesis
spaces (representation languages) and/or employ different search techniques.
![Page 76: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/76.jpg)
76
Various Function Representations
• Numerical functions– Linear regression– Neural networks– Support vector machines
• Symbolic functions– Decision trees– Rules in propositional logic– Rules in first-order predicate logic
• Instance-based functions– Nearest-neighbor– Case-based
• Probabilistic Graphical Models– Naïve Bayes– Bayesian networks– Hidden-Markov Models (HMMs)– Probabilistic Context Free Grammars (PCFGs)– Markov networks
![Page 77: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/77.jpg)
77
Various Search Algorithms
• Gradient descent– Perceptron– Backpropagation
• Dynamic Programming– HMM Learning– Probabilistic Context Free Grammars (PCFGs) Learning
• Divide and Conquer– Decision tree induction– Rule learning
• Evolutionary Computation– Genetic Algorithms (GAs)– Genetic Programming (GP)– Neuro-evolution
![Page 78: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/78.jpg)
78
Evaluation of Learning Systems
• Experimental– Conduct controlled cross-validation experiments to compare
various methods on a variety of benchmark datasets.– Gather data on their performance, e.g. test accuracy,
training-time, testing-time.– Analyze differences for statistical significance.
• Theoretical– Analyze algorithms mathematically and prove theorems
about their:• Computational complexity• Ability to fit training data• Sample complexity (number of training examples needed to
learn an accurate function)
![Page 79: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/79.jpg)
79
Core parts of the machine learning
Many machine learning systems can be usefully characterized in terms of these four generic modules.
(Game history)
(Initial game board)( )^
V
![Page 80: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/80.jpg)
80
Four Components of a Learning System (2)
• Generalizer– Input: training example – Output: hypothesis (estimate of the target function)– Generalizes from the specific training examples– Hypothesizes a general function
• Experiment generator– Input - current hypothesis– Output - a new problem– Picks new practice problem maximizing the learning
rate
![Page 81: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/81.jpg)
81
Four Components of a Learning System(1)
• Performance system
- Solve the given performance task
- Use the learned target function
- New problem → trace of its solution
• Critic
- Output a set of training examples of the target function
![Page 82: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/82.jpg)
82
History of Machine Learning
• 1950s– Samuel’s checker player– Selfridge’s Pandemonium
• 1960s: – Neural networks: Perceptron– Pattern recognition – Learning in the limit theory– Minsky and Papert prove limitations of Perceptron
• 1970s: – Symbolic concept induction– Winston’s arch learner– Expert systems and the knowledge acquisition bottleneck– Quinlan’s ID3– Michalski’s AQ and soybean diagnosis– Scientific discovery with BACON– Mathematical discovery with AM
![Page 83: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/83.jpg)
83
History of Machine Learning (cont.)
• 1980s:– Advanced decision tree and rule learning– Explanation-based Learning (EBL)– Learning and planning and problem solving– Utility problem– Analogy– Cognitive architectures– Resurgence of neural networks (connectionism, backpropagation)– Valiant’s PAC Learning Theory– Focus on experimental methodology
• 1990s– Data mining– Adaptive software agents and web applications– Text learning– Reinforcement learning (RL)– Inductive Logic Programming (ILP)– Ensembles: Bagging, Boosting, and Stacking– Bayes Net learning
![Page 84: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/84.jpg)
84
History of Machine Learning (cont.)
• 2000s– Support vector machines– Kernel methods– Graphical models– Statistical relational learning– Transfer learning– Sequence labeling– Collective classification and structured outputs– Computer Systems Applications
• Compilers• Debugging• Graphics• Security (intrusion, virus, and worm detection)
– E mail management– Personalized assistants that learn– Learning in robotics and vision
![Page 85: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/85.jpg)
85
Remind
• “Learning as search in a space of possible hypotheses”
• Learning methods are characterized by their search strategies and by the underlying structure of the search spaces.
![Page 86: 1 Data Mining & Machine Learning Introduction Intelligent Systems Lab. Soongsil University Thanks to Raymond J. Mooney in the University of Texas at Austin,](https://reader035.fdocuments.us/reader035/viewer/2022062314/56649e7f5503460f94b839a6/html5/thumbnails/86.jpg)
86
Issues in Machine Learning
• 특정 학습예제에 대한 general target function 의 학습 가능한 알고리즘 개발
• 훈련용 data 는 얼마나 되어야 하는가 ?
• 학습에 대해 지식을 가지고 있다면 인간의 간여가 도움이 되는가 ?
• 각 훈련 단계별로 학습문제의 복잡도 (Complexity) 를 개선하는 문제
• Function approximation 을 위해 학습의 노력을 줄이는 것
• Target function 의 표현을 자동으로 개선 혹은 변경하는 것