University of Tehran 1 Microprocessor System Design Omid Fatemi Instructions (1) ([email protected])
© 2008 SRI International Systems Learning for Complex Pattern Problems Omid Madani AI Center, SRI...
-
Upload
silas-allen -
Category
Documents
-
view
219 -
download
0
Transcript of © 2008 SRI International Systems Learning for Complex Pattern Problems Omid Madani AI Center, SRI...
© 2008 SRI International
Systems Learning for Complex Pattern ProblemsOmid MadaniAI Center, SRI International
© 2008 SRI International
Foundations of Intelligence: Concepts (Categories)• Intelligent systems categorize their perceptions (objects, events, relations) • Categorization involves substantial abstraction: you rarely see the same exact
thing again…• Categorization is necessarily for intelligence• Categories are complex: have adaptive structure, composed of parts, of
absrtactions,…• High intelligence (advanced animals) requires myriad categories
What are the principles behind such learning and development?
• Assumptions/Evidence: These (perceptual) categories are developed mainly in an unsupervised manner
– Doubtful they are all programmed in.. Many are not (in particular, for humans)– Explicit teacher is absent
© 2008 SRI International
Example Perceptual Concepts• In text, every word, phrase, expression: “book”, “new”, “a”, …• Single characters are primitive concepts: “a”, “b”, …, “1”,”2”, “;” ….• Concepts can be composed of other concepts:
– “n”+”e” = “ne”– “new” + “york” = “new york”
• Concepts can be abstractions: – week-day = {Monday, Tuesday, ….}– Digits = {1,2,3,4,….}
• Area code is a concept that involves both composing and abstraction:– Composition of 3 digits– A digit is a grouping, i.e., the set {0,1,2,…,9} ( 2 is a digit )
• Other examples: phone number, address, resume page, face (in visual domain), etc.
© 2008 SRI International
Acquiring and Developing Concepts
• Higher intelligence, such as “advanced” pattern recognition/generation (e.g. vision), may require
– Long term learning (weeks, months, years,…)– Cumulative learning (learn these first, then these, then these,…)– Massive Learning: Myriad inter-related categories/concepts– Systems learning: multiple algorithms working together– Autonomy (relatively little human involvement)
What are the learning processes?
?
Applications: learning to segment words in speech stream in any language, visual object recognition, learn to play Go/Chess
© 2008 SRI International
Prediction System
…. 0011101110000….
After a while(much learning)
predict observe & update
Prediction System
observe & updatepredict
low level or “hard-wired” categories
higher level categories(bigger chunks)
(Input say text: characters, .. or vision: edges, curves,…)
(e.g. words, digits, phrases, phone numbers, faces, visual objects, home pages, sites,…)
• In a nutshell, we seek a system such that:
Learning by Repeatedly Predicting in a Rich World
Prediction Games in Infinitely Rich Worlds, AAAI FSS07
© 2008 SRI International
“ther ”
Example Category Node (processed Jane Austen’s online books)
“and ”
“heart”0.087
0.07
0.057
0.052
0.13
0.11
“love ”0.10
“by ”
(Exploring Massive Learning via a Prediction System, AAAI FSS’07)
7.1 0.41(keep local statistics)
prediction weights
categories appearing before
“ bro”
“ far”
“toge”
“nei”
© 2008 SRI International
Some Challenges or Features of the Task
• Lots of – Features/predictors (input dimensionality), – classes (output dimensionality), – instances (episodes)
• Uncertainty in the value of features, classes, adequate segmentation, …– No one segments them for us! (what about written language?)
• Require algorithms that are primarily:– incremental, handle nonstationarities, uncertainty, asymptotic
convergence, efficient sample complexity• Objectives and evaluation criteria?
© 2008 SRI International
Many-Class Learning (.. A Wiring Problem)• The questions raised during this research:1. Given the need to quickly classify (a given instance) into one of myriad classes (e.g. millions), how can this be done?1. How about space efficiency? 2. How can we efficiently learn such efficient classification systems?
many-class learning
classification system
x ?,nRx
© 2008 SRI International
A Solution: Index Learning
features categoriesinstances
Input:tripartite graph
learn
features categories
Output: an index = sparse weighted bipartite graph
if jcijw
1c
if
jc
ijw
0
0
0
Output:A (sparse) matrix
W
© 2008 SRI International
Classification/Prediction (retrieval & scoring)
}f,f{x 32
1. Features are “activated”
features classes
c1
c2
c3
c4
c5
f1
f2
f3
f42. Edges are activated
3. Receiving classes are activated4. Classes sorted/ranked
).,c(),.,c(),.,c(),.,c(
:list sorted
10104050 1534
40.
30.20.
10.
10.
see omadani.net for the learning algorithms
© 2008 SRI International
Summary• Encouraging signs that elements of unsupervised (more
“autonomous”) long-term learning systems are developing:– For instance, efficient many-class learning a good possibility– Good progress in machine learning (e.g. some evidence that hierarchical networks are
useful)• Our work stresses large-scale and long-term learning
– A “systems” approach (compared to traditional neural network approaches): we require to solve multiple problems and need multiple algorithms
– Many challenges: Uncertainties (e.g. feature noise and label noise) Nonstationarities (concepts evolve, the system evolves and develops) System objective(s)? Avoiding accumulation of error, local minima, slow learning Understanding the interaction between different modules (segmentation and concept learning,
etc.)
• Driven by goal of robustly solving practical problems (versus driven by “modeling” the brain), but problems that we think intelligence in the biological world solves.
© 2008 SRI International
… New Jersey in …
predictors (active categories)
window containing contextand target
target (category to predict)
… New Jersey in …
next time step
predictors
target
In this example, context contains one category on each side
Expedition (a 1st System)
© 2008 SRI International
… loves New York life …
predictors
window containing contextand target
target (category to predict)
.. Some Time Later ..
In terms of supervised learning/classification, in this learning activity (prediction games):• The set of concepts grows over time• Same for features/predictors (concepts ARE the predictors!)• Instance representation (segmentation of the data stream) changes/grows over time ..
© 2008 SRI International
On Learning a Task (or a Dilemma of AI!)
Program It!
Learn It!Program to Learn It!
Program to Learn to Learn It!...
© 2008 SRI International
A View of ML: On the Source of Classes(A Spectrum of Feedback-Driven (“Supervised”) Learning)
1. Machine defined2. Implicitly assigned (by the “world” or a “natural” activity/machine)
1. Human defined2. Human/Explicitly assigned(human procures training data)
1. Human defined2. Implicitly assigned (by the “world” or a “natural” activity, or by machine)
More machine autonomy (less human involvement)More noise/uncertaintyMore training dataMore classes More open problems!More interesting!
(classic supervised learning )Annotator/Editorial label assignment, (Reuters RCV1, ODP,…) controlled image tagging, ~mechanical Turk, explicit personalization (news filtering, spam,…)
predict a word using
context in text
The Newsgroup data setImage tagging in FlickerUsers as classesQueries as classesPredict clicks…..
Autonomouslearning systems( systems acquiring and developing their own concepts, prediction games, complex sensory input streams, cumulative learning, life-long learning, development,... )
© 2008 SRI International
Summary
See omadani.net/publications.html