Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.
-
date post
19-Dec-2015 -
Category
Documents
-
view
221 -
download
1
Transcript of Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.
![Page 1: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/1.jpg)
Cooperating Intelligent Systems
Learning from observationsChapter 18, AIMA
Machine Learning
![Page 2: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/2.jpg)
Two types of learning in AI
Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt with this)
Inductive: Learn new rules/facts from a data set D.
CACBA
CAnyn Nn ...1)(),(xD
We will be dealing with the latter, inductive learning, now
![Page 3: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/3.jpg)
Two types of inductive learning
Supervised: The machine has access to a teacher who corrects it.
Unsupervised: No access to teacher. Instead, the machine must search for “order” and “structure” in the environment.
![Page 4: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/4.jpg)
Two tracks
• Regression: Learning function values• Classification: Learning categories
![Page 5: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/5.jpg)
Inductive learning - example A
• f(x) is the target function• An example is a pair [x, f(x)]• Learning task: find a hypothesis h such that h(x) f(x) given a
training set of examples D = {[xi, f(xi) ]}, i = 1,2,…,N
1)(,
0
0
1
0
1
0
1
1
1
xx f
1)(,
0
0
1
1
1
0
0
1
1
xx f
0)(,
0
1
0
1
1
0
0
1
1
xx f
Etc...
Inspired by a slide from V. Pavlovic
![Page 6: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/6.jpg)
Inductive learning – example B
• Construct h so that it agrees with f.• The hypothesis h is consistent if it agrees with f on all
observations.• Ockham’s razor: Select the simplest consistent hypothesis.
• How achieve good generalization?
Consistent linear fit Consistent 7th order polynomial fit
Inconsistent linear fit.Consistent 6th orderpolynomial fit.
Consistent sinusoidal fit
![Page 7: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/7.jpg)
Inductive learning – example C
x
y
Example from V. Pavlovic @ Rutgers
![Page 8: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/8.jpg)
Inductive learning – example C
x
y
Example from V. Pavlovic @ Rutgers
![Page 9: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/9.jpg)
Inductive learning – example C
x
y
Example from V. Pavlovic @ Rutgers
![Page 10: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/10.jpg)
Inductive learning – example C
x
y
Example from V. Pavlovic @ Rutgers
Sometimes a consistent hypothesis is worse than an inconsistent
![Page 11: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/11.jpg)
The idealized inductive learning problem
Our hypothesis space
f(x)
hopt(x) H
Error
Find appropriate hypothesis space H and findh(x) H with minimum “distance” to f(x) (“error”)
The learning problem is realizable if f(x) ∈ H.
![Page 12: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/12.jpg)
Data is never noise free and never available in infinite amounts, so we get variation in data and model.The generalization error is a function of both the training data and the hypothesis selection method.
Find appropriate hypothesis space H and minimize the expected distance to f(x) (“generalization error”)
{f(x)}
{hopt(x)}
Egen
The real inductive learning problem
![Page 13: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/13.jpg)
Hypothesis spaces (examples)
f(x) = 0.5 + x + x2 + 6x3123
1={a+bx}; 2={a+bx+cx2}; 3={a+bx+cx2+dx3};Linear; Quadratic; Cubic;
1 2 3
![Page 14: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/14.jpg)
Learning problems
• The hypothesis takes as input a set of attributes x and returns a ”decision” h(x) = the predicted (estimated) output value for the input x.
• Discrete valued function ⇒ classification• Continuous valued function ⇒ regression
![Page 15: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/15.jpg)
Classification
Order into one out of several classes
KD CX Input space Output (category) space
D
D
X
x
x
x
2
1
xK
K
C
c
c
c
0
1
0
2
1
c
![Page 16: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/16.jpg)
Example: Robot color vision
Classify the Lego pieces into red, blue, and yellow.Classify white balls, black sideboard, and green carpet.Input = pixel in image, output = category
![Page 17: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/17.jpg)
RegressionThe “fixed regressor model”
)()( xx gf
x Observed inputf(x) Observed outputg(x) True underlying function I.I.D noise process
with zero mean
![Page 18: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/18.jpg)
Example: Predict price for cotton futures
Input: Past historyof closing prices,and trading volume
Output: Predictedclosing price
![Page 19: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/19.jpg)
Now...
let’s look at a classification problem: predicting whether a certain person will choose a particular restaurant.
![Page 20: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/20.jpg)
Method: Decision trees
• “Divide and conquer”: Split data into smaller and smaller subsets.
• Splits usually on a single variable
x1 > ?
yes no
x2 > ? x2 > ?
yes yesno no
![Page 21: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/21.jpg)
The wait@restaurant decision tree
This is our true function.Can we learn this tree from examples?
![Page 22: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/22.jpg)
Inductive learning of decision tree
• Simplest: Construct a decision tree with one leaf for every example = memory based learning.Not very good generalization.
• Advanced: Split on each variable so that the purity of each split increases (i.e. either only yes or only no)
• Purity measured,e.g, with entropy
![Page 23: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/23.jpg)
Inductive learning of decision tree
• Simplest: Construct a decision tree with one leaf for every example = memory based learning.Not very good generalization.
• Advanced: Split on each variable so that the purity of each split increases (i.e. either only yes or only no)
• Purity measured,e.g, with entropy
![Page 24: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/24.jpg)
Inductive learning of decision tree
• Simplest: Construct a decision tree with one leaf for every example = memory based learning.Not very good generalization.
• Advanced: Split on each variable so that the purity of each split increases (i.e. either only yes or only no)
• Purity measured,e.g, with entropy
)](ln[)()](ln[)(Entropy noPnoPyesPyesP
i
ii vPvP )(ln)(EntropyGeneral form:
![Page 25: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/25.jpg)
The entropy is maximal whenall possibilities are equallylikely.
The goal of the decision treeis to decrease the entropy ineach node.
Entropy is zero in a pure ”yes”node (or pure ”no” node).
The second law of thermodynamics:Elements in a closed system tend to seek their most probable distribution; in a closed system entropy always increases
Entropy is a measure of ”order” in asystem.
![Page 26: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/26.jpg)
Decision tree learning algorithm
• Create pure nodes whenever possible• If pure nodes are not possible, choose the
split that leads to the largest decrease in entropy.
![Page 27: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/27.jpg)
Decision tree learning example
10 attributes:1. Alternate: Is there a suitable alternative restaurant
nearby? {yes,no}2. Bar: Is there a bar to wait in? {yes,no}3. Fri/Sat: Is it Friday or Saturday? {yes,no}4. Hungry: Are you hungry? {yes,no}5. Patrons: How many are seated in the restaurant? {none,
some, full}6. Price: Price level {$,$$,$$$}7. Raining: Is it raining? {yes,no}8. Reservation: Did you make a reservation? {yes,no}9. Type: Type of food {French,Italian,Thai,Burger}10. Wait: {0-10 min, 10-30 min, 30-60 min, >60 min}
![Page 28: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/28.jpg)
Decision tree learning example
T = True, F = False 6 True,6 False 30.012
6ln126
126ln12
6Entropy
![Page 29: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/29.jpg)
Decision tree learning example
30.063ln6
36
3ln63
12
66
3ln63
63ln6
312
6Entropy
Alternate?
3 T, 3 F 3 T, 3 F
Yes No
Entropy decrease = 0.30 – 0.30 = 0
![Page 30: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/30.jpg)
Decision tree learning example
30.063ln6
36
3ln63
12
66
3ln63
63ln6
312
6Entropy
Bar?
3 T, 3 F 3 T, 3 F
Yes No
Entropy decrease = 0.30 – 0.30 = 0
![Page 31: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/31.jpg)
Decision tree learning example
29.073ln7
37
4ln74
12
75
3ln53
52ln5
212
5Entropy
Sat/Fri?
2 T, 3 F 4 T, 3 F
Yes No
Entropy decrease = 0.30 – 0.29 = 0.01
![Page 32: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/32.jpg)
Decision tree learning example
24.054ln5
45
1ln51
12
57
2ln72
75ln7
512
7Entropy
Hungry?
5 T, 2 F 1 T, 4 F
Yes No
Entropy decrease = 0.30 – 0.24 = 0.06
![Page 33: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/33.jpg)
Decision tree learning example
30.084ln8
48
4ln84
12
84
2ln42
42ln4
212
4Entropy
Raining?
2 T, 2 F 4 T, 4 F
Yes No
Entropy decrease = 0.30 – 0.30 = 0
![Page 34: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/34.jpg)
Decision tree learning example
29.074ln7
47
3ln73
12
75
2ln52
53ln5
312
5Entropy
Reservation?
3 T, 2 F 3 T, 4 F
Yes No
Entropy decrease = 0.30 – 0.29 = 0.01
![Page 35: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/35.jpg)
Decision tree learning example
14.06
4ln64
62ln6
212
6
40ln4
04
4ln44
12
42
2ln22
20ln2
012
2Entropy
Patrons?
2 F
4 T
None Full
Entropy decrease = 0.30 – 0.14 = 0.16
2 T, 4 FSome
![Page 36: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/36.jpg)
Decision tree learning example
23.04
3ln43
41ln4
112
4
20ln2
02
2ln22
12
26
3ln63
63ln6
312
6Entropy
Price
3 T, 3 F
2 T
$ $$$
Entropy decrease = 0.30 – 0.23 = 0.07
1 T, 3 F$$
![Page 37: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/37.jpg)
Decision tree learning example
30.04
2ln42
42ln4
212
44
2ln42
42ln4
212
4
21ln2
12
1ln21
12
22
1ln21
21ln2
112
2Entropy
Type
1 T, 1 F
1 T, 1 F
French Burger
Entropy decrease = 0.30 – 0.30 = 0
2 T, 2 FItalian
2 T, 2 F
Thai
![Page 38: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/38.jpg)
Decision tree learning example
24.02
2ln22
20ln2
012
22
1ln21
21ln2
112
2
21ln2
12
1ln21
12
26
2ln62
64ln6
412
6Entropy
Est. waitingtime
4 T, 2 F
1 T, 1 F
0-10 > 60
Entropy decrease = 0.30 – 0.24 = 0.06
2 F10-30
1 T, 1 F
30-60
![Page 39: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/39.jpg)
Decision tree learning example
Patrons?
2 F
4 T
None Full
Largest entropy decrease (0.16)achieved by splitting on Patrons.
2 T, 4 FSome
X? Continue like this, making new splits, always purifying nodes.
![Page 40: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/40.jpg)
Decision tree learning example
Induced tree (from examples)
![Page 41: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/41.jpg)
Decision tree learning example
True tree
![Page 42: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/42.jpg)
Decision tree learning example
Induced tree (from examples)
Cannot make it more complexthan what the data supports.
![Page 43: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/43.jpg)
How do we know it is correct?
How do we know that h f ? (Hume's Problem of Induction)– Try h on a new test set of examples
(cross validation)
...and assume the ”principle of uniformity”, i.e. the result we get on this test data should be indicative of results on future data. Causality is constant.
Inspired by a slide by V. Pavlovic
![Page 44: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/44.jpg)
Learning curve for the decision tree algorithm on 100 randomlygenerated examples in the restaurant domain.The graph summarizes 20 trials.
![Page 45: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/45.jpg)
Cross-validationUse a “validation set”.
Dtrain
Dval
Eval
valgen EE
Split your data set into twoparts, one for training yourmodel and the other for validating your model.The error on the validation data is called “validation error”(Eval)
![Page 46: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/46.jpg)
K-Fold Cross-validationMore accurate than using only one validation set.
Dtrain
Dval
Dtrain
Dtrain
Dtrain
Dval
Dval
Eval(1) Eval(2) Eval(3)
K
kvalvalgen kE
KEE
1
)(1
![Page 47: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/47.jpg)
PAC
• Any hypothesis that is consistent with a sufficiently large set of training (and test) examples is unlikely to be seriously wrong; it is probably approximately correct (PAC).
• What is the relationship between the generalization error and the number of samples needed to achieve this generalization error?
![Page 48: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/48.jpg)
instance space X
f
h
f and h disagree
The error
X = the set of all possible examples (instance space).D = the distribution of these examples.H = the hypothesis space (h H).N = the number of training data.
Image adapted from F. Hoffmann @ KTH
] fromdrawn |)()([)(error DfhPh xxx
![Page 49: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/49.jpg)
Probability for bad hypothesis
Suppose we have a bad hypothesis h with error(h) > .What is the probability that it is consistent with N
samples?
• Probability for being inconsistent with one sample = error(h) > .
• Probability for being consistent with one sample = 1 – error(h) < 1 – .
• Probability for being consistent with N independently drawn samples < (1 – )N.
![Page 50: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/50.jpg)
Probability for bad hypothesis
What is the probability that the set Hbad of bad
hypotheses with error(h) > contains a consistent hypothesis?
NNhhP )1()1())(error consistent ( bad HH
![Page 51: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/51.jpg)
Probability for bad hypothesis
What is the probability that the set Hbad of bad
hypotheses with error(h) > contains a consistent hypothesis?
NNhhP )1()1())(error consistent ( bad HH
If we want this to be less than some constant , then
ln)1ln(ln1 NN HH
![Page 52: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/52.jpg)
Probability for bad hypothesis
What is the probability that the set Hbad of bad
hypotheses with error(h) > contains a consistent hypothesis?
NNhhP )1()1())(error consistent ( bad HH
If we want this to be less than some constant , then
)ln(|)ln(|
)1ln(
)ln(|)ln(|
HH
N
Don’t expect to learn very well if H is large
![Page 53: Cooperating Intelligent Systems Learning from observations Chapter 18, AIMA Machine Learning.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d3a5503460f94a152f8/html5/thumbnails/53.jpg)
How make learning work?
• Use simple hypotheses– Always start with the simple ones first
• Constrain H with priors– Do we know something about the domain?– Do we have reasonable a priori beliefs on
parameters?
• Use many observations– Easy to say...
• Cross-validation...