Department of Statistics Florida State University
Transcript of Department of Statistics Florida State University
![Page 1: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/1.jpg)
1
The Artificial Prediction Market
Adrian BarbuDepartment of StatisticsFlorida State UniversityJoint work with Nathan Lay, Siemens Corporate Research
![Page 2: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/2.jpg)
2
OverviewMain Contributions A mathematical theory for Artificial Prediction Markets
Loss function. Relation to existing methods:
Linear Aggregation SVM Logistic Regression
Extension to regression estimation. Experimental comparison with Random Forest and Adaboost
![Page 3: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/3.jpg)
3
MotivationMain goal: Classification Let ⊂RF be the instance space K possible classes (outcomes) {1,…,K}Supervised learning: Given training examples:
(xi,yi)∈ × {1,…,K} Learn a function
such that fk(x) is a good approximation of p(Y=k|x)
![Page 4: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/4.jpg)
4
The Iowa Electronic Market
Market setup: Contracts for each outcome are bought and sold at market price
0 < c < 1 Each contract pays $1 if outcome is realized. Market price of contract represents a good approximation of the
probability that the corresponding event occurs
![Page 5: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/5.jpg)
The Artificial Prediction Market Goal: predict class probability p(y|x) Market formulation:
Simulate the Iowa Electronic Market Market participants = classifiers Solve market price equations
Obtain total budget conservation No price fluctuations
Train the market using training examples (xi,yi)∈ × {1,…,K} Participants bet on instance xi
Wins are based on contracts purchased for correct class yi
Participants become rich or poor based on prediction ability The trained market predicts better
![Page 6: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/6.jpg)
Other Prediction Markets Perols 2009
Parimutuel betting with odds update Participants are not trained (have equal budgets) Evaluated on UCI datasets
Using the Market Maker Chen and Vaughan, 2010, Abernethy et al, 2011 Participants enter the market sequentially Are paid according to a scoring rule See Tuesday’s tutorial
Machine Learning Markets (Storkey 2011) Participants bet to maximize a utility function Equilibrium price is computed by optimization
![Page 7: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/7.jpg)
7
The Artificial Prediction Market A simulation of the Iowa Electronic Market:
Each class k = 1, ...,K corresponds to a contract type Market price is a vector c = (c1,..., cK ). We enforce ck=1 Contract for class k sells at market price 0<ck<1 and pays 1 if the
outcome is k. A market participant is not a human, but a pair of:
1. A budget (or weight) βm Based on past ability in predicting correct class
2. A betting function 3. Percentage of the budget on each class a participant allocates.
![Page 8: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/8.jpg)
8
Constant Betting Functions Allocate same amount independent of the price
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
![Page 9: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/9.jpg)
9
Linear Betting Functions
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
![Page 10: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/10.jpg)
10
Aggressive Betting Functions
Buy/sell based on classifier estimation of p(y|x)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
![Page 11: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/11.jpg)
11
Artificial Prediction Market Diagram
Equilibriumprice c
from Price Equations
...
...
Market participants
hm(x) mBetting function BudgetClassifier
hM(x) MBetting function BudgetClassifier
h1(x) 1Betting function BudgetClassifier
Inpu
t (x,
y)Prediction
Market
Estimated probabilityp(y|x)=c
![Page 12: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/12.jpg)
12
Market Update (x,y)
1. Compute equilibrium price c based on the price equations.2. For each m=1,...,M
Update participant m’s budget as
![Page 13: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/13.jpg)
13
Price EquationsMain requirement: The total budget must remain the same after each market
update, independent of the outcome y. This means:
This must hold for any y, since the market price c must depend only on x for prediction purposes.
We also have
![Page 14: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/14.jpg)
14
Solving the Price Equations Price Uniqueness
If are monotonic, the price c is unique Holds for our betting functions. Solving the price equations
Analytically when possible: For Constant Market Two class linear market.
Numerically: Double bisection method Mann Iteration (faster)
![Page 15: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/15.jpg)
15
Constant Betting is Linear Aggregation In the case of constant betting functions
we obtain linear aggregation of classifiers
existent in Adaboost, Random Forest, etc. We obtain a new online learning rule for linear aggregation:
![Page 16: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/16.jpg)
16
Logistic Regression Market If x ∈ RM, then picking the betting functions
Gives the price equilibrium equation
Which gives the logistic regression model
![Page 17: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/17.jpg)
17
Logistic Regression Market Update This has the update rule that conserves
It resembles the online logistic regression update rule
An example of Logistic betting
![Page 18: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/18.jpg)
18
Kernel Method for the Market Each instance xi is a participant Each participant given as
Has decision boundary
![Page 19: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/19.jpg)
19
Kernel Method for the Market Decision boundary
Can use the RBF Kernel Trick for nonlinear boundaries No margin though
![Page 20: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/20.jpg)
20
Maximum Likelihood The Constant Market maximizes the log likelihood
The update
can be viewed as a gradient ascent on L() The Market update is stochastic gradient ascent
![Page 21: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/21.jpg)
21
Loss functions for both the batch and Market (incremental) updates.
Batch vs Incremental Market Updates
![Page 22: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/22.jpg)
22
Specialization In Boosting and Random Forrest, all classifiers are
aggregated for any observation x∈. The Market participants can be specialized
A participant can predict very well on a subregion of . It will not bet on any x outside its region. For each observation, a different subset of classifiers could
participate in betting Example: a leaf node of a random tree
![Page 23: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/23.jpg)
23
Decision Tree Rules as Specialized Classifiers Decision tree rules (leaves) can perfectly classify training data
in their specialized domain.
![Page 24: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/24.jpg)
24
Real Data Results 21 datasets from the UC Irvine Machine Learning repository
Many are small (≈ 200 examples). Training and test sets are randomly subsampled, 90% for training
and 10% for testing. Exceptions are satimage and poker datasets with test sets of size
2000 and 106 respectively All results are averaged over 100 runs. Significance comparison tests (<0.01):
Mean differences from RF results from Breiman’01 Paired t-tests with our RF implementation
![Page 25: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/25.jpg)
25
Results on UCI Data
ADB and RFB are Adaboost and Random Forest from Breiman’01 CB and AB perform best and significantly outperform RF in many cases Trained markets never performed significantly worse than RF
![Page 26: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/26.jpg)
26
Application: Lymph Node Detection About 2000 candidate lymph node centers are obtained with
a trained detector (Barbu et al, 2012) At each candidate, a segmentation is obtained From each segmentation
17000 features are extracted ~30 are selected by Adaboost
Detected lymph node candidates
![Page 27: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/27.jpg)
27
Example Axillary Region
Detected LN candidates Detected Lymph Nodes Detected Lymph Nodes
![Page 28: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/28.jpg)
A Market of Classifier Bins Adaboost is based on histogram classifiers with 64 bins
Converted to Constant Market Each bin is a specialized participant bidding for one class Initial budgets are the Adaboost coefficients Totally 2048 participants Weighted update with w+=0.5/N+, w– =0.5/N–
![Page 29: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/29.jpg)
Lymph Node Detection Results
Detection rate at 3FP/vol (clinically acceptable) Six fold cross-validation
0 5 10 15 20 25 300.79
0.8
0.81
0.82
0.83
0.84
0.85
0.86
0.87
Epoch
Det
ectio
n R
ate
at 3
FP
/Vol
Train MarketTrain AdaboostTest MarketTest Adaboost
![Page 30: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/30.jpg)
Lymph Node Detection Results
Market performance at 7 epochs p-value 0.028
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
False positives per volume
Det
ectio
n ra
te
Train Market 7 EpochsTrain AdaboostTest Market 7 EpochsTest Adaboost
![Page 31: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/31.jpg)
31
The Regression Market Extend class labels to have “uncountably many” labels Participants’ bets and prices become conditional densities Equilibrium price and updates generalize As with Classification Market, it maximizes log likelihood and
minimizes an approximation of the E[KL(p(y|x),c(y|x;)].
![Page 32: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/32.jpg)
32
The Regression Market The proportion of the budget spent on contracts for “class”
at price is The number of contracts purchased for is
Introduce reward kernel that rewards for “almost”correct predictions (e.g. Gaussian, Dirac Delta).
![Page 33: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/33.jpg)
33
Constant Betting Update Rule
This gives the update rule:
caps the total proportion bet This prevents instantaneous bankruptcies (i.e. ) is also the learning rate.
![Page 34: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/34.jpg)
34
Constant Betting Update Rule: Delta Update
When
Same update rule as classification market. Still improves aggregation but prone to overfitting.
![Page 35: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/35.jpg)
35
Constant Betting Update Rule: Gaussian Update When
Have to evaluate an integral. Use Gaussian-Quadrature.
are the Hermite-Gauss nodal points and weights. should reflect noise level of training data.
![Page 36: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/36.jpg)
36
Loss Examples
Training, test RMSD and loss for abalone and cpu-performance data sets
![Page 37: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/37.jpg)
37
Real Data Results
RFB is Regression Forest from Breiman’01 GM, DM perform best and significantly outperforms RF in most cases
![Page 38: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/38.jpg)
38
Clustering Regression Tree Want to “regress” multimodal responses (e.g. circle). Generalize Regression Tree to cluster Y values Use Market to “weed out” poorly clustered branches of a
forest.
A single clustering regression tree on the spiral data.
![Page 39: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/39.jpg)
39
ConclusionA theory for Artificial Prediction Markets based on the
Iowa Electronic Market: Aggregate classifiers, regressors, and densities. Very simple update rules. Logistic Regression and Kernel methods. Can be used for both online and offline learning. Significantly outperforms Random Forest in many cases, in
both prediction and probability estimation.
![Page 40: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/40.jpg)
40
Future Work Generalization error and VC dimension of the Market Feature (participant) selection Learning betting functions Regression Market applications in Computer Vision and
Medical Imaging Other types of Market participants
![Page 41: Department of Statistics Florida State University](https://reader034.fdocuments.us/reader034/viewer/2022052601/628cd3f8c3ffb17f3b330b23/html5/thumbnails/41.jpg)
References N. Lay, A. Barbu. Supervised Aggregation of Classifiers using
Artificial Prediction Markets. ICML 2010 N. Lay, A. Barbu. The Artificial Regression Market. Arxiv. A. Barbu, M. Suehling, X. Xu, D. Liu, S. K. Zhou, D.
Comaniciu. Automatic Detection and Segmentation of Lymph Nodes from CT Data. IEEE Trans Medical Imaging, 31, No. 2, 240–250, 2012.
A. Barbu, N. Lay. An Introduction to Artificial Prediction Markets. JMLR 2012