Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data...

Learning in Bayesian NetworksLearning in Bayesian Networks

Known StructureComplete Data

Known StructureIncomplete Data

Unknown StructureComplete Data

Unknown StructureIncomplete Data

Learning

The Learning ProblemThe Learning Problem

Known Structure Complete DataKnown Structure Complete Data

Known Structure Incomplete DataKnown Structure Incomplete Data

Unknown Structure Complete DataUnknown Structure Complete Data

Unknown Structure Incomplete DataUnknown Structure Incomplete Data

Known StructureKnown Structure

Method A

CPTs A

Method B

CPTs B

A= PrB

+CPTs B

Which probability distribution should we choose?

Common criterion: Choose distribution that maximizes

likelihood of data

A= PrB

+CPTs B

Data D

PrA (D) = PrA (d1) … PrA (dm) Likelihood of data given PrA

PrB (D) = PrB (d1) … PrB (dm) Likelihood of data given PrB

Maximizing Likelihood of DataMaximizing Likelihood of Data

• Complete Data: Unique set of CPTs which maximize likelihood of data

• Incomplete Data: No Unique set of CPTs which maximize likelihood of data

Maximizing Likelihood of DataMaximizing Likelihood of Data

• Complete Data: Unique set of CPTs which maximize likelihood of data

• Incomplete Data: No Unique set of CPTs which maximize likelihood of data

Known Structure, Complete DataKnown Structure, Complete DataData D

òêdjbc= Count(bc;D)Count(dbc;D)

Estimated parameter: Number of data points di with d b

cNumber of data points di with b c

Known Structure, Complete DataKnown Structure, Complete DataData D

òêdjbc= Count(bc;D)Count(dbc;D)

Estimated parameter:

= Pj=1m I (bc;dj)

Pj=1m I (dbc;dj )

ComplexityComplexity• Network with:

– Nodes: n– Parameters: k– Data points: m

• Time complexity: O(m k )(straightforward implementation)

• Space complexity: O(k + mn)parameter count

Known Structure, Incomplete DataKnown Structure, Incomplete Data

òêdjbc= Pj=1m Pri(bcjdj)

Pj=1m Pri(dbcjdj )

Estimated parameters at iteration i+1 (using the CPTs at iteration i):Pr0 corresponds to the initial Bayesian network (random CPTs)

Known Structure, Incomplete DataKnown Structure, Incomplete Data

EM Algorithm (Expectation-Maximization):-Initial CPTs to random values-Repeat until convergence:

-Estimate parameters using current CPTs (E-step)-Update CPTs using estimates (M-step)

EM AlgorithmEM Algorithm• Likelihood of data cannot get smaller after an

iteration• Algorithm is not guaranteed to return the network

which absolutely maximizes likelihood of data • It is guaranteed to return a local maxima:

Random re-starts• Algorithm is stopped when

– change in likelihood gets very small– Change in parameters gets very small

ComplexityComplexity• Network with:

– Nodes: n– Parameters: k– Data points: m– Treewidth: w

• Time complexity (per iteration): O(m k n 2w)(straightforward implementation)

• Space complexity: O(k + nm + n 2w)parameter count + space for data + space for inference

Collaborative FilteringCollaborative Filtering

• Collaborative Filtering (CF) finds items of interest to a user based on the preferences of other similar users.– Assumes that human behavior is predictable

Where is it used?Where is it used?• E-commerce

– Recommend products based on previous purchases or click-stream behavior

– Ex: Amazon.com

• Information sites– Rate items based on

previous user ratings– Ex: MovieLens, Jester

John 5 - 3 2Sam - 4 1 5Cindy 3 - 5 -

Bob 5 1 - -

Bob 5 1 3.5 1.7

Memory-based AlgorithmsMemory-based Algorithms

• Use the entire database of user ratings to make predictions.– Find users with similar voting histories to the

active user.– Use these users’ votes to predict ratings for

products not voted on by the active user.

Model-based AlgorithmsModel-based Algorithms

• Construct a model from the vote database.• Use the model to predict the active user’s

ratings.

Bayesian ClusteringBayesian Clustering

• Use a Naïve Bayes network to model the vote database.

• m vote variables: one for each title.– Represent discrete vote values.

• 1 “cluster” variable– Represents user personalities

05.35.1.5.

25.23.1

cvCv kk

Naïve BayesNaïve Bayes

V1 V2 V3 Vm…

05.35.1.5.

25.23.1

cvCv kk

• Inference– Evidence: known votes vk for titles k I– Query: title j for which we need to predict vote

• Expected value of vote:

hkjj Ikvhvhp

1):|Pr(

V1 V2 V3 Vm…

LearningLearning• Simplified Expectation Maximization (EM)

Algorithm with partial data

• Initialize CPTs with random values subject to the following constraints:

)Pr(cc )|Pr(| cvkcvk

c 1| k

DatasetsDatasets• MovieLens

– 943 users; 1682 titles; 100,000 votes (1..5); explicit voting

• MS Web – website visits– 610 users; 294 titles; 8,275 votes (0,1) :

null votes => 0 : 179,340 votes; implicit voting

0 5 10 15

Iteration

• Learning curve for MovieLens Dataset

ProtocolsProtocols

• User database is divided into: 80% training set and 20% test set.– One-by-one select a user from the test set to be

the active user.– Predict some of their votes based on remaining

• All-But-One

• Given-{Two, Five, Ten}

Qe eIa e e e e e e e ee e e

Q eeIa Q Q Q Q Q Q QQ Q Q Q

e e e e eQIa Q Q Q Q Q QQ Q

e eee QeIa QQ Q e e ee e

Evaluation MetricEvaluation Metric

• Average Absolute Deviation

• Ranked Scoring

jaja vpP ,

ResultsResults• Experiments were run 5 times and averaged• Movielens

Algorithm Given-Two Given-Five Given-Ten All-But-One

Correlation 1.019 .916 .865 .806

VecSim .948 .878 .843 .799

BC(9) .771 .765 .763 .753

• MS Web

Algorithm Given-Two Given-Five Given-Ten All-But-One

Correlation 0.105 0.0911 0.0844 0.0673

VecSim 0.101 0.0885 0.0818 0.0675

BC(9) 0.0652 0.0652 0.0649 0.0507

Computational IssuesComputational Issues

• Prediction time: (Memory-based) 10 minutes per experiment; (Model-based) 2 minutes

• Learning time: 20 minutes per iteration

• n: number of data point; m: number of titles; w: number of votes per title;|C| number of personality types

Algorithm Prediction Time Learning Time Space

Memory-based O(n*m) N/A O(n*m)

Model-based O(|C|*m) O(n*m*|C|*w) O(|C|*m*w)

Demo of SamIamDemo of SamIam• Building networks:

– Nodes, Edges– CPTs

• Inference:– Posterior marginals– MPE– MAP

• Learning: EM• Sensitivity Engine

Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data...

Documents

Transcript of Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data...

CHSL Incomplete

Incomplete Offers

Amino Acids Generalchemhaven.org/che102/EP/Ch29_4x6_Print.pdf · • Complete (animal) vs. Incomplete (vegetable) protein. Amino Acids Structure. Amino Acid Structure: • Amide,

THE GENETIC STRUCTURE NATURAL POPULATIONS THE · 2003-07-29 · THE GENETIC STRUCTURE OF NATURAL POPULATIONS OF DROSOPHILA MELANOGASTER. XIV. EFFECTS OF THE INCOMPLETE DOMINANCE OF

GENETICS Copyright © 2020 Incomplete annotation has a ... · transcription to known genes, confirming that human gene annotation remains incomplete, even among well-studied genes

The evolution of methods for establishing evolutionary ...abacus.gene.ucl.ac.uk/ziheng/pdf/2016DonoghueYangRockClock.pdf · The fossil record is well known to be incomplete. Read

Teaching Sentence Structure to Primary Writershtencinclusion.weebly.com/.../teachingsentencestructurepartone.pdf · Sentence Structure Incomplete sentences, missed periods or capitals,

Functional Annotation of Proteins with Known Structure by Structure and Sequence Similarity,

Chapter 8 Membrane Structure & Function. Membrane Structure Selective permeability Controls traffic Known as the plasma membrane Amphipathic.

Incomplete leaders

means incomplete care. Incomplete information MANAGING RISK … · 2017-07-18 · Incomplete information means incomplete care. When you bring data together, something amazing happens.

De novo protein structure generation from incomplete chemical

Communicable Diseases Intelligence 2018 - Creutzfeldt-Jakob … · Web viewThe classification of register cases remains as “incomplete” until all known available information

JPMW Incomplete

Dynamic Equilibrium and the Structure of Premiums in a ... · cesses, Dynamic Optimization, Term Structure of Interest Rates, Incomplete Models, Non-Pro- portional Treaties. 1. Introduction

Incomplete Contracts

Ten Things I Wish We'd Known (a wholly incomplete guide to running a cloud business)

Incomplete songs

Kinase Tool compound structure SMILES IC50/Ki known off ......Tool compound structure SMILES IC50/Ki known off targets other compounds Reference Stage of development Commercial source

DNA Structure and Protein Synthesis (also known as Gene Expression)