Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar...
-
Upload
oliver-summers -
Category
Documents
-
view
219 -
download
2
Transcript of Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar...
![Page 1: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/1.jpg)
ProbProbabilistic abilistic IInformation nformation RRetrievaletrieval
[ ProbIR ]
Suman K Mitra
DAIICT, Gandhinagar
![Page 2: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/2.jpg)
Acknowledgment
Alexander Dekhtyar, University of Maryland
Mandar Mitra, ISI, Kolkata
Prasenjit Majumder, DAIICT, Gandhinagar
![Page 3: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/3.jpg)
Why use Probabilities?
•Information Retrieval deals with uncertain information
•Probability is a measure of uncertainty
•Probabilistic Ranking Principle
•provable
•minimization of risk
•Probabilistic Inference
•To justify your decision
![Page 4: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/4.jpg)
Document Collection
Document Representation
Query
Query Representation
11
22
33
Basic IR SystemBasic IR System
44
1. How good the representation is?
2. How exact the representation is?
3. How well is the query matched ?
4. How relevant is the result to the query?
![Page 5: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/5.jpg)
Approaches and main ContributorsApproaches and main Contributors
Probability Ranking Principle – Robertson 1970 onwards
Information Retrieval as Probabilistic Inference – Van Rijsbergen et al. 1970 onwards
Probabilistic Indexing – Fuhr et al. 1980 onwards
Bayesian Nets in Information Retrieval – Turtle, Croft 1990 onwards
Probabilistic Logic Programming in Information Retrieval – Fuhr et al. 1990 onwards
![Page 6: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/6.jpg)
Probability Ranking PrincipleProbability Ranking Principle
1.Collection of documents
2.Representation of documents
3.User uses a query
4.Representation of query
5.A set of documents to return
Question: In what order documents to present to user?
Logically: Best document first and then next best and so on
Requirement: A formal way to judge the goodness of documents with respect to query
Possibility: Probability of relevance of the document with respect to query
![Page 7: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/7.jpg)
Probability Ranking PrincipleProbability Ranking Principle
If a retrieval system’s response to each request is a ranking of the documents in the collections in order of decreasing probability of goodness to the user who submitted the request ...
… where the probabilities are estimated as accurately as possible on the basis of whatever data made available to the system for this purpose ...
… then the overall effectiveness of the system to its users will be the best that is obtainable on the basis of that data.
W. S. CooperW. S. Cooper
![Page 8: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/8.jpg)
)(
)()|()|(
)(
)()|()|(
)()|()()()|(
bp
apabpbap
bp
apabpbap
apabpbapbpbap
Probability BasicsBayes’ Rule
Let a and b are two events
Odds of an event a is defined as
)(1
)(
)(
)()(
ap
ap
ap
apaO
![Page 9: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/9.jpg)
Conditional probability satisfies all axioms of probability
11
)|()|(i
ii
ibpbp aa
1)|(0 bap(i)
1)|( bSp(ii)
(iii) If
are mutually exclusive events thenai
0)(,0)(,)(
)()|( bpabp
bp
abpbap 0)|( bap
1)(
)()()(
bp
abpbpabpbab 1)|( bap
Hence (i)
1)(
)(
)(
)()|(
bp
bp
bp
SbpbSp Hence (ii)
![Page 10: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/10.jpg)
a1
a2
a3
a4
b
1
1
)(
))(()|(
i
ii
i bp
bpbp
aa
)(
)(1
bp
bpi
ia
)(
)(1
bp
bpi
ia
[ ‘s are all mutually exclusive]bai
Hence (iii)
1
1 )|()(
)()|(
ii
ii
bpbp
bpbp
aa
![Page 11: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/11.jpg)
Probability Ranking PrincipleLet x be a document in the collection. Let R represent relevance of a document w.r.t. given (fixed) query and let NR represent non-relevance.
p(R|x) - probability that a retrieved document x is relevant.p(NR|x) - probability that a retrieved document x is non-relevant.
)(
)()|()|(
)(
)()|()|(
xp
NRpNRxpxNRp
xp
RpRxpxRp
p(R),p(NR) - prior probability of retrieving a relevant and non-relevant document respectively
p(x|R), p(x|NR) - probability that if a relevant (non-relevant) document is retrieved, it is x.
![Page 12: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/12.jpg)
)(
)()|()|(
)(
)()|()|(
xp
NRpNRxpxNRp
xp
RpRxpxRp
Ranking Principle (Bayes’ Decision Rule):If p(R|x) > p(NR|x) then x is relevant,otherwise x is not relevant
Probability Ranking Principle
![Page 13: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/13.jpg)
Is PRP minimizes the average probability of error?
)|(
)|()|(
xNRp
xRpxerrorp
If we decide NR
If we decide R
x
xpxerrorperrorp )()|()(
p(error) is minimal when all p(error|x) are minimal.Bayes’ decision rule minimizes each p(error|x).
Probability Ranking Principle
Actual X is R X is NR
Decision
X is R 2
X is NR 1
![Page 14: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/14.jpg)
x
xpxerrorperrorp )()|()(
NRx Rx
xpxNRpxpxRp )()|()()|(X is either in R or NR
NRx Rx
xpxNRpxpxRp )()|()()|(
NRx NRx
xpxNRpxpxNRp )()|()()|(
NRx x
xpxNRpxpxNRpxRp )()|()()}|()|({
Constant
Rx
tConsxpxRpxNRperrorp tan)()}|()|({)(
![Page 15: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/15.jpg)
Minimization of p(error) Minimization of 2p(error)
Rx NRx
xpxNRpxRpxpxRpxNRp )(}|()|({)()}|()|({Min
p(x) is same for any x
Define}0)|()|(:{1 xNRpxRpxS
}0)|()|(:{2 xNRpxRpxS
are nothing but the decision for x to be in R and NR respectively
1S2Sand
: The decision minimizes p(error)1S2Sand
Hence
![Page 16: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/16.jpg)
• How do we compute all those probabilities?– Cannot compute exact probabilities, have to use
estimates from the (ground truth) data.
(Binary Independence Retrieval)
(Bayesian Networks??)
Assumptions– “Relevance” of each document is independent of
relevance of other documents.– Most applications are for Boolean model.
Probability Ranking Principle
Issues
![Page 17: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/17.jpg)
• Simple case: no selection costs.
• x is relevant iff p(R|x) > p(NR|x) (Bayes’ Decision Rule)
• PRP: Rank all documents by p(R|x).
Probability Ranking Principle
Actual X is R X is NR
Decision
X is R
X is NR
![Page 18: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/18.jpg)
• More complex case: retrieval costs.– C - cost of retrieval of relevant document– C’ - cost of retrieval of non-relevant document– let d, be a document
• Probability Ranking Principle: if
for all d’ not yet retrieved, then d is the next document to be retrieved
))|(1()|())|(1()|( dRpCdRpCdRpCdRpC
Probability Ranking PrincipleActual X is R X is NR
Decision
X is R
X is NR
![Page 19: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/19.jpg)
Binary Independence Model
![Page 20: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/20.jpg)
Binary Independence Model
• Traditionally used in conjunction with PRP
• “Binary” = Boolean: documents are represented as binary vectors of terms:– – iff term i is present in document x.
• “Independence”: terms occur in documents independently
• Different documents can be modeled as same vector.
),,( 1 nxxx 1ix
![Page 21: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/21.jpg)
• Queries: binary vectors of terms
• Given query q, – for each document d, need to compute p(R|q,d).– replace with computing p(R|q,x) where x is vector
representing d
• Interested only in ranking
• Use odds:
),|(
),|(
)|(
)|(
),|(
),|(),|(
qNRxp
qRxp
qNRp
qRp
xqNRp
xqRpxqRO
Binary Independence Model
![Page 22: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/22.jpg)
• Using Independence Assumption:
n
i i
i
qNRxp
qRxp
qNRxp
qRxp
1 ),|(
),|(
),|(
),|(
),|(
),|(
)|(
)|(
),|(
),|(),|(
qNRxp
qRxp
qNRp
qRp
xqNRp
xqRpxqRO
Constant for each query
Needs estimation
n
i i
i
qNRxp
qRxpqROdqRO
1 ),|(
),|()|(),|(•So :
Binary Independence Model
![Page 23: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/23.jpg)
n
i i
i
qNRxp
qRxpqROdqRO
1 ),|(
),|()|(),|(
• Since xi is either 0 or 1:
01 ),|0(
),|0(
),|1(
),|1()|(),|(
ii x i
i
x i
i
qNRxp
qRxp
qNRxp
qRxpqROdqRO
• Let );,|1( qRxpp ii );,|1( qNRxpr ii
• Assume, for all terms not occurring in the query (qi=0) ii rp
Binary Independence Model
![Page 24: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/24.jpg)
All matching terms Non-matching query terms
All matching termsAll query terms
11
101
1
1
)1(
)1()|(
1
1)|(),|(
iii
i
iii
q i
i
qx ii
ii
qx i
i
qx i
i
r
p
pr
rpqRO
r
p
r
pqROxqRO
Binary Independence Model
![Page 25: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/25.jpg)
Constant foreach query
Only quantity to be estimated for rankings
11 1
1
)1(
)1()|(),|(
iii q i
i
qx ii
ii
r
p
pr
rpqROxqRO
• Retrieval Status Value:
11 )1(
)1(log
)1(
)1(log
iiii qx ii
ii
qx ii
ii
pr
rp
pr
rpRSV
Binary Independence Model
![Page 26: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/26.jpg)
• All boils down to computing RSV.
11 )1(
)1(log
)1(
)1(log
iiii qx ii
ii
qx ii
ii
pr
rp
pr
rpRSV
1
;ii qx
icRSV)1(
)1(log
ii
iii pr
rpc
So, how do we compute ci’s from our data ?
Binary Independence Model
![Page 27: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/27.jpg)
• Estimating RSV coefficients.• For each term i look at the following table:
Documens Relevant Non-Relevant Total
Xi=1 s n-s n
Xi=0 S-s N-n-S+s N-n
Total S N-S N
S
spi
)(
)(
SN
snri
)()(
)(log),,,(
sSnNsn
sSssSnNKci
• Estimates:
Binary Independence Model
![Page 28: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/28.jpg)
PRP and BIR: The lessons
• Getting reasonable approximations of probabilities is possible.
• Simple methods work only with restrictive assumptions:– term independence– terms not in query do not affect the outcome– boolean representation of documents/queries– document relevance values are independent
• Some of these assumptions can be removed
![Page 29: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/29.jpg)
Probabilistic weighting scheme
))((
)(log
sSsn
sSnNs
Add 0.5 with each term
)5.0)(5.0(
)5.0)(5.0(log
sSsn
sSnNs
log function of ratio of probabilities may lead to positive or negative of infinity
![Page 30: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/30.jpg)
)5.0)(5.0(
)5.0)(5.0(log.
)(
)1(.
)(
)1(
1
1
3
3
sSsn
sSnNs
fLk
fk
qk
qk
: are constants.q : within query frequency (wqf),f : within document frequency (wdf),n : number of documents in the collection indexed by this term,N : total number of documents in the collection, s : number of relevant documents indexed by this term,S : total number of relevant documents,L : normalised document length (i.e. the length of this document divided by the average length of documents in the collection).
Probabilistic weighting scheme [S.Robertson]
In general form, the weighting function is
31,kk
![Page 31: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/31.jpg)
Probabilistic weighting scheme [S.Robertson]
BM11
Stephen Robertson's BM11 uses the general form for weights, but adds an extra item to the sum of term weights to give the overall document score
)1(
)1(2 L
Lnk q
: number of terms in the query (the query length),
: another constant. 2kqn
This term is 0 when L=1
)1(
)1(2 L
Lnk q
)5.0)(5.0(
)5.0)(5.0(log.
)(
)1(.
)(
)1(
1
1
3
3
sSsn
sSnNs
fLk
fk
qk
qk
![Page 32: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/32.jpg)
Probabilistic weighting scheme
)1(
)1(2 L
Lnk q
)5.0)(5.0(
)5.0)(5.0(log.
)(
)1(.
)(
)1(
1
1
3
3
sSsn
sSnNs
fLk
fk
qk
qk
BM15
)1(
)1(2 L
Lnk q
)5.0)(5.0(
)5.0)(5.0(log.
)(
)1(.
)(
)1(
1
1
3
3
sSsn
sSnNs
fk
fk
qk
qk
BM15 is same as BM11 with term
replaced by
fLk 1
fk 1
![Page 33: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/33.jpg)
Probabilistic weighting scheme
BM25
)5.0)(5.0(
)5.0)(5.0(log.
)(
)1(.
)(
)1( 1
3
3
sSsn
sSnNs
fk
fk
qk
qk
BM25 combines the B11 and B15 with a scaling factor, b, which turns BM15 into BM11 as it moves from 0 to 1
))1((1 bbLkk
b=1 BM11 b=0 BM15
Default values used: 5.0,1,0,1 321 bkkk
1,02 bk General Form
![Page 34: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/34.jpg)
Bayesian Networks
![Page 35: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/35.jpg)
2. Binary Assumption
xi = 0 or 1
Can it be 0, 1, , …..n ?
1. Independent Assumption
xxx nx ,........,
21
Independent
Can they be dependent?
A possible way could be Bayesian Networks
![Page 36: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/36.jpg)
Modeling of JPD in a compact way by using graph (s) to reflect conditional independence relationships
BN
BN : Representation
Arcs Conditional independence (or causality)
Nodes Random variables
Undirected arcs MRF
Directed arcs BNsBN : Advantages
•Arc from node A to B implies : A causes B
•Compact representation of JPD of nodes
•Easier to learn (Fit to data)
Bayesian Network (BN) BasicsBayesian Network (BN) is a hybrid system of probability theory and graph theory. Graph theory provides the user to build an interface to model highly interacting variables. On the other hand probability theory ensures the system as a whole is consistent, and provides ways to interface models to data.
![Page 37: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/37.jpg)
G = Global structure (DAG – Directed Acyclic Graph) that contains
• a node for each variable xi U, i = 1,2, ….., n
•edges represent the probabilistic dependencies between
nodes of variables xi U
M = set of local structures {M1 , M2 , …., Mn }. n mappings for n variables. Mi maps each values of {xi , Par(xi )} to parameter . Here Par(xi ) denotes the set of parent nodes of xi in G.
The joint probability distribution over U can be decomposed by the global structure G as
P(U | B) = i P(xi | Par(xi ), , Mi , G)
x4
x2 x3
x1
![Page 38: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/38.jpg)
Yes No
True False
Wet Grass Yes No
SprinklerOn Off
Rain
Cloud
BN : An Example * By Kevin Murphy
p(C=T) p(C=F)
1/2 1/2
C p(S=On) p(S=Off)
T 0.1 0.9
F 0.5 0.5
C p(R=Y) p(R=N)
T 0.8 0.2
F 0.2 0.8S R p(W=Y) p(W=N)
On Y 0.99 0.01
On N 0.9 0.1
Off Y 0.9 0.1
Off N 0.0 1.0
p(C,S,R,W) = p(C ) p(S|C) p(R|C,S) p(W|C,S,R)
p(C,S,R,W) = p(C ) p(S|C) p(R|C) p(W|S,R)
![Page 39: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/39.jpg)
BN : As Model•Hybrid system of probability theory and graph theory
•Graph theory provides the user to build an interface to model the highly interactive variables
•Probability theory provides ways to interface model to data
BN : Notations),( sBB
sB : Structure of the Network
: Set of parameters that encode local prob. dist.
sB again has two parts G and M
G is the global structure (DAG)
M is the mapping for n variables (arcs)
)}(,{ ii xparx
![Page 40: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/40.jpg)
BN : Learning
Structure : DAG
Parameter : CPD
sBStructure
Known
Unknown
Parameter
Known
Unknown
Parameter can only be learnt if structure is either known or learnt earlier
![Page 41: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/41.jpg)
Structure is known and full data is observed (nothing missing)
BN : Learning
Parameter learning MLE, MAP
Structure is known and full data is NOT observed (data missing)
Parameter learning EM
![Page 42: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/42.jpg)
BN : Learning
Learning Structure
•To find the best DAG that fits the data
)(
)()|()|(
Dp
BpBDpDBp ss
s Objective Function:
))(log())(log())|(log())|(log( DpBpBDpDBp sss
Constant and indep. of sB
Search Algorithm : NP hard
Performance criteria used : MIC, BIC
![Page 43: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/43.jpg)
References (basics)
1. S. E. Roberctson, “The Probability Ranking Principle in IR”, Journal of Documentation, 33, 294-304, 1977.
2. K. S. Jones, S. Walker and S. E. Roberctson, “A Probabilistic model for information retrieval: development and comparative experiments-Part-1”, Information Processing and Management, 36, 779-808, 2000.
3. K. S. Jones, S. Walker and S. E. Roberctson, “A Probabilistic model for information retrieval: development and comparative experiments-Part-2”, Information Processing and Management, 36, 809-840, 2000.
4. S. E. Roberctson and H. Zaragoza, “The Probabilistic relevance framework: BM25 and beyond Ranking Principle in IR”, Foundation and Trends in Information Retrieval, 3, 333-389, 2009.
5. S. E. Roberctson, C. J. Van Rijsbergen and M. F. Porter, “Probabilistic models of indexing and searching”, Information Retrieval Research, Oddy et al. (Ed.s), 36, 35-56, 1981.
6. N. Fuhr and C. Buckley, “A probabilistic learning approach for document indexing”, ACM Tran. On Information systems, 9, 223-248, 1991.
7. H. R. Turtle and W. B. Croft, “Evaluation of an inference network based retrieval model, ACM Tran. On Information systems, 7, 187-222, 1991.
![Page 44: Probabilistic Information Retrieval [ ProbIR ] Suman K Mitra DAIICT, Gandhinagar suman_mitra@daiict.ac.in.](https://reader037.fdocuments.us/reader037/viewer/2022102906/56649cfa5503460f949cbe50/html5/thumbnails/44.jpg)
Thanks You
Discussions