CSA4080: Adaptive Hypertext Systems II
description
Transcript of CSA4080: Adaptive Hypertext Systems II
![Page 1: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/1.jpg)
1 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
CSA4080:Adaptive Hypertext Systems II
Dr. Christopher StaffDepartment of Computer Science & AI
University of Malta
Topic 6: Information and Knowledge Representation
![Page 2: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/2.jpg)
2 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Aims and Objectives
• Models of Information Retrieval– Vector Space Model– Probabilistic Model
• Relevance Feedback
• Query Reformulation
![Page 3: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/3.jpg)
3 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Aims and Objectives
• Dealing with General Knowledge
• Programs that reason
• Conceptual Graphs
• Intelligent Tutoring Systems
![Page 4: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/4.jpg)
4 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Background
• We’ve talked about how user information can be represented
• We need to be able to represent information about the domain so that we can reason about what the user’s interests are, etc.
• We covered the difference between data, information, and knowledge in CSA3080...
![Page 5: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/5.jpg)
5 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Background
• In 1945, Vannevar Bush writes “As We May Think”– Gives rise to seeking “intelligent” solutions to
information retrieval, etc.
• In 1949, Warren Weaver writes that if Chinese is English + codification, then machine translation should be possible– Leads to surface-based/statistical techniques
![Page 6: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/6.jpg)
6 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Background
• Even today, nearly 60 years later, there is significant effort in both directions
• For years, intelligent solutions were hampered by the lack of fast enough hardware, software– Doesn’t seem to be an issue any longer, and the
Semantic Web may be testimony to that– But there are sceptics
![Page 7: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/7.jpg)
7 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Background
• Take IR as an example
• At the dumb end we have “reasonable” generic systems, but at other end, systems are domain specific, more expensive, but do they give “better” results?
![Page 8: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/8.jpg)
8 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Background
• At what point does it cease to be cost effective to attempt more intelligent solutions to the IR problem?
![Page 9: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/9.jpg)
9 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Background
• Is “Information” Retrieval a misnomer?– Consider your favourite Web-based IR
system... does it retrieve information?– Can you ask “Find me information about all
flights between Malta and London”?• And what would you get back?
– Can you ask “Who was the first man on the moon?”
![Page 10: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/10.jpg)
10 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Background
• With many IR systems that we use, the “intelligence” is firmly rooted in the user– We must learn how to construct our queries so
that we get the information we seek– We sift through relevant and non-relevant
documents in the results list
• What we can hope for is that “patterns” can be identified to make life easier for us - e.g., recommender systems
![Page 11: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/11.jpg)
11 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Background
• Surface-based techniques tend to look for and re-use patterns as heuristics, without attempting to encode “meaning”
• The Semantic Web, and other “intelligent” approaches, try to encode meaning so that it can be reasoned with and about
• Cynics/sceptics/opponents believe that there is more success to be had in giving users more support, than to encode meaning into documents to support automation
![Page 12: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/12.jpg)
12 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
However...
• We will cover both surface-based and some knowledge-based approaches to supporting the user in his or her task
![Page 13: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/13.jpg)
13 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Information Retrieval
• We will discuss two IR models...– Vector Space Model– Probabilistic Model
• ... and surface-based techniques that can improve their usability– Relevance Feedback– Query Reformulation– Question-Answering
![Page 14: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/14.jpg)
14 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Knowledge
• Conceptual graphs support the encoding and matching of concepts
• Conceptual graphs are more “intelligent” and can be used to overcome some problems like the Vocabulary Problem
![Page 15: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/15.jpg)
15 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Reasoning on the Web
• REWERSE (FP6 NoE) is an attempt to represent meaning contained in documents and to reason with and about it so that a single high-level user request may be carried out even if it contains several sub-tasks– E.g., “Find me information about cheap flights
between Malta and London”
![Page 16: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/16.jpg)
16 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• Recommended Reading– p18-wong (Generalised Vector Space
Model).pdf - look at refs [1],[2],[3] for original work
![Page 17: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/17.jpg)
17 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• Documents are represented as m-dimensional vectors or “bags of words”
• m is the size of the vocabulary
• wk = 1, indicates term is present in document
• wk = 0, indicates term is absent
• dj = <1,0,0,1,...,0,0>
![Page 19: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/19.jpg)
19 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• The query is then plotted into m-dimensional space and the nearest neighbours are the most relevant
• However, the results set is usually presented as a list ranked by similarity to the query
![Page 20: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/20.jpg)
20 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• Cosine Similarity Measure (from IR vector space model.pdf)
€
sim(Q,D) =
wqk • wdk
k=1
m
∑
(wqk )2 •k=1
m
∑ (wdk )2
k=1
m
∑
![Page 21: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/21.jpg)
21 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• Calculating term weights– Term weights may be binary, integers, or reals– Binary values are thresholded, rather than
simply indicating presence or absence– Integers or reals will be measure of relative
significance of term in document
• Usually, term weight is TFxIDF
![Page 22: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/22.jpg)
22 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• Steps in calculating term weights– Remove stop words– Stem remaining words– Count term frequency (TF)– Count number of documents containing term
(DF)– Invert it (log(C/DF)), where C is total number
of documents in collection
![Page 23: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/23.jpg)
23 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• Normalising weights for vector length– Documents with longer vectors have a better
chance of being retrieved than short ones (simply because there are a larger number of terms that they will match in a query)
– IR should treat all relevant documents as important for retrieval purposes
– Solution: , where w is weight of term t
€
w = w (wi)2
vector(i)
∑
![Page 24: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/24.jpg)
24 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• Why does this work?– Term discrimination– Assumes that terms with high TF and low DF
are good discriminators of relevant documents– Because documents are ranked, documents do
not need to contain precisely the terms expressed in the query
– We cannot say anything (in VSM) about terms that occur in relevant and non-relevant documents - though we can in probabilistic IR
![Page 25: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/25.jpg)
25 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Vector-Space Model
• Vector-Space Model is also used by Recommender Systems to index user profiles and product, or item, features
• Apart from ranking documents, results lists can be controlled (to list top n relevant documents), and query can be automatically reformulated based on relevance feedback
![Page 26: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/26.jpg)
26 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback
• When a user is shown a list of retrieved documents, user can give relevance judgements
• System can take original query and relevance judgements and re-compute the query
• Rocchio...
![Page 27: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/27.jpg)
27 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback
• Basic Assumptions –Similar docs are near each other in vector space–Starting from some initial query, the query can
be reformulated to reflect subjective relevance judgements given by the user
–By reformulating the query we can move the query closer to more relevant docs and further away from nonrelevant docs
![Page 28: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/28.jpg)
28 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback
• In VSM, reformulating query means re-weighting terms in query
• Not failsafe: may move query towards nonrelevant docs!
![Page 29: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/29.jpg)
29 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback
• The Ideal Query
• If we know the answer set rel, then the ideal query is:
€
Qopt =1
RDi −
1
N − RDi
Di ∈Nonrel
∑Di ∈rel
∑
![Page 30: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/30.jpg)
30 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback
• In reality, a typical interaction will be:– User formulates query and submits it– IR system retrieves set of documents– User selects R’ and N’
where 0 <= <= 1 (and vector magnitude usually dropped...)
€
Q i+1 = αQ i +β
′ R Di −
γ
′ N Di
Di ∈ ′ N
∑Di ∈ ′ R
∑
€
Q i+1 = αQ i + β Di − γ DiDi ∈ ′ N
∑Di ∈ ′ R
∑
![Page 31: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/31.jpg)
31 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback
• What are the values of and ?– is typically given a value of 0.75, but this
can vary. Also, after a number of iterations, the original weights of terms can be highly reduced
– If and have equal weight, then relevant and nonrelevant docs make equal contribution to reformulated query
– If = 1, = 0, then only relevant docs are used in reformulated query
– Usually, use = 0.75, = 0.25
![Page 32: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/32.jpg)
32 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback
• ExampleQ: (5, 0, 3, 0, 1)R: (2, 1, 2, 0, 0) N: (1, 0, 0, 0, 2) = 0.75, = 0.50, = 0.25Q’ = 0.75Q + 0.5R – 0.25N
= 0.75(5, 0, 3, 0, 1)+0.5(2, 1, 2, 0, 0)–0.25(1,0, 0, 0, 2)
= (4.5, 0.5, 3.25, 0, 0.25)
![Page 33: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/33.jpg)
33 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback
• How many docs to use in R’ and N’?– Use all docs selected by user– Use all rel docs and highest ranking nonrel docs– Usually, user selects only relevant docs...
• Should entire document vector be used?– Really want to identify the significant terms...
• Use terms with high-frequency/weight• Use terms in doc adjacent to terms from query• Use only common terms in R’ (and N’)
![Page 34: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/34.jpg)
34 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Automatic Relevance Feedback
• Users tend not to select nonrelevant documents, and rarely choose more than one relevant document (http://www.dlib.org/dlib
/november95/11croft.html)– This makes it difficult to use relevance
feedback
• Current research uses automatic relevance feedback techniques...
![Page 35: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/35.jpg)
35 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Automatic Relevance Feedback
• Two main approaches– To improve precision– To improve recall
![Page 36: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/36.jpg)
36 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Automatic Relevance Feedback
• Reasons for low precision– Documents contain query terms, but documents
are not “about” the “concept” or “topic” the user is interested in
– E.g., user wants documents in which a cat chases a dog but the query <cat, chase, dog> also retrieves docs in which dogs chase cats
– Term ambiguity
![Page 37: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/37.jpg)
37 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Automatic Relevance Feedback
• Improving precision– Want to promote relevant documents in the
results list– Assume that top-n (typically 20) documents
are relevant, and assume docs ranked 500-1000 are nonrelevant
– Choose co-occurring discriminatory terms– Re-rank docs ranked 21-499 using (modified)
Rocchio methodp206-mitra.pdf
![Page 38: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/38.jpg)
38 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Automatic Relevance Feedback
• Improving precision– Does improve precision by 6%-13% at p-21 to
p-100– But remember that precision is to do with the
ratio of relevant to nonrelevant documents retrieved
– There may be many relevant documents that were never retrieved (i.e., low recall)
![Page 39: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/39.jpg)
39 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Automatic Relevance Feedback
• Reasons for low recall– “Concept” or “topic” that user is interested in
can be described using terms additional to those express by user in query
– E.g., think of all the different ways in which you can express “car”, including manufacturers names (e.g., Ford, Vauxhall, etc.)
– There is only a small probability that user and author use the same term to describe the same concept
![Page 40: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/40.jpg)
40 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Automatic Relevance Feedback
• Reasons for low recall– “Imprudent” query term “expansion” improves
recall, simply because more documents are retrieved, but hurts precision!
![Page 41: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/41.jpg)
41 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Automatic Relevance Feedback
• Improving recall– Manually or automatically generated thesaurus
used to expand query terms before query is submitted
– We’re currently working on other techniques to pick synonyms that are likely to be relevant
– Semantic Web attempts to encode semantic meaning into documents
p61-voorhees.pdf, qiu94improving.pdf, MandalaSigir99EvComboWordNet.pdf
![Page 42: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/42.jpg)
42 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Indexing Documents
• Obviously, comparing a query vector to each document vector to determine the similarity is expensive
• So how can we do it efficiently, especially for gigantic document collections, like the Web?
![Page 43: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/43.jpg)
43 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Indexing Documents
• Inverted indices– An inverted index is a list of terms in the
vocabulary together with a postings list for each term
– A postings list is a list of documents containing the term
![Page 44: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/44.jpg)
44 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Indexing Documents
• Inverted index
• Several pieces of information can be stored in the postings list– term weight
– location of the term in the document (to support proximity operators)
![Page 45: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/45.jpg)
45 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Indexing Documents
• Results set is obtained using set operators• Once documents in results set are known,
their vectors can be retrieved to perform ranking operations on them
• The document vectors also allow automatic query reformulation to occur following relevance feedback
• See brin.pdf and p2-arasu.pdf
![Page 46: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/46.jpg)
46 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Probabilistic IR
• VSM assumes that a document that contains some term x is about that term
• PIR compares the probability of seeing term x in a relevant document as opposed to a nonrelevant document
• Binary Independence Retrieval Model proposed by Robertson & Sparck Jones, 1976
robertson97simple.pdf, SparckJones98.pdf
![Page 47: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/47.jpg)
47 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• BIR Fundamentals:– Given a user query there is a set of documents
which contains exactly the relevant documents and no other:
• the “ideal” answer set– Given the ideal answer set, a query can be
constructed that retrieves exactly this set• Assumes that relevant documents are
“clustered”, and that terms used adequately discriminate against non-relevant documents
![Page 48: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/48.jpg)
48 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• We do not know what are, in general, the properties of the ideal answer set
• All we know is that documents have terms which “capture” semantic meaning
• When user submits a query, “guess” what might be the ideal answer set
• Allow user to interact, to describe the probabilistic description of the ideal answer set (by marking docs as relevant/non-relevant)
![Page 49: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/49.jpg)
49 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Probabilistic Principle: Assumption– Given a user query q and a document dj in the
collection:• Estimate the probability that the user will find dj
relevant to q
– Rank documents in order of their probability of relevance to the query (Probability Ranking Principle)
![Page 50: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/50.jpg)
50 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Model assumes that probability of relevance depends on q and doc representations only
• Assumes that there is an ideal answer set!
• Assumes that terms are distributed differently in relevant and non-relevant documents
![Page 51: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/51.jpg)
51 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Whether or not a document x is retrieved depends on:– Pr(rel|x): the probability that x is relevant– Pr(nonrel|x): ... that x isn’t relevant
![Page 52: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/52.jpg)
52 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Document Ranking Function: document x will be retrieved if
where a2 is the cost of not retrieving a relevant document, and a1 is the cost of retrieving a non-relevant document
• If we knew Pr(rel|x) (or Pr(nonrel|x)), solution would be trivial, but...
€
a2 Pr(rel | x) ≥ a1 Pr(nonrel | x)
![Page 53: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/53.jpg)
53 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Use Bayes Theorem to rewrite Pr(rel|x):
– Pr(x): probability of observing x– P(rel): a priori probability of relevance (ie,
probability of observing a set of relevant documents)
– : probability that x is in the given set of relevant docs
€
Pr(rel | x) =Pr(x | rel)P(rel)
Pr(x)
€
Pr(x | rel)
![Page 54: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/54.jpg)
54 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Can do the same for
€
Pr(nonrel | x)
€
Pr(nonrel | x) =Pr(x | nonrel)P(nonrel)
Pr(x)
![Page 55: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/55.jpg)
55 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• The document ranking function can be rewritten as:
• and simplified as:
• Pr(x | rel) and Pr(x | nonrel) are still unknown, so we will replace them in terms of keywords in the document!€
log g(x) = logPr(x | rel)
Pr(x | nonrel)+
Pr(rel)
Pr(nonrel)€
log g(x) = logPr(x | rel)Pr(rel)
Pr(x)
Pr(x)
Pr(x | nonrel)Pr(nonrel)
![Page 56: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/56.jpg)
56 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• We assume that terms occur independently in relevant and non-relevant docs...
• : probability that term xi is present in a document randomly selected from the ideal answer set
• : probability that term xi is present in a document randomly selected from outside the ideal answer set
€
log g(x) = logPr(xi | rel)
Pr(xi | nonrel)+C
i =1
t
∑
€
Pr(xi | rel)
€
Pr(xi | nonrel)
![Page 57: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/57.jpg)
57 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Considering document , where di is the weight of term i,
• where is the probability that a relevant document contains term xi (similarly for )
€
D = d1,d2 ,K dt
€
logg(x) =Pr(x i = di | rel)
Pr(x i = di | nonrel)+ C
i=1
t
∑
€
Pr(xi = di | rel)
€
Pr(xi = di | nonrel)
![Page 58: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/58.jpg)
58 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• When di = 0 we want the contribution of term i to g(x) to be 0:
=
€
log g(x) = logPr(xi = di | rel)
Pr(xi = di | nonrel)
Pr(xi = 0 | nonrel)
Pr(xi = 0 | rel)+C
i =1
t
∑
€
log g(x) = logpi(1− qi )
qi(1− pi )+C
i =1
t
∑
![Page 59: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/59.jpg)
59 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• The term relevance weight of term xi is:
• Weight of term i in document j is:€
tri =
logpi(1− qi)
qi(1− pi)=
logPr(x i = di | rel)
Pr(x i = di | nonrel)
Pr(x i = 0 | nonrel)
Pr(x i = 0 | rel)
€
wij = tfij × tri
![Page 60: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/60.jpg)
60 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Estimation of term occurrence probability– Given a query, a document collection can be
partitioned into a relevant and non-relevant set– The importance of a term j is its discriminatory
power in distinguishing between relevant and nonrelevant documents
![Page 61: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/61.jpg)
61 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIRWith complete information about the relevant & non-relevant document sets we can estimate pj and qj:
• Approximation:
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
€
pj = rj R
€
qj =dfj − rj
N − R
€
trj = logrj R
(dfj − rj) ( N − R)
1− (dfj − rj) ( N − R)
1− rj R
= logrj
R − rj
N − dfj − R + rj
dfj − rj
€
trj = logrj
R
N
dfj
![Page 62: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/62.jpg)
62 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Term Occurrence Probability Without Relevance Information– What do we do because we don’t know rj?
– : since most docs are nonrelevant
– pj = 0.5 (arbitrary)
– : does this remind you of anything?
€
qj = dfj N
€
trj = log(( N dfj )−1)
![Page 63: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/63.jpg)
63 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
BIR
• Reminder... Ranking Function
where,
pi = Pr(xi=di|rel)
qi = Pr(xi=di|nonrel)and di is the weight of term i
€
log g(x) = logpi(1− qi )
qi(1− pi )+C
i =1
t
∑
![Page 64: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/64.jpg)
64 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback in BIR
• Want to add more terms to the query so the query will resemble documents marked as relevant (note difference from VSM)
• How do we select which terms to add to the query?
![Page 65: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/65.jpg)
65 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Relevance Feedback in BIR
• Rank terms in marked documents and add the first m terms
• where:
N: no. of docs in the collectionni: document frequency of term iR: no. of relevant docs selectedri: no. of docs in R containing term i
• Compares frequency of occurrence of term in R with document frequency
€
wi = logri N − ni − R + ri( )
R − ri( ) ni − ri( )
![Page 66: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/66.jpg)
66 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Question-Answering on the Web
• Two aspects to IR:– Coverage (find all relevant documents)– Question-Answering (find the answer to
specific query)
• In QA we want one answer to our question• How much NLP do we need to use to
answer fact-based questions– Answers that require reasoning are much
harder!
![Page 67: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/67.jpg)
67 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Question Answering
• Most IR tasks assume that user can predict what terms a relevant document will contain
• But sometimes what we want is the answer to a direct question– “Who was the first man on the moon?”
• Do we really want a list of millions of documents that contain first, man, moon?
• And do we really want to have to read them to find the answer?
![Page 68: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/68.jpg)
68 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Question Answering
• All we want is one document, or one statement, that contains the answer
• Can we take advantage of IR on the Web to do this?
• Taking advantage of redundancy on the Web– E.g., Mulder, Dumais
![Page 69: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/69.jpg)
69 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Mulder
• Uses Web as collection of answers to factual questions– “Who was the first man on the moon?”– “What is the capital of Italy?”– “Where is the Taj Mahal?”
kwok01scaling.pdf
![Page 70: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/70.jpg)
70 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Mulder
• Three parts to a QA system:– Retrieval Engine
• Indexes documents in a collection and retrieves them
– Query Formulator• Converts NL question into formal query
– Answer Extractor• Locates answer in text
![Page 71: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/71.jpg)
71 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Mulder
• Six parts to Mulder:– Question Parsing– Question Classification– Query Formulation– Search Engine– Answer Extraction– Answer Selection
![Page 72: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/72.jpg)
72 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dumais et al
• Takes advantage of multiple, differently phrased, answer occurrences on Web
• Doesn’t need to find all answer phrases– Just the ones that match the query pattern
• Rules for converting questions, finding answers are mostly handwritten
p291-dumais
![Page 73: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/73.jpg)
73 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dumais et al
• Steps– Rewrite question into weighted query patterns
• Use POS tagger + lexicon to seek alternative word forms
– Search– Mine N-grams in summaries– Filter and re-weight N-grams– Tile N-grams to yield longer answers
![Page 74: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/74.jpg)
74 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Azzopardi
• Joel Azzopardi, 2004, “Template-Based Fact Finding on the Web” FYP report, CSAI
• Can find factoids about a series of queries relating to a particular topic using majority polling (voting) to decide amongst competing answers
• Series of topic sensitive query patterns stored in template
![Page 75: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/75.jpg)
75 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Azzopardi
• Template is learned by comparing a sample of documents about a topic
• Commonly occurring phrases (trigrams) extracted and turned into partial query in template, together with answer “type”
![Page 76: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/76.jpg)
76 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Azzopardi
• When user wants information regarding a topic, use appropriate template together with subject (e.g., person’s name)
• Subject is appended to partial queries in template - queries are submitted to Google
• Top-n documents retrieved and processed to identify candidate answers
• Uses voting to decide on most frequently occurring answer
![Page 77: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/77.jpg)
77 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Summary
• We’ve discussed a couple of popular models of IR that are more “intelligent” that plain old Extended Boolean Information Retrieval
• They still treat terms as atoms that are representative of the semantic meaning of the document
![Page 78: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/78.jpg)
78 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Summary
• But word order generally insignificant (“bag of words”)– Cannot distinguish between “dog chased cat”
and “cat chased dog” • unless phrase matching also used, but then cannot tell
that “cat chased dog” and “dog was chased by cat” are semantically equivalent
• What about information extraction?– George W. Bush = President of the United States
of America
![Page 79: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/79.jpg)
79 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Summary
• More “intelligent” approaches have been used
• And more “intelligence” is being put “into” the Web
• Personalisation and user-adaptivity also require high accuracy in determining which documents are relevant to a user
![Page 80: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/80.jpg)
80 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Summary
• Sowa’s conceptual graphs and McCarthy’s Generality in AI/Notes on Contextual Reasoning are seminal works that underpin much that is happening in the Semantic Web
• CGs represent semantic content of utterances in interchangeable format (KIF)
• McCarthy claims that it is hard to make correct inferences in the absence of contextual information
![Page 81: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/81.jpg)
81 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Summary
• Because of the expense of CGs, they are still very much domain specific– SemWeb hopes that by bringing massive
numbers of people together there will be a proliferation of “ontologies” to make it happen
• Guha did his PhD “Contexts: A Formalisation and Some Applications” at Stanford, under John McCarthy. His work on Cyc underpins RDF, DAML+OIL
![Page 82: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/82.jpg)
82 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Why did “Mary hit the piggy bank with a hammer”?
![Page 83: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/83.jpg)
83 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Do computer systems need general knowledge?
• How do computer systems represent general knowledge?
![Page 84: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/84.jpg)
84 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Do we need general knowledge?
• How do we represent general knowledge?
![Page 85: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/85.jpg)
85 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• As usual, has its roots in philosophy (epistemology)– Early (i.e., Greek) revolved around Absolute
and Universal Ideas and Forms (Plato)– Aristotle: Logic for representing and reasoning
about knowledge
http://pespmc1.vub.ac.be/EPISTEMI.html
![Page 86: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/86.jpg)
86 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Following Renaissance, two main schools of thought
• Empiricists– Knowledge as product of sensory perception
• Rationalists– Product of rational reflection
![Page 87: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/87.jpg)
87 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Kantian Synthesis of empiricism and reflectionism– Knowledge results from the organization of
perceptual data on the basis of inborn cognitive structures, called "categories".
– Categories include space, time, objects and causality.
– (viz. Chomsky’s Universal Grammar)
![Page 88: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/88.jpg)
88 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Pragmatism– Knowledge consists of models that attempt to
represent the environment to simplify problem-solving
– Assumption: Models are “rich”. No model can ever hope to capture all relevant information, and even if such a complete model would exist, it would be too complicated to use in any practical way.
![Page 89: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/89.jpg)
89 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Pragmatism (contd.)– The model which is to be chosen depends on the
problems that are to be solved (context).• But see also discussions on pragmatic vs. cognitive
contexts! (Topic 3)
– Basic criterion: model should produce correct (or approximate) (testable) predictions or problem-solutions, and be as simple as possible.
• This is the approach mainly used in CS/AI today
![Page 90: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/90.jpg)
90 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• “The first theories of knowledge stressed its absolute, permanent character, whereas the later theories put the emphasis on its relativity or situation-dependence, its continuous development or evolution, and its active interference with the world and its subjects and objects. The whole trend moves from a static, passive view of knowledge towards a more and more adaptive and active one”.
http://pespmc1.vub.ac.be/EPISTEMI.html
![Page 91: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/91.jpg)
91 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• We’ll look at four overviews of and approaches to knowledge in computer systems– McCarthy (1959, mcc.pdf) – Sowa (1979, p79-1010.pdf)– McCarthy (1987, p1030-mccarthy.pdf)– Brézillon & Pomerol (2001, is-context-a-kind.pdf)
![Page 92: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/92.jpg)
92 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• McCarthy, J. 1959. “Programs with Common Sense”
• “a program has common sense if it automatically deduces for itself a sufficiently wide class of immediate consequences of anything it is told and what it already knows”.
![Page 93: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/93.jpg)
93 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Objective: “to make programs that learn from their experience as effectively as humans do”
• To learn to improve how to learn
• And to do it in logic using a logical representation
![Page 94: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/94.jpg)
94 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Minimum features required of a machine that can evolve intelligence approaching that of humans– Representation of all behaviours– Interesting changes in behaviour must be
expressible– All aspects of behaviour must be improvable– Must have notion of partial success– System must be able to create/learn subroutines
![Page 95: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/95.jpg)
95 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Bar-Hillel’s biggest complaint (in my opinion) is– “A deductive argument, where you have first to
find out what are the relevant premises, is something which many humans are not always able to carry out successfully. I do not see the slightest reason to believe that at present machines should be able to perform things that humans find trouble in doing”
– We’ll return to this in Closed vs. Open World Assumption
![Page 96: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/96.jpg)
96 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Sowa, J. 1979. “Semantics of Conceptual Graphs”
• Logic used by McCarthy as representation of “statements about the world” as well as “theorem prover” to infer/deduce new knowledge (assumptions) about the world
• Sowa uses CG as “a language for representing knowledge and patterns for constructing models”
![Page 97: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/97.jpg)
97 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Sowa proposes CGs as better alternative to semantic networks and predicate calculus– SemNets have no well-defined semantics– PC is “adequate for describing mathematical
theories with a closed set of axioms... But the real world is messy, incompletely explored, and full of unexpected surprises”
![Page 98: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/98.jpg)
98 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• CGs serve two purposes:– They can be used as canonical representations
of meaning in Natural Language– They can be used to construct abstract
structures that serve as models in the model-theoretic sense (e.g., microtheories)
![Page 99: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/99.jpg)
99 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• To understand a sentence:1. Convert utterance to CG2. Join CG to graphs that help resolve
ambiguities and incorporate background information
3. Resulting graph is nucleus for constructing models (of worlds) in which utterance is true
4. Laws of world block illegal extensions5. If model could be extended infinitely, result
would be complete standard model
![Page 100: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/100.jpg)
100 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• “Mary hit the piggy bank with a hammer”
![Page 101: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/101.jpg)
101 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Linearizing the conceptual graph
[PERSON:Mary]->(AGNT)->[HIT:c1]<-(INST) <- [HAMMER]
[HIT:c1]<-(PTNT)<-[PIGGY-BANK:i22103]
![Page 102: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/102.jpg)
102 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Context-sensitive logical operators– Allow building models of possible worlds and
checking their consistency– Def: “A sequent is a collection of conceptual
graphs divided into two sets, called the conditions u1,..., un and the assertions v1,..., vm. It is written u1,..., un ->v1,..., vm.”
![Page 103: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/103.jpg)
103 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Cases of sequents:– simple assertion: no conditions, one assertion (->v)– disjunction: no conditions, one or more assertions
(->v1,..., vm)– simple denial: one condition, no assertions (u->)– compound denial: 2 or more conditions, no assertions
(u1,..., un->)– conditional assertion: u1,..., un ->v1,..., vm
– empty clause: ->– Horn clause: anything with at most one assertion (inc. 0)
![Page 104: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/104.jpg)
104 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• McCarthy, J. 1987. “Generality in Artificial Intelligence” (1971 Turing Award Lecture)
• “no one knows how to make a general database of commonsense knowledge that could be used by any program that needed the knowledge”
• Examples: robots moving things around, what we know about families, buying and selling...
![Page 105: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/105.jpg)
105 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• “In my opinion, getting a language [my italics] for expressing general commonsense knowledge for inclusion in a general database is the key problem of generality in AI.”
![Page 106: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/106.jpg)
106 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• How can we write programs that can learn to modify their own behaviour, including improving the way they learn?– Friedberg (A Learning Machine, c. 1958)– Newell, Simon, Shaw (General Problem Solver, c. 1957-1969)– Newell, Simon (Production Machines, 1950-1972)– McCarthy (Logical Representation, c. 1958)– McCarthy (Formalising Context, 1987)
![Page 107: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/107.jpg)
107 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• A Learning Machine– Learns by making random modifications to a
program– Discard flawed programs– Learnt to move a bit from one memory cell to
another– In 1987, was demonstrated to be inferior to
simply re-writing the entire program
![Page 108: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/108.jpg)
108 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• General Problem Solver– Represent problems of some class as problems
of transforming one expression into another using a set of allowed rules
– First system to separate problem structure from the domain
– McCarthy claims problem in representing commonsense knowledge as transformations
![Page 109: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/109.jpg)
109 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Production (Expert) Systems– Represent knowledge as facts and rules– Facts contain no variables or quantifiers– New facts are produced by inference,
observation and user input– Rules are usually coded by programmer/expert– Rules are usually not learnt or generated by
system (but see data mining)
![Page 110: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/110.jpg)
110 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Logical Representation– Representing information declaratively– Although Prolog can represent facts in logical
representation and reason using logic, it cannot do universal generalization, and so cannot modify its own behaviour enough
– So McCarthy built Lisp...
![Page 111: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/111.jpg)
111 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Logical Representation– McCarthy’s “dream” is that commonsense
knowledge possessed by humans could be written as logical sentences and stored in a db
– Facts about the effects of actions is essential (when we hear the squeal of types we expect a bang...)
– Necessary to say that an action changes only features of the situation to which it refers
![Page 112: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/112.jpg)
112 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Context– We understand under-qualified utterances
because we understand them in context– “The book is on the table”– “Where is the book?”
![Page 113: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/113.jpg)
113 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Context– “Can you fetch me the book, please?”– Up until the last utterance, the physical location
of the book was not significant, and we were able to have a short dialogue about it
– Fully qualified utterances are too unwieldy to use in conversation
– Occasionally gives rise to misunderstandings...
![Page 114: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/114.jpg)
114 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Context– “The book is on the table” is valid for a large
number of different contexts, in which the specific book and the specific table, and perhaps even the location of the specific table can be significant and can also change over time
– Utterances are understood in context
![Page 115: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/115.jpg)
115 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Is Context a ... collective Tacit Knowledge?– How does data become knowledge?
![Page 116: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/116.jpg)
116 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Is Context a ... collective Tacit Knowledge?– Context is “the collection of relevant conditions
and surrounding influences that make a situation unique and comprehensible”
![Page 117: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/117.jpg)
117 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
– Where is context?
![Page 118: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/118.jpg)
118 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Closed world vs. Open World assumption– Closed World
• I assume that anything I don’t know the truth of is false: I know everything that is true
– Open World• I assume that anything I don’t know the truth of is
unknown: Some things I don’t know may be true: I don’t know everything
![Page 119: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/119.jpg)
119 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• Prolog, for instance, will return “false” about any fact that is missing from its database, or for which it cannot derive a truth-value
• A three-valued logic permits assertions to be true, false, or unknown
• However, reasoning and truth-maintenance become expensive in the open world
![Page 120: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/120.jpg)
120 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Dealing with General Knowledge
• The Web is an open world so the Semantic Web needs to reason within an open world (perhaps even across ontologies)
• Doesn’t mean that to solve some problems, SW cannot temporarily assume a closed-world (within an agreed ontology)
ekaw2004.pdf
![Page 121: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/121.jpg)
121 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Teaching Knowledge
• Intelligent Tutoring Systems need to model both the user and the domain to create a learning path based on the students prior knowledge and goals, and to monitor the student’s progress
• AHSs developed partly by using hypertext systems as domain representations for ITSs - basically, when intelligent tutoring moved to the Web
![Page 122: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/122.jpg)
122 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Overview• Modern ITS development began in 1987,
after a review by Wenger– Wenger, E. (1987). Artificial Intelligence and Tutoring
Systems: Computational and Cognitive Approaches to the Communication of Knowledge. Los Altos, CA: Morgan Kaufmann Publishers, Inc.
• This was the first attempt to examine the implicit and explicit goals of ITS designers
![Page 123: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/123.jpg)
123 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Wenger described ITS as a part of "knowledge communication" and his review focused on cognitive and learning aspects as well as the AI issues
![Page 124: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/124.jpg)
124 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• "... consider again the example of books: they have certainly outperformed people in the precision and permanence of their memory, and the reliability of their patience. For this reason, they have been invaluable to humankind. Now imagine active books that can interact with the reader to communicate knowledge at the appropriate level, selectively highlighting the interconnectedness and ramifications of items, recalling relevant information, probing understanding, explaining difficult areas in more depth, skipping over seemingly known material ... intelligent knowledge communication systems are indeed an attractive dream." (p. 6).
![Page 125: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/125.jpg)
125 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Motivations underlying ITSs (and education in general):– to teach about something (abstract)– to teach how to do something (practical)
![Page 126: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/126.jpg)
126 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• How can learning be achieved?– By rote– By mimicry (observation)– By application
![Page 127: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/127.jpg)
127 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• When student performs task correctly, assume student understands concept and/or its application– When student performs task incorrectly, how
can the tutor help?– Simply tell the student the correct answer– Tell student the correct answer and state why
it's correct– Explain to the student why his/her answer is
incorrect
![Page 128: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/128.jpg)
128 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Explanation-based correction is HARD!
• Tutor must first understand why the student gave the incorrect answer– Student lacks knowledge– Incorrect application of correct procedure– Misinterpretation of task– Misconception of principle
![Page 129: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/129.jpg)
129 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• How to tutor?– Originally Computer-Aided Instruction (CAI)
used non-interactive "classroom" techniques.– All students were taught in the same manner
(e.g., through flash cards) and then assessed. – If a student failed, student had to work through
the same material again, to "learn it better"– Access to human tutor to address difficulties– This type of learning, although self-paced, is
ineffective
![Page 130: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/130.jpg)
130 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• The goal of an ITS– A student learns from ITS by solving problems. – The ITS selects a problem and compares its
solution with that of the student– It performs a diagnosis based on the
differences. – After giving feedback, system reassesses and
updates the student skills model and entire cycle is repeated.
![Page 131: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/131.jpg)
131 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• The goal of an ITS (continued):– As the system assesses what the student
knows, it also considers what the student needs to know, which part of the curriculum is to be taught next, and how to present the material.
– It then selects the next problem/s.
![Page 132: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/132.jpg)
132 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
Basic issues in knowledge communication
![Page 133: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/133.jpg)
133 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Domain Expertise– Rather than being represented by chunks of
information, the domain should be represented using a model and a set of rules which allows the system to "reason"
– Typical domain model representations (make closed world assumption!)
• If - Then Rules • If - Then Rules with uncertainty measures • Semantic Networks• Frame based representations
![Page 134: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/134.jpg)
134 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Student Model– According to Wenger, student models have
three tasks. They must• Gather information about the student (implicitly or
explicitly)• Create a representation of the student's knowledge
and learning process (often as “buggy” models)• Perform a diagnosis to determine what the student
knows and to determine how the student should be taught and to identify misconceptions
![Page 135: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/135.jpg)
135 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Student model architectures – Overlay student models – Differential student models – Perturbation student models
![Page 136: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/136.jpg)
136 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Student model diagnosis – Performance measuring – Model tracing – Issue tracing – Expert systems
![Page 137: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/137.jpg)
137 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Pedagogical expertise– Used to decide how to:
• present/sequence information • answer questions/give explanations• provide help/guidance/remediation
![Page 138: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/138.jpg)
138 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• According to Wenger, when "learning is viewed as successive transitions between knowledge states, the purpose of teaching is accordingly to facilitate the student's traversal of the space of knowledge states." (p. 365)
• The ITS must model the student's current knowledge and support the transition to a new knowledge state.
![Page 139: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/139.jpg)
139 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• ITSs must alternate between diagnostic and didactic support.
• Diagnostic support– Information about a student's state is inferred
on 3 levels– Behavioural - ignores learner's knowledge, and
concentrates on observed behaviour– Epistemic - attempts to infer learner's
knowledge state based on learner's behaviour– Individual - cognitive model of learner's state,
attitudes (to self, world, ITS), motivation
![Page 140: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/140.jpg)
140 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Didactic support– Concerned with the "delivery" aspect of
teaching
![Page 141: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/141.jpg)
141 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems
• Interface– The interface is the layer through which the
learner and ITS communicate– The design of an interface which enhances
learning is essential– Web-based ITSs tend to rely on the Web
browser to provide the interface– Hypermedia-based ITSs in general must provide
adaptive presentation and adaptive navigation facilities, if they are to extend beyond knowledge exploration environments
![Page 142: CSA4080: Adaptive Hypertext Systems II](https://reader035.fdocuments.us/reader035/viewer/2022070409/5681446f550346895db101e8/html5/thumbnails/142.jpg)
142 of [email protected] University of Malta
CSA4080: Topic 6© 2004- Chris Staff
Intelligent Tutoring Systems