Machine Learning Tutorial.1 Boolean algebra, probabilities.
-
Upload
joel-craig -
Category
Documents
-
view
239 -
download
0
description
Transcript of Machine Learning Tutorial.1 Boolean algebra, probabilities.
Machine Learning
Tutorial.1Boolean algebra, probabilities
Boolean Algebra Tutorial(Duncan Fyfe Gillies slides)
• Computer hardware works with binary numbers, but binary arithmetic is much older than computers– Ancient Chinese Civilisation (3000 BC)– Ancient Greek Civilisation (1000 BC)– Boolean Algebra (1850)
Propositional Logic
• The Ancient Greek philosophers created a system to formalize arguments called propositional logic.– A proposition is a statement that could be TRUE or
FALSE– Propositions could be compounded by means of the
operators AND, OR and NOT
Propositional Calculus Example
• Propositions may be TRUE or FALSE:– it is raining– the weather forecast is bad
• A combined proposition:– it is raining OR the weather forecast is bad
Propositional Calculus Example• We can assign values to propositions, for example:
I will take an umbrella if and only ifit is raining OR the weather forecast is bad
• In other words the proposition “I will take an umbrella” is the result of the Boolean combination (OR) between two conditions:
raining weather forecast being bad.
• In fact we could write:I will take an umbrella =
it is raining OR the weather forecast is bad
Diagrammatic representation
• We can think of the umbrella proposition as a result that we calculate from the weather forecast and the fact that it is raining by means of a logical or.
OR
Rain
Bad forecastTake an umbrella
Computer Organization class diagrams
Truth Tables
• Since propositions can only take two values, we can express all possible outcomes of the umbrella proposition by a table:– True = 1– False = 0
Rain Bad Forecast
Umbrella
False False FalseFalse True TrueTrue False TrueTrue True True
Rain Bad Forecast
Umbrella
0 0 00 1 11 0 11 1 1
More complex propositions
• We can make our propositions more complex, for example:
(Take Umbrella ) =( NOT (Take Car ) ) AND ( (Bad Forecast ) OR (Raining ) )
OR
Rain
Bad forecast
ANDTake an umbrellaTake a car NOT
Boolean Algebra
• We could write propositional statements as above but to perform calculations quickly and efficiently we can use an equivalent, but more succinct notation.
• We also need a to have a well-defined semantics for all the “operators”, or connectives that we intend to use.
• The system we will employ is called Boolean Algebra (introduced by the English mathematician Boole in1850) and satisfies the criteria above.
Fundamentals of Boolean Algebra
• The truth values are replaced by 1 and 0:– TRUE = 1– FALSE = 0
• Propositions are replaced by variables (for example):– it is raining = R– The weather forecast is bad = W
• Operators are replaced by symbols– NOT = – OR = disjunction – AND = conjunction
Truth Tables
All possible outcomes of the operators can be written as truth tables
A B R
0 0 00 1 11 0 11 1 1
A B R
0 0 00 1 01 0 01 1 1
A R0 11 0
A AND B A OR B NOR A
For n variables there are 2n possibilities.
Theory of Probability
Olga Kubassova, School of Computing
http://www.comp.leeds.ac.uk/olga
7th February 2007
Probability Basics0≤P≤1
• Experiment – a process that yields outcome.
• Sample space – the set of all possible outcomes.
• Event – a subset of the sample space.– E.g. getting heads when tossing a coin.
• Mutually exclusive – if events cannot occur simultaneously.
Example:
• Rolling a fair six-sided die: • Sample Space: S={1,2,3,4,5,6}, |S| = 6
– Independent events• Event: obtaining 4
Probability notation• P(A) – the prob. of event A occurring
• P(A∪B) – prob. of A or B– Probability of rolling 1 or 6
• P(A∩B) – prob. of A and B– Probability or rolling 1 and 2 – Probability of rolling an even number grater than 3
• P(¬A) – prob. of A not occurring (the complement of A)
Pay attention to the AND/OR notation (set theory)
Even numbers on both dice?
Sum is greater than 5?
1,1 1,2 1,3 1,4 1,5 1,6
2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6
4,1 4,2 4,3 4,4 4,5 4,6
5,1 5,2 5,3 5,4 5,5 5,6
6,1 6,2 6,3 6,4 6,5 6,6
Are the dice identical?How many different events?
Even numbers on both dice
ANDSum is greater
than 5
Even numbers on both dice
ORSum is greater
than 5
Example:
• Rolling a fair six-sided die: • Sample Space: S={1,2,3,4,5,6}, |S| = 6
– Independent events• Event: obtaining 4
• P(S) =
– Sum of probabilities of all (independent!) events– P(eventi)=1/6 (uniform distribution)
1)(||
1
S
iieventp
Finite probability• Experiment that has a finite number of
outcomes.• Therefore, each event has a finite
probability.
• Experiment:Throw a die. A = number > 4. B = even number.Find P(A), P(B), P(A∪B) and P(A∩B)
Mutually Exclusive Events
• Events are mutually exclusive if they are disjoint, or cannot occur simultaneously
A ∩ B = 0
Examples?
Even numbers on both dice
ANDSum is less
than 1
Uniform distributions • Every outcome in the sample space is
equally probable.
• E.g. tossing coins rolling dice drawing a card from a deck
Conditional probability
)()()|(
BPBAPBAP
• P(A | B) – the prob. of A happening given that B has occurred.
Example of the Conditional Probability
Imagine that 5% of people of a given population own at least one dog. 2% of people own at least one dog and at least one cat. What is the probability that someone will own a cat, given that they also have a dog?
Let A = “dog owner”, B = “cat owner”, then: P(A) = 0.05; P(A ∩ B) = 0.02
P(B | A) = P(A ∩B)/P(A)
Independence• If events A and B do not influence
each other then:
P(A | B) = P(A)
and
P(A∩B) = P(A)P(B)
Example (Independence)• Let throw the coin 2 times. Probability that
the first time it is heads and the second it is tail do not influence each other
Chain rule
),,,( 21 nAAAP
),|(),|()|()( 11213121 nn AAAPAAAPAAPAP
Chain rule allows computing probabilities of sequences of the events
Chain rule (II)• Let us compute sequences of events.• E.g. prob. of sequence of letters, ‘t’,
‘h’ and ‘e’:
P(t,h,e) = P(t)P(h | t)P(e | th)
Bayes’ theorem
Bayes’ theorem (II)• Meningitis causes stiff neck 50% cases• Prob. of meningitis is 1/50000• Prob. of stiff neck is 1/20
• What is the prob. of having meningitis if a patient has a stiff neck?
Additional resources• Andy Roberts homepage: http://andy-roberts.net/teaching/index.html• Google directory:
http://directory.google.com/Top/Science/Math/Probability
• Wikipedia:http://en.wikipedia.org/wiki/Probability
Exponentials – What are they?
• Simply: a number (base) raised to a power (exponent).
• E.g. the area of a square whose sides are of length 3…
• = 32 • = 3 squared, or 3 to the power 2 • = 3 × 3
• Easy enough to calculate:
• ab = a × a × a … × a • So there are ‘b’ lots of ‘a’• What is 43?
• 4 × 4 × 4 = 64
Exponentials – A quick explanation
b
Exponentials in Computer Science
• Have you ever noticed the following numbers popping up in your computer science studies?
• 1, 2, 4, 8, 16, 32, 64, 128…1024…• What do they all have in common?• They are all powers of 2!!
Exponentials in Computer Science
• The powers of 2 should be (very) familiar to you by now.
• All the computers you have been using work in binary bits (at the lowest level)…
• 2 values: 0/1, true/false etc• all ‘data sizes’ must be expressed in powers of
2.
Exponentials in Computer Science
• 1 kilobyte ≠ 1000 bytes (‘standard’ use of kilo)• 1 kilobyte = 1024 bytes• Because…• 29 = 512• 210 = 1024• 211 = 2048• So 210 is the closest we can get to 1000
Working with Exponentials
• First, some easy exponentials to remember:
• a0 = 1
• a1 = a
Logarithms
• This is because log(arithm)s are just ‘reversed’ exponentials
• E.g. if 24 = 16• log216 = 4
– base is 2• Logarithms ‘map’ large numbers onto smaller
numbers
Logarithmic Scale
0
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9 10
x
2 x̂
log(x,2)
Logarithms - Bases
• There are several common bases:– 10: very common base, Richter scale etc.– e: used by a lot of scientists– 2: very common in computer science. WHY?
Logarithms – An example
• Imagine a (small) football tournament of 8 teams.
• It is a knockout tournament, so every time that 2 teams play the losing team is eliminated from the tournament and the winning team goes on to the next round.
• How many rounds must be played to determine an overall winner?
Logarithms – An example
• Imagine a (small) football tournament of 8 teams.
• It is a knockout tournament, so every time that 2 teams play the losing one is removed from the tournament and the winning one goes on to the next round.
• How many rounds must be played to determine an overall winner?
Logarithms – An example
Round 0: 8 teams
Round 1: 4 teams
Round 2: 2 teams
Round 3: 1 team
3 rounds needed!
Logarithms – Example explained
• 3 rounds are needed to determine the winner of 8 teams, competing 2 at a time (i.e. one-on-one)
• This can be easily calculated using logs.• 2 teams play at a time, so the base is 2.
(i.e. 2x = 8, so we need to use log2)
• So, log2 8 = ….• 3