CSCI 590: Machine Learningmdundar/CSCIMachineLearning/Lecture2.pdf · CSCI 590: Machine Learning...

CSCI 590: Machine Learning

Lecture 2: Basics of probability Instructor: Murat Dundar

Acknowledgement:

Some of these slides are taken from course textbook website

http://research.microsoft.com/~cmbishop/prml/

Axioms of Probability

In probability, the set of all possible outcomes of an experiment is called the sample space and usually denoted by 𝝮.

Tossing a coin, 𝝮={head, tail} Rolling a dice, 𝝮={1, 2, 3, 4, 5, 6} Let the probability of an event A is defined by P(A). The number P(A)

should satisfy the following three conditions: 1. P(A)≥0 2. P(𝝮)=1 3. If A ∩ B={} then P(A ∪ B)=P(A) + P(B)

σ-Algebra (1)

Events are subsets of 𝝮 but not all subsets of 𝝮 can be considered as events. When the set 𝝮 has infinitely many outcomes it is impossible to assign

probabilities to all subsets. Thus, we say 𝝮 is not measurable. The collection of sets over which a measure can be defined is called a σ-

algebra, σ-field. A σ-field, 𝝨, is a nonempty class of sets such that If A 𝝐 𝝨 then 𝐴 𝝐 𝝨 (complement) If A1, A2, … 𝝐 𝝨 then A1 ∪ A2 ∪ … 𝝐 𝝨 (countable union) The pair (𝝮, 𝝨) is called a measurable space.

σ-Algebra (2)

These are the minimum set of conditions. We can show that a σ-field also satisfies the following.

• If A1, A2, … 𝝐 𝝨 then A1 ∩ A2 ∩ … 𝝐 𝝨 (countable intersection)

• 𝝮 𝝐 𝝨 and ∅ 𝝐 𝝨

Example: If 𝝮={a,b,c,d} one possible σ-algebra on 𝝮 is 𝝨={∅, {a,b}, {c,d}, {a,b,c,d}}

The smallest possible σ-field is a collection of just two sets, {∅, 𝝮}. The largest possible field is the collection of all possible subsets of 𝝮 also called the powerset of 𝝮.

Borel σ-Algebra

Borel set: The Borel set of 𝝮 is any set that can be formed from open sets (or closed sets) through the operations of countable union, countable intersection, and relative complement.

The collection of all Borel sets on 𝝮 forms a special class of σ-algebra known as Borel algebra.

The Borel algebra on 𝝮 is the smallest σ-algebra containing all open sets.

The Real Line

Let 𝝮 be the set of all real numbers. One possible set of subsets can be considered as sets of points on the real line. 𝝮 is not measurable on this set as the real line consists of noncountable infinity of elements.

So how do we construct our event set on 𝝮 so that it is countable?

We start with all open intervals (x1< x < x2) and adding in all countable unions, countable intersections, and relative complements and continuing this process until the relevant closure properties are achieved.

The set of these events form the Borel field that contains all half lines x ≤ xi, which can be assigned probabilities. The probabilities of all other events can be derived from these probabilities.

Probability Theory

Marginal Probability

Conditional Probability Joint Probability

Probability Theory

Sum Rule

Product Rule

The Rules of Probability

Sum Rule

Product Rule

Probability Densities

Transformed Densities

If 𝑦 = 𝑔−1 𝑥 and 𝑥~𝑝𝑥(𝑥)

From P(x) to uniform distribution

Given a RV x with P(x), the RV u=P(x) is uniformly distributed in the interval [0,1].

Proof: If u=P(x) from the monotonicity of P(x) it follows that u ≤ u iff x ≤ x. Hence

P(u)=P(u ≤ u)=P(x ≤ x)=P(x)=u

From uniform distribution to P(x)

This is how we can generate samples with cdf P(X)

Given a uniformly distributed RV u in the interval [0,1], we can generate samples of a RV x with a cdf P(x) by transforming u through P-1(u)

Expectations

Conditional Expectation (discrete)

Approximate Expectation (discrete and continuous)

Variances and Covariances

Independence

If two events A and B are independent then P(A,B)=P(A)P(B)

In geneal, if n random variables xi are identically and independently (i.i.d) distributed according to some distribution p(xi), then we can write

𝑝(𝑥1, 𝑥2,…, 𝑥𝑛) = 𝑝(𝑥𝑖)𝑛𝑖=1

If two random variables are independent they are not correlated. The reverse is not necessarily true!

The Gaussian Distribution

Gaussian Mean and Variance

The Gaussian Distribution

CSCI 590: Machine Learningmdundar/CSCIMachineLearning/Lecture2.pdf · CSCI 590: Machine Learning...

Documents

Transcript of CSCI 590: Machine Learningmdundar/CSCIMachineLearning/Lecture2.pdf · CSCI 590: Machine Learning...