©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
-
Upload
melissa-kelly -
Category
Documents
-
view
215 -
download
0
Transcript of ©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Background
Information theory Probability theory
Algorithms
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
©2003/04 Alessandro Bogliolo
Outline1. Information theory
1. Definitions2. Encodings3. Digitalization
2. Probability theory1. Random process2. Probability (marginal, join, conditioned), Bayes Theorem3. Probability distribution4. Sampling theory and estimation
3. Algorithms1. Definition2. Equivalence3. Complexity
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Information
• Information: reduction of uncertainty• The minimum uncertainty is given by two
alternatives• The elementary choice between 2 alternatives
contains the minimum amount of information• Bit: binary digit encoding the elementary choice
between 2 alternatives (information unit)
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Strings
• String: sequence of N characters (or digits) taken from a given finite alphabete of S symbols
• There are SN different strings of N characters taken from the same alphabete of S symbolsThere are SN different configurations of N characters taken from an alphabete of S symbols
• A binary string is composed of bits, defined over the binary alphabete
• Byte: binary string of 8 bits 10,B
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Encodings• Encoding: assignment of strings with the elements
of a set according to a given ruleProperties:– Irredundant: each element is
assigned with a unique string
– Constant length: all code words are of the same length
– Exact: all elements are encoded and there are no elements associated with the same string
0101101110
00011011
000001010011100101110
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Encoding finite sets• The minimum number of digits of a constant-length exact encoding of a set of M elements
is
so that• Properties: MlogN S
MS N
NS=2 S=4
1 2 42 4 163 8 644 16 2565 32 10246 64 40967 128 163848 256 655369 512 26214410 1024 1048576
M
0
2
4
6
8
10
12
0 200 400 600 800 1000 1200
M
N
S=4
S=2
)MM(logNNMlogMlog SSS 212121
kSS MlogNkMlogk
212121
NNNN SSSMM
NkkNk S)S(M
12
2
1 Slog
MlogMlog
S
SS
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Encoding unlimited sets (limitation)
• An unlimited set contains infinite elementsExample: integer numbers
• Infinite sets cannot be exactly encoded
• In order to be digitally encoded the set must be restricted to a limited, finite subset
• In most cases this is done by encoding only the elements within given lower and upper boundsExample: integer numbers within 0 and 999
• The limited subset may be exactly encoded
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Example: positional notation
00
11
22
11 ... bcbcbcbc n
nn
n
0121 ... cccc nn
• The base-b positional representation of an integer number of n digits has the form
• The value of the number is
• n digits encode all integer numbers from 0 to bn-1
Example:b=2 , n=5 10011=1*16+1*2+1*1=19
11111=31=25-1
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Encoding continuous sets (discretization)
• A continuous set contains infinite elementsExample: real numbers in a given interval, points on a plane
• In order to be digitally encoded, the set needs to be discretized: partitioned into a discrete number of subsets
• Codewords will be associated with subsets• The resulting encoding is approximated
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Example: gray levels
1111111011011100101110101001100001110110010101000011001000010000
The encoding associates a unique code with an interval of gray levels
All gray levels within the interval are associated with the same code, thus loosing information
The original gray level cannot be exactly reconstructed from the code
Encoding associates each code with a unique gray level (representative of a class)
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Example: 2D images
Gray level
x
y
nlev
nx
ny
pixel
levyx nnnsize 2log
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Analog and digital signals• Signal: time-varying physical quantity
– Analog: continuous-time, continuous-value
– Digital: discrete-time, discrete-value
• The digital encoding of a continuous signal entails:– Sampling (i.e., time discretization)
– Quantization (i.e., value discretization)
sizerate sTssize
Sampling rate
Duration
Sample size
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Example: time series
time
value
levratesizerate nTssTssize 2log
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Example: videoyxcolratesizerate nnnlogTssTssize 2
srate = frame rate
ncol = number of colors
nxny = frame size
time
ny
nx
color
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Redundancy• Redundant encoding: encoding that makes use of more than
the minimum number of digits required by an exact encoding
MN Slog
• Motivations for redundancy:– Providing more expressive/natural encoding/decoding rules– Reliability (error detection)
Ex: parity encoding
– Noise immunity / fault tolerance (error correction)
Ex: triplication
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
01101
• Parity encoding:– A parity bit is used to guarantee that all codewords have an even
number of 1’s
– Single errors are detected by means of a parity check
Redundancy: examples
0010 00101
000000111000
parity check0
1
error
Irredundant codeword
• Triple redundancy:– Each character is repeats 3 times
– Single errors are corrected by means of a majority voting
000000111010error
0 0 1 0 voting result
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Compression• Lossy compression
– Compression achieved at the cost of reducing the accuracy of the representation
– The original representation cannot be restored
• Lossless compression– Compression achieved by either removing redundancy or
leveraging content-specific opportunities
– The original representation can be restored
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Outline1. Information theory
1. Definitions2. Encodings3. Digitalization
2. Probability theory1. Random process2. Probability (marginal, join, conditioned), Bayes Theorem3. Probability distribution4. Sampling theory and estimation
3. Algorithms1. Definition2. Equivalence3. Complexity
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Random process
• A random process:– Can be repeated infinite times
– May provide mutually-exclusive results
– Provide an unpredictable result at each trial
• Elementary event (e): each possible result of a single trial
• Event space (X): the set of all elementary events• Event (E): any set of elementary events (any subset of
the event space)
E
X
e
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Probability
• Relative frequency of E over N trials: ratio of the number of occurrence of E (nE) over the N trials:
fN(E) = nE/N• Probability of E (empirical definition):
• Probability of E (assiomatic definition):p(E)=0 if E is emptyp(E)=1 if E=X
if
)(lim)( EfEp NN
)()()( 2121 EpEpEEp 21 EE
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Probability (properties)
1. If all elementary events have the same probability, the probability of an event is given by its “relative size”:
p(E) = card(E)/card(X)
2.
3. If E1 is a subset of E2
4.
5.
)()( 21 EpEp
E1 E2
X
)()()( 21121 EEpEpEEp
)(1)( EpEp
)()()()( 212121 EEpEpEpEEp
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Conditional probability• Joint probability of two events, outcomes of two random experiments
N
nlim)EE(p EE
N
2121
• Marginal probability
)EE(p)EE(p)E(p 21211
• Conditional probability
)E(p
)EE(p)E|E(p
2
2121
• Decomposition
)E|E(p)E(p)E|E(p)E(p)E(p 2122121
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Independent events
• The joint probability of two independent events is equal to the product of their marginal probabilities
)E(p)E(p)EE(p 2121
• E1 and E2 are independent events if and only if)E(p)E|E(p 121
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Bayes Theorem
• Bayes theorem: given two events E1 and E2, the conditional probability of E1 given E2 can be expressed as:
)E(p
)E(p)E|E(p)E|E(p
2
11221
• The theorem provides the statistical support for statistical diagnosis based on the evaluation of the probability of a possible cause (E1) of an observed effect (E2)
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Random variable
• Random variable x: variable representing the outcome of a random experiment
• Probability distribution function of x: )x(p)(Fx xx
• Probability density function of x:
x
x
d
)(dF)a(f x
x
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Sampling and estimation• Parent population of a random variable x: ideal set of the
outcomes of all possible trials of a random process
• Sample of x: set of the outcomes of N trials of the random process
• Sample parameters can be used as estimators of parameters of the parent population
• Example:
Xe
ii
i
)e(pexE
N
jjx
Nx
1
1
Expected value of x Sample average
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Confidence of an estimator• The quality of the estimator P’ of a given parameter P can be expressed
in terms of:– Confidence interval: limiting distance d between the estimator and the
actual parameter
– Confidence level: probability c of finding the actual parameter within the confidence interval
• The smaller the confidence interval d and the higher the confidence level c, the better
• The quality of an estimator grows with the number of samples
• For a fixed confidence level c, the size of confidence interval d decreases with the inverse of the square root of N
d|'PP|obPrc
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Variance, covariance, correlation
• Variance:
• Standard deviation:
• Covariance:
• Correlation:
• The confidence interval of an estimator is proportional to
N
jj )xx(
N 1
22 1
2
N
jjj )yy)(xx(
N)y,x(Cov
1
1
yx
)y,x(Cov)y,x(Corr
N
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Outline1. Information theory
1. Definitions2. Encodings3. Digitalization
2. Probability theory1. Random process2. Probability (marginal, join, conditioned), Bayes Theorem3. Probability distribution4. Sampling theory and estimation
3. Algorithms1. Definition2. Equivalence3. Complexity
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Algorithm• Definition: Finite description of a finite sequence of non
ambiguous instructions that can be executed in finite time to solve a problem or provide a result
• Key properties:– Finite description
– Non ambiguity
– Finite execution
• Algorithms take input data and provide output data
• Domain: set of all allowed configurations of the input data
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Complexity• Complexity: measure of the number of elementary steps
required by the algorithm to solve a problem
• The number of execution steps usually depend on the configuration of the input data (i.e., on the instance of the problem)
• The complexity of an algorithm is usually expressed as a function of its input data, retaining the type of behavior while neglecting additive and multiplicative constants
• Example: O(n), O(n2), O(2n)
©2003/04 Alessandro Bogliolo
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
sCIENZEtECNOLOGIEDELL’iNFORMAZIONE
ISTITUTO DIE
Equivalence• Two algorithms are said to be equivalent if:
– they are defined on the same domain
– they provide the same result in all domain points
• In general, there are many equivalent algorithms that solve the same problem, possibly providing different complexity
• The complexity is a property of an algorithm, it is not an inherent property of the problem
• The complexity of the most efficient knwon algorithm that solves a given problem is commonly considered to be the complexity of the problem