Probability and Statisticsmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture1.pdf ·...

42
1 Probability and Statistics

Transcript of Probability and Statisticsmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture1.pdf ·...

1

Probability and Statistics

2

助教

! 高钰婧 [email protected]

! 高昊 [email protected]

! 郑翔宇 [email protected]

3

! 全班数学考试的平均成绩是80分,标准差是10 分,这是什么意思?如果你的成绩是90分,你大概排在第几名?

! 假如你是市场部的经理,根据一次广告活动的统计结果,你是否能评价广告的作用。

! 当你看到诸如“吃生的酸奶酪可以活到100岁”之类的标题。这样的声明合理吗?如何去验证?

4

!20世纪五十年代小儿麻痹症疫苗的研究。

疫苗试用于一大批孩子们的身上,在200,000个孩子身上做实验。此外,研究者还用另外相同数目的孩子作为对照组。对照组的孩子仅仅得到安慰剂。

在下一个“小儿麻痹症季节”,对照组中,有138个孩子感染了此病。在被注射了疫苗的那一组中,有56个孩子患了小儿麻痹症,这个数字当然也有随机性。56和138的差别是由数据随机性导致的还是疫苗起到了作用?

5

n 中国人工作辛苦,工作时间长,工资收入低是有目共睹的事实。中国人勤劳是全世界公认的美德。那么中国人的勤劳能换来企业的高效率吗?

6

行业 世界500强排名

公司名称 总部所在地

营业额(百万美元)

员工数量

人均营业额(百万美元)

炼油1 荷兰皇家壳牌石油公司 荷兰 484489.0 90,000 5.382 埃克森美孚 美国 452926.0 99,100 4.574 英国石油公司 英国 386463.0 83,400 4.635 中国石油化工集团公司 中国 375214.0 1,021,979 0.376 中国石油天然气集团公司 中国 352338.0 1,668,072 0.21

《财富》世界500强企业

将《财富》世界500强企业重新排序。排名首位的荷兰,共有12家企业,平均人均营业额13.8百万美元;中国进入世界500强企业共有79家,平均人均营业额0.6百万美元,排名第二十六。差距为 23倍。

7

或许在过去,公司还能拍脑门做决策,无视正确的统计数据,因为他们还能靠跟随行业“惯例”糊弄过去。如今,有效的利用统计数据是竞争的必要条件。更为重要的是,在你的竞争对手之前找到并充分利用统计数据,将成为打造你的竞争优势的关键所在。

如今的中国企业的管理正由粗放型转向精细型管理。 统计分析能够带给你价值,对数据有敬畏之心的,率先进行投入的企业,将更有机会在未来竞争中取得先机。

----《哈佛商业评论》

8

! TextbookProbability and Statistics,Revised Edition by Xiangzhong Fang, Ligang Lu,Dongfeng LiHigher Education Press.

! Prerequisite: Calculus, algebra

! Lecture Notes, Homework, and Reading Materialshttp://mwfy.gsm.pku.edu.cn/

9

课堂要求

n 不迟到,不早退

n 上课积极参与讨论

n 积极关掉所有的电子产品

10

Course Assessment! 考核

! 每周一次作业共占 30%. 晚交会被扣分,

! 期中考试占 30%.! 期末考试占 40%.! 作弊和抄袭会导致取消相应成绩

11

Introduction of Probability

12

Experiment(试验)

Suppose a coin is tossed once and the up face isrecorded. The result we see and record is called anobservation, or measurement, and the process ofmaking an observation is called an experiment.

The point is that a statistical experiment can bealmost any act of observation as long as the outcomeis uncertain.

13

Definition:An experiment is an act or process of observation thatleads to a single outcome that cannot be predictedwith certainty.

E.g., Consider the experiment of counting thenumber of customers at a restaurant on aparticular day. The basic possible outcomes ofthis experiment are 0,1, 2, 3, …

Experiment

14

The features of an experiment are:

! each of its possible outcomes can be specified before the experiment is performed;

! one and only one of the possible experimental outcomes will occur;

! there is uncertainty associated with which one will occur.

Experiment

15

Sample Point, Sample Space, Event

! A basic possible outcome of an experiment is called a sample point (样本点).

! The sample space (样本空间) for an experiment is the set of all sample points.

! An event (事件) is a specific collection of sample points.

16

Experiment Sample Space Event

The number of customers at a restaurant on a particular day

{0,1,2,…} The number of customers is at least 100

Select a part for inspection

{Defective, Nondefective} Defective

The height of a randomly chosen student (cm)

[120, 220] The height is no less than 175cm

17

! A given event is said to have occurred if theoutcome of the experiment is one of theoutcomes in the event.

! E.g.,抛骰子,S={1,2,3,4,5,6}。Event A:抛出的是偶数,即 A={2,4,6}。当观察到s=2,我们就说事件A发生了。

18

Probability as Numerical Measure of Uncertainty

Probability is a numerical measure of the likelihoodthat an event will occur.

Probabilities could be used as measures of the degreeof uncertainty.

19

概率

! 相对频率(客观概率):

多次重复做一个实验时,那么其概率就是事件最终发生的次数比率。

! 个别概率(主观概率):

生活中大部分事件不会重演,个别概率是个人对某种结果发生的主观估计。

骰子可以一掷再掷

20

The KP&L Problem! KP&L company is starting a project designed to

increase the generating capacity of one of its plants.The project is divided into two sequential stages: stage1 (design) and stage 2 (construction). Managementcannot predict beforehand the exact time required tocomplete each stage of the project. An analysis ofsimilar construction projects has shown completiontimes for the design stage of 2, 3, or 4 months andcompletion times for the construction stage of 6, 7, or 8months.

! Because of the critical need for additional electricalpower, management has set a goal of 10 months forthe completion of the entire project, and managementwants to know the probability that the project will befinished on time.

21

Sample Space for the KP&L ProblemThere are (3)(3)=9 outcomes in the sample space.

Completion Time (months)Stage 1 Stage 2 Total Project(Design) (Construction) Experimental Outcome Completion Time

2 6 (2,6) 82 7 (2,7) 92 8 (2,8) 103 6 (3,6) 93 7 (3,7) 103 8 (3,8) 114 6 (4,6) 104 7 (4,7) 114 8 (4,8) 12

22

Probabilities! Probability can be interpreted as the relative frequency of the occurrence of anevent if the experiment is repeated a large number of times.

! The KP&L problem:Completion Time (months) Number of Past Projects WithStage 1 Stage 2 Outcome Total Time These Completion Times Probability

2 6 (2,6) 8 6 6/40= .152 7 (2,7) 9 6 6/40= .152 8 (2,8) 10 2 2/40= .053 6 (3,6) 9 4 4/40= .103 7 (3,7) 10 8 8/40= .203 8 (3,8) 11 2 2/40= .054 6 (4,6) 10 2 2/40= .054 7 (4,7) 11 4 4/40= .104 8 (4,8) 12 6 6/40= .15

Total 40 Total 1.00The probability of the project being finished on time is.15+.15+.05+.10+.20+.05=.70

23

1.4 Set Theory

Probability Theory Set Theory

sample space of an experimente.g. The number of customers at a

restaurant on a particular day

the set of all possible outcomes S

S={0,1,2,3,…}an outcome (a sample point)

e.g. 100 customersan element in the set S

100

an evente.g. at least 100 customers

a subset of possibleoutcomes in S

A={100,101,102,…}

24

Relations of Set Theory! An outcome s is a member of S

! Event A is contained in event B: every outcome that belongs to the subset defining A also belongs to the subset defining B

! Empty set: the subset that contains no outcomes

SsÎ

BA Ì

BAAB,BA =ÞÌÌ

CACB,BA ÌÞÌÌ

fSA ÌÌf

25

Operations of Set Theory! Unions. The union of A and B is defined to be the event

containing all outcomes that belong to either A or B, or theevent that either A or B would occur.

e .g . {1 ,2 ,...,100} , {51 ,52 ,...,150} {1 ,2 ,.. .,150}A B A B= = Þ È =

BBABASSA,AAAAA,ABBA

=ÈÞÌ=È=È=ÈÈ=È

f

)()( CBACBACBA ÈÈ=ÈÈ=ÈÈ

26

the union of n events is defined to be the event which contains all outcomes that belong to at least one of these n events, or the event that at least one of these nevents would occur.

the union of an infinite sequence of events.

Some Notation

27

! Intersections. The intersection of A and B is defined to be theevent containing all outcomes that belong both to A and B, orthe event that both A and B would occur.

ABABAASA,AAAA,ABBA

=ÇÞÌ=Ç=Ç=ÇÇ=Ç

ff

)()()()()()(

)()(

CABACBACABACBA

CBACBACBA

ÈÇÈ=ÇÈÇÈÇ=ÈÇ

ÇÇ=ÇÇ=ÇÇ

e .g . {1 ,2 ,...,100} , {51 ,52 ,...,150} {51 ,52 , ...,100}A B A B= = Þ Ç =

28

!

Some Notation

29

! Complements. The complement of A is defined to be theevent containing all outcomes in S that do not belong toA, or the event that event A would not occur.e.g. {1,3,5}}6,5,4,3,2,1{} ,6,4,2{ =Þ== cASA

f

ff

=Ç=È

===cc

cccc

AA,SAASS,A)A(

30

! Disjoint events(不相交事件,互斥事件). A and B aredisjoint, or mutually exclusive, if they have no outcomes incommon.

! More generally, a collection of events A1, A2, . . . , Anis said to be disjoint/mutually exclusive if no twoof them have any outcomes in common.

f=Ç BA

31

1.5 Definition of Probability

! Definition: A probability measure, or simply aprobability, on a sample space S is a specification ofnumbers Pr(A) for all events A that satisfy Axioms 1,2, and 3.

32

Comment! 1933年,苏联大数学家Kolmogorov制定了这个概率论的公理体系,其中对概率是什么不加定义,只指出关于其运算所必须遵守的几条规则,这样就回避了如何定义概率这个难题,在这个基础上,建立起了概率论的宏伟大厦。

! 现今谈概率,不论是客观主观,大都遵守柯氏的公理体系,这有极大的好处:不论对概率的本质理解有何不同,在运算推理上大家都遵守公认的准则,而不是各行其是。

33

! Theorem 1.5.1 0)Pr( =f

34

! Theorem 1.5.2 For any finite sequence of disjoint events A1, ..., An,

å=

= =Èn

iii

ni AA

11 )Pr()Pr(

35

! Theorem 1.5.3 For any event A,)Pr(1)Pr( AAc -=

36

! Theorem 1.5.4 For any event A,1)Pr(0 ££ A

37

! Theorem 1.5.5 If , thenBAÌ )Pr()Pr( BA £

38

! Theorem 1.5.6 For any two events A and B,)Pr()Pr()Pr()Pr( ABBABA -+=È

=

+

39

Example: Diagnosing Diseases! A patient arrives at a doctor’s office with a sore

throat and low-grade fever. After an exam, thedoctor decides that the patient has either a bacterialinfection, or a viral infection, or both.

! The doctor decides that there is a probability of 0.7that the patient has a bacterial infection and aprobability of 0.4 that the person has a viral infection.

! What is the probability that the patient has bothinfections?

40

! B: the event that the patient has a bacterial infection.! V: the event that the patient has a viral infection.

! Known: Pr(B)=0.7, Pr(V)=0.4,! Question: Pr(BV)=?

! Solution:

VBS È=

1.0)Pr()Pr(4.07.01

)Pr()Pr()Pr()Pr(

=Þ-+=Þ

-+=È

BVBV

BVVBVB

41

课后作业

n 见 http://mwfy.gsm.pku.edu.cn/

42

See U Next Time J