1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

download 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

of 12

Transcript of 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    1/12

    Page 1

    1

    Quantitative Methods andBusiness Statistics for

    Decision Making (MSA606)

    Ramesh [email protected]

    Department of Mechanical Engineering, NIT CalicutKerala, India -673 601.

    Dedicated to

    Professor. S. G. DeshmukhProfessor. S. G. DeshmukhProfessor. S. G. DeshmukhProfessor. S. G. Deshmukh

    [email protected]

    3

    Objectives of this course

    Appreciate the role of statistics in various decision making situations Summarize data with frequency distributions and graphic

    presentation. Interpret descriptive statistics for central tendency, dispersion and

    location Define and interpret probability. Utilize discrete and continuous

    probability distributions to determine probabilities in variousmanagerial applications.

    Apply the central limit theorem to determine probabilities of samplemeans and compute and interpret point and interval estimates.

    Conduct Hypothesis tests for means Utilize linear regression to estimate and predict variables.

    Understand basic concepts of design-of-experiment

    Understand importance of non-parametric tests

    [email protected]

    Lab/tutorial

    The laboratory content will require pre-

    requisite of working with Excel. There willbe quizzes/assignments every week. Thelab assignments are to be submitted on

    that day itself. Students will be alsorequired to visit and consult useful web

    resources.

    [email protected] 4

    Mode of Evaluation andGrades

    Grades are based on total points earned

    from test 1 &2,lab/tutorial/assignments,mini-project and end semester

    examination.

    [email protected] 5

    Test1

    Article critique &

    Presentation

    EndSemester

    Lab/tutorial/quizes

    /assignments

    (every week)

    Mini-Project

    30 % 10 % 40 % 10 % 10%

    Reference

    Meyer PL, Introductory Probability and StatisticalApplications, Oxford and IBH Publishers

    Miller IR, Freund JE, Johnson R, Probability and

    Statistics for Engineers, Prentice-Hall (I) Ltd Walpole RE and Myers RH, Probability &

    Statistics for Engineers and Scientists,Macmillan

    Levin, R. I. and Rubin, D.S., Statistics forManagement (Pearson Education )

    [email protected] 6

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    2/12

    Page 2

    7

    Statistics..

    Plays an important role in many facets of

    human endeavour Occurs remarkably frequently in our

    everyday lives

    It is often incorrectly thought of as just acollection of data, graphs and diagrams

    [email protected] 8

    Statistics in Business

    Accounting auditing and cost estimation

    Economics regional, national, and internationaleconomic performance

    Finance investments and portfolio management Management human resources, compensation,

    and quality management Management Information Systems (ERP):

    performance of systems which gather, summarize,and disseminate information to various manageriallevels

    Marketing market analysis and consumerresearch

    International Business market and demographicanalysis

    [email protected]

    9

    What is Statistics?

    Science of gathering, analyzing, interpreting,and presenting data

    Branch of mathematics

    Facts and figures

    Measurement taken on a sample

    Statistics is the scientific method thatenables us to make decisions as responsiblyas possible.

    [email protected] 10

    Statistics

    The science of data to answer researchquestions Formulate a research question(s) (hypothesis)

    Collect data

    Analyze and summarize data

    Draw conclusions to answer researchquestions

    Statistical Inference

    In the presence of variation

    [email protected]

    11

    Answers Questions from EverydayLife

    Business: Will a new marketing strategy beprofitable?

    Industry: Will a products life exceed the

    warranty period? Medicine: Will this years flu vaccine reduce thechance of flu?

    Education: Will technology improve learning?

    Government: Will a change in interest ratesaffect inflation?

    [email protected] 12

    Statistics: Science ofvariability..?

    Virtually everything varies

    Variation occurs among individuals

    Variation occurs within any one individualas time passes

    [email protected]

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    3/12

    Page 3

    13

    Can Statistics Be Trusted?There are three kinds of lies:

    Lies, damned lies, and statistics.--Mark Twain

    It is easy to lie with statistics. But it is

    easier to lie without them.--Frederick Mosteller

    Figures wont lie but liars will figure.--Charles Grosvenor

    [email protected] 14

    Population Versus Sample Population the whole

    a collection of persons, objects, or items under study

    The entire group of individuals in a statistical study wewant information about.

    Census gathering data from the entirepopulation

    Sample a portion of the whole a subset of the population a part of the population from which we actually collect

    information, used to draw conclusions about thewhole (statistical inference

    [email protected]

    15

    Statistics can be split into twobroad categories

    1. Descriptive statistics

    2. Statistical inference

    [email protected]

    Descriptive Statistics

    Collect data

    ex. Survey

    Present data

    ex. Tables and graphs

    Characterize data

    ex. Sample mean =i

    X

    n

    17

    Descriptive statistics..

    Encompasses the following:

    Graphical or pictorial display

    Condensation of large masses of data into a

    form such as tables

    Preparation of summary measures to give aconcise description of complex information(e.g. an average figure)

    Exhibition of patterns that may be found insets of information

    [email protected]

    Inferential Statistics

    Estimation

    ex. Estimate thepopulation mean weight

    using the sample meanweight

    Hypothesis testing

    ex. Test the claim that thepopulation mean weightis 120 poundsDrawing conclusions and/or making decisions

    concerning a population based on sample results.

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    4/12

    Page 4

    19

    Inferential Statistics..

    Especially relates to:

    Determining whether characteristics of asituation are unusual or if they havehappened by chance

    Estimating values of numerical quantities anddetermining the reliability of those estimates

    Using past occurrences to attempt to predictthe future

    [email protected] 20

    Process of Inferential Statistics

    Population

    (parameter)

    Sample

    x

    (statistic )

    Calculate xto estimate

    Select a

    random sample

    [email protected]

    Population vs. Sample

    Population Sample

    Measures used to describe the

    population are called parameters

    Measures computed from

    sample data are called statistics22

    Parameter vs. Statistic

    Parameter descriptive measure of thepopulation

    Usually represented by Greek letters

    Statistic descriptive measure of asample

    Usually represented by Roman letters

    [email protected]

    23

    Symbols for PopulationParameters

    denotes population parameter

    2

    denotes population variance denotes population standard deviation

    [email protected] 24

    Symbols for Sample Statistics

    x denotes sample mean

    2S denotes sample variance

    S denotes sample standard deviatio

    [email protected]

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    5/12

    Page 5

    Types of Variables

    Categorical (qualitative) variables have values

    that can only be placed into categories, such asyes and no.

    Numerical (quantitative) variables have values

    that represent quantities.

    Types of Variables

    Data

    Categorical Numerical

    Discrete Continuous

    Examples:

    Marital Status

    Political Party Eye Color

    (Defined categories)Examples:

    Number of Children

    Defects per hour

    (Counted items)

    Examples:

    Weight

    Voltage

    (Measured characteristics)

    27

    Levels of Data Measurement

    Nominal Lowest level of measurement

    Ordinal

    Interval

    Ratio Highest level of measurement

    [email protected]

    Levels of Measurement

    A nominal scale classifies data into distinct

    categories in which no ranking is implied.

    Categorical Variables Categories

    Personal ComputerOwnership

    Type of Stocks Owned

    Internet Provider

    Yes/ No

    Microsoft Network / AOL

    Growth Value Other

    Levels of Measurement

    An ordinal scale classifies data into distinct

    categories in which ranking is implied

    Categorical Variable Ordered Categories

    Student c lass des igna tion Freshman, Sophomore , Junio r,Senior

    Product satisfaction Satisfied, N eutral, Unsatisfied

    Faculty rank Professor, Associate Professor,Assistant Professor, Instructor

    Standard & Poors bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,

    C, DDD, DD, D

    Student Grades A, B, C, D, F

    Levels of Measurement

    An interval scale is an ordered scale in which thedifference between measurements is a meaningfulquantity but the measurements do not have a true

    zero point.

    A ratio scale is an ordered scale in which thedifference between the measurements is ameaningful quantity and the measurements have a

    true zero point.

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    6/12

    Page 6

    Interval and Ratio Scales

    32

    Usage Potential of VariousLevels of Data

    Nominal

    Ordinal

    IntervalRatio

    [email protected]

    33

    Data Level, Operations,and Statistical Methods

    Data Level

    Nominal

    Ordinal

    Interval

    Ratio

    Meaningful Operations

    Classifying and Counting

    All of the above plus Ranking

    All of the above plus Addition,Subtraction

    All of the above plusmultiplication and division

    StatisticalMethods

    Nonparametric

    Nonparametric

    Parametric

    Parametric

    [email protected] 34

    Data preparation rules

    Data presented must be

    factual

    relevant

    Before presentation always check:

    the source of the data

    that the data has been accurately

    transcribed

    the figures are relevant to the problem

    [email protected]

    35

    Methods of visual presentationof data

    Table

    1st Qtr 2nd Qtr 3rd Qtr 4th QtrEast 20.4 27.4 90 20.4

    West 30.6 38.6 34.6 31.6

    North 45.9 46.9 45 43.9

    [email protected] 36

    Methods of visual presentationof data

    Graphs

    0

    10

    20

    30

    40

    50

    6070

    80

    90

    1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

    East

    West

    North

    [email protected]

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    7/12

    Page 7

    37

    Methods of visual presentationof data

    Pie chart

    1st Qtr

    2nd Qtr

    3rd Qtr

    4th Qtr

    [email protected] 38

    Methods of visual presentationof data

    Multiple bar chart

    0 20 40 60 80 100

    1st Qtr

    2nd Qtr

    3rd Qtr

    4th Qtr

    North

    West

    East

    [email protected]

    39

    Methods of visual presentationof data

    Simple pictogram

    0

    20

    40

    60

    80

    100

    1 st Q tr 2 nd Q tr 3 rd Q tr 4 th Q tr

    East

    North

    West

    [email protected] 40

    Frequency distributions

    Frequency tables

    ClassInterval Frequency Cumulative Frequency

    < 20 13 13

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    8/12

    Page 8

    43

    Example of UngroupedData

    4230

    53

    50

    52

    30

    55

    49

    61

    74

    2658

    40

    40

    28

    36

    30

    33

    31

    37

    3237

    30

    32

    23

    32

    58

    43

    30

    29

    3450

    47

    31

    35

    26

    64

    46

    40

    43

    5730

    49

    40

    25

    50

    52

    32

    60

    54

    Ages of a Sample of

    Managers from

    XYZ

    [email protected] 44

    Frequency Distribution ofAges

    Class Interval Frequency20-under 30 6

    30-under 40 18

    40-under 50 11

    50-under 60 11

    60-under 70 3

    70-under 80 1

    [email protected]

    45

    Data Range

    42

    30

    53

    50

    52

    30

    55

    49

    61

    74

    26

    58

    40

    40

    28

    36

    30

    33

    31

    37

    32

    37

    30

    32

    23

    32

    58

    43

    30

    29

    34

    50

    47

    31

    35

    26

    64

    46

    40

    43

    57

    30

    49

    40

    25

    50

    52

    32

    60

    54

    Smallest

    Largest

    Range = Largest - Smallest

    = 74 - 23

    = 51

    [email protected] 46

    Number of Classes and ClassWidth

    The number of classes should be between 5 and 15.

    Fewer than 5 classes cause excessive summarization.

    More than 15 classes leave too much detail.

    Class Width

    Divide the range by the number of classes for anapproximate class width

    Round up to a convenient number

    10=WidthClass

    8.5=6

    51=WidthClasseApproximat

    [email protected]

    47

    Class Midpoint

    Class Midpoint =beginning class endpoint + ending class endpoint

    2

    = 30 + 402

    = 35

    ( )

    Class Midpoint = class beginning point +1

    2class width

    = 30 +1

    210

    = 35

    [email protected] 48

    Relative FrequencyRelative

    Class Interval Frequency Frequency

    20-under 30 6 .12

    30-under 40 18 .36

    40-under 50 11 .22

    50-under 60 11 .22

    60-under 70 3 .06

    70-under 80 1 .02

    Total 50 1.00

    6

    50=

    18

    50=

    [email protected]

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    9/12

    Page 9

    49

    Cumulative Frequency

    Cumulative

    Class Interval Frequency Frequency20-under 30 6 6

    30-under 40 18 24

    40-under 50 11 35

    50-under 60 11 46

    60-under 70 3 49

    70-under 80 1 50

    Total 50

    18 + 6

    11 + 24

    [email protected] 50

    Class Midpoints, Relative Frequencies,

    and Cumulative Frequencies

    Relative Cumulative

    Class IntervalFrequency Midpoint Frequency Frequency

    20-under 30 6 25 .12 6

    30-under 40 18 35 .36 24

    40-under 50 11 45 .22 35

    50-under 60 11 55 .22 46

    60-under 70 3 65 .06 49

    70-under 80 1 75 .02 50

    Total 50 [email protected]

    51

    Cumulative Relative Frequencies

    Relative Cumulative Cumulative Relative

    Class IntervalFrequencyFrequencyFrequency Frequency

    20-under 30 6 .12 6 .12

    30-under 40 18 .36 24 .48

    40-under 50 11 .22 35 .70

    50-under 60 11 .22 46 .92

    60-under 70 3 .06 49 .98

    70-under 80 1 .02 50 1.00

    Total 50 [email protected] 52

    Common Statistical Graphs

    Histogram -- vertical bar chart of frequencies

    Frequency Polygon -- line graph of frequencies

    Ogive -- line graph of cumulative frequencies

    Pie Chart -- proportional representation forcategories of a whole

    Stem and Leaf Plot

    Pareto Chart

    Scatter Plot

    [email protected]

    53

    Histogram

    Class Interval Frequency

    20-under 30 6

    30-under 40 1840-under 50 11

    50-under 60 11

    60-under 70 3

    70-under 80 1 0

    10

    20

    0 10 20 30 40 50 60 70 80

    Years

    Frequency

    [email protected] 54

    Histogram Construction

    Class Interval Frequency

    20-under 30 6

    30-under 40 1840-under 50 11

    50-under 60 11

    60-under 70 3

    70-under 80 10

    10

    20

    0 10 20 30 40 50 60 70 80

    Years

    Frequency

    [email protected]

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    10/12

    Page 10

    55

    Frequency Polygon

    Class Interval Frequency20-under 30 6

    30-under 40 18

    40-under 50 11

    50-under 60 11

    60-under 70 3

    70-under 80 10

    10

    20

    0 1 0 20 3 0 4 0 5 0 60 7 0 8 0

    Years

    Frequency

    [email protected] 56

    Ogive

    Cumulative

    Class Interval Frequency

    20-under 30 6

    30-under 40 24

    40-under 50 35

    50-under 60 46

    60-under 70 49

    70-under 80 500

    20

    40

    60

    0 10 20 30 40 50 60 70 80

    Years

    Frequency

    [email protected]

    57

    Relative Frequency Ogive

    Cumulative

    Relative

    Class Interval Frequency

    20-under 30 .12

    30-under 40 .48

    40-under 50 .70

    50-under 60 .92

    60-under 70 .98

    70-under 80 1.00

    0.000.100.200.300.400.500.600.700.800.901.00

    0 10 20 30 40 50 60 70 80

    Years

    CumulativeRelativeFrequency

    [email protected] 58

    Complaints by Passengers

    COMPLAINT NUMBER PROPORTION DEGREES

    Stations, etc. 28,000 .40 144.0

    TrainPerformance

    14,700 .21 75.6

    Equipment 10,500 .15 50.4

    Personnel 9,800 .14 50.6

    Schedules,etc.

    7,000 .10 36.0

    Total 70,000 1.00 [email protected]

    59

    Complaints by Passengers

    Stations, Etc.

    40%Train

    Performance

    21%

    Equipment15%

    Personnel

    14%

    Schedules,

    Etc.

    10%

    [email protected] 60

    SecondQuarter Truck

    Production

    2d QuarterTruck

    ProductionCompany

    A

    B

    C

    D

    ETotals

    357,411

    354,936

    160,997

    34,099

    12,747920,190

    [email protected]

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    11/12

    Page 11

    61

    39%

    39%

    17%

    4%1%

    A B C D E

    Second QuarterTruck Production

    [email protected] 62

    Pie Chart Calculations forCompany A

    2d Quarter

    TruckProduction

    Proportion DegreesCompany

    A

    B

    C

    D

    ETotals

    357,411

    354,936

    160,997

    34,099

    12,747920,190

    .388

    .386

    .175

    .037

    .0141.000

    140

    139

    63

    13

    5360

    357,411

    920,190=

    . 388 360 =

    [email protected]

    63

    Pareto Chart

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    Poor

    Wiring

    Short in

    Coil

    Defective

    Plug

    Other

    Frequency

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    80%

    90%

    100%

    [email protected] 64

    Scatter Plot

    RegisteredVehicles(1000's)

    Gasoline Sales(1000's ofGallons)

    5 60

    15 120

    9 90

    15 140

    7 60

    0

    100

    200

    0 5 10 15 20RegisteredVehicles

    GasolineSales

    [email protected]

    Principles of Excellent Graphs

    The graph should not distort the data.

    The graph should not contain unnecessary

    adornments (sometimes referred to as chart junk).

    The scale on the vertical axis should begin at zero.

    All axes should be properly labeled.

    The graph should contain a title.

    The simplest possible graph should be used for a

    given set of data.

    Graphical Errors: Chart Junk

    1960: $1.00

    1970: $1.60

    1980: $3.10

    1990: $3.80

    Minimum Wage

    Bad Presentation

    Minimum Wage

    0

    2

    4

    1960 1970 1980 1990

    $

    Good Presentation

  • 8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]

    12/12

    Page 12

    Graphical Errors:Compressing the Vertical Axis

    Good PresentationQuarterly Sales Quarterly Sales

    Bad Presentation

    0

    25

    50

    Q1 Q2 Q3 Q4

    $

    0

    100

    200

    Q1 Q2 Q3 Q4

    $

    Graphical Errors: No Zero Pointon the Vertical Axis

    Monthly Sales

    36

    39

    42

    45

    J F M A M J

    $

    Graphing the first six months of sales

    Monthly Sales

    0

    39

    42

    45

    J F M A M J

    $

    36

    Good PresentationsBad Presentation

    69

    Thank You

    http://www.stats.gla.ac.uk/steps/glossary/p

    resenting_data.html

    http://www.ilir.uiuc.edu/courses/lir593/