Frequency Distributions Quantitative Methods in HPELS 440:210.

Post on 31-Mar-2015

222 views 0 download

Tags:

Transcript of Frequency Distributions Quantitative Methods in HPELS 440:210.

Frequency Distributions

Quantitative Methods in HPELS

440:210

Agenda

Basic Concepts Frequency Distribution Tables Frequency Distribution Graphs Percentiles and Percentile Ranks

Basic Concepts Frequency distribution: An organized tabulation

of the number of individuals located in each category on the scale of measurement

Frequency distributions can be in table or graph format

There are two elements in a frequency distribution: The set of categories that make up the scale of

measurement The record of the frequency of individuals in each

category

Basic Concepts

There are two reasons to construct frequency distributions:Assists with choosing the appropriate test

statistic (parametric vs. nonparametric)Assists with identification of outliers

Basic Concepts Parametric statistics require a normal

distribution Frequency distributions provide a “picture”

of the data for determination of normality If data is normal use parametric

statistic, assuming INTERVAL or RATIO If data is non-normal use nonparametric

regardless of scale of measurement

The Normal Distribution Characteristics:

1. Horizontally symmetrical

2. Unified mode, median and mean

Non-Normal Distributions

Heavy tailed Light tailed

Left skewed Right skewed

Normal Distribution

How to determine if distribution is normal: Several methods:

Qualititative assessmentQuantitative assessment:

Kolmogorov-Smirnov Shapiro-Wilk Q-Q plots

Interpretation of the Q-Q Normal Plot

Normal Heavy tailed Light tailed

Left skew Right skew

Bottom Line: Parametric or Nonparametric?

Is the scale of measurement at least interval?No NonparametricYes Answer next question

Is the distribution normal?No NonparametricYes Parametric

Basic Concepts

The frequency distribution can assist with the identification of outliers

Outlier: An individual data point that is substantially different from the values obtained from other individuals in the same data set

Outliers can have drastic results on the test statistic

Basic Concepts Outliers may occur naturally or maybe due

to some form of error:Measurement error throw out Input error correct the errorLack of effort or purposeful deceit on behalf of

subject throw out.Natural occurrence keep the data

Agenda

Basic Concepts Frequency Distribution Tables Frequency Distribution Graphs Percentiles and Percentile Ranks

Frequency Distribution Tables

FDT contain the following information:Scale of measurement (measurement

categories)Frequency of each point along the scale of

measurement FDT are in row/column format

Simple frequency distribution tablesGrouped frequency distribution tables

Simple Frequency Distribution Tables

Process:List all measurement categories from lowest

to highest (unless nominal) in a column (X)List the frequency that each category

occurred in the next column (f) Example 2.1 (p 37).

Note that f = N where:N = total number of individuals.

Simple Frequency Distribution Tables

Obtaining the X from a FDT Process:Create a third column called (fX)Multiply (f) column by (X) column product in

a new (fX) columnX = fX See Table on page 38

Simple Frequency Distribution Tables Obtaining Proportions and Percentages: Proportion (p): The fraction of the total

group associated with each score where,(p) = f/N

Percentage (%) = p*100 Example 2.2 (p 37)

Grouped Frequency Distribution Tables

If the data covers a wide range of values, there are disadvantages to listing each individual score:CumbersomeDifficult to interpret

Grouped FDT creates groups (class intervals) of scores

Grouped Frequency Distribution Tables There are several rules to help with the construction

of grouped FDT: Rule 1: Use ~ 10 class intervals

Too few: Lost information Too many: Complicated

Rule 2: Width/size of each class interval should be simple Easy to count by 2, 5 or 10.

Rule 3: The bottom score in each class interval should be a multiple of the width/size of the class interval

Example: Width/size = 5 Each interval should start with 5, 10, 15 . . .

Rule 4: Each class interval should be the same width/size. Example 2.3 (p 40) and Table 2.2 (p 41).

Agenda

Basic Concepts Frequency Distribution Tables Frequency Distribution Graphs Percentiles and Percentile Ranks

Frequency Distribution Graphs

Graphs contain same information from the frequency distribution table Scale of measurement or measurement

categoriesFrequency of each category

Frequency Distribution Graphs

Format is different: Scale of measurement is located along the

horizontal x-axis (abscissa)Values should increase from left to right.

Frequency is along the vertical y-axis (ordinate)Values should increase from bottom to top.

Frequency Distribution Graphs

Generally speaking:The point where the two axes intersect should

have a value of zeroThe height (y-axis) of the graph should be

approximately 2/3 to 3/4 of its length (x-axis)Figure 2.2 (p 44)

Frequency Distribution Graphs

There are several types of FDG:Histograms (Interval/Ratio)Polygons (Interval/Ratio)Stem and leaf displays (Interval/Ratio)Bar graphs (Nominal/Ordinal)

FDG: Histograms (I/R) Process:

List the numerical scores along the x-axis Draw a bar above each X value so that:

Height: Corresponds to the frequency Width: Extends to the real limits of the value

Real limits: Upper and lower Separate adjacent scores along a number line Example The real limits of 150

Lower limit = 149.5 Upper limit = 150.5

Figure 1.7 (p 19)

FDG: Histograms (I/R) Bars should be in contact with each other

Extend to real limitsFigure 2.2a (p 44)

FDG: Histograms (I/R) Variations:

Histogram from grouped frequency table Figure 2.2b (p 45)

Modified histogram Figure 2.4 (p 45)

FDG: Polygons (I/R) Process:

List the numerical scores along the x-axisPlace dot above scores corresponding to

frequencyConnect dots with continuous lineDraw two lines from the extreme dots to the x-

axis One category below the lowest score One category above the highest score Figure 2.5 (p 46)

FDG: Polygons (I/R)

Variations:Polygon from grouped data Figure 2.6 (p 46)

FDG: Stem and Leaf Displays (I/R)

Introduction:Simple plot designed by J.W. Tukey (1977)Two parts:

Stem: First digit Leaf: Last digit(s)

Table 2.3 (p 59)

FDG: Stem and Leaf Displays (I/R)

Process:List all stems that occur (no duplicates)List all leaves by its stem (duplicates)

Variation:Double stems for greater detail

First of two stems associated with leaves (0-4) Second stem with leaves (5-9) Table 2.4 (p 60)

FDG: Bar Graph (N/O)

Process: Same as histogram Spaces between the bars no real limits Figure 2.7 (p 47)

Nominal vs. Ordinal Data: Nominal data: The order of the categories is arbitrary Ordinal data: Logical progression of categories

Example: Dislike, mod. dislike, no opinion, mod. like, like

Agenda

Basic Concepts Frequency Distribution Tables Frequency Distribution Graphs Percentiles and Percentile Ranks

Percentiles and Percentile Ranks

Introduction: Useful when comparing scores relative to other

scores Determine the relative position of scores within the

data set Rank or percentile rank: Percentage of scores at or

below the particular value Percentile: When a score is identified by its percentile

rank

Percentiles and Percentile Ranks Process:

Within simple distribution tableCreate new column (cf) cumulative

frequencyCount # of scores AT or BELOW

the category Interpretation:

Cumulative frequency of 20 = 20 scores fall at or below the category

Example 2.4 (p 52)

Percentiles and Percentile Ranks Process continued:

Same table: Add new column (c%) cumulative percentage or percentile rank

Divide (cf) value by N Intepretation:

Percentile rank of 95% = 95% of the scores fall at or below the category

Example 2.5 (p 53)

Textbook Problem Assignment

Problems: 1, 8, 16, 17, 20a, 20c, 24, 25