Chapter Twelve Quality Control and Initial Analysis of Data.

49
Chapter Twelve Quality Control and Initial Analysis of Data

Transcript of Chapter Twelve Quality Control and Initial Analysis of Data.

Page 1: Chapter Twelve Quality Control and Initial Analysis of Data.

Chapter Twelve

Quality Control and Initial

Analysis of Data

Page 2: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 2

Chapter Objectives

• Define editing and distinguish between a field edit and an office edit

• Define coding and outline the steps it involves• Compute measures of central tendency and

dispersion of the data for each variable in a data set

• State the potential uses of frequency distribution or one- way tables

Page 3: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 3

Data Analysis at Rockbridge Associates: Data Integrity

• Data integrity is the foundation for successful marketing research

• Rockbridge ensures integrity in the collection and processing of the data by a number of quality control checks for– mail surveys

– telephone surveys

– web surveys

• Rockbridge ensures data integrity in how the results are interpreted and explained to management

Page 4: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 4

Editing

• Editing is the process of examining completed data collection forms and taking whatever corrective action is needed to ensure the data are of high quality– Preliminary or field edit

– Final or office edit

Page 5: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 5

Field Edit

• A field edit, or preliminary edit, is a quick examination of completed data collection forms, usually on the same day they are filled out

• Objectives– Ensure that proper procedures are being followed in

selecting respondents, interviewing them, and recording their responses

– Fix fieldwork deficiencies before they turn into major problems

Page 6: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 6

Office Edit

• A final, or office edit, verifies response consistency and accuracy– Makes necessary corrections

– Determines whether some or all parts of a data collection form should be discarded

Page 7: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 7

What Is Wrong With this Response…

• A respondent said he was 18 years old but indicated that he had a Ph.D. when asked for his highest level of education.

Page 8: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 8

Editing Can Help Uncover

• Improper field procedures• Incomplete interviews• Improperly conducted interviews• Technical problems with the questionnaire or

interview• Respondent rapport problems• Consistency problems that can be isolated

and reconciled

Page 9: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 9

Improper Field Procedures

• Wrong questionnaire form used• Interview inadvertently not taken

Page 10: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 10

Incomplete Interviews

• Questions not asked• Directions not followed (proper segments of

the questionnaire were not administered)

Page 11: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 11

Improperly Conducted Interviews

• The wrong respondent interviewed (e.g., son instead of father)

• Questions misinterpreted by interviewer or respondent

• Evidence of bias or influencing of answers.• Failure to probe for adequate answers or the use of

poor probes• Interviewer's illegible writing and/or style.• Interviewer recorded information which identified a

respondent whose anonymity should have been protected

Page 12: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 12

Improperly Conducted Interviews (Cont’d)

• Interviewer apparently does not understand what type of responses constitute an answer to the actual question asked

• Interviewer does not understand what the objective of the question is and thus accepts an improper frame of reference for the respondent's answer

• Other evidence of need for training or instructions to be given to interviewer – failure to write down probes, wrong abbreviations,

failure to follow directions

Page 13: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 13

Technical Problems With the Questionnaire or Interview

• Space was not provided for needed information• The presence of unanticipated or unusually frequent

extreme responses to questions, indicating a possible need for rewording of certain questions

• Inappropriate or unworkable interviewer instructions not detected in the pretest

• The order in which questions were asked introduces confusion, resentment, or bias into the respondent's answers

Page 14: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 14

Respondent Rapport Problems

• Frequent refusal to answer certain questions.• Reports of abnormal termination of the

interview (or presence of hostility) due to sensitive questions

• Evidence that respondent and interviewer are playing the "game" of "What answer do you want me to give?"

• Evidence that the presence of other people in the interview situation is causing problems

Page 15: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 15

Consistency Problems That Can Be Isolated and Reconciled

• Contradictory answers – Reports no savings in one section of the interview but

reports interest from bank accounts in another section

• Misclassification – Mortgage debt improperly reported as installment debt

• Impossible answers – Reports paying $600 for a new Edsel in 1970 - the car

should have been recorded as a "used" car; or weekly income reported on the income-per-month line

Page 16: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 16

Consistency Problems That Can Be Isolated and Reconciled (Cont’d)

• Unreasonable (and probably erroneous) responses – Respondent reports borrowing $2,000 for two years to

buy a car but reported monthly payments multiplied by 24 months are less than $2,000

– Respondent reports that the house value is $90,000 while income is $2,000 per year and the respondent claims less than a high school education

Page 17: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 17

Preventing Errors

• Careful planning before fieldwork begins• Automating data entry

Page 18: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 18

Coding

• Coding broadly refers to the set of all tasks associated with transforming edited responses into a form that is ready for analysis

• Steps– Transforming responses to each question into a set of

meaningful categories

– Assigning numerical codes to the categories

– Creating a data set suitable for computer analysis

Page 19: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 19

Transforming Responses into Meaningful Categories

• A structured question is pre-categorized• Responses to a nonstructured or open-ended

question to be grouped into a meaningful and manageable set of categories

Page 20: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 20

The Best Way to Treat "Don't Know" Responses

• Infer an actual response – dubious validity• Classify the "don't know's" as a separate

response category for each question

Page 21: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 21

Missing-Value Category

• A missing value can stem from– A respondent's refusal to answer a question

– An interviewer's failure to ask a question or record an answer or a "don't know" that does not seem legitimate

• Best way to treat missing value responses– Sound questionnaire design

– Tight control over fieldwork

Page 22: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 22

Assigning Numerical Codes

• Assign appropriate numerical codes to responses that are not already in quantified form

• To assign numerical codes, the researcher should facilitate computer manipulation and analysis of the responses

Page 23: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 23

Coding Multiple Response

• Which of the following countries have you visited during the past 12 months?

________Canada________England________France________Germany________Japan________Mexico

• Need six variables, each relating to a specific country and having two possible values. For example, 1= “No” and 2 = “Yes”

• Six columns must be set aside in the data spreadsheet to record responses to this question

Page 24: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 24

Multiple Response Question –Rank Order Question

• Please rank the following fast-food restaurants by placing a 1 beside the restaurant you think is best overall, a 2 beside the restaurant you think is second best, and so on.__________Burger King__________McDonald's__________Wendy's__________Whataburger

• This question requires as many variables (and columns) as there are objects to be ranked

• 4 separate variables are needed

Page 25: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 25

Creating a Data Set

• Organized collection of data records• Each sample unit within the data set is called

a case or observation• Structure of a Data Set

– The number of observations = n

– The total number of variables embedded in the questionnaire is m, then

• Data set = n x m matrix of numbers

Page 26: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 26

Table 12.3 Structure of a Data Sheet

Page 27: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 27

Preliminary Data Analysis:Basic Descriptive Statistics

• Preliminary data analysis examines the central tendency and the dispersion of the data on each variable in the data set

Page 28: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 28

Table 12.4 Measures of Central Tendency and Dispersion for Different Types of Variables

Page 29: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 29

Measurement Level of Data Pertaining to Variable – Nominal

• Measures of Central Tendency– Mode: Most frequently occurring response

• Measures of Dispersion – Strictly speaking, the concept of dispersion is

not meaningful for nominal data

– An idea about the distribution of responses can be obtained by examining their relative frequencies of occurrence

Page 30: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 30

Measurement Level of Data Pertaining to Variable – Ordinal

• Measures of Central Tendency– Median: 50th percentile response

• Measures of Dispersion – Range: Defined by the highest and lowest

response values

– Interquartile range: Difference between the 75th and 25th percentile responses

Page 31: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 31

Measurement Level of Data Pertaining to Variable – Interval

• Measures of Central Tendency– Mean: Arithmetic average of response values

• Measures of Dispersion – Standard deviation: As defined in Chapter 9

Page 32: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 32

Measurement Level of Data Pertaining to Variable – Ratio

• Measures of Central Tendency– Mean: Arithmetic average of response values

• Measures of Dispersion – Standard deviation: As defined in Chapter 9

Page 33: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 33

Mode

• The value that occurs most frequently

Page 34: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 34

Table 12.5 How Long Have You Been Using

the Services of National? – Computing

Mode

Page 35: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 35

Median

• The observation below which 50 percent of the observations fall

Page 36: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 36

How long have you been using the services of National?

4 3 4 1 4 4 4 4 4 4 3

4 4 3 4 4 4 3 1 1

1= Less than a year; 2 = 1 to less than 2 years; 3 = 2 to less than 5 years;

4 = 5 years or more

Arranging the 20 values in ascending order:

1 1 1 3 3 3 3 4 4 4 4

4 4 4 4 4 4 4 4 4

Because the sample size = 20, there are two middle values: 4 and 4. The

median is, therefore, the average of the two middle values = 4.

Table 12.6 Length of Time Service Used – Responses from 20 Customers

Page 37: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 37

Table 12.7 Computing Median for Length of Time Service Used

Page 38: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 38

Mean

n = Number of units in the sample

xi = data obtained from each sample unit I

= sample mean value, given by

1

( )n

ii

X

n

X

Page 39: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 39

Table 12.8 Overall Quality of Services Provided by

National– Computing Mean

Page 40: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 40

Measures of Dispersion

• Range• Variance• Standard Deviation

Page 41: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 41

Range

• Range is the difference between the largest and smallest value

• The simplest measure of dispersion

Page 42: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 42

(xi –x )2

S2 = ---------- n-1

Variance

• Variance of a set of data is a measure of deviation of the data around the arithmetic mean

Page 43: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 43

n (xi –x )2

i=1---------- n-1

Standard Deviation

• Standard deviation is the square root of the variance

Page 44: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 44

Table 12.9 Overall Quality of Services Provided by National: Computing Range, Variance, and Standard Deviation

Page 45: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 45

Frequency Distribution: One-Way Tabulation

• One-way tabulation is a table showing the distribution of data pertaining to categories of a single variable

Page 46: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 46

Table 12.10 Age and Length of Time Service Used

Page 47: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 47

Table 12.10 Age and Length of Time Service Used (Cont’d)

Page 48: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 48

Why Averages May be Misleading

• Researchers tested a new sauce product and found– Mean rating of the taste test was close to the

middle of the scale, which had "very mild" and "very hot" as its bipolar adjectives

• Researcher’s conclusion – Consumers need really neither really hot nor

really mild sauce

Page 49: Chapter Twelve Quality Control and Initial Analysis of Data.

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 49

Why Averages May be Misleading (Cont’d)

• Deeper examination revealed – The existence of a large proportion of

consumers who wanted the sauce to be mild and an equally large proportion who wanted it to be hot nor really mild sauce

• Moral of the story– A clear understanding of the distribution of

responses can help a researcher avoid erroneous inferences