Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as...

22
Using Stata for Questionnaire Development Theodore Pollari and Phil Schumm Research Computing Group Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant 1R01 AG021487-01A1)

Transcript of Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as...

Page 1: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Using Stata for Questionnaire Development

Theodore Pollari and Phil Schumm

Research Computing GroupDepartment of Health Studies

University of Chicago

August 23, 2004

(Supported by National Institute on Aging grant 1R01 AG021487-01A1)

Page 2: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Creating a QuestionnaireStandard MethodA Better Alternative

Explanation by ExampleSchema for Individual QuestionsCross-References Between ItemsControl Logic: Contingencies and Blocks

Discussion of Suggested Questionnaire Development SystemReview of Our SystemReview of Advantages of Our SystemFuture Plans

Page 3: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Standard method of creating a questionnaire

Questionnaires are usually created using a word-processor. Thishas several disadvantages:

I Editing is inefficient (e.g., changing response categories formultiple questions, re-ordering questions, specifying controllogic, etc.)

I Not amenable to version control

I Must be manually re-entered into CAPI

I No mechanism for transferring metadata to final data set

Page 4: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

A Better AlternativeFixing Problems with a Systematic Process

Stata .do FileDescribing

Questionnaire

Stata Dataset With Complete Set of Variables & Metadata but

No Observations

Finished Product

I Create questionnaire as Stata .do file (or files) according to aspecific schema

I Use .do files to create a Stata dataset representing a completequestionnaire along with all relevant metadata.

Page 5: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Explanation by ExampleNational Social Life, Health and Aging Project

I National survey of 3,000 older adults

I Supported by a grant from the National Institute on Aging

I Fielded by NORC using commercial CAPI software (SPSSMR)

I Interdisciplinary project involving sociologists, economists,psychologists, and physicians

I Special challenges in integrating different research questionsinto a single instrument

I Effective collaboration both essential and sometimes difficultto achieve

Page 6: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: An Individual QuestionStata Code Specification

I Represent questions as individual variables

I Use characteristics to represent question text and control logic

I Use notes and additional characteristics to represent metadata

Page 7: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: An Individual QuestionDate of Birth : Stata Code

* Generate vars

gen dob = .

...

char dob[qt] In what month, day, and year were you born?char dob[mask] _____ (month) _____ (day) _____ (year)lab var dob "date of birth"char dob[src] 1992 HRSchar dob[dom] sociodemognote dob: 8/25/2003 - gpk

Page 8: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

A Better AlternativeFixing Problems with a Systematic Process

Stata .do FileDescribing

Questionnaire

Stata Dataset With Complete Set of Variables & Metadata but

No Observations

Human Readable

Output with Metadata for

Web/Development

_build.ado

Page 9: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: An Individual QuestionDate of Birth : Human Readable Form

Page 10: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: An Individual QuestionStata Code Specification

I Represent questions as individual variables

I Use characteristics to represent question text and control logic

I Use notes and additional characteristics to represent metadata

I Represent response categories as value labels

Page 11: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: Response CategoriesHispanic Origin: Stata Code

* Generate varsgen hispanic = ....

lab def yesno 1 "Yes" 2 "No" 98 "Don’t know" 99 "Refused"...

char hispanic[qt] Do you consider yourself Hispanic or Latino?lab val hispanic yesnolab var hispanic "Hispanic/Latino ethnicity"char hispanic[src] 1992 HRSchar hispanic[dom] sociodemognote hispanic: 8/26/2003 - gpk

Page 12: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: Response CategoriesHispanic Origin: Human Readable Form

Page 13: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: Setup for Cross-ReferencingCode from the .do File

* Generate varsgen dob = .gen hispanic = .

...

* Assign question numbers for referencinglocal i 1foreach v of varlist _all {

char ‘v’[qnum] ‘i’local ++i

}

Page 14: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: Control Logic – Cross-Referencing & BlocksPast Experience in Paid Work: Stata Code

* Generate varsgen pastpaid = .

...

char pastpaid[qt] Have you ever worked for pay?lab var pastpaid "past experience in paid work"char pastpaid[block] pastblkchar _dta[pastblk] ASK THESE QUESTIONS ONLY IF RESPONDENT/*

*/ ANSWERS "Yes" to ‘disab[qnum]’ or ‘othemp[qnum]’ /**/ OR IF RESPONDENT ANSWERS BOTH "Yes" to /**/ ‘homemk[qnum]’ AND "No" to ‘curremp[qnum]’

char pastpaid[src] modification of 1992 HRSchar pastpaid[dom] sociodemognote pastpaid: 9/1/2003 - gpk

Page 15: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: Control Logic – Cross-Referencing & BlocksPast Experience in Paid Work: Human Readable Form

Page 16: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: Control Logic – ContingenciesSelf-Described ’Born-Again’ Experience: Stata Code

* Generate varsgen bornagain = .

...

char bornagain[qt] Would you say you have been "born again" or have /**/ had a "born again" experience--that is, a turning point in your /**/ life when you committed yourself to Christ?

lab var bornagain "self-described ’born-again’ experience"char bornagain[c] relig 2 3 5char bornagain[src] NHSLSchar bornagain[dom] sociodemognote bornagain: 8/26/2003 - gpknote bornagain: 9/26/2003 - omitted per Linda’s e-mail 9/24/2003 - gpknote bornagain: 11/8/2003 - reinserted per Ed’s request 11/3/2003

Page 17: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Example: Control Logic – ContingenciesSelf-Described ’Born-Again’ Experience: Human Readable Form

Page 18: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

A Better AlternativeFixing Problems with a Systematic Process

Stata .do FileDescribing

Questionnaire

Stata Dataset With Complete Set of Variables & Metadata but

No Observations

Human Readable

Output with Metadata for

Web/Development

_build.ado

Page 19: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Brief Review

Create Questionnaires as Stata .do File (or Files)

I Represent questions as variables and response categories asvalue labels

I Use characteristics to represent question text and control logicsuch as contingencies, loops and blocks

I Use notes and additional characteristics to represent metadata

I .ado file available to generate web-accessible, human readableversion of questionnaire

I Finished .do file creates a Stata Dataset with complete set ofvariables and metadata but no observations

Page 20: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Advantages Over Traditional Methods

I Editing of questions is much more systematic and efficient

I Easy to use – proven useable on a major study with minimaltraining

I Flexible and powerful development and collaboration

I Questionnaire easily transfered to other systems and formats

I Preserves metadata along entire course of development andanalysis

Page 21: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Advantages Over Traditional Methods(Continued)

I Creates a ready-made documented dataset to appendcollected data onto

Stata .do FileDescribing

Questionnaire

Stata Dataset With Complete Set of Variables & Metadata but

No Observations

Human Readable

Output with Metadata for

Web/Development

_build.adoData Collection

Instrument (CAPI/CATI/CASI, pen & paper, etc.)

Final, Complete Dataset

-append-

Page 22: Using Stata for Questionnaire Developmentrepec.org/nasug2004/nasug2004_qdev.pdf · such as contingencies, loops and blocks I Use notes and additional characteristics to represent

Future Plans

Stata .do FileDescribing

Questionnaire

Stata Dataset With Complete Set of Variables & Metadata but

No Observations

Human Readable

Output with Metadata for

Web/Development

_build.adoData Collection

Instrument (CAPI/CATI/CASI, pen & paper, etc.)

Final, Complete Dataset

-append-