IHSN DDI General Template

download IHSN DDI General Template

of 28

Transcript of IHSN DDI General Template

  • 8/6/2019 IHSN DDI General Template

    1/28

    TEMPLATE NAME:

    DDI_GeneralStudyTemplate

    DESCRIPTION:

    International Household Survey Network standard DDI template to be used for surveys, censusesand administrative records. This is an upgrade from the previous format. The main changes are: -Disable Cube Setup and Cell Notes for Editor 1.1

    DOCUMENT METADATA

    Metadata Preparation

    Item Must DescripTion Note

    Study Title

    The title is the official name of the

    survey as it is stated on thequestionnaire or as it appears in thedesign documents. The following itemsshould be noted: - Include the referenceyear(s) of the survey in the title. - Donot include the abbreviation of thesurvey name in the title. - As the surveytitle is a proper noun, the first letter of each word should be capitalized (exceptfor prepositions or other conjunctions).- Including the country name in the titleis optional. Examples: - NationalHousehold Budget Survey 2002-2003 -Popstan Multiple Indicator ClusterSurvey 2002

    MetadataProducer

    Name of the person(s) or organization(s) who documented the dataset. Usethe "role" attribute to distinguishdifferent stages of involvement in theproduction process. Examples: Name:National Statistics Office (NSO) Role:Documentation of the study Name:International Household SurveyNetwork (IHSN) Role: Review of themetadata

    Date of Production

    This is the date (in ISO format YYYY-MM-DD) the DDI document wasproduced (not distributed or archived).This date will be automatically imputedwhen you save the file.

    Documenting a dataset is not a trivial

    exercise. Producing "perfect" metadatais probably impossible. It may thereforehappen that, having identified errors ina DDI document or having received

    Page 1 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    2/28

    STUDY METADATA

    DDI DocumentVersion

    suggestions for improvement, youdecide to modify the Document evenafter a first version has beendisseminated. This element is used toidentify and describe the current versionof the document. It is good practice to

    provide a version number (and date),and information on what distinguishesthis version from the previous one(s) if relevant. Example: Version 1.1 (July2006). This version is identical toversion 1.0, except for the section onData Appraisal which was updated.

    DDI DocumentID Number

    The ID number of a DDI document is aunique number that is used to identifythis DDI file. Define and use aconsistent scheme to use. Such an ID

    could be constructed as follows: DDI-country-producer-survey-year where -country is the 3-letter ISO countryabbreviation - producer is theabbreviation of the producing agency -survey is the survey abbreviation - yearis the reference year (or the year thesurvey started) - DDI document versionnumber Example: The DDI file relatedto the Demographic and Health Surveydocumented by staff from the Uganda

    Bureau of Statistics in 2005 would havethe following ID: DDI-UGA-UBOS-DHS-2005-v01. If the same survey isdocumented by a staff from the IHSN,this would be DDI-UGA-IHSN-DHS-205-v01.

    Identification

    Item Must DescripTion Note

    Title TRUE

    The title is the official name of thesurvey as it is stated on thequestionnaire or as it appears in thedesign documents. The followingitems should be noted: - Include thereference year(s) of the survey in thetitle. - Do not include the abbreviationof the survey name in the title. - As thesurvey title is a proper noun, the first

    letter of each word should becapitalized (except for prepositions orother conjunctions). - Including thecountry name in the title is optional.

    Page 2 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    3/28

    The title will in most cases be identicalto the Document Title (see above).Examples: - National HouseholdBudget Survey 2002-2003 - PopstanMultiple Indicator Cluster Survey2002

    Subtitle

    Subtitle is optional and rarely used. Asubtitle can be used to add informationusually associated with a sequentialqualifier for a survey. Example: Title:Welfare Monitoring Survey 2007Subtitle: Fifth round

    Translated Title

    In countries with more than oneofficial language, a translation of thetitle may be provided. Likewise, thetranslated title may simply be atranslation into English from acountry's own language. Specialcharacters should be properlydisplayed (such as accents and otherstress marks or different alphabets).

    Abbreviation

    The abbreviation of a survey is usuallythe first letter of each word of the titledsurvey. The survey reference year(s)may be included. Examples: - DHS2000 for "Demographic and HealthSurvey 2005" - HIES 2002-2003 for"Household Income and ExpenditureSurvey 2003"

    ID Number

    The ID number of a dataset is a uniquenumber that is used to identify aparticular survey. Define and use aconsistent scheme to use. Such an IDcould be constructed as follows:country-producer-survey-year-versionwhere - country is the 3-letter ISOcountry abbreviation - producer is theabbreviation of the producing agency -survey is the survey abbreviation -year is the reference year (or the yearthe survey started) - version is thenumber dataset version number (seeVersion Description below) Example:The Demographic and Health Surveyimplemented by the Uganda Bureau of Statistics in 2005 could have thefollowing ID: UGA-UBOS-DHS-2005-v01.

    Study TypeThe study type or survey type is thebroad category defining the survey.This item has a controlled vocabulary(you may customize the IHSN

    Page 3 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    4/28

    template to adjust this controlledvocabulary if needed).

    SeriesInformation

    A survey may be repeated at regularintervals (such as an annual labourforce survey), or be part of aninternational survey program (such asthe MICS, CWIQ, DHS, LSMS andothers). The Series information is adescription of this "collection" of surveys. A brief description of thecharacteristics of the survey, includingwhen it started, how many roundswere already implemented, and who isin charge would be provided here. If the survey does not belong to a series,leave this field empty. Example: TheMultiple Indicator Cluster Survey,

    Round 3 (MICS3) is the third round of MICS surveys, previously conductedaround 1995 (MICS1) and 2000(MICS2). MICS surveys are designedby UNICEF, and implemented bynational agencies in participatingcountries. MICS was designed tomonitor various indicators identified atthe World Summit for Children andthe Millennium Development Goals.Many questions and indicators in

    MICS3 are consistent and compatiblewith the prior round of MICS (MICS2)but less so with MICS1, although therehave been a number of changes indefinition of indicators betweenrounds. Round 1 covered X countries,round 2 covered Y countries, andRound Z covered N countries.

    Version

    Item Must DescripTion Note

    Description

    The version description should containa version number followed by a versionlabel. The version number shouldfollow a standard convention to beadopted by the institute. Werecommend that larger series bedefined by a number to the left of adecimal and iterations of the sameseries by a sequential number thatidentifies the release. Larger series will

    typically include (0) the raw, unediteddataset; (1) the edited dataset, nonanonymized, for internal use at the dataproducing agency; and (2) the edited

    Page 4 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    5/28

    dataset, prepared for dissemination tosecondary users (possiblyanonymized). Examples: - v0.1: Basicraw data, obtained from data entry(before editing). - v1.2: Edited data,second version, for internal use only. -

    v2.1: Edited, anonymous dataset forpublic distribution. A brief descriptionof the version should follow thenumerical identification.

    Notes

    Version notes should provide a brief report on the changes made through theversioning process. The note shouldindicate how this version differs fromother versions of the same dataset.

    Overview

    Item Must DescripTion Note

    Abstract

    The abstract should provide a clearsummary of the purposes, objectives andcontent of the survey. It should be written bya researcher or survey statistician aware of the survey.

    Kind of Data

    This field is a broad classification of thedata and it is associated with a drop downbox providing controlled vocabulary. Thatcontrolled vocabulary includes 9 items but isnot limited to them.

    Unit of Analysis

    Basic unit(s) of analysis or observation thatthe study describes: individuals,families/households, groups, facilities,institutions/organizations, administrativeunits, physical locations, etc. Examples: - Aliving standards survey with community-level questionnaire would have thefollowing units of analysis: individuals,households, and communities. - Aneconomic survey could have the firm and

    establishment as units of analysis.Scope

    Item Must DescripTion Note

    The scope is a description of the themes coveredby the survey. It can be viewed as a summary of the modules that are included in the questionnaire.The scope does not deal with geographic coverage.Example: The scope of the Multiple IndicatorCluster Survey includes: - HOUSEHOLD:

    Household characteristics, household listing,orphaned and vulnerable children, education, childlabour, water and sanitation, household use of insecticide treated mosquito nets, and salt

    Page 5 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    6/28

    Description of Scope

    iodization, with optional modules for childdiscipline, child disability, maternal mortality andsecurity of tenure and durability of housing. -WOMEN: Women's characteristics, childmortality, tetanus toxoid, maternal and newbornhealth, marriage, polygyny, female genital cutting,

    contraception, and HIV/AIDS knowledge, withoptional modules for unmet need, domesticviolence, and sexual behavior. - CHILDREN:Children's characteristics, birth registration andearly learning, vitamin A, breastfeeding, care of illness, malaria, immunization, and anthropometry,with an optional module for child development.

    TopicsClassifications

    A topic classification facilitates referencing andsearches in electronic survey catalogs. Topicsshould be selected from a standard thesaurus,preferably an international, multilingual thesaurus.

    The IHSN recommends the use of the thesaurusused by the Council of European Social ScienceData Archives (CESSDA). The CESSDAthesaurus has been introduced as a controlledvocabulary in the IHSN Study Template version1.3 (available at www.surveynetwork.org/toolkit).

    Keywords

    Keywords summarize the content or subject matterof the survey. As topic classifications, these areused to facilitate referencing and searches inelectronic survey catalogs. Keywords should be

    selected from a standard thesaurus, preferably aninternational, multilingual thesaurus. Entering alist of keywords is tedious. This option is providedfor advanced users only.

    Coverage

    Item Must DescripTion Note

    Country TRUE

    Enter the country name, even in caseswhere the survey did not cover theentire country. In the field"Abbreviation", we recommend thatyou enter the 3-letter ISO code of thecountry. If the dataset you documentcovers more than one country, enter allin separate rows.

    GeographicCoverage

    This filed aims at describing at whatgeographic level the data arerepresentative. Typical entries will be"National coverage", "Urban (or rural)areas only", "State of ...", "Capitalcity", etc. Note that we do not describe

    here where the data was collected. Forexample, as sample survey could bedeclared as "national coverage" even incases where some districts where not

    Page 6 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    7/28

    included in the sample, as long as thesampling strategy was such that therepresentativity is national.

    Universe

    We are interested here in the surveyuniverse (not the universe of particularsections of the questionnaires orvariables), i.e. in the identification of the population of interest in the survey.The universe will rarely be the entirepopulation of the country. Samplehousehold surveys, for example,usually do not cover homeless, nomads,diplomats, community households.Population censuses do not coverdiplomats. Try to provide the mostdetailed information possible on thepopulation covered by thesurvey/census. Example: The surveycovered all de jure household members(usual residents), all women aged 15-49years resident in the household, and allchildren aged 0-4 years (under age 5)resident in the household.

    Producers and Sponsors

    Item Must DescripTion Note

    PrimaryInvestigator TRUE

    The primary investigator will in mostcases be an institution, but could alsobe an individual in the case of small-scale academic surveys. The twofields to be completed are the Nameand the Affiliation fields. Generally,in a survey, the Primary Investigatorwill be the institution implementingthe survey. If various institutionshave been equally involved as maininvestigators, then all should bementioned. This only includes theagencies responsible for theimplementation of the survey, not itsfunding or technical assistance. Theorder in which they are listed isdiscretionary. It can be alphabetic orby significance of contribution.Individual persons can also bementioned. If persons are mentioneduse the appropriate format of Surname, First name.

    This field is provided to list other

    interested parties and persons thathave played a significant but not theleading technical role inimplementing and producing the data.

    Page 7 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    8/28

    Other Producers

    The specific fields to be competedare: Name of the organization,Abbreviation, Affiliation and Role. If any of the fields are not applicablethese can be left blank. Theabbreviations should be the official

    abbreviation of the organization. Therole should be a short and succinctphrase or description on the specificassistance provided by theorganization in order to produce thedata. The roles should be standardvocabulary such as: - [Technicalassistance in] questionnaire design -[Technical assistance in] samplingmethodology / selection - [Technicalassistance in] data collection -[Technical assistance in] dataprocessing - [Technical assistance in]data analysis Do not include here thefinancial sponsors.

    Funding

    List the organizations (national orinternational) that have contributed,in cash or in kind, to the financing of the survey. The governmentinstitution that has provided fundingshould not be forgotten.

    OtherAcknowledgments

    This optional field can be used to

    acknowledge any other people andinstitutions that have in some formcontributed to the survey.

    Sampling

    Item Must DescripTion Note

    This field only applies to samplesurveys. Information on samplingprocedure is crucial (although notapplicable for censuses and

    administrative datasets). This sectionshould include summary informationthat includes though is not limited to: -Sample size - Selection process (e.g.,probability proportional to size or oversampling) - Stratification (implicit andexplicit) - Stages of sample selection -Design omissions in the sample - Levelof representation - Strategy for absentrespondents/not found/refusals(replacement or not) - Sample frame

    used, and listing exercise conducted toupdate it It is useful also to indicate herewhat variables in the data files identifythe various levels of stratification and

    Page 8 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    9/28

    SamplingProcedure

    the primary sample unit. These arecrucial to the data users who want toproperly account for the sampling designin their analyses and calculations of sampling errors. This section acceptsonly text format; formulae cannot be

    entered. In most cases, technicaldocuments will exist that describe thesampling strategy in detail. In suchcases, include here a reference(title/author/date) to this document, andmake sure that the document is providedin the External Resources. Example:5000 households were selected for thesample. Of these, 4996 were occupiedhouseholds and 4811 were successfullyinterviewed for a response rate of 96.3%.Within these households, 7815 eligiblewomen aged 15-49 were identified forinterview, of which 7505 weresuccessfully interviewed (response rate96.0%), and 3242 children aged 0-4were identified for whom the mother orcaretaker was successfully interviewedfor 3167 children (response rate 97.7%).These give overall response rates(household response rate timesindividual response rate) for thewomen's interview of 92.5% and for thechildren's interview of 94.1%.

    Deviationsfrom SampleDesign

    This field only applies to samplesurveys. Sometimes the reality of thefield requires a deviation from thesampling design (for example due todifficulty to access to zones due toweather problems, political instability,etc). If for any reason, the sample designhas deviated, this should be reportedhere.

    Response Rates

    Response rate provides that percentageof households (or other sample unit) thatparticipated in the survey based on theoriginal sample size. Omissions mayoccur due to refusal to participate,impossibility to locate the respondent, orother. Sometimes, a household may bereplaced by another by design. Check that the information provided here isconsistent with the sample size indicated

    in the "Sampling procedure" field andthe number of records found in thedataset (for example, if the sampledesign mention a sample of 5,000

    Page 9 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    10/28

    households and the data on contain dataon 4,500 households, the response rateshould not be 100 percent). Provide if possible the response rates by stratum. If information is available on the causes of non-response (refusal/not found/other),

    provide this information as well. Thisfield can also in some cases be used todescribe non-responses in populationcensuses.

    Weighting

    This field only applies to samplesurveys. Provide here the list of variables used as weighting coefficient.If more than one variable is a weightingvariable, describe how these variablesdiffer from each other and what thepurpose of each one of them is.

    Example: Sample weights werecalculated for each of the data files.Sample weights for the household datawere computed as the inverse of theprobability of selection of the household,computed at the sampling domain level(urban/rural within each region). Thehousehold weights were adjusted fornon-response at the domain level, andwere then normalized by a constantfactor so that the total weighted number

    of households equals the totalunweighted number of households. Thehousehold weight variable is calledHHWEIGHT and is used with the HHdata and the HL data. Sample weightsfor the women's data used the un-normalized household weights, adjustedfor non-response for the women'squestionnaire, and were then normalizedby a constant factor so that the totalweighted number of women's casesequals the total unweighted number of women's cases. Sample weights for thechildren's data followed the sameapproach as the women's and used theun-normalized household weights,adjusted for non-response for thechildren's questionnaire, and were thennormalized by a constant factor so thatthe total weighted number of children'scases equals the total unweightednumber of children's cases.

    Data CollectionItem Must DescripTion Note

    Page 10 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    11/28

    Dates of Collection

    Enter the dates (at least month and year)of the start and end of the data collection.DATE MUST BE ENTERED IN THEISO FORMAT YYYY-MM-DD In somecases, data collection for a same surveycan be conducted in waves. In such case,

    you should enter the start and end date of each wave separately, and identify eachwave in the "cycle" field.

    Time Periods

    This field will usually be left empty. Timeperiod differs from the dates of collectionas they represent the period for which thedata collected are applicable or relevant.NOTE: DATE MUST BE ENTERED INTHE FORMAT YYYY-MM-DD

    Mode of DataCollection

    The mode of data collection is the mannerin which the interview was conducted orinformation was gathered. This field is acontrolled vocabulary field. Use the drop-down button in the Toolkit to select oneoption. In most cases, the response will be"face to face interview". But for somespecific kinds of datasets, such as forexample data on rain falls, the responsewill be different.

    Notes on DataCollection

    This element is provided in order to

    document any specific observations,occurrences or events during datacollection. Consider stating such itemslike: - Was a training of enumeratorsheld? (elaborate) - Any events that couldhave a bearing on the data quality? - Howlong did an interview take on average? -Was there a process of negotiationbetween households, the community andthe implementing agency? - Are anecdotalevents recorded? - Have the field teams

    contributed by supplying information onissues and occurrences during datacollection? - In what language was theinterview conducted? - Was a pilot surveyconducted? - Were there any correctiveactions taken by management whenproblems occurred in the field? Example:The pre-test for the survey took placefrom August 15, 2006 - August 25, 2006and included 14 interviewers who wouldlater become supervisors for the main

    survey. Each interviewing teamcomprised of 3-4 female interviewers (nomale interviewers were used due to thesensitivity of the subject matter), together

    Page 11 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    12/28

    with a field editor and a supervisor and adriver. A total of 52 interviewers, 14supervisors and 14 field editors wereused. Data collection took place over aperiod of about 6 weeks from September2, 2006 until October 17, 2006.

    Interviewing took place everydaythroughout the fieldwork period, althoughinterviewing teams were permitted to takeone day off per week. Interviews averaged35 minutes for the householdquestionnaire (excluding salt testing), 23minutes for the women's questionnaire,and 27 for the under five children'squestionnaire (excluding theanthropometry). Interviews wereconducted primarily in English andMumbo-jumbo, but occasionally usedlocal translation in double-Dutch, whenthe respondent did not speak English orMumbo-jumbo. Six staff members of GenCenStat provided overall fieldwork coordination and supervision. The overallfield coordinator was Mrs. Doe.

    Questionnaires

    This element is provided to describe thequestionnaire(s) used for the datacollection. The following should bementioned: - List of questionnaires and

    short description of each (allquestionnaires must be provided asExternal Resources) - In what languagewere the questionnaires published? -Information on the questionnaire designprocess (based on a previousquestionnaire, based on a standard modelquestionnaire, review by stakeholders). If a document was compiled that containsthe comments provided by thestakeholders on the draft questionnaire, or

    a report prepared on the questionnairetesting, a reference to these documentsshould be provided here and thedocuments should be provided as ExternalResources. Example: The questionnairesfor the Generic MICS were structuredquestionnaires based on the MICS3Model Questionnaire with somemodifications and additions. A householdquestionnaire was administered in eachhousehold, which collected various

    information on household membersincluding sex, age, relationship, andorphanhood status. The householdquestionnaire includes household

    Page 12 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    13/28

    characteristics, support to orphaned andvulnerable children, education, childlabour, water and sanitation, householduse of insecticide treated mosquito nets,and salt iodization, with optional modulesfor child discipline, child disability,

    maternal mortality and security of tenureand durability of housing. In addition to ahousehold questionnaire, questionnaireswere administered in each household forwomen age 15-49 and children under agefive. For children, the questionnaire wasadministered to the mother or caretaker of the child. The women's questionnaireinclude women's characteristics, childmortality, tetanus toxoid, maternal andnewborn health, marriage, polygyny,female genital cutting, contraception, andHIV/AIDS knowledge, with optionalmodules for unmet need, domesticviolence, and sexual behavior. Thechildren's questionnaire includeschildren's characteristics, birth registrationand early learning, vitamin A,breastfeeding, care of illness, malaria,immunization, and anthropometry, withan optional module for childdevelopment. The questionnaires weredeveloped in English from the MICS3Model Questionnaires, and weretranslated into Mumbo-jumbo. After aninitial review the questionnaires weretranslated back into English by anindependent translator with no priorknowledge of the survey. The back translation from the Mumbo-jumboversion was independently reviewed andcompared to the English original.Differences in translation were reviewedand resolved in collaboration with theoriginal translators. The English andMumbo-jumbo questionnaires were bothpiloted as part of the survey pretest. Allquestionnaires and modules are providedas external resources.

    Data Collectors

    This element is provided in order torecord information regarding the personsand/or agencies that took charge of thedata collection. This element includes 3fields: Name, Abbreviation and the

    Affiliation. In most cases, we will recordhere the name of the agency, not the nameof interviewers. Only in the case of verysmall-scale surveys, with a very limited

    Page 13 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    14/28

    number of interviewers, the name of person will be included as well. The fieldAffiliation is optional and not relevant inall cases. Example: Name: CentralStatistics Office Abbreviation: CSOAffiliation: Ministry of Planning

    Supervision

    This element will provide information onthe oversight of the data collection. Thefollowing should be considered: - Werethe enumerators organized in teams thatincluded a controller and a supervisor?With how many controllers/supervisorsper interviewer? - What were the mainroles of the controllers/supervisors? -Were there visits to the field by uppermanagement? How often? Example:Interviewing was conducted by teams of

    interviewers. Each interviewing teamcomprised of 3-4 female interviewers, afield editor and a supervisor, and a driver.Each team used a 4 wheel drive vehicle totravel from cluster to cluster (and wherenecessary within cluster). The role of thesupervisor was to coordinator field datacollection activities, includingmanagement of the field teams, suppliesand equipment, finances, maps andlistings, coordinate with local authorities

    concerning the survey plan and makearrangements for accommodation andtravel. Additionally, the field supervisorassigned the work to the interviewers,spot checked work, maintained fieldcontrol documents, and sent completedquestionnaires and progress reports to thecentral office. The field editor wasresponsible for reviewing eachquestionnaire at the end of the day,checking for missed questions, skiperrors, fields incorrectly completed, andchecking for inconsistencies in the data.The field editor also observed interviewsand conducted review sessions withinterviewers. Responsibilities of thesupervisors and field editors are describedin the Instructions for Supervisors andField Editors, together with the differentfield controls that were in place to controlthe quality of the fieldwork. Field visitswere also made by a team of central staff on a periodic basis during fieldwork. Thesenior staff of GenCenStat also made 3visits to field teams to provide supportand to review progress.

    Page 14 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    15/28

    Data Processing

    Item Must DescripTion Note

    Data Editing

    The data editing should containinformation on how the data was treated orcontrolled for in terms of consistency andcoherence. This item does not concern thedata entry phase but only the editing of datawhether manual or automatic. - Was a hotdeck or a cold deck technique used to editthe data? - Were corrections madeautomatically (by program), or by visualcontrol of the questionnaire? - Whatsoftware was used? If materials areavailable (specifications for data editing,report on data editing, programs used fordata editing), they should be listed here and

    provided as external resources. Example:Data editing took place at a number of stages throughout the processing,including: a) Office editing and coding b)During data entry c) Structure checking andcompleteness d) Secondary editing e)Structural checking of SPSS data filesDetailed documentation of the editing of data can be found in the "Data processingguidelines" document provided as anexternal resource.

    Use this field to provide as muchinformation as possible on the data entrydesign. This includes such details as: -Mode of data entry (manual or by scanning,in the field/in regions/at headquarters) -Computer architecture (laptop computers inthe field, desktop computers, scanners,PDA, other; indicate the number of computers used) - Software used - Use (andrate) of double data entry - Average

    productivity of data entry operators;number of data entry operators involvedand their work schedule Information ontabulation and analysis can also beprovided here. All available materials (dataentry/tabulation/analysis programs; reportson data entry) should be listed here andprovided as external resources. Example:Data were processed in clusters, with eachcluster being processed as a complete unitthrough each stage of data processing. Each

    cluster goes through the following steps: 1)Questionnaire reception 2) Office editingand coding 3) Data entry 4) Structure andcompleteness checking 5) Verification

    Page 15 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    16/28

    OtherProcessing

    entry 6) Comparison of verification data 7)Back up of raw data 8) Secondary editing9) Edited data back up After all clusters areprocessed, all data is concatenated togetherand then the following steps are completedfor all data files: 10) Export to SPSS in 4

    files (hh - household, hl - householdmembers, wm - women, ch - children under5) 11) Recoding of variables needed foranalysis 12) Adding of sample weights 13)Calculation of wealth quintiles andmerging into data 14) Structural checkingof SPSS files 15) Data quality tabulations16) Production of analysis tabulationsDetails of each of these steps can be foundin the data processing documentation, dataediting guidelines, data processing

    programs in CSPro and SPSS, andtabulation guidelines. Data entry wasconducted by 12 data entry operators intow shifts, supervised by 2 data entrysupervisors, using a total of 7 computers (6data entry computers plus one supervisors'computer). All data entry was conducted atthe GenCenStat head office using manualdata entry. For data entry, CSPro version2.6.007 was used with a highly structureddata entry program, using system

    controlled approach that controlled entry of each variable. All range checks and skipswere controlled by the program andoperators could not override these. Alimited set of consistency checks were alsoincluded in the data entry program. Inaddition, the calculation of anthropometricZ-scores was also included in the data entryprograms for use during analysis. Open-ended responses ("Other" answers) werenot entered or coded, except in rarecircumstances where the response matchedan existing code in the questionnaire.Structure and completeness checkingensured that all questionnaires for thecluster had been entered, were structurallysound, and that women's and children'squestionnaires existed for each eligiblewoman and child. 100% verification of allvariables was performed using independentverification, i.e. double entry of data, withseparate comparison of data followed bymodification of one or both datasets tocorrect keying errors by original operatorswho first keyed the files. After completionof all processing in CSPro, all individual

    Page 16 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    17/28

  • 8/6/2019 IHSN DDI General Template

    18/28

    Estimates of

    SamplingError

    sample, it would have been possible touse straightforward formulae forcalculating sampling errors. However, the2005-2006 MICS sample is the result of amulti-stage stratified design, andconsequently needs to use more complex

    formulae. The SPSS complex samplesmodule has been used to calculatesampling errors for the 2005-2006 MICS.This module uses the Taylor linearizationmethod of variance estimation for surveyestimates that are means or proportions.This method is documented in the SPSSfile CSDescriptives.pdf found under theHelp, Algorithms options in SPSS.Sampling errors have been calculated fora select set of statistics (all of which areproportions due to the limitations of theTaylor linearization method) for thenational sample, urban and rural areas,and for each of the five regions. For eachstatistic, the estimate, its standard error,the coefficient of variation (or relativeerror -- the ratio between the standarderror and the estimate), the design effect,and the square root design effect (DEFT -- the ratio between the standard errorusing the given sample design and thestandard error that would result if asimple random sample had been used), aswell as the 95 percent confidenceintervals (+/-2 standard errors). Details of the sampling errors are presented in thesampling errors appendix to the reportand in the sampling errors table presentedin the external resources.

    This section can be used to report anyother action taken to assess the reliabilityof the data, or any observations regarding

    data quality. This item can include: - Fora population census, information on thepost enumeration survey (a report shouldbe provided in external resources andmentioned here). - For any survey/census,a comparison with data from anothersource. - Etc. Example: A series of dataquality tables and graphs are available toreview the quality of the data and includethe following: - Age distribution of thehousehold population - Age distribution

    of eligible women and interviewedwomen - Age distribution of eligiblechildren and children for whom themother or caretaker was interviewed -

    Page 18 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    19/28

    Other Formsof DataAppraisal

    Age distribution of children under age 5by 3 month groups - Age and periodratios at boundaries of eligibility - Percentof observations with missing informationon selected variables - Presence of motherin the household and person interviewed

    for the under 5 questionnaire - Schoolattendance by single year age - Sex ratioat birth among children ever born,surviving and dead by age of respondent -Distribution of women by time since lastbirth - Scatter plot of weight by height,weight by age and height by age - Graphof male and female population by singleyears of age - Population pyramid Theresults of each of these data quality tablesare shown in the appendix of the finalreport and are also given in the externalresources section. The general rule forpresentation of missing data in the finalreport tabulations is that a column ispresented for missing data if thepercentage of cases with missing data is1% or more. Cases with missing data onthe background characteristics (e.g.education) are included in the tables, butthe missing data rows are suppressed andnoted at the bottom of the tables in thereport (not in the SPSS output, however).

    Data Access

    Item Must DescripTion Note

    CitationRequirement

    Citation requirement is the way that thedataset should be referenced when cited in anypublication. Every dataset should have acitation requirement. This will guarantee thatthe data producer gets proper credit, and thatanalytical results can be linked to the properversion of the dataset. The Access Policyshould explicitly mention the obligation tocomply with the citation requirement. Thecitation should include at least the primaryinvestigator, the name and abbreviation of thedataset, the reference year, and the versionnumber. Include also a website where the dataor information on the data is made availableby the official data depositor. Example:"National Statistics Office of Popstan,Multiple Indicators Cluster Survey 2000

    (MICS 2000), Version 1.1 of the public usedataset (April 2001), provided by the NationalData Archive. www.nda_popstan.org"

    Page 19 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    20/28

    AccessConditions

    Each dataset should have an "Access policy"attached to it. The IHSN recommends threelevels of accessibility: - Public use files,accessible to all - Licensed datasets, accessibleunder conditions - Datasets only accessible ina data enclave, for the most sensitive andconfidential data. The IHSN has formulatedstandard, generic policies and access forms foreach one of these three levels (which eachcountry can customize to its specific needs).One of the three policies may be copy/pastedin this field once it has been edited as neededand approved by the appropriate authority.Before you fill this field, a decision has to bemade by the management of the data depositoragency. Avoid writing a specific statement foreach dataset. If the access policy is subject toregular changes, you should enter here a URLwhere the user will find detailed informationon access policy which applies to this specificdataset. If the datasets are sold, pricinginformation should also be provided on awebsite instead of being entered here. If theaccess policy is not subject to regular changes,you may enter more detailed information here.For a public use file for example, you couldenter information like: The dataset has beenanonymized and is available as a Public Use

    Dataset. It is accessible to all for statistical andresearch purposes only, under the followingterms and conditions: 1. The data and othermaterials will not be redistributed or sold toother individuals, institutions, or organizationswithout the written agreement of the [NationalData Archive]. 2. The data will be used forstatistical and scientific research purposesonly. They will be used solely for reporting of aggregated information, and not forinvestigation of specific individuals or

    organizations. 3. No attempt will be made tore-identify respondents, and no use will bemade of the identity of any person orestablishment discovered inadvertently. Anysuch discovery would immediately be reportedto the [National Data Archive]. 4. No attemptwill be made to produce links among datasetsprovided by the [National Data Archive], oramong data from the [National Data Archive]and other datasets that could identifyindividuals or organizations. 5. Any books,

    articles, conference papers, theses,dissertations, reports, or other publications thatemploy data obtained from the [National DataArchive] will cite the source of data in

    Page 20 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    21/28

    accordance with the Citation Requirementprovided with each dataset. 6. An electroniccopy of all reports and publications based onthe requested data will be sent to the [NationalData Archive]. The original collector of thedata, the [National Data Archive], and the

    relevant funding agencies bear noresponsibility for use of the data or forinterpretations or inferences based upon suchuses.

    Access Authority

    This section is composed of various sections:Name-Affiliation-email-URI. This informationprovides the contact person or entity to gainauthority to access the data. It is advisable touse a generic email contact such [email protected] whenever

    possible to avoid tying access to a particularindividual whose functions may change overtime.

    Confidentiality

    If the dataset is not anonymized, we mayindicate here what Affidavit of Confidentialitymust be signed before the data can beaccessed. Another option is to include thisinformation in the next element (Accessconditions). If there is no confidentiality issue,this field can be left blank. An example of statement could be the following:

    Confidentiality of respondents is guaranteedby Articles N to NN of the National StatisticsAct of [date]. Before being granted access tothe dataset, all users have to formally agree: 1.To make no copies of any files or portions of files to which s/he is granted access exceptthose authorized by the data depositor. 2. Notto use any technique in an attempt to learn theidentity of any person, establishment, orsampling unit not identified on public use datafiles. 3. To hold in strictest confidence the

    identification of any establishment orindividual that may be inadvertently revealedin any documents or discussion, or analysis.Such inadvertent identification revealed inher/his analysis will be immediately brought tothe attention of the data depositor. Thisstatement does not replace a morecomprehensive data agreement (see Accesscondition).

    Disclaimer and Copyright

    Item Must DescripTion NoteA disclaimer limits the liability that theStatistics Office has regarding the use

    Page 21 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    22/28

    DATASET METADATA

    Disclaimer

    of the data. A standard legal statementshould be used for all datasets from asame agency. The IHSN recommendsthe following formulation: The user of the data acknowledges that the originalcollector of the data, the authorized

    distributor of the data, and the relevantfunding agency bear no responsibilityfor use of the data or for interpretationsor inferences based upon such uses.

    CopyrightInclude here a copyright statement onthe dataset, such as: (c) 2007, PopstanCentral Statistics Agency

    Contacts

    Item Must DescripTion Note

    ContactPersons

    Users of the data may need furtherclarification and information. This sectionmay include the name-affiliation-email-URI of one or multiple contact persons.Avoid putting the name of individuals. Theinformation provided here should be validfor the long term. It is therefore preferableto identify contact persons by a title. Thesame applies for the email field. Ideally, a"generic" email address should beprovided. It is easy to configure a mailserver in such a way that all messages sentto the generic email address would beautomatically forwarded to some staff members. Example: Name: Head, DataProcessing Division Affiliation: NationalStatistics Office Email: [email protected]: www.cso.org/databank

    Data Files

    Item Must DescripTion Note

    A data filename usually provides littleinformation on its content. Provide here adescription of this content. Thisdescription should clearly distinguishcollected variables and derived variables.It is also useful to indicate the availabilityin the data file of some particularvariables such as the weightingcoefficients. If the file contains derived

    variables, it is good practice to refer to thecomputer program that generated it.Examples: - The file contains data relatedto section 3A of the household survey

    Page 22 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    23/28

    Contents

    questionnaire (Education of householdmembers aged 6 to 24 years). It alsocontains the weighting coefficient, andvarious recoded variables on levels of education. - The file contains derived dataon household consumption, annualized

    and aggregated by category of productsand services. The file also contains aregional price deflator variable and thehousehold weighting coefficient. The filewas generated using a Stata programnamed "cons_aggregate.do" available inthe external resources.

    Producer

    Put the name of the agency that producedthe data file. Most data files will havebeen produced by the survey primaryinvestigator. In some cases however,

    auxiliary or derived files from otherproducers may be released with a data set.This may for example include CPI datagenerated by a different agency, or filescontaining derived variables generated bya researcher.

    Version

    A data file may undergo various changesand modifications. These file specificversions can be tracked in this element.This field will in most cases be leftempty. It is more important to fill the

    field identifying the version of the dataset(see above).

    ProcessingChecks

    Use this element if needed to provideinformation about the types of checks andoperations that have been performed onthe data file to make sure that the data areas correct as possible, e.g. consistencychecking, wildcode checking, etc. Notethat the information included here shouldbe specific to the data file. Informationabout data processing checks that havebeen carried out on the data collection(study) as a whole should be provided inthe "Data editing" element at the studylevel. You may also provide here areference to an external resource thatcontains the specifications for the dataprocessing checks (that same informationmay be provided also in the "DataEditing" filed in the Study Descriptionsection).

    Missing DataMissing data can be given certain coding.A common convention is to iterate thenumber "9"to fill a field. This value needsto be defined as missing in the data set

    Page 23 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    24/28

    VARIABLES METADATA

    and can be explained in detail in thiselement.

    Notes

    This field, aiming to provide informationto the user on items not coveredelsewhere, will in most cases be leftempty.

    Description

    Item Must DescripTion Note

    Definition

    This element provides a space todescribe the variable in detail. Not allvariables require definition. Thefollowing variables should always be

    defined when available in aquestionnaire: - Household (attach thisdefinition to the "household ID"variable - Head of household (attachthis definition to the variable"relationship to the head" - Urban/rural

    Universe

    The universe at the variable levelreflects skip patterns within-records ina questionnaire. This information cantypically be copy/pasted from thesurvey questionnaire. Try to be asspecific as possible. This information isvery useful for the analyst. In manycases, a block of variables will have thesame universe (for example, a block of variables on education can all relate tothe population aged 6 to 24 year). TheToolkit allows you to select multiplevariables and enter the universeinformation to all variables at once.

    Source of Information

    Enter information regarding whoprovided the information containedwithin the variable. In most cases, thesource will be "Head of household" or"Household member". But it may alsobe - GPS measure (for geographicposition) - Interviewer's visualobservation (for type of dwelling) -Best informant in community - Etc.

    Greater description on the nature of thevariable can be placed in this element.For example this element can provide a

    clearer definition for certain variables(i.e. a variable that providesinformation on whether a person is ahousehold member). In the case of

    Page 24 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    25/28

    Concepts

    household membership, a conceptualdefinition can be provided. Example: Ahousehold member is defined as anyperson who has been resident in thehousehold for six months or more in agiven year and takes meals together

    OR by default the head of household,infants under 6 months, newly weddedcouples etc.

    Question

    Item Must DescripTion Note

    Pre-QuestionText

    The pre-question texts are theinstructions provided to theinterviewers and printed in thequestionnaire before the literalquestion. This does not apply to all

    variables. Do not confuse this withinstructions provided in theinterviewer's manual. With this and thenext two fields, one should be able tounderstand how the question was askedduring the interview. See examplebelow. The literal question is the fulltext of the questionnaire as theenumerator is expected to ask it whenconducting the interview. This does notapply to all variables (it does not apply

    to derived variables). The post-questiontexts are instructions provided to theinterviewers, printed in thequestionnaire after the literal question.Post-question can be used to enterinformation on skips provided in thequestionnaire. This does not apply toall variables. Do not confuse this withinstructions provided in theinterviewer's manual. With this and thenext two fields, one should be able to

    understand how the question was askedduring the interview. See exampleabove. Example: - Pre-question: Check age. If child is 3 years old or more, ask:- Literal question: Does (name) attendany organized learning or earlychildhood education programme, suchas private or government facility,including kindergarten or communitychild care? - Post-question: If answer is2 or 9 > Goto next module

    The pre-question texts are theinstructions provided to theinterviewers and printed in the

    Page 25 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    26/28

    Literal Question

    questionnaire before the literalquestion. This does not apply to allvariables. Do not confuse this withinstructions provided in theinterviewer's manual. With this and thenext two fields, one should be able to

    understand how the question was askedduring the interview. See examplebelow. The literal question is the fulltext of the questionnaire as theenumerator is expected to ask it whenconducting the interview. This does notapply to all variables (it does not applyto derived variables). The post-questiontexts are instructions provided to theinterviewers, printed in thequestionnaire after the literal question.Post-question can be used to enterinformation on skips provided in thequestionnaire. This does not apply toall variables. Do not confuse this withinstructions provided in theinterviewer's manual. With this and thenext two fields, one should be able tounderstand how the question was askedduring the interview. See exampleabove. Example: - Pre-question: Check age. If child is 3 years old or more, ask:- Literal question: Does (name) attendany organized learning or earlychildhood education programme, suchas private or government facility,including kindergarten or communitychild care? - Post-question: If answer is2 or 9 > Goto next module

    Post-Question

    The pre-question texts are theinstructions provided to theinterviewers and printed in thequestionnaire before the literal

    question. This does not apply to allvariables. Do not confuse this withinstructions provided in theinterviewer's manual. With this and thenext two fields, one should be able tounderstand how the question was askedduring the interview. See examplebelow. The literal question is the fulltext of the questionnaire as theenumerator is expected to ask it whenconducting the interview. This does not

    apply to all variables (it does not applyto derived variables). The post-questiontexts are instructions provided to theinterviewers, printed in the

    Page 26 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    27/28

    Text questionnaire after the literal question.Post-question can be used to enterinformation on skips provided in thequestionnaire. This does not apply toall variables. Do not confuse this withinstructions provided in the

    interviewer's manual. With this and thenext two fields, one should be able tounderstand how the question was askedduring the interview. See exampleabove. Example: - Pre-question: Check age. If child is 3 years old or more, ask:- Literal question: Does (name) attendany organized learning or earlychildhood education programme, suchas private or government facility,including kindergarten or communitychild care? - Post-question: If answer is2 or 9 > Goto next module

    InterviewerInstructions

    Copy/paste the instructions provided tothe interviewers in the interviewer'smanual. In cases where someinstructions relate to multiple variables,repeat the information in all variables.The Toolkit allows you to selectmultiple variables and enter theinformation to all these variables atonce.

    Imputation and Derivation

    Item Must DescripTion Note

    Imputation

    The field is provided to record anyimputation or replacement techniqueused to correct inconsistent orunreasonable data. It is recommendedthat this field provide a summary of what was done and include a referenceto a file in the external resourcessection.

    This element applies to data that wereobtained by recoding collectedvariables, or by calculating newvariables that were not directly obtainedfrom data collection. It is very importantto properly document such variables.Poorly documented variables cannot (orshould not) be used by researchers. Incases where the recoding or derivationmethod was very simple, a full

    description can be provided here. Forexample, if variable AGE_GRP wasobtained by recoding variable S1Q3, wecould simply mention "Variable

    Page 27 of 28

    23/07/2010file://C:\Users\Owner\AppData\Local\Temp\sxm7FD8.htm

  • 8/6/2019 IHSN DDI General Template

    28/28

    Recoding andDerivation

    obtained by recoding the age in yearsprovided in variable S1Q3 into agegroups for years 0-4, 5-9, ..., 60-64, 65and over. Code 99 indicates unknownage." When the derivation method ismore complex, provide here a reference

    to a document (and/or computerprogram) to be provided as an ExternalResource. This will be the case forexample for a variable "TOT_EXP"containing the household annual totalexpenditure, obtained from a householdbudget survey. In such case, theinformation provided here could be:"This variable provides the annualhousehold expenditure. It was obtainedby aggregating expenditure data on allgoods and services, available in sections4 to 6 of the household questionnaire. Itcontains imputed rental values forowner-occupied dwellings. The valueshave been deflated by a regional pricedeflator available in variableREG_DEF". All values are in localcurrency. Outliers have been fixed.Details on the calculations are availablein Appendix 2 of the Report on DataProcessing, and in the Stata program"aggregates.do" available in externalresources.

    Others

    Item Must DescripTion Note

    Security

    This field will be left empty in mostcases. It can be used to identifyvariables that are direct identifiers of the respondents (or highly identifyingindirect identifiers), and that should notbe released.

    NotesThis element is provided in order torecord any additional or auxiliaryinformation related to the specificvariable.

    Page 28 of 28