Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates...
Transcript of Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates...
![Page 1: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/1.jpg)
Policy Research Working Paper 7793
Estimating Poverty Rates in Target Populations
An Assessment of the Simple Poverty Scorecard and Alternative Approaches
Alexis DiamondMichael Gill
Miguel Rebolledo Dellepiane Emmanuel Skoufias
Katja VinhaYiqing Xu
Poverty and Equity Global Practice GroupAugust 2016
WPS7793P
ublic
Dis
clos
ure
Aut
horiz
edP
ublic
Dis
clos
ure
Aut
horiz
edP
ublic
Dis
clos
ure
Aut
horiz
edP
ublic
Dis
clos
ure
Aut
horiz
ed
![Page 2: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/2.jpg)
Produced by the Research Support Team
Abstract
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
Policy Research Working Paper 7793
This paper is a product of the Poverty and Equity Global Practice Group. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at [email protected].
The performance of the Simple Poverty Scorecard is compared against the performance of established regres-sion-based estimators. All estimates are benchmarked against observed poverty status based on household expen-diture (or income) data from household socioeconomic surveys that span nearly a decade and are representative of subnational populations. When the models all adopt the same “one-size-fits-all” training approach, there is no meaningful difference in performance and the Simple
Poverty Scorecard is as good as any of the regression-based estimators. The findings change, however, when the regression-based estimators are “trained” on “training sets” that more closely resemble potential subpopulation test sets. In this case, regression-based models outperform the nationally calculated Simple Poverty Scorecard in terms of bias and variance. These findings highlight the funda-mental trade-off between simplicity of use and accuracy.
![Page 3: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/3.jpg)
EstimatingPovertyRatesinTargetPopulations:AnAssessmentoftheSimplePovertyScorecardandAlternative
Approaches
AlexisDiamond,InternationalFinanceCorporation
MichaelGill,HarvardUniversity
MiguelRebolledoDellepiane,InternationalFinanceCorporation
EmmanuelSkoufias*,TheWorldBankGroup
KatjaVinha,TheWorldBankGroup
YiqingXu,MassachusettsInstituteofTechnology
JELClassification:I31,I32,I38
Keywords:SimplePovertyScorecard,PPI,Headcountpovertyrate,TrainingandTestdatasets.
*Correspondingauthor:EmmanuelSkoufias,TheWorldBank(MailStop:I4‐405),1818HStreetNW,WashingtonDC20433‐USA.tel:(202)458‐7539.fax:(202)522‐3134.e‐mail:[email protected].
Acknowledgements:We are grateful toNobuo Yoshida and Phillippe Leite for inputs at the earlystagesofthisstudyandtoMarkSchreinerfordiscussionsandclarificationsprovidedthroughtheprocessifthisstudy.
![Page 4: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/4.jpg)
ContentsI. Introduction .......................................................................................................................................... 1
II. Data ....................................................................................................................................................... 2
III. Methods and Notation ...................................................................................................................... 3
A. Household‐level Poverty Probabilities: Several Approaches ............................................................ 4
B. Regression‐based Alternatives to the SPS ........................................................................................ 6
C. Group‐Level Poverty Rates ............................................................................................................... 8
D. Measures of Uncertainty via Bootstrap ............................................................................................ 8
IV. Results ............................................................................................................................................... 9
A. Estimating National Poverty Rates.................................................................................................... 9
B. Stratum‐Specific Poverty Rates ....................................................................................................... 11
C. Testing Estimator Resilience over Time ............................................................................................. 27
V. Concluding Remarks and the Way Forward ........................................................................................ 29
References .................................................................................................................................................. 31
Appendix 1: A Detailed Summary of the Poverty Scorecard Methodology ................................................ 33
Appendix 2: SPS Survey Questions and Lookup Tables .............................................................................. 36
Appendix 3: Additional Figures from other Countries ................................................................................ 46
Appendix 4: Stratum Specific Poverty Rates ............................................................................................... 50
![Page 5: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/5.jpg)
1
I. Introduction
The World Bank Group (WBG)’s twin goals—eliminating extreme poverty and boosting sharedprosperity—haveintensifiedtheinterestinmeasuringpovertyratesinspecificpopulationstargetedby development programs worldwide. Private‐sector firms and financial institutions (especiallymicro‐finance institutions, NGOs, agriculture and agribusiness enterprises) are also increasinglysellingandbuying fromthepoorproductsandservicesandseekingtoestimatepoverty levels inspecificmarket segments to inform their business strategies and operations. Themost rigorouspoverty estimation methodologies are, however, not necessarily practical for developmentpractitionersorprivatesectorfirms.Thebestsourcesofpovertydataarebasedongovernment‐runlarge‐scalenationally‐representativehouseholdsurveysthatcollecthighlydetailedsocioeconomichouseholdinformation,costmillionsofdollars,andtakeyearstodesignandimplement(BeninandRandriamomonjy, 2008). However useful these survey instruments and data may be for manyapplications, they do not enable direct estimation of poverty rates in targeted idiosyncraticpopulations.Eveniftherelevantnationaldatahappentobeavailable,thealternativeapproachofusing small area estimation or survey to survey imputationmethods is very data intensive andrequiresa levelof technicalsophistication thatmakes thisapproach impractical forpractitioners(Elbers,etal,2003;Christiaensenetal,2012;TarozziandDeaton,2009).Themostpopularsolutionforproject‐specificpovertyestimationistheSimplePovertyScorecard(SPS)describedinSchreiner(2014a),whichisimplementedwitha10‐questionsurveyand(initsbasicform)astatisticallook‐uptable.1 The SPS is based on a logistic regression, but it departs from established econometricapproaches.DespitetheSPS’swidespreadusethereislittlepublishedacademicliteratureassessingitsperformancewhenappliedtosubnationalpopulationsacrosscountriesandtime‐periods,asitisactuallyutilizedbyresearchers.2
WeseektofillthisgapintheliteraturebyevaluatingtheSPS,performingseveralthousandstatisticalexperiments across diverse populations and strata in nine separate countries in a collection ofsurveysrepresentativeaysubnationallevelthatspannearlyadecade.TheseexperimentsassessthenationallevelcalculatedSPSperformanceversustheperformanceofestablishedregression‐basedestimators.3We benchmark all estimates against observed data on poverty status derived fromgovernment‐run sub‐nationally representative socioeconomic household surveys. We find thatestablished regression‐based models like ordinary least squares, logistic regression, and lassoregression—trainedon“trainingset”dataandtestedon“testset”data—outperformtheSPSintermsofbiasandvariance.Inmanyoftheseexperiments,ourregression‐basedestimatorsperformbetterthan theSPSbecause theyare informedbyadditional and targetpopulation specific information
1 SeeAppendix1forasummaryofthemethodology. 2AsSchreiner(2015)states,“Likeallpredictivemodels,thescorecardhereisconstructedfromasinglesampleandsomissesthemarktosomeunknownextentwhenappliedtoavalidationsample.Furthermore,itisbiasedwhenapplied(inpractice)toadifferentpopulation…..(becausetherelationshipsbetweenindicatorsandpovertychangeovertime).”3ItisimportanttoclarifythatthisreportevaluatestheperformanceoftheSPSinestimatingpovertyforatargetpopulationandnottheperformanceofSPSinpredictingthepovertystatusofanindividualfortargetingbasedonpovertystatusorforprogram eligibility. In fact the SPS can be and has been used for targetingbased on individual poverty status but theevaluationofthetargetingperformanceofSPSisoutsidethescopeofthisreport.
![Page 6: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/6.jpg)
2
reflectingabasicbutimportantpoint:theSPSdoesnottakeadvantageofanyinformationbeyonditstensurveyquestions,eventhoughresearchersalmostalwayshaveadditionaldataattheirdisposal.Using all relevant available information is especially important when targeting groups thatsignificantlydifferfromthenationalpopulationsuponwhichSPSsarebased,becausethemodeledrelationshipbetweenhouseholdcharacteristicsandpovertystatus inagivennationalpopulationmaynotholdtrueforsubpopulationsinthecountry.
We advise researchers attempting to estimate poverty rates in particular samples to adopt anapproachbasedontheestablishedregression‐basedtechniquesexploredinourexperiments.OurrecommendedapproachismoreaccuratethantheSPS,applicabletoanypopulationofinterest,andable to exploit information beyond the ten SPS survey questions. Because we utilize nationalhouseholdsurveysthatstatisticallyrepresentativeforspecificsubnationalstrata,ourapproachcanbeusedtoderivereliablepovertyrateestimatesforthesesubnationalgroups.
Thispaperisorganizedasfollows:SectionIIdescribesthedata.SectionIIIexplainsthemethodsandmodelsunderexamination, includingtheSPS,andSection IVassesses theirrelativeperformance.SectionVsummarizeslessonslearnedanddescribespathsforfuturework.
II. Data
Ouranalysisdrawsfrom14nationalsocioeconomichouseholdsurveysacross9countries(seeTable1),whichwere implementedby the statistical agencies of the various governments. The surveyscoversubnationallyrepresentativesamplesranginginsizefrom3,579households(inSierraLeone,2003)to293,715households(inIndonesia,2010).Eachsurveycontainsacorequestionnairewhichconsistsofahouseholdrosterlistingthesex,age,maritalstatus,educationalattainment,householdincomeand/orexpenditureinformation,andlaborforceexperienceofallhouseholdmembers.Weusetheexpenditureorincomedataalongwithnationalpovertylinestodeterminethepovertystatusofeachhousehold.Wethenusethepovertystatustocalculateourbenchmarkpovertyrates,referredtobelowas“observed”povertyratesforvariousthepopulationsofinterest.Hundredsofquestionsareaskedineachsurvey(e.g.,612variablestotalin21datasetsassociatedwiththeBangladesh2010Household IncomeandExpenditureSurvey),and the timeandcostof implementation is amajorreasonthatsomanyresearchershaveadoptedtheSPSapproach.TheSPSandtheothermodelsweexamine below have been derived from these data sets, and are thus based upon the statisticalrelationships between observed poverty status and household characteristics.We also use theselarge‐scale household surveys to test the models, assessing which models come closest to theobservedpovertyrates.
Table1:RepresentativestrataandnationalpovertyratesinSPSsample
Country YearRepresentative
strataNumberofstrata
NPovertyrate(weighted)
Povertyrate(unweighted)
Bangladesh 2010 Region 7 12,209 0.2848 0.2881
![Page 7: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/7.jpg)
3
(HIES)
Indonesia(SUSENAS)
2010 Districts 498 293,715 0.1000 0.1047
Jordan(HEIS)
2006 Urban/Rural 2 11,639 0.0942 0.0953
Jordan(HEIS)
2008 Urban/Rural 2 10,961 0.1533 0.1548
Jordan(HEIS)
2010 Urban/Rural 2 11,223 0.1202 0.1241
Nepal(NLSS)
2010 Analyticaldomains 12 5,988 0.2000 0.1852
Paraguay(EPH)
2011Department(modified)
7 4,893 0.2626 0.2669
Peru(ENAHO)
2010 Department 24 20,048 0.2673 0.3064
Peru(ENAHO)
2011 Department 24 22,978 0.2433 0.2752
Peru(ENAHO)
2012 Department 24 23,349 0.2254 0.2551
SierraLeone(SLIHS)
2003 Region 4 3,579 0.6106 0.6767
SierraLeone(SLIHS)
2011 District 14 6,693 0.4572 0.4715
Thailand(SES)
2011Region
Urban/Rural9 42,083 0.1339 0.1205
Uganda(UNPS)
2009 Region 4 6,755 0.2436 0.2364
Notes:NindicatesthesamplesizeofthedatasetusedtocalibratetheSPS.Theweightedpovertyrateusespopulationsurveyweightsandtheunweighteddoesnot.
III. MethodsandNotation
Foreachdatasetinoursample,weestimatethenationalpovertyrateandstratum‐specificpovertyrates through various approaches. Here we present the notation and a general overview of theapproachesusedinouranalysis.ThereareasetofhouseholdsN=1,…,n.Let denotehouseholdi’sobserved poverty status (i.e., in poverty or not, 1 or 0) as determined by their expenditure (orincome) in reference to the national poverty line.While there are 10 questions in each povertyscorecard,someofthesequestionshavecategoricalresponseswithmorethantwocategories.Somecareistakentopreprocesstheexplanatorydataforanalysis.Let denotethenumberofpossible
categorical responses for each surveyquestionq=1,2,…,10; for example, if questionq=1 hadpossibleresponsesA;B;andC,thismeanstherearethreeresponsecategoriestothatquestion.Ifanrespondentweretobeaskedall10questions,itfollowsthatthetotalnumberofbinaryindicatorsneededtoaccountforallsurveyresponsesis ∑ 1 .Hence,inourreformatteddata,
isabinaryindicator(1or0)ofwhetherhouseholdiprovidedresponsejforeachj 1,…,p.
![Page 8: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/8.jpg)
4
A. Household‐levelPovertyProbabilities:SeveralApproaches
SimplePovertyScorecard(SPS)TheSPSisapovertyestimationmethodologydevelopedbyMarkSchreinerofMicrofinanceRiskManagement,L.L.C.4EachSPSisdesignedforaparticularcountryandaparticularyear.Designbeginswithanationallyrepresentativehouseholdsurvey,whichistakenasthepovertyestimationinthatcountryatthetimeofthesurvey.Foragivensurvey,halfthedataareusedasa“trainingset”todevelop(ortrain)themodelthattheSPSwillusetoestimatepovertyrates,andtheotherhalf(the“testset”)areusedtovalidatetheaccuracyoftheconstructedmodel.Totrainthemodel,theSPSdeveloperrepeatedlyanalyzesthetrainingset,attemptingtoidentify10questionsfromthenationalhouseholdsurveythatcanreliablypredicthousehold‐levelpoverty‐status.Thisisaniterativemodel‐selectionprocessthatreliesonbothstatisticalmethods(logisticregression)andprofessionaljudgment,inanefforttoidentifyvariableswithhighpredictivepowerthatcanbeeasilycollectedandverifiedbysurveyorsinthefield.DetailsonthedevelopmentandcalibrationofSPSforparticulardatasetsinoursamplecanbefoundinSchreiner(2010,2011a,b,2012a,b,c,2013a,b,2014b).5SPSshavebeenestimatedforatleast63countries,andinmanycountriestheyhavebeenperiodicallyupdatedwhennewhouseholdsurveyshavebecomeavailable.
Once an SPS has been developed, actually calculating poverty rates for a given data set isstraightforward. All that is needed is basic arithmetic, pencil and paper, and a lookup‐table thatconvertseachhousehold’ssurveyresult(the“povertyscore”)intoanestimatedprobabilitythatthathousehold is below a specific poverty‐line. After converting the results into probabilities for allrespondents,theaverageprobabilityisadjustedbyanadditivebias‐adjustmentfactor(typicallyafractionofapercentage‐point)andtheresultisthepoverty‐rateestimate.Technologyapplicationstodothisarealsowidelyavailableandusedbytheprivatesector.OnelimitationoftheSPSisthatitslook‐up table (which is derived from logistic regression results) is calibrated to deliver certaindiscreteestimatedprobabilitiesforeachhousehold.InthecaseoftheBangladesh2010SPS,thelook‐up table converts any survey result into one of 18discrete poverty rate estimates. This SPS canestimateahousehold‐levelprobabilityof40.9%,or50.4%,butnothinginbetween(seeFigure1).
4Moredetailsavailableonlineathttp://www.microfinance.com/#Poverty_Scoring.5OrseeAppendix1forasummaryofthemethodology.
![Page 9: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/9.jpg)
5
Figure1. Comparing the SPS and FittedValueswith a Simple LogitModel (Bangladesh,2010).Thisplotcompares thehousehold‐levelpovertyprobabilities froma logistic regressionmodel(thehorizontalaxis)againstthoseobtainedfromtheSPS(theverticalaxis)usingdatafromtheBangladesh2010HIES.AsaresultofdatacoarseningintheSPSlookuptable,thenumberofuniqueprobabilitiesgeneratedbySPS is considerably lower than logit.While there is a strongpositiveassociationbetweentheresultsfrombothapproaches,thefiguredemonstratesthereisinformationsacrificedinthecalculationofSPSprobabilities.
![Page 10: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/10.jpg)
6
SPS authors claim good properties for their estimatorwhen it is applied to data thatmirror thenationalhouseholdsamplesthatformedtheoriginalSPStrainingsets.Indeed,theSPS’sclaimstoreliabilityare,almostwithoutexception,6limitedtoSPS’sapplicationtothecorrespondingnationalpopulationdata.TheproblemwiththisclaimisthattheneedtouseanSPStoestimatepovertyratesinthesetargetpopulationsitisnotatallobvious,giventhatthebestandeasiestapproachwouldbetoderivethenumbersdirectlyfromtheoriginalnationaldatasets(orreadthesummarystatisticspublishedbygovernmentagencies). Indeed, theSPS is typicallyusedtoestimatepovertyrates inspecific subnational groups. For example,microfinance institutions frequently and other privatesectoractorsusetheSPStoestimatepovertyratesinspecificsubnationalgroups.Similarly,whentheSPS isusedbydevelopmentpractitioners toassesspoverty rates inprojectareas,or tocomparepoverty rates before versus after projects or in treatment versus control samples, the targetpopulationistypicallyaspecificsubnationalgroupofprojectparticipants.TheSPSissometimesusedto estimate poverty rates in groups that are significantly poorer than the national average (e.g.,female‐headed agricultural households in a particular region of a country). The SPS authors arecognizantofthedifferencebetweentheidealconditionsforSPSuseandthewayitisactuallyusedbypractitioners,andSPSdocumentationalwaysincludesacaveattothiseffect,liketheonequotedinfootnote4.
B. Regression‐basedAlternativestotheSPS
Ordinary Least Squares (OLS): Using observed survey responses for each of the ten povertyscorecardquestions,weestimatethefollowinglinearprobabilitymodel
∑ , 1,2, … , . (OLS)
Inthiscontext,eachobservationinourtrainingsampleisgivenequalweighttowardthemodel’sestimation. As iswell known, the set of estimated regression coefficients are estimated so as tominimizetheSumofSquaredErrors(SSE),i.e.,
∑ ∑ ∑ .
In the event the empirical distribution of the training sample is reflective of the researcher’spopulationofinterest,theOLSestimatormayprovideunbiasedandconsistentestimatesofgroup‐levelpovertyratesdespitethefactthathousehold‐levelpovertyestimatesmayfalloutsideoftheunitinterval.
Weighted Least Squares (WLS): In contrast to OLS,WLS does not treat all observations in thetrainingsetasequallyinfluentialtotheestimationprocedure.Thisfeaturemaybedesirablethemoreanationalsurveyisstratified,oriftheempiricaldistributionofsurveyrespondentsdoesnotcloselyapproximate the proportion of households in the researcher’s population of interest, or if theresidualsoftheclassicleastsquaresmodelsareheteroskedastic(Greene,2003).Inthisreport,we
6 One exception is Mark Schreiner’s paper “Is One Simple Poverty Scorecard Enough for India?” http://www.microfinance.com/English/Papers/Scoring_Poverty_India_Segments.pdf.
![Page 11: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/11.jpg)
7
weighttheinfluenceofeachobservationaccordingtoitsestimatedinverse‐probabilityweight ,whichareobtainedfromthenationalsurveys.AsinEquation(OLS),theWLSregressionequationmodel assumes a household’s probability of being impoverished is a linear function of observedcovariates (i.e., survey responses). However, theWLS estimatorminimizes theweighted sum ofsquarederrors(WSSE),suchthat
∑ ∑ ∑ .
LogisticRegression(Logit):InadditiontoOLSandWLS,weimplementasimplelogisticregressionmodeltocompareagainstpovertyratesgeneratedbythepovertyscorecard.Wemodelpovertywiththewell‐knownfunctionalform
1∑
∑ (Logit)
LassoPenalization:Thelassoisaformofpenalizedregression,similartoridgeregression,wherebyregression coefficients are weighted by “shrinkage factors” such that regression coefficients areweighted towards zero (Tibshirani, 1994; Hastie et al., 2009; James et al. 2013). The lasso iscommonlyusedforfeatureselectioninhigh‐dimensionallearningproblemstodecreasethevarianceofaparticularclassifier.Forourordinaryleastsquaresestimators,weapplythelassoatthetraining‐setlevelsuchthatwesolvethefollowingproblem:
min ∑ ∑ s.t. ∑ (OLSLasso)
wheres isacoefficientshrinkagefactor,and isa linearestimateofthemarginal influenceofa
surveyresponseonpoverty.Philosophically,thisprocedureissimilartoanordinaryleastsquaresregressionprocedureinwhichthebest‐modelisdeterminedbythatwhichminimizesthein‐samplesumofsquaredresiduals,exceptregressioncoefficientsarepenalizedaccordingtopriorrule(i.e.,theshrinkagefactor)ontheminimumcoefficientsizeavariableisallowedtohavetobeincludedinthefinalclassificationmodel.Theconventionalpenaltyusedinthelassoistheℓ penalty,whichisdefined by ∑ . In Equation OLS Lasso, the coefficient shrinkage factor is therefore
definedas ⋅ ,where ∈ 0,1 isatuningfactor.Theoptimallevelof ischosenthrough
10‐foldcrossvalidation.
TrainingandTestSets:Foreachnationalsurveyanalyzedinthisreport,wedividethefullsurveysampleintotrainingsetandatestset(arandom50‐percentofthedataineachset),justastheSPS’sdevelopersdo.Theregression‐basedmodels(OLS,logit,etc.)describedinSectionAaretrained(i.e.,coefficientsareestimated)exclusivelyonthetrainingdataset.Toevaluatetheperformanceofthesemodels,weprojecttheestimatedmodelparameters(andtheequivalent–theSPS’spovertyscores)ontothetestset.By“project”wemeanweusethecoefficientsestimatedforthetrainingdatasetsandthe valuesinthetestingsetstoderivethepredictedpovertyrateinthetestdatasets.Itis
importanttokeepinmindthatalltheregression‐basedmodelsweemployatthenationalandatthe
![Page 12: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/12.jpg)
8
stratumleveluse,onpurpose,thesame10variablesusedbytheSPSmodel.7AllresultspresentedinSectionIV,includingthestratum‐specificanalyses,useonlythetestsetdata.
C. Group‐LevelPovertyRates
Theobjectiveoftheanalysisistoestimatepovertyratesatvariouslevels.Herewedistinguishthepoverty rate of the survey sample from the poverty rate of the national population, and wedistinguishbothofthesequantitiesfromthepovertyrateofaparticularstratumofinterest(i.e.,aparticularsubsetof thenationalpopulation).Theutility inestimatinganyof thesequantitieswilldependpreciselyonhowcloselytheymaptotheresearcher’stargetpopulationofinterest.
Table2:CalculationofPovertyRatesforDifferentPopulationsofInterest. Theweighted‐nationalestimateof thepovertyrate is theclassicalHorvitz‐Thompsonestimator (HorvitzandThompson, 1952). For stratum‐specific poverty rates, S indicates the set of individuals in aparticularstratumofinterest,where ⊂ ,and| |denotesthelengthofthestratum.Readersshouldnotewedonotdirectlyestimate inthistechnicalreport,sothequalityofourresultsthatrelyontheseweightswilldependonthequalityoftheseweightsderivedbyprioranalysisandmadeavailableinthesedatasets.
PovertyRateofInterest ObservedRate(inData) EstimatedRate(Fitted)
Sample1
1
National(Weighted) ∙ ∙
Stratum1| | ∈
1| | ∈
Stratum(Weighted)∈
∙∈
∈
∙∈
D. MeasuresofUncertaintyviaBootstrap
Foreachof thesurveys,weobtainboth theobservedratesofpovertybasedon therawnationalsurveys(yi)andtheestimatedpovertyratesobtainedthoroughtheapproachesdescribedinSectionIII, .Forallconfidenceintervalspresentedinthisanalysis,bootstrappedstandarderrorsofthemean (using 5000 bootstrapped samples) are calculated for estimated poverty rates of interest(Efron and Tibshirani, 1998). Bootstrapped confidence intervals are similarly generated for theobservedlevelsofpovertyinthedata,giventhattheobservedrateofpovertyisitselfasampleandthereforeanestimateoftheunobservedpopulation.
7Itisquiteplausiblethatnotallofthese10variableswillbethebestpredictorsofpovertyineachdifferentstratum.Infactitisquitelikelythatthebestpredictorsofpovertymayvaryfromstratumtostratumwhichimpliesthatthecompositionofthesetofthe10bestpovertypredictorsislikelytovaryacrossstrata.ForthepurposeofkeepingtheanalysissimpleandfairtotheSPS,wehavedecidedtostickwiththesame10variablesusedbySPStopredictpoverty.OneexceptionistheLassowhichisanalogoustoselectingpredictionvariablesamongthe10variableusedinthemodel(seeJamesetal.2013).
![Page 13: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/13.jpg)
9
IV. Results
InthissectionweassesstheaccuracyoftheSPSbyrunningmultipletrialsacrossdifferent“test”sets,including national samples and subpopulations, to see how well the SPS poverty estimates canrecovertheobservedpovertyrates.Wethensubjecttheregression‐basedmodelstothesametrialsand compare SPS and regression results.We considerwhether the SPS or other approaches arepreferableforestimatingpovertyrates.
A. EstimatingNationalPovertyRates
WebeginbyexploringhoweffectivelytheSPSandotherestimatorscanrecovertheobservedpovertyrateinarandomsampledrawnfromnationalhouseholdsurveydata.ThisisalogicalplacetostartbecauseitiswhattheSPSisexpresslydesignedtodo.Werun14testsinasmanycountry‐yeardatasets, with and without the observation‐specific weights accompanying the national householdsurveys.8 We show results for trials with the weights because they produce statisticallyrepresentativenationalsamples,butwealsoshowresultsfortrialsrunwithoutweights(the“rawsample”)becauseinpractice,theSPSisoftenappliedtoidiosyncraticsamplesthatdonotgenerallyhaveobservationweights.
Figure2 is illustrativeof the findings fromtheexercisedescribedabove.Theupperpanelshowspovertyestimatesandbootstrapped95%confidenceintervals—Indonesiaontheleft,Peruontheright—appliedtoraw(unweighted)national“testsets”orvalidationsets(i.e.,randomsamplesfromthenationalsurveydatathatwerenotusedtofittheregression‐basedmodels).Theregression‐basedmethodsclearlydominatetheSPSintheupperpanel;ofthese,weightedleastsquares(WLS),whichutilizesobservationweightsinthetrainingsetbutnotthetestset,doestheworstbutstilldominatestheSPS.Thelowerpanelshowshowtheseresultschangewhenobservationweightsareappliedtothetestsets(creatingnationallyrepresentativetestsets).HereweseethattheSPSperformsbetterthanitdidintheupperpanel,butSPSresultsarenotdominatedbytheregression‐basedapproaches(and in fact SPS is clearly dominated byWLS,which performs the best). Additionally, a generalfeature of these results is that applying the SPS’s bias correction factor actually increasesdiscrepancies,movingestimatesawayfromtheobservedpovertyrate.
8FiguresfromParaguay,Jordan,Uganda,andThailandarepresentedinAppendix3.
![Page 14: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/14.jpg)
10
Figure2.Regression‐BasedMethodsImproveUponSimplePovertyScorecardforNationalPovertyRateEstimates.Theupperpanelshowspovertyestimatesandbootstrapped90%and95%confidenceintervals—Indonesiaontheleft,Peruontheright—appliedtoraw(unweighted)national“testsets”orvalidationsets(i.e.,randomsamplesfromthenationalsurveydatathatwerenotusedtofitthemodels).Regression‐basedmethodsclearlydominatetheSPSintheupperpanel;ofthese,weightedleastsquares(WLS),whichutilizesobservationweightsinthetrainingsetbutnotinthetestset,doestheworstbutstilldominatestheSPS.Thelowerpanelshowshowtheseresults change when observation weights are applied to the test sets (creating nationallyrepresentativetestsets).HeretheSPSperformsbetterthanintheupperpanel,butSPSisstillnotdominatedbytheregression‐basedapproaches(andisclearlydominatedbyWLS,whichperformsthebest).Additionally,ageneralfeatureoftheseresultsisthatapplyingtheSPS’sbiascorrectionfactorincreasesthediscrepancy,movingestimatesawayfromtheobservedpovertyrate.
![Page 15: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/15.jpg)
11
B. Stratum‐SpecificPovertyRates
Results in the previous section show the SPS to be fairly reliable in percentage‐point terms(though not generally themost accurate estimator) for estimating poverty rates in samplesmirroringthenationalhouseholdsamplesthatformedtheoriginalSPStrainingsets.However,itisessentialtotestSPSandtheothermethodsunderrealisticconditions,astheywouldactuallybe used by project leaders and researchers. To this end, we assess the performance of allestimatorsappliedtosubnationalgroups.
WefirstcompareresultsfortheSPSversustheLogit–basedestimatoracrossgeographicstrata.InthisexercisetheSPSandtheLogitmodelsarebothtrained(orestimated)withmodelsappliedtonational‐leveldata.Figure3indicateshownationallyrepresentativepovertyestimators,suchasSPS,performrelativetologisticregressionmodelstrainedonthenationaldata.EventhoughtheIndonesia2010SUSENASsurveyisonlyrepresentativeatthedistrictlevel,wesplitthedatainto934stratasuchthateachofthe498districtsarepartitionedintourbanandruralareas,andmeasurethediscrepanciesbetweenestimatedpovertyratesandactualpovertyratesforeachofthesestrata.Eachofthesmallcirclescorrespondstothediscrepancyforagivenstratum—rawdiscrepanciesintheupperpanelandabsolutediscrepanciesinthelowerpanel,withSPSresultsin blue, and national‐level logit results in gold. Locally weighted scatterplot smoothing(Cleveland and Devlin, 1988) defines best‐fit curves drawn through the points. The greenverticallineshowsthenationalpovertyrate.WeimplementbootstrappedKolgomorov‐Smirnovtestsfortheequalityofthepovertyscorecardestimatesagainstthestratumspecificregressions(Præstgaard, 1995), and find that the results from the two estimators are statisticallydistinguishable, but both estimatorsperformabout equally poorly andoverestimatepovertyratesinthericherregionsandunderestimatepovertyinthepoorerregions.Inthepooreststrata,averageSPSdiscrepanciesareashighas15–25percentage‐points.
![Page 16: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/16.jpg)
12
Figure 3. “One Size Fits All” National Models Perform Poorly When Applied toSubnationalStrata,SPSversusLogit(Indonesia2010).WesplittheIndonesia2010datainto 934 strata such that each of the 498 districts are partitioned into urban and ruralcomponents andmeasure the discrepancies between the estimated poverty rates and theobservedpovertyrates.Eachofthesmallcirclescorrespondstothediscrepancyforagivenstratum—rawdiscrepanciesintheupperpanelandabsolutediscrepanciesinthelowerpanel,with SPS results in blue, and national‐level logit results in gold. Each estimator producesresults that are statistically distinguishable from the other (see the statistically significantKolmogorov‐Smirnov p‐values), but both estimators perform about equally poorly,overestimatingpovertyratesinthericherregionsandunderestimatingpovertyinthepoorerregions. In thepoorest strata, averageSPSdiscrepanciesareashighas15–25percentage‐points.
![Page 17: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/17.jpg)
13
![Page 18: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/18.jpg)
14
The next set of figures (Figures4‐7) compares the SPS against regression‐based estimatorstrainedorestimatedseparatelyusingdata fromeachgeographic stratum that thehouseholdsurveyisdesignedtoyieldrepresentativeestimatesfor.AsTable1indicatesmanyhouseholdsurveysaredesignedtoberepresentativefordifferentgeographicstrata.PeruandBangladeshforexamplearerepresentativeattheregionallevel,withPeruhaving24departments(regions)andBangladesh7regions.Indonesia,ontheotherhand,followingthefiscaldecentralizationthattookplacein2001,isdesignedtorepresentativeforeachofits498districts.
Regression‐based approaches perform much better when trained on data specific to eachgeographicstratum.TheuppersubplotinFigure4showsthatfortheSUSENAS2010datasetfromIndonesia(whichhasthelargestnumberofrepresentativesubnationalgroupingsofallourdatasets)nomatterwhat thequantile, themagnitudesofdiscrepancies forregression‐basedestimatorsareafractionofwhattheyarefortheSPS.Foreachestimator,weobservethedistrict‐levelestimatesof thepovertyrate.Wecompare therelativeabsoluteerror (i.e., theabsolutevalue of the estimated valueminus the observed poverty rate in the test set, benchmarkedagainsttheerrorrateoftheSPS)ateachpercentileoftheabsoluteerrorrate.Thelowerhalfofthisfigurerestrictsthesampletothepoorerdistricts(i.e.,thedistrictswithapovertyrateabovethemediandistrictpovertyrate),andreflectshowapplyingtheSPStothepoorestsubgroups(asisoftendoneinpractice)maycomparetoothermethods.
Figure5showstheoverlapofthe95%confidenceintervalofpovertyrateestimatesusingSPSand the 95% confidence intervals of the true poverty rate estimate based on householdconsumptionatthedistrictlevel.9TheretendstobespatialcorrelationamongdistrictswheretheSPSoverestimatesthepovertyrateandwhereitunderestimatesthepovertyrate.Inabout10%ofthedistricts,theSPSunder‐estimatesthepovertyrateandinabout26%ofthedistrictsit over‐estimates the poverty rate. In comparison, the strata‐specific logit estimator under‐estimatesthepovertyrateinabout2%ofthedistrictsandover‐estimatesthepovertyrateinabout2%ofthedistricts.
9Themapsarebasedonthe2007districtboundaries.Wewereunabletoacquiretheboundaryfileswiththemostrecentdivisions.Ifadistricthassincesplit,onlythedatareflectthepovertyratecomparisononthe“mother”district.Thatisifadistrictwascreatedafter2007,thenthemapdoesnotnecessarilyreflectthepovertycomparisoninthisdistrict,butinthedistrictfromwhichitsplitfrom.Themapsreflecttheestimatesusingtheweightedtestingset.
![Page 19: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/19.jpg)
15
Figure4
![Page 20: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/20.jpg)
16
Figure 5. Mapping Discrepancies Across Districts: Strata‐specific Logit DistrictDominates theSPS for Indonesia2010.The upper panel shows the overlap of the 95%confidenceintervalofpovertyrateestimatesusingSPSandthe95%confidenceintervalsofthetruepovertyrateestimatebasedonhouseholdconsumptionatthedistrict level.TheretendstobespatialcorrelationamongdistrictswheretheSPSoverestimatesthepovertyrateand where it underestimates the poverty rate. In about 10% of the districts, the SPSunderestimatesthepovertyrateandinabout26%ofthedistrictsitoverestimatesthepovertyrate. In comparison, the strata‐specific logit estimator (lower panel) under‐estimates thepovertyrateinabout2%ofthedistrictsandover‐estimatesthepovertyrateinabout2%ofthedistricts
![Page 21: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/21.jpg)
17
InFigure6we split thePeru2010data into24 regional (departments) strata andmeasurediscrepanciesbetweenestimatedpovertyratesbythestratum‐specificlogitandthenationalSPSandobservedpovertyrates.Eachofthesmallcirclescorrespondstothediscrepancyforagivenstrata.Rawdiscrepanciesare in theupperpanelandabsolutediscrepanciesare in the lowerpanel,withSPSresultsinblue,strata‐specificlogitinred,andthegreenverticallinemarkingtheaveragenationalpovertyrate.Stratum‐specificlogitshowsstrongandconsistentperformance,withlowaveragediscrepanciesacrossthedomainofobservedstratumpovertyrates.Noticethatasbefore,theSPSoverestimatespovertyratesintheregionsthatareleastpoor.Inthestratawith the lowest poverty rates, the discrepancies are highest, averaging about 15 percentagepoints.Kolmogorov‐Smirnovp‐valuesshowstatisticallysignificantdifferencesbetweenSPSandstrata‐specificlogitresults.ThemapinFigure7showsthattheSPSoverestimatesthepovertyratefor6regions(upperpanel),whereasregion‐specificlogitproducesestimatesarewithinthe95%confidenceintervalofthetrueestimateinallcases(lowerpanel).
![Page 22: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/22.jpg)
18
Figure6.RegionalStrataPovertyRateEstimation:Strata‐specificLogitDominatestheSPSforPeru2010.WesplitthePeru2010datainto24regionalstrata(departments),andmeasurethediscrepanciesbetweenestimatedpovertyratesandobservedpovertyrates.Eachofthesmallcirclescorrespondstothediscrepancyforagivenstratum‐‐‐rawdiscrepanciesintheupperpanelandabsolutediscrepanciesinthelowerpanel,withSPSresultsinblue,strata‐specific logit in red, and thegreenvertical linemarking theaveragenationalpoverty rate.Stratum‐specific logit shows strong and consistent performance, with low averagediscrepancies across the domain of observed stratum poverty rates. SPS overestimatespovertyratesintheregionsthatareleastpoor.Inthestratawiththelowestpovertyrates,thediscrepancies are highest, averaging about 15 percentage points. Kolmogorov‐Smirnov p‐valuesshowstatisticallysignificantdifferencesbetweenSPSandstrata‐specificlogitresults.
![Page 23: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/23.jpg)
19
Figure 7. Mapping the Discrepancies Across Regional Strata: Strata‐specific LogitDominatestheSPSforPeru2010.Theregion‐specificlogitproducesestimateswithinthe95%confidenceintervalofthetrueestimateinallcases,whereastheSPSoverestimatesthepovertyratefor6regions.
![Page 24: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/24.jpg)
20
Thenextsetoffigures(Figures8‐12)drilldeeperthanthelowestgeographiclevelforwhichasurveymayberepresentative for,bycomparing theperformanceof thenationalSPSand thestratum‐specificLogitmodelwiththestratumnowdefinedbytheintersectionoftheregionordistrict identifier and anotherkey socio‐economicvariable, suchas an identifierwhether thehouseholdheadismaleorfemaleorwhetherthehouseholdheadisinagricultureornot.Oneimportantcaveatassociatedwiththesecomparisonsisthatthe“true”povertyrateestimatethattheSPSandthestratum‐specificLogitmodelsarecomparedwith,islikelytohaveahighvariancesincethesurveyisdesignedtoyieldreliableestimatesfortheregionorthedistrictasawholebutnotforanyspecificsocio‐economicgroupwithinthesegeographicareas.
Wewilldescribeeachoftheseanalysesinorder.10ThemapinFigure8alsoshowsthedifferencesbetweentheSPSestimatesandthe14district‐specificlogitestimatesforSierraLeone.TheSierraLeoneIntegratedHouseholdSurvey(SLIHS)in2003isonlyrepresentativeforthe4regionsofSierraLeone,whereasthe2013isrepresentativeforall14districtsofthecountry.Inthisexercisewedonottakeintoconsiderationthegeographicrepresentativityoftheavailablesurvey,andweuse the district identifiers to create the district strata in 2003 even though the survey isrepresentativeattheregionlevelandnotatthedistrictlevel.Thedistrict‐specificlogitproducesestimateswithinthe95%confidenceintervalofthepovertyestimatesinalldistricts,11whereastheSPSoverestimatesthepovertyrateforonedistrict(theareanearFreetown,thecapital)andunderestimatesthepovertyinthreedistricts.
Figure9showstheresultsforBangladesh2010,splitinto14stratadefinedbytheintersectionofthe“agriculturalheadofhousehold”andthe7“region”dummyvariables.Stratum‐specificlogitshowsstrongandconsistentperformance,withlowaveragediscrepanciesacrossthedomainofobservedstratumpovertyrates.Again, thestratum‐specific logitshowsstrongandconsistentperformance,with lowaveragediscrepanciesacross thedomainofobservedstratumpovertyrates.Furthermore,theSPSconsistentlyover‐estimatesthelowerpovertyratesandconsistentlyunderestimatesthehigherpovertyratesasinIndonesiaandinPeruintheearliersetofestimates.Inthepoorestandricheststrata,averageSPSdiscrepanciesmaybeashighas15–20percentage‐points.Kolmogorov‐Smirnovp‐valuesindicatethatdifferencesbetweenSPSandstrata‐specificlogitresultsarestatisticallysignificant.
Figure10showsresultsforSierraLeone2003data,dividedinto28stratadefinedbytheinter‐section of the district/region and the “agricultural head of household” dummyvariables.TheupperpanelrevealsthatSPSisreliable(discrepanciesarenearzero)onlyifthestratapovertyratesareintheimmediateneighborhoodofthenationalpovertyrate;onceagain,theSPSoverestimatespovertyratesinthericherdistrictsandseverelyunderestimatespovertyin thepoorerdistricts,with theworstperformance in thepoorestdistricts.TheKolmogorov‐Smirnov p‐values in the upper panel show that strata‐specific logit and SPS are notdistinguishablewhencomparingperformancebasedonrawdiscrepancies.Inthelowerpanel,
10FiguresfromParaguay,Thailand,andUgandamaybefoundinAppendix4.
11Notethatsincethe2003surveyisonlyrepresentativeattheregionallevelandnotatthedistrictlevel,werefrainfromreferringtotheseestimatesasthe“true”estimatesofpovertyinthedistrict.
![Page 25: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/25.jpg)
21
however, strata‐specific logit dominates SPS (though not asmuch in previous strata‐specificcomparisons);inbothweightedandunweightedcomparisons,theredline(strata‐specificlogit)haslessabsolutediscrepancythanSPSfortheentirerangeofobservedstratumpovertyrates.SPS discrepancies here are some of theworst across all data sets– averaging asmuch as 30percentagepointsinthepoorestregions.
Figure11showsresultsforNepal2010,using28stratadefinedbytheintersectionofthe“femalehead‐of‐household” and the14 administrative zones of the country. Stratum‐specific logitdominates SPS, which tends to underestimate the poverty rate across the entire domain ofobservedstratum‐specificpovertyrates.Whenthestrataareidiosyncraticanddeviatefromthenationalsample,averageSPSdiscrepanciesmaybeashighas10–20percentage‐points.
Finally,Figure12showsresultsfortheIndonesia2010data,dividedinto934stratadefinedbytheintersectionof“urban/rural”andthedistrictidentifier.Stratum‐specificlogitshowsstrongand consistent performance, with low average discrepancies across the domain of observedstratumpovertyrates.SPSisreliable(discrepanciesarenearzero)onlyifthestratapovertyratesare in the immediate neighborhood of the national poverty rate; once again, the SPSoverestimatespovertyratesinthericherdistrictsandseverelyunderestimatespovertyinthepoorer districts, with the worst performance in the poorest districts. When the strata areidiosyncraticanddeviatefromthenationalsample,averageSPSdiscrepanciesmaybeashighas15–25percentage‐points.Kolmogorov‐Smirnov testresultsarestatisticallysignificanthereaswell.
![Page 26: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/26.jpg)
22
Figure 8. Mapping Discrepancies Across District and Agricultural/Non‐AgriculturalStrata:Strata‐specificLogitDominatestheSPSforSierraLeone2011.Thedistrict‐specificlogitproducesestimateswithinthe95%confidenceintervalofthetrueestimateinallcases,whereas the SPS overestimates the poverty rate for one district and underestimates thepovertyrateforthreedistricts.
![Page 27: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/27.jpg)
23
Figure9.Agricultural/Non‐AgriculturalHouseholdandRegionalStrata:Strata‐specificLogitDominatesSPSforBangladesh2010.WesplittheBangladesh2010datainto14strata,andmeasurethediscrepanciesbetweentheestimatedpovertyratesandtheobservedpovertyrates. Each of the small circles corresponds to the discrepancy for a given strata—rawdiscrepancies in the upper panel and absolute discrepancies in the lower panel, with SPSresults in blue, strata‐specific logit in red, and the green vertical linemarking the averagenationalpovertyrate.Stratum‐specific logitshowsstrongandconsistentperformance,withlow average discrepancies across the domain of observed stratum poverty rates. SPSconsistentlyoverestimatesthelowerpovertyratesandconsistentlyunderestimatesthehigherpovertyrates.Inthepoorestandricheststrata,averageSPSdiscrepanciesmaybeashighas15–20 percentage‐points. Kolmogorov‐Smirnov p‐values indicate that differences betweenSPSandstrata‐specificlogitresultsarestatisticallysignificant.
![Page 28: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/28.jpg)
24
Figure10.Agricultural/Non‐AgriculturalHouseholdandDistrictStrata:Strata‐specificLogitDominatesSPS,SierraLeone2003.WesplittheSierraLeone2003datainto28stratadefinedbythe intersectionof thedistrictandthe“agriculturalheadofhousehold”dummyvariables, and measure the discrepancies between the estimated poverty rates and theobservedpovertyratesEachof thesmallcirclescorrespondstothediscrepanciesbetweenobservedandestimatedpovertyrates‐‐rawdiscrepancieson theupperpanelandabsolutediscrepanciesonthelowerpanel,withSPSresultsinblue,strata‐specificlogitinred,andthegreenvertical linemarkingtheaveragenationalpovertyrate.TheupperpanelrevealsthatSPS is reliable (discrepancies are near zero) only if the strata poverty rates are in theimmediate neighborhood of the national poverty rate; once again, the SPS overestimatespovertyratesinthericherdistrictsandseverelyunderestimatespovertyinthepoorerregions,withtheworstperformanceinthepoorestdistricts.TheKolmogorov‐Smirnovp‐valuesintheupperpanelshowthatstrata‐specificlogitandSPSarenotdistinguishablewhencomparingperformancebasedonrawdiscrepancies. In the lowerpanel,however, strata‐specific logitdominatesSPS(thoughnotasmuchinpreviousstrata‐specificcomparisons).Inbothweightedandunweightedcomparisons,theredline(strata‐specificlogit)haslessabsolutediscrepancythanSPSfortheentirerangeofobservedstratumpovertyrates.SPSdiscrepanciesherearesome of theworst across all data sets–averaging asmuch as 30 percentage points in thepoorestregions.
![Page 29: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/29.jpg)
25
Figure11.FemaleHead‐of‐Household/RegionalStrataPovertyRateEstimation:Strata‐specificLogitDominatesSPSforNepal2010.WesplittheNepal2010datainto28stratadefinedby the intersectionof the “femalehead‐of‐household”and theadministrative zonedummyvariables,andmeasurethediscrepanciesbetweentheestimatedpovertyratesandtheobservedpovertyrates.Eachofthesmallcirclescorrespondstothediscrepancyforagivenstrata—rawdiscrepanciesintheupperpanelandabsolutediscrepanciesinthelowerpanel,withSPSresults inblue,strata‐specific logit inred,andthegreenvertical linemarkingtheaverage national poverty rate. Stratum‐specific logit dominates SPS, which tends tounderestimatethepovertyrateacrosstheentiredomainofobservedstratum‐specificpovertyrates.Whenthestrataare idiosyncraticanddeviate fromthenationalsample,averageSPSdiscrepanciesmaybeashighas10–20percentage‐points.
![Page 30: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/30.jpg)
26
Figure12.Strata‐specificLogitDominatesSPS,Indonesia2010.Wesplit the Indonesia2010datainto934stratadefinedbytheintersectionof“urban/rural”and“district”dummyvariables, and measure the discrepancies between the estimated poverty rates and theobservedpovertyrates.Eachofthesmallcirclescorrespondstothediscrepancyforagivenstrata.Rawdiscrepanciesintheupperpanelandabsolutediscrepanciesinthelowerpanel,withSPSresults inblue,strata‐specific logit inred,andthegreenvertical linemarkingtheaverage national poverty rate. Stratum‐specific logit shows strong and consistentperformance,withlowaveragediscrepanciesacrossthedomainofobservedstratumpovertyrates.SPSisreliableifandonlyifthestratapovertyratesareintheimmediateneighborhoodofthenationalpovertyrate.Whenthestrataareidiosyncraticanddeviatefromthenationalsample,averageSPSdiscrepanciesmaybeashighas15–25percentage‐points.
![Page 31: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/31.jpg)
27
C.TestingEstimatorResilienceoverTime
NationalhouseholddatasetsarenotpublishedeveryyearandSPSsarenotdevelopedforeverynationalhouseholdsurvey,soitisnotuncommonfortheretobeamismatchintimebetweenthevintageofone’ssampleandthevintageofavailableSPS.Ofthe63countriesforwhichSPSsarecurrentlyavailable,48countriesofferonlyoneSPS,tencountriesofferSPSsfortwodifferentyears,andfivecountriesofferSPSsforthreedifferentyears.Whatcanhappenwhen,forexample,someoneattemptstoestimatethepovertyrateinasamplecollectedin2012andapplytheSPSwhenthemostrecentSPSinthecountrywascalibratedtoa2010nationalhouseholdsurvey?12
Toexplorethisquestion,wetookSPSandregression‐basedmodelstrainedonthePeru2010nationaldataandtestedthosemodelsusingPeru2011andPeru2012testdata.TheupperpanelinFigure13 showspoverty estimates andbootstrapped95%confidence intervals based onmodelsderivedfromPeru2010data—forPeru2011dataontheleftandPeru2012dataontheright—appliedtoraw(unweighted)national“testsets”orvalidationsets(i.e.,randomsamplesfromthenationalsurveydatathatwerenotusedtofitthemodels).Regression‐basedmethodsclearlydominatetheSPSandweightedleastsquares(WLS),whichutilizesobservationweightsinthetrainingsetbutnotthetestset,doestheworstbutstilldominatestheSPS.Thelowerpanelshowshowtheseresultschangewhenobservationweightsareappliedtothetestsets(creatingnationallyrepresentativetestsets).AswiththeconcurrentSPScase,acommonfeatureoftheseresults is thatapplyingtheSPS“BiasCorrectionFactor”appearsto increasethebias,movingestimatesawayfromtheobservedpovertyrate.
12Asnotedabove,infootnote4,SPSdocumentationincludeswarningsaboutthispractice.
![Page 32: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/32.jpg)
28
Figure 13. Testing the Resilience of Poverty Estimates Over Time: RegressionOutperformsSPS,Peru2010EstimatorsAppliedtoPeru2011andPeru2012Data.Thisfigureillustratesthediscrepanciesthatmayoccurwhenthepovertyestimatormodelsarenotcontemporaneouswiththetargetedsample.Theupperpanelshowspovertyestimatesandbootstrapped95%confidenceintervalsbasedonmodelsderivedfromPeru2010data—forPeru2011dataonthe leftandPeru2012dataon theright—applied toraw(unweighted)national“testsets”orvalidationsets(i.e.,randomsamplesfromthenationalsurveydatathatwerenotused to fit themodels).Regression‐basedmethods clearlydominate theSPSandweightedleastsquares(WLS),whichutilizesobservationweightsinthetrainingsetbutnotthe test set,does theworstbut stilldominates theSPS.The lowerpanel showshow theseresults changewhen observationweights are applied to the test sets (creating nationallyrepresentative test sets).A common featureof these results is that applying theSPS “BiasCorrectionFactor”appearsto increasethebias,movingestimatesaway fromtheobservedpovertyrate.
![Page 33: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/33.jpg)
29
WedonotclaimthattheresultsinFigure13generalizeacrosscountriesandtime‐periods;theyaremerelyindicativeofwhatcanhappenwhenthereisamismatchintimebetweenatrainingsetandatestset.Itisworthnotingthatmismatchesintimecanfavororhinderanyestimatorifeconomictrendscanceloutdiscrepancies.
V. ConcludingRemarksandtheWayForward
Theaccurateandefficientestimationofpovertyratesisaconcernfordevelopmentpractitionersand researchersalike. In thispaperwedemonstrate that an increasinglypopularmethod forestimating poverty rates, the simple poverty scorecard, performs best when applied to theestimationofnationalpovertyrateswithnationally‐representativesamples.However,SPS‐likeprocedures are (by their very nature and emphasis on simple operational implementation)ignore informationthat iscommonlyavailabletosurveyors inmostappliedsettings.Analystsgenerallyhaverichhousehold‐levelcovariates,suchasoccupationandgeographicorregionalinformationthatcanprovideadditionalinformationandallowresearchersandpractitionerstomorepreciselyestimatepovertyratesintargetpopulationsofinterest.WedemonstratethatbothSPS‐typeproceduresandnational‐levelregressions“performwell”inpractice(inatrainingandtestsetparadigm)whenappliedtotargetedstratawithpovertyratesnearthenationalpovertyrate.Butasthepopulationsofinterestbecomemoregranular(e.g.,regional)ormoreextremeontheincomedistribution,SPS‐typeproceduresperformmeasurablyworsethanprosaicstatisticalmodels tuned at the stratum level. These results are also in accordance with the growingacademicliteratureonsmallareaestimationandpovertymappingthatadvocatestheestimationofregionordistrict‐specificconsumption(orincome)modelsaslongasthehouseholdsurveyisrepresentativeatthatlevel(e.g.,Elbersetal.2003,TarozziandDeaton,2009;Tarozzi,2011).
Thefindingsinthisreporthaveimportantimplicationsforthepractitionersinthefieldwantingtohaveanestimateofthepovertyrateintheirtargetpopulation.Tobeginwith,itisimportanttounderstandthatthereisafundamentaltradeoffbetweensimplicityofuseandaccuracy.Simpletools,liketheSPSinitscurrentform,aredesignedinfavorofsimplicitybyestimatingpovertyforanypossibletargetpopulationusingunderlyingparametersderivedfromthefullsampleofhouseholdsinthenationalhouseholdsurvey.Forexample,supposethatone’sgoalistoestimatepovertyamongfemale‐headedhouseholdprogramparticipants,inoneregionofacountry,inayearinwhichanationalhouseholdsurveywasadministeredandaSPShasbeenestimatedandmadepubliclyavailable.Thepractitionercouldthencollectdataforthe10questionsrequiredbytheSPSfromthetargetfemalepopulationinthespecificregionandapplytheSPSfromthesameyeartoestimatethepovertyrateforthetargetfemalepopulation.Thepovertyratesforthetargetpopulationwouldthenbebasedontheparametersthathavebeenestimatedusingnationaldataformaleandfemaleheadedhouseholdsfromalldifferentregionsinthecountry.This impliesthatifmale‐headedhouseholdsaremuchmoreprevalentinthenationalhouseholdsurvey,asitisusuallythecase,orifotherregionshavealargerpopulationthantheregionofinterest,thenthepovertyrateestimatedbytheSPSforthetargetfemalepopulationmaybebiasedinthesense
![Page 34: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/34.jpg)
30
thatitwouldnotbesogoodatapproximatingthetruepovertyrateamongfemalesintheregionofinterest.
The analysis in this report implies that the prediction of poverty could be improved if theunderlyingparametersofthemodelusedtopredictpovertyortoassignpovertyscoreswereestimatedbasedonthesub‐sampleoffemaleheadedhouseholdsinthatregion(extractedfromthefullnationalhouseholdsurvey).However,itisimportanttoacknowledgeupfrontthattherearevarious challengesespecially for thepractitioner in implementing themethodsemployedherejustforthesakeofimprovingtheaccuracyofpovertypredictionsinthetargetpopulation:
(i) Potentialusers(researchers,practitioners,orboth)wouldrequireaccesstosub‐nationalrepresentative household level data, which are not easily accessible nor ready forprocessingforthepurposesofpredictingpoverty13;
(ii) Specializedstatisticalbackgroundandeconometricexpertisewouldberequired;(iii) Evenif(i)and(ii)arefeasible,thesamplesizeofthespecificpopulationofinterestinthe
nationalsurveymaybeinsufficient;(iv) There may not be sufficient business appetite to individually allocate the resources
neededtoattainsuchimprovementsinanalyticalprecision.
Assumingthereissufficientappetite,forthepractitionersinthefieldrelyingontheSPSorSPS‐likemethodstoestimatepovertyinthepopulationoftheirinterest,withthedataavailableandcurrently used by the SPS, there are relatively simple and low cost ways of improving thepredictionsofpovertyintargetpopulations.OnepracticaloptionisthattheSPSmethodanditssurroundinginfrastructuregetupdatedbyconsidering:(i)theuseofregression‐basedmethodssuchasthoseusedinthisreport;and(ii)theincorporationoftheintermediateandmorepracticalstepofestimatingregression‐basedmodelsseparatelyforthegeographicstratathatthenationalsurvey isdesigned toberepresentativeat.Povertyestimates for targetpopulationsbasedonstrata‐specificestimatesofregression‐basedmodelscertainlyimproveuponpovertyestimatesbasedonthenationallyestimatedSPS.Itisquitelikelythatregionordistrict‐specificestimatesof the SPS, depending on the country, will improve the accuracy of the poverty estimatescurrentlybasedonthenationallyestimatedSPS.14Therefore,wesuggestthattheinternationaldevelopmentanddonorcommunitytakealeadindeveloping,refining,packagingandmakingavailablesuchmodelsinatoolkitformatthatwouldbeavailabletocurrentusersoftheSPS.
13Forexample,thehouseholdsurveyitselfmaybeavailablewithouttheconsumption(orincome)aggregateusedfortheestimationoftheofficialpovertyrateinthecountry.14Anotheroption,availableatthispointonlytoWorldBankstaff,istorelyontoolsdesignedforcustomizedanalysisforthetargetpopulationinthecountryofinterest.TheSWIFTtoolcurrentlyunderdevelopmentintheWorldBankisone such option providing target population‐specific poverty estimates upon requestwith primary data collectedthroughashortsurveyappliedtothetargetpopulationinthesamespiritastheSPS.ThedifferencebetweenSWIFTandtheSPSisthattheunderlyingstatisticalmodelandmethodsusedinSWIFTtoderivetheparametersneededforthepredictionofpovertyaredifferentandmoreflexibleinthesensethatamenuofoptionsisavailable.SWIFT,forexample, isequippedtoestimateregion‐specificconsumptionmodelsinthenationalhouseholdsurveythatcanbeusedtopredictconsumptioninthetargetpopulation. Inadditiontotheheadcountpovertyrate, theavailabilityofpredictedconsumptionallowstheestimationofthepovertygapandtheseverityofpoverty,twopovertymeasuresthatarenotpossibletopredictwiththeSPSinitscurrentform.
![Page 35: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/35.jpg)
31
References
Benin,S.andRandriamomonjyJ.(2008).“EstimatingHouseholdIncometoMonitorandEvaluatePublicInvestmentProgramsinSub‐SaharanAfrica.”IFPRIDiscussionPaper00771pp.1–24.
Christiaensen L., Lanjouw P., Luoto J., and Stifel D. (2012) “Small Area Estimation‐basedPredictionMethodstoTrackPoverty:ValidationandApplications”,JournalofEconomicInequality,10:267–297
Cleveland, W. S. and Devlin, S.J. (1988). “Locally‐Weighted Regression: An Approach toRegression Analysis by Local Fitting.” Journal of the American Statistical Association83(403):596–610.
Efron,B.,andTibshiraniR.(1998).AnIntroductiontotheBootstrap.Chapman&Hall/CRC.
Elbers,C.,Lanjouw,J.O.,Lanjouw,P.(2003)“Micro‐levelestimationofpovertyandinequality.”Econometrica 71(1), 355–364.Greene, W. H. 2003. Econometric Analysis. 5th ed. NJ:PrenticeHall.
Hastie,T.,TibshiraniR.,andFriedman,J.(2009).TheElementsofStatisticalLearning:DataMining,Inference,andPrediction.2nded.Springer.
Horvitz,D.G.,andThompson,D.J.(1952).“AGeneralizationofSamplingWithoutReplacementFromaFiniteUniverse.”JournaloftheAmericanStatisticalAssociation47(260):663–685.
James,G.,Witten,D.,Hastie,T.,andTibshirani,R.(2013).AnIntroductiontoStatisticalLearningwithApplicationsinR.Springer
Præstgaard,J.T.(1995).“PermutationandBootstrapKolmogorov‐SmirnovTestsfortheEqualityofTwoDistributions.”ScandinavianJournalofStatistics22(3):305–322.
Rosenbaum, P., and Rubin, D. B, (1983). “The Central Role of the Propensity Score inObservationalStudiesforCausalEffects.”Biometrika70(1):41–55.
Schreiner,M.(2010).“ASimplePovertyScorecardforJordan.”Mimeo.
________________(2011a).“ASimplePovertyScorecardforSierraLeone.”Mimeo.
________________(2011b).“ASimplePovertyScorecardforUganda.”Mimeo.
________________(2012a).“ASimplePovertyScorecardforIndonesia.”Mimeo.
________________(2012b).“ASimplePovertyScorecardforParaguay.”Mimeo.
________________(2012c).“ASimplePovertyScorecardforPeru.”Mimeo.
________________(2013a).“ASimplePovertyScorecardforBangladesh.”Mimeo.
________________(2013b).“ASimplePovertyScorecardforNepal.”Mimeo.
________________(2014a).“TheProcessofPoverty‐ScoringAnalysis.”Mimeo.
________________(2014b).“ASimplePovertyScorecardforThailand.”Mimeo.
![Page 36: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/36.jpg)
32
Tarozzi, A. (2011). “Can census data alone signal heterogeneity in the estimation of povertymaps?”JournalofDevelopmentEconomics95(2):170–185.
Tarozzi, A., and Deaton, A. (2009). “Using Census and Survey Data to Estimate Poverty andInequalityforSmallAreas.”ReviewofEconomicsandStatistics91(4):773–792.
Tibshirani,R.(1994).“RegressionShrinkageandSelectionViatheLasso.” JournaloftheRoyalStatisticalSociety,SeriesB58:267–288.
![Page 37: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/37.jpg)
33
Appendix1:ADetailedSummaryofthePovertyScorecardMethodology15
Thepovertyscorecardsarebasedonsurveydatawithhalfofthedatausedfortoconstructandcalibratethecardandtheotherhalfofthedataisusedtovalidatetheaccuracyoftheconstructedcard.
Scorecardconstructionfora10‐indicatorvariablecard
1. Choose a large number (about 120 in Bangladesh) of candidate indicators from a representativepopulationsurveydescribing: familycomposition,education,housingcharacteristics,ownershipofdurable assets, employment, and agriculture. (Here already some decisions are made as somevariables are aggregate groups. For example the variable “howmany householdmembers 12 oryounger”isgroupedinto0,1,2,or3+.)
2. Chooseapovertylineonwhichthescorecardisbuilt.(ForBangladeshitwas$1.25/day2005PPPpovertyline.)
3. UseLogittobuildonescorecardforeachcandidateindicatorusingtheconstruction/calibrationsub‐sample.
4. Order the candidate indicators based on how strongly (by themselves) they are correlated withpovertyusingentropybased“uncertaintycoefficient.”(Figure3intheBangladeshdocument.)
5. Chooseoneoftheone‐indicatorscorecardsbasedonits:accuracy(inpredictingpoverty),likelihoodof acceptancebyusers, sensitivity to change in poverty, applicability across regions, relevance fordistinguishingamonghouseholdsatthepoorerendoftheexpendituredistributionandverifiability.(ForBangladeshthefirst10indicatorsbasedonuncertaintycoefficientare:numberofmobilephones;numberoffans;receiptofcharity,gifts,royalties,help,Zagat,Fitra;tv’swithdvdetc;tv;consumptiononQurbani;landlineconnectionormobilephone;highestgradeofmalespouse;anyoneworkedwithdailypay;mainjobwaswithdailypay.Forexample,oftheseonlyoneofthemeasuresoftelephones(numberofmobiles)waschosen.And,forexample,receiptofcharitywasnotincluded,possiblyasitisdifficulttoverify.)
6. Thenbuildaseriesoftwo‐indicatorscorecardsaddinganotherindicatorvariable(fromthesetof120)tothefirstonechoseninstep5.
7. Compare the two‐indicator scorecards and choose one based on the uncertainty coefficients andjudgmentonthecharacteristicsof thesecond indicator. Thesecond indicatorshouldbeevaluatedbased on the same criteria as the first one and, in addition, variety among indicators should beconsidered.
8. Buildaseriesofthree‐indicatorscorecards(addingathirdvariableto2‐variablescorecardchosenin7)andagainchoosethe third indicatorbasedon thesamestatisticalandnon‐statisticalcriteriaasbefore.
9. Repeatexerciseuntilhaveascorecardwith10measures.10. Transformthelogitcoefficientsfromthe10‐indicatorscorecardintonon‐negativeintegerswithtotal
scorerangefrom0(mostlikelybelowapovertyline)to100(leastlikelytobebelowapovertyline).
Scorecardandpovertylikelihoodcorrespondence
11. Determinethepovertyscoreforeachhouseholdintheconstruction/calibrationsubsample.12. Foreachcalibrationsubsamplehouseholddeterminewhetheritisbelow/aboveapovertyline.(e.g.
intheBangladeshdocument,thestatuswithrespecttonationallower,nationalupper,150%nationalupper, 200% national upper, USAID “Extreme”, $1.25 PPP, $1.75 PPP, $2.00PPP, $2.50 PPP isdeterminedforeachhousehold.)
13. Foreachrangeofpovertyscores(0‐4;5‐9;10‐14…)determinethepercentageofhouseholds(inthecalibrationsubsample)whoarebelowtheparticularpoverty line forwhichtheconversiontable isbeingbuilt.
15BasedonSchreiner(2014a)
![Page 38: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/38.jpg)
34
AccuracyofSPSbasedpovertylikelihoods
14. Determinethescoreforeachofthehouseholdsinthevalidationsubsample.15. Draw a bootstrap sample of n households with replacement from the validation sample. (In the
Bangladeshstudyn=16,384.)16. Calculatethetruepovertylikelihoodinthebootstrapsample.Thatis,theshareofhouseholdsbelow
apovertyline.(Needstobecalculatedseparateforeachpovertylineconsidered.)17. Foreachscore,comparethistruepovertylikelihoodwiththeestimatedpovertylikelihooddetermined
instep13(Scorecardandpovertylikelihoodcorrespondence).Recordthedifference.18. Repeat1,000timesrecordingthedifferencebetweenthetrueandestimatedlikelihoodsforeachscore.19. Foreachscore,reportthetwo‐sided intervalscontainingthecentral900,950,and990differences
betweenestimatedandtruepovertylikelihoods(togetconfidenceintervals)toseehowaccuratethemeasureisfordifferentpovertyscores.
AccuracyofSPSbasedpovertyrate
20. Todeterminethepovertyrateforaparticulargroup,averagetheestimatedpovertylikelihood(fromthescorecards)ofallindividualsinthegroup.
21. Calculatethetruepovertyrateforthe1,000repetitionsofn=16,384bootstrapsamples.22. Calculatethedifferencebetweentheestimatedpovertyrateandthetruepovertyrateforeachofthe
1,000repetitions.23. Theaveragedifferencebetweentheestimatedandthetruepovertyratesisthe“biascorrectionfactor.”24. Thepovertyratesthanneedtobeadjustedbythis “biascorrection” toget theunbiasedestimates.
Thereisauniquebiascorrectionfactorforeachpovertyline.(InBangladeshtheyrangefrom+0.5to‐0.9percentagepoints.)
25. Use the distribution of the true poverty rate estimates from the bootstrap samples to determinestandard errors/confidence intervals. (I.e. the interval containing the central 900 poverty rateestimatesisthe90%confidenceinterval.)
Determinationofstandarderrorsforestimatedsamples
26. To determine the standard errors for the scorecard based poverty rates the directmeasurementstandarderrorformulaneedstobeadjustedforthefactthatthescorecardisnotadirectmeasureofpoverty. The correction factor is the ratio of the standard errors derived analytically from thebootstrapsample to thestandarderror fromthemathematical formula in thedirectmeasurementcase.(Avaluelessthanonimpliesconfidenceintervalsforpovertyscoringmethodaresmallerthanthosefromdirectmeasurement,i.e.theyaremoreprecise,andavaluegreaterthanoneimpliesthattheyarelessprecise.)Thecorrectionfactorisderivedbyusingbootstrapsamplesofvarioussizestoget empirical estimates of the confidence interval and comparing them to the analytical standarderrorscorrespondingtothesamesamplesize.Thecorrectionfactoristheaverageoftheseratios.(IntheBangaldeshcasehedoestheexercisefor7differentsamplesizerangingfromn=256ton=16,384.)
27. Thestandarderrorforpoint‐in‐timeestimatesofpovertyratesvisSPSis
1 ∗
1
Where isthecorrectionfactor, istheestimatedpovertyrate,Nisthepopulationsize,andnisthesamplesize.
Estimateofchangeinpovertyratesovertime
28. Similarmethodologycanbeusedtoderivetheestimatesofbias,precisionandthe whenusing(2010)SPSinotheryears.Asabove,the(2010)validationsampleaswellasfullsamplefromanotheryearareusedtogeneratebootstrapsamplestoobtainmeandifferencesstandarderrorsbetweensurveyssamples.
![Page 39: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/39.jpg)
35
29. Notethatchangedoesnotimplyimpact.
Targeting 30. Targetingaccuracycanalsobeassessedbycomparingthetruepovertystatuswithinclusion/exclusion
ofapro‐poorprogramwithdifferentscoresusedasthecut‐offlevel.
![Page 40: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/40.jpg)
36
Appendix2:SPSSurveyQuestionsandLookupTables
Figure 14a: SPS for Bangladesh, 2010
![Page 41: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/41.jpg)
37
Figure 14b: SPS Look-up Table for Bangladesh, 2010
![Page 42: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/42.jpg)
38
Figure 15: SPS for Indonesia, 2010
![Page 43: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/43.jpg)
39
Figure 16: SPS for Jordan, 2006
![Page 44: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/44.jpg)
40
Figure 17: SPS for Nepal, 2010
![Page 45: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/45.jpg)
41
Figure 18: SPS for Paraguay, 2011
![Page 46: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/46.jpg)
42
Figure 19: SPS for Peru, 2010
![Page 47: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/47.jpg)
43
Figure 20: SPS for Sierra Leone, 2003
![Page 48: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/48.jpg)
44
Figure 21: SPS for Thailand, 2011
![Page 49: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/49.jpg)
45
Figure 22: SPS for Uganda, 2009
![Page 50: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/50.jpg)
46
Appendix3:AdditionalFiguresfromotherCountries
![Page 51: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/51.jpg)
47
![Page 52: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/52.jpg)
48
![Page 53: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/53.jpg)
49
![Page 54: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/54.jpg)
50
Appendix4:StratumSpecificPovertyRates
![Page 55: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/55.jpg)
51
![Page 56: Estimating Poverty Rates in Target Populations - World Bank · 2016. 8. 15. · Notes: N indicates the sample size of the data set used to calibrate the SPS. The weighted poverty](https://reader033.fdocuments.us/reader033/viewer/2022061002/60b0bee60f36c7762f3b0336/html5/thumbnails/56.jpg)
52