Download - Confidence Intervals and Hypothesis Testing

Transcript
  • 1

    Chapter6:ConfidenceIntervalsandHypothesisTesting

    Whenanalyzingdata,wecantjustacceptthesamplemeanorsampleproportionastheofficial

    meanorproportion.Whenweestimatethestatistics ,x p (samplemeanandsampleproportion),wegetdifferentanswersduetovariability.Sowehavetoperformstatisticalinference:ConfidenceInterval:whenyouwanttoestimateapopulationparameterSignificanceTesting:whenwewanttoassesstheevidenceprovidedbythedatainfavorofsomeclaimaboutthepopulation.

    Section6.1:ConfidenceIntervalsallowustoestimatearangeofvaluesforthepopulationmeanorproportion.Thetruemeanorproportionforthepopulationexistsandisafixednumber,butwedontknowit!Usingsamplestatisticswegetanestimateofwhereweexpectthepopulationparametertobe.

    Ifwetakeasinglesample,oursingleconfidenceintervalnetmayormaynotincludethepopulationparameter.

    Howeverifwetakemanysamplesofthesamesizeandcreateaconfidenceintervalfromeachsamplestatistic,overthelongrun95%ofourconfidenceintervalswillcontainthetruepopulationparameter(ifweareusinga95%confidencelevel).

  • 2

    Ifyouincreaseyoursamplesize(n),youdecreaseyourmarginoferror

    Ifyouincreaseyourconfidencelevel(C),thenyouincreaseyourmarginoferror

    Asmallermarginoferrorisgoodbecausewegetasmallerrangeofwheretoexpectthetruepopulationparameter.

    Confidenceintervalformulaslooklikeestimatemarginoferror.

    Wewritetheintervalsas(lowerbound,upperbound).

  • 3

    ConfidenceIntervalforaPopulationMean,:

    * xx zn

    wherez*isthevalueonthestandardnormalcurvewithareCbetweenz*andz*.

    z* 1.645 1.960 2.576

    C 90% 95% 99%

    (TableDinthebackofthebookcontainsmorevalues,butthesearethemostcommon)

    SampleSize,n,forDesiredMarginofError,m:

    2* xznm

    Notethatitisthesamplesize,n,thatinfluencesthemarginoferror.Thepopulationsizehasnothingtodowithit.

    Waystoreduceyourmarginoferror:1.) Increasesamplesize2.) Usealowerlevelofconfidence(smallerC)3.) Reduce x

    Becareful!!!!Youcanonlyusetheformula* xx z

    n

    undercertaincircumstances: DatamustbeanSRSfromthepopulation.

    DonotuseifthesamplingisanythingmorecomplicatedthananSRS.

    Datamustbecollectedcorrectly(nobias).Themarginoferrorcoversonlyrandom

    samplingerrors.Undercoverageandnonresponsearenotcovered.

    Outlierscanhaveabigeffectontheconfidenceinterval.(ThismakessensebecauseweusethemeanandstandarddeviationtogetaCI.)

    Youmustknowthestandarddeviationofthepopulation, x .

  • 4

    EXAMPLE1:Aquestionnaireofspendinghabitswasgiventoarandomsampleofcollegestudents.Eachstudentwasaskedtorecordandreporttheamountofmoneytheyspentontextbooksinasemester.Thesampleof130studentsresultedinanaverageof$422withstandarddeviationof$57.

    a) Givea90%confidenceintervalforthemeanamountofmoneyspentbycollegestudentsontextbooks.

    b) Isittruethat90%ofthestudentsspenttheamountofmoneyfoundintheintervalfrompart(a)?Explainyouranswer.

    c) Whatisthemarginoferrorforthe90%confidenceinterval?

    d) Howmanystudentsshouldyousampleifyouwantamarginoferrorof$5fora90%confidenceinterval?

  • 5

    EXAMPLE2:Asampleof12STAT301studentsyieldsthefollowingExam1scores:

    78 62 99 85 94 5388 90 86 92 75 92

    Assumethatthepopulationstandarddeviationis10.ThesamplemeancanbecalculatedusingSPSSorcalculatortobe82.83.(Note:DoNOTuseanySPSSconfidenceintervalstheyaregoodonlyforChapter7,notthistypeofCI.YoumustgettheseZconfidenceintervalsbyhand.)

    a) Findthe90%confidenceintervalforthemeanscoreforSTAT301students.

    b) Findthe95%confidenceinterval.

    c) Findthe99%confidenceinterval.

    d) Howdothemarginsoferrorin(b),(c),and(d)changeastheconfidencelevelincreases?Why?

  • 6

    Section6.2:HypothesisTesting

    The4stepscommontoalltestsofsignificance:1. StatethenullhypothesisH0andthealternativehypothesisHa.2. Calculatethevalueoftheteststatistic.3. DrawapictureofwhatHalookslike,andfindthePvalue.4. Stateyourconclusionaboutthedatainasentence,usingthePvalueand/orcomparing

    thePvaluetoasignificancelevelforyourevidence.STEP1:StatethenullhypothesisH0andthealternativehypothesisHa.Todoasignificancetest,youneed2hypotheses:

    H0,NullHypothesis:thestatementbeingtested,usuallyphrasedasnoeffectornodifference.

    Ha,AlternativeHypothesis:thestatementwehopeorsuspectistrueinsteadofH0.Hypothesesalwaysrefertosomepopulationormodel.Nottoaparticularoutcome.Hypothesescanbeonesidedortwosided.

    Onesidedhypothesis:coversjustpartoftherangeforyourparameter

    H0:=10 OR H0:=10Ha:>10 Ha:

  • 7

    Example(Exercise6.37,p.418):Eachofthefollowingsituationsrequiresasignificancetestaboutapopulationmean.StatetheappropriatenullhypothesisH0andalternativehypothesisHaineachcase:

    a. CensusBureaudatashowsthatthemeanhouseholdincomeintheareaservedbyashoppingmallis$72,500peryear.Amarketresearchfirmquestionsshoppersatthemalltofindoutwhetherthemeanhouseholdincomeofmallshoppersishigherthanthatofthegeneralpopulation.

    b. Lastyear,yourcompanysservicetechnicianstookanaverageof1.8hourstorespondtotroublecallsfrombusinesscustomerswhohadpurchasedservicecontracts.Dothisyearsdatashowadifferentaverageresponsetime?

    STEP2:Calculatethevalueoftheteststatistic.AteststatisticmeasurescompatibilitybetweentheH0andthedata.Theformulafortheteststatisticwillvarybetweendifferenttypesofproblems.InproblemslikethosewestudiedinChapter6,theteststatisticwillbetheZscore.STEP3:DrawapictureofwhatHalookslike,andfindthePvalue.Pvalue:theprobability,computedassumingthatH0istrue,thattheteststatisticwouldtakeavalueasextremeormoreextremethanthatactuallyobservedduetorandomfluctuation.Itisameasureofhowunusualyoursampleresultsare.

    ThesmallerthePvalue,thestrongertheevidenceagainstH0providedbythedata.

    CalculatethePvaluebyusingthesamplingdistributionoftheteststatistic(onlythenormaldistributionforChapter6).

    STEP4:CompareyourPvaluetoasignificancelevel.Stateyourconclusionaboutthedatainasentence.

    ComparePvaluetoasignificancelevel,.

    IfthePvalue,wecanrejectH0.

    IfyoucanrejectH0,yourresultsaresignificant.

    IfyoudonotrejectH0,yourresultsarenotsignificant.

  • 8

    ZTestforaPopulationMeanTotestthehypothesisH0:=0basedonanSRSofsizenfromapopulationwithunknownmeanandknownstandarddeviation,

    computetheteststatistic:

    00 /

    xZn

    thePvaluesforatestofH0against:

    Ha:>0isP(ZZ0)

    Ha:

  • 9

    EXAMPLES

    1. LastyearthegovernmentmadeaclaimthattheaverageincomeoftheAmericanpeoplewas$33,950.However,asampleof50peopletakenrecentlyshowedanaverageincomeof$34,076withapopulationstandarddeviationof$324.Isthegovernmentsestimatetoolow?Conductasignificancetesttoseeifthetruemeanismorethanthereportedaverage.Usean=0.01.

    2. Anenvironmentalistcollectsaliterofwaterfrom45differentlocationsalongthebanksofastream.Hemeasurestheamountofdissolvedoxygenineachspecimen.Themeanoxygenlevelis4.62mg,withtheoverallstandarddeviationof0.92.Awaterpurifyingcompanyclaimsthatthemeanlevelofoxygeninthewateris5mg.Conductahypothesistestwith=0.001todeterminewhetherthemeanoxygenlevelislessthan5mg.

    3. Anagroeconomistexaminesthecellulosecontentofavarietyofalfalfahay.Supposethatthecellulosecontentinthepopulationhasastandarddeviationof8mg.Asampleof15cuttingshasameancellulosecontentof145mg.Apreviousstudyclaimedthatthemeancellulosecontentwas140mg.Performahypothesistesttodetermineifthemeancellulosecontentisdifferentfrom140mgif=0.05.

  • 10

    Howdoesrelatetoconfidenceintervals?

    Ifyouhavea2sidedtest,andiftheandconfidenceleveladdto100%,youcanrejectH0if0(thenumberyouwerechecking)isnotintheconfidenceinterval.

    a) Finda95%confidenceintervalforthemeancellulosecontentfromtheaboveexample.

    b) Nowtrythetestfrompartnumber3againusingtheconfidenceintervalfrompartbtodothehypothesistest.(Theresultshouldbethesame.)

    AnnualDrinkingWaterQualityReport,2004,TownofBrookston,IN

    Impleasedtoreportthatourdrinkingwaterissafeandmeetsfederalandstaterequirements.

    TestResults(MCListhemaximumcontaminantlevel,thehighestlevelofacontaminantthatisallowedindrinkingwater.)

    Contaminant ViolationY/N

    LevelDetected

    Unitmeasurement

    MCL

    Beta/photonemitters N 2.1 3.2 mrem/yr 4

    Alphaemitters N 0 1.6 pCi/l 15

    Barium N 0.216 ppm 2

    Copper N 0.039to0.453 ppm 1.3

    Fluoride N 0.01 ppm 4

    Sodium N 0.0 ppm N/A

    Oneoftheseviolationreportsshouldactuallybeayesinsteadofano.Whichoneisitandwhy?Whathypothesesgoalongwiththeseconfidenceintervals?

    Note:WhenIcalledthetownofBrookstonofficetoaskthemaboutthis,thewatermanagercalledthestateEPAofficetogetmoreinformation.Whattheytoldhimwasthat,yes,technicallyIwascorrect,butthattheydontusetheconfidenceintervalsthatarereported.ApparentlythesearetheFEDERALEPArules.Theyonlyusethemean.Itriedtogetsamplesizeorotherinformation,butIwasntabletolearnanythingmore.

  • 11

    Pvaluescanbemoreinformativethanareject/donotrejectH0basedon.AsPvaluegetssmallertheevidenceforrejectingH0getsstronger.

    Justbecauseweuse=0.05alotdoesntmeanthatsthelevelyouhavetouseitsjustthemostcommon.Theresnothingparticularlyspecialaboutthatlevel.

    Inalargesample,eventinydeviationsfromthenullhypothesiscanbeimportant.

    IfwefailtorejectH0,itmaybebecauseH0istrueorbecauseoursamplesizeisinsufficienttodetectthealternative.

    PlotyourdataandlookatyourPvaluetodetermineyourconclusions.Couldoutliersbepartoftheproblem?

    Aconfidenceintervalactuallyestimatesthesizeofaneffectratherthansimplyaskingifitistoolargetoreasonablyoccurbychancealone.

    Youmusthaveawelldesignedexperimentinorderforstatisticalinferencetowork.Randomizationisimportant.