Confidence Intervals and Hypothesis Testing

11

Click here to load reader

description

This document tell you how to use confidence intervals and hypothesis testing in statistical methosd.

Transcript of Confidence Intervals and Hypothesis Testing

  • 1

    Chapter6:ConfidenceIntervalsandHypothesisTesting

    Whenanalyzingdata,wecantjustacceptthesamplemeanorsampleproportionastheofficial

    meanorproportion.Whenweestimatethestatistics ,x p (samplemeanandsampleproportion),wegetdifferentanswersduetovariability.Sowehavetoperformstatisticalinference:ConfidenceInterval:whenyouwanttoestimateapopulationparameterSignificanceTesting:whenwewanttoassesstheevidenceprovidedbythedatainfavorofsomeclaimaboutthepopulation.

    Section6.1:ConfidenceIntervalsallowustoestimatearangeofvaluesforthepopulationmeanorproportion.Thetruemeanorproportionforthepopulationexistsandisafixednumber,butwedontknowit!Usingsamplestatisticswegetanestimateofwhereweexpectthepopulationparametertobe.

    Ifwetakeasinglesample,oursingleconfidenceintervalnetmayormaynotincludethepopulationparameter.

    Howeverifwetakemanysamplesofthesamesizeandcreateaconfidenceintervalfromeachsamplestatistic,overthelongrun95%ofourconfidenceintervalswillcontainthetruepopulationparameter(ifweareusinga95%confidencelevel).

  • 2

    Ifyouincreaseyoursamplesize(n),youdecreaseyourmarginoferror

    Ifyouincreaseyourconfidencelevel(C),thenyouincreaseyourmarginoferror

    Asmallermarginoferrorisgoodbecausewegetasmallerrangeofwheretoexpectthetruepopulationparameter.

    Confidenceintervalformulaslooklikeestimatemarginoferror.

    Wewritetheintervalsas(lowerbound,upperbound).

  • 3

    ConfidenceIntervalforaPopulationMean,:

    * xx zn

    wherez*isthevalueonthestandardnormalcurvewithareCbetweenz*andz*.

    z* 1.645 1.960 2.576

    C 90% 95% 99%

    (TableDinthebackofthebookcontainsmorevalues,butthesearethemostcommon)

    SampleSize,n,forDesiredMarginofError,m:

    2* xznm

    Notethatitisthesamplesize,n,thatinfluencesthemarginoferror.Thepopulationsizehasnothingtodowithit.

    Waystoreduceyourmarginoferror:1.) Increasesamplesize2.) Usealowerlevelofconfidence(smallerC)3.) Reduce x

    Becareful!!!!Youcanonlyusetheformula* xx z

    n

    undercertaincircumstances: DatamustbeanSRSfromthepopulation.

    DonotuseifthesamplingisanythingmorecomplicatedthananSRS.

    Datamustbecollectedcorrectly(nobias).Themarginoferrorcoversonlyrandom

    samplingerrors.Undercoverageandnonresponsearenotcovered.

    Outlierscanhaveabigeffectontheconfidenceinterval.(ThismakessensebecauseweusethemeanandstandarddeviationtogetaCI.)

    Youmustknowthestandarddeviationofthepopulation, x .

  • 4

    EXAMPLE1:Aquestionnaireofspendinghabitswasgiventoarandomsampleofcollegestudents.Eachstudentwasaskedtorecordandreporttheamountofmoneytheyspentontextbooksinasemester.Thesampleof130studentsresultedinanaverageof$422withstandarddeviationof$57.

    a) Givea90%confidenceintervalforthemeanamountofmoneyspentbycollegestudentsontextbooks.

    b) Isittruethat90%ofthestudentsspenttheamountofmoneyfoundintheintervalfrompart(a)?Explainyouranswer.

    c) Whatisthemarginoferrorforthe90%confidenceinterval?

    d) Howmanystudentsshouldyousampleifyouwantamarginoferrorof$5fora90%confidenceinterval?

  • 5

    EXAMPLE2:Asampleof12STAT301studentsyieldsthefollowingExam1scores:

    78 62 99 85 94 5388 90 86 92 75 92

    Assumethatthepopulationstandarddeviationis10.ThesamplemeancanbecalculatedusingSPSSorcalculatortobe82.83.(Note:DoNOTuseanySPSSconfidenceintervalstheyaregoodonlyforChapter7,notthistypeofCI.YoumustgettheseZconfidenceintervalsbyhand.)

    a) Findthe90%confidenceintervalforthemeanscoreforSTAT301students.

    b) Findthe95%confidenceinterval.

    c) Findthe99%confidenceinterval.

    d) Howdothemarginsoferrorin(b),(c),and(d)changeastheconfidencelevelincreases?Why?

  • 6

    Section6.2:HypothesisTesting

    The4stepscommontoalltestsofsignificance:1. StatethenullhypothesisH0andthealternativehypothesisHa.2. Calculatethevalueoftheteststatistic.3. DrawapictureofwhatHalookslike,andfindthePvalue.4. Stateyourconclusionaboutthedatainasentence,usingthePvalueand/orcomparing

    thePvaluetoasignificancelevelforyourevidence.STEP1:StatethenullhypothesisH0andthealternativehypothesisHa.Todoasignificancetest,youneed2hypotheses:

    H0,NullHypothesis:thestatementbeingtested,usuallyphrasedasnoeffectornodifference.

    Ha,AlternativeHypothesis:thestatementwehopeorsuspectistrueinsteadofH0.Hypothesesalwaysrefertosomepopulationormodel.Nottoaparticularoutcome.Hypothesescanbeonesidedortwosided.

    Onesidedhypothesis:coversjustpartoftherangeforyourparameter

    H0:=10 OR H0:=10Ha:>10 Ha:

  • 7

    Example(Exercise6.37,p.418):Eachofthefollowingsituationsrequiresasignificancetestaboutapopulationmean.StatetheappropriatenullhypothesisH0andalternativehypothesisHaineachcase:

    a. CensusBureaudatashowsthatthemeanhouseholdincomeintheareaservedbyashoppingmallis$72,500peryear.Amarketresearchfirmquestionsshoppersatthemalltofindoutwhetherthemeanhouseholdincomeofmallshoppersishigherthanthatofthegeneralpopulation.

    b. Lastyear,yourcompanysservicetechnicianstookanaverageof1.8hourstorespondtotroublecallsfrombusinesscustomerswhohadpurchasedservicecontracts.Dothisyearsdatashowadifferentaverageresponsetime?

    STEP2:Calculatethevalueoftheteststatistic.AteststatisticmeasurescompatibilitybetweentheH0andthedata.Theformulafortheteststatisticwillvarybetweendifferenttypesofproblems.InproblemslikethosewestudiedinChapter6,theteststatisticwillbetheZscore.STEP3:DrawapictureofwhatHalookslike,andfindthePvalue.Pvalue:theprobability,computedassumingthatH0istrue,thattheteststatisticwouldtakeavalueasextremeormoreextremethanthatactuallyobservedduetorandomfluctuation.Itisameasureofhowunusualyoursampleresultsare.

    ThesmallerthePvalue,thestrongertheevidenceagainstH0providedbythedata.

    CalculatethePvaluebyusingthesamplingdistributionoftheteststatistic(onlythenormaldistributionforChapter6).

    STEP4:CompareyourPvaluetoasignificancelevel.Stateyourconclusionaboutthedatainasentence.

    ComparePvaluetoasignificancelevel,.

    IfthePvalue,wecanrejectH0.

    IfyoucanrejectH0,yourresultsaresignificant.

    IfyoudonotrejectH0,yourresultsarenotsignificant.

  • 8

    ZTestforaPopulationMeanTotestthehypothesisH0:=0basedonanSRSofsizenfromapopulationwithunknownmeanandknownstandarddeviation,

    computetheteststatistic:

    00 /

    xZn

    thePvaluesforatestofH0against:

    Ha:>0isP(ZZ0)

    Ha:

  • 9

    EXAMPLES

    1. LastyearthegovernmentmadeaclaimthattheaverageincomeoftheAmericanpeoplewas$33,950.However,asampleof50peopletakenrecentlyshowedanaverageincomeof$34,076withapopulationstandarddeviationof$324.Isthegovernmentsestimatetoolow?Conductasignificancetesttoseeifthetruemeanismorethanthereportedaverage.Usean=0.01.

    2. Anenvironmentalistcollectsaliterofwaterfrom45differentlocationsalongthebanksofastream.Hemeasurestheamountofdissolvedoxygenineachspecimen.Themeanoxygenlevelis4.62mg,withtheoverallstandarddeviationof0.92.Awaterpurifyingcompanyclaimsthatthemeanlevelofoxygeninthewateris5mg.Conductahypothesistestwith=0.001todeterminewhetherthemeanoxygenlevelislessthan5mg.

    3. Anagroeconomistexaminesthecellulosecontentofavarietyofalfalfahay.Supposethatthecellulosecontentinthepopulationhasastandarddeviationof8mg.Asampleof15cuttingshasameancellulosecontentof145mg.Apreviousstudyclaimedthatthemeancellulosecontentwas140mg.Performahypothesistesttodetermineifthemeancellulosecontentisdifferentfrom140mgif=0.05.

  • 10

    Howdoesrelatetoconfidenceintervals?

    Ifyouhavea2sidedtest,andiftheandconfidenceleveladdto100%,youcanrejectH0if0(thenumberyouwerechecking)isnotintheconfidenceinterval.

    a) Finda95%confidenceintervalforthemeancellulosecontentfromtheaboveexample.

    b) Nowtrythetestfrompartnumber3againusingtheconfidenceintervalfrompartbtodothehypothesistest.(Theresultshouldbethesame.)

    AnnualDrinkingWaterQualityReport,2004,TownofBrookston,IN

    Impleasedtoreportthatourdrinkingwaterissafeandmeetsfederalandstaterequirements.

    TestResults(MCListhemaximumcontaminantlevel,thehighestlevelofacontaminantthatisallowedindrinkingwater.)

    Contaminant ViolationY/N

    LevelDetected

    Unitmeasurement

    MCL

    Beta/photonemitters N 2.1 3.2 mrem/yr 4

    Alphaemitters N 0 1.6 pCi/l 15

    Barium N 0.216 ppm 2

    Copper N 0.039to0.453 ppm 1.3

    Fluoride N 0.01 ppm 4

    Sodium N 0.0 ppm N/A

    Oneoftheseviolationreportsshouldactuallybeayesinsteadofano.Whichoneisitandwhy?Whathypothesesgoalongwiththeseconfidenceintervals?

    Note:WhenIcalledthetownofBrookstonofficetoaskthemaboutthis,thewatermanagercalledthestateEPAofficetogetmoreinformation.Whattheytoldhimwasthat,yes,technicallyIwascorrect,butthattheydontusetheconfidenceintervalsthatarereported.ApparentlythesearetheFEDERALEPArules.Theyonlyusethemean.Itriedtogetsamplesizeorotherinformation,butIwasntabletolearnanythingmore.

  • 11

    Pvaluescanbemoreinformativethanareject/donotrejectH0basedon.AsPvaluegetssmallertheevidenceforrejectingH0getsstronger.

    Justbecauseweuse=0.05alotdoesntmeanthatsthelevelyouhavetouseitsjustthemostcommon.Theresnothingparticularlyspecialaboutthatlevel.

    Inalargesample,eventinydeviationsfromthenullhypothesiscanbeimportant.

    IfwefailtorejectH0,itmaybebecauseH0istrueorbecauseoursamplesizeisinsufficienttodetectthealternative.

    PlotyourdataandlookatyourPvaluetodetermineyourconclusions.Couldoutliersbepartoftheproblem?

    Aconfidenceintervalactuallyestimatesthesizeofaneffectratherthansimplyaskingifitistoolargetoreasonablyoccurbychancealone.

    Youmusthaveawelldesignedexperimentinorderforstatisticalinferencetowork.Randomizationisimportant.