The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal...

17
The Distribution of Single Variables

Transcript of The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal...

Page 1: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

The Distribution of Single Variables

Page 2: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Two Types of Variables

• Continuous variables– Equal intervals of measurement– Known zero-point that is meaningful

• Discrete variables– Simply counts of attributes– Generate category frequencies

Page 3: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

For continuous variables,

• Equal intervals means that the distance between any two adjacent values is identical

e.g., the difference between 21 and 22 years of age is identical in years to the difference between 33 and 34 years of age

• Meaningful zero-point means it makes sense e.g., a GPA of 0.00 means you have completed no coursework successfully

Page 4: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

For discrete variables,

All we can do is count the number of observations that fall into its various categories

e.g., the number of males andfemales in this class

Page 5: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Using SAS to Produce Statistics for Using SAS to Produce Statistics for Single Variables Single Variables

libname mydata 'a:\';libname library 'a:\'; options ps=66 nodate nonumber; proc freq data=mydata.cities;table citysize;title1 'One-Way Frequency Distribution';title2;title3 'PPD 404';run;

 

Page 6: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

One-Way Frequency Distribution  PPD 404  SIZE OF CITY, DICHOTOMY  Cumulative Cumulative CITYSIZE Frequency Percent Frequency Percent ------------------------------------------------------ Small 45 71.4 45 71.4 Large 18 28.6 63 100.0 

Page 7: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

libname mydata 'a:\';libname library 'a:\'; options ps=66 nodate nonumber; proc chart data=mydata.cities;vbar spending / discrete;title1 'Vertical Bar Chart';

run; 

Page 8: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Vertical Bar Chart

Frequency  23 + ***** | ***** 22 + ***** ***** | ***** ***** 21 + ***** ***** | ***** ***** 20 + ***** ***** | ***** ***** 19 + ***** ***** | ***** ***** 18 + ***** ***** ***** | ***** ***** ***** 17 + ***** ***** ***** | ***** ***** ***** 16 + ***** ***** ***** | ***** ***** ***** 15 + ***** ***** ***** | ***** ***** ***** 14 + ***** ***** ***** | ***** ***** ***** 13 + ***** ***** ***** | ***** ***** ***** 12 + ***** ***** ***** | ***** ***** ***** 11 + ***** ***** ***** | ***** ***** ***** 10 + ***** ***** ***** | ***** ***** ***** 9 + ***** ***** ***** | ***** ***** ***** 8 + ***** ***** ***** | ***** ***** ***** 7 + ***** ***** ***** | ***** ***** ***** 6 + ***** ***** ***** | ***** ***** ***** 5 + ***** ***** ***** | ***** ***** ***** 4 + ***** ***** ***** | ***** ***** ***** 3 + ***** ***** ***** | ***** ***** ***** 2 + ***** ***** ***** | ***** ***** ***** 1 + ***** ***** ***** | ***** ***** ***** -------------------------------------------- Low Medium High

POLICE EXPENDITURE, TRICHOTOMY 

Page 9: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

libname mydata 'a:\';libname library 'a:\'; options ps=66 nodate nonumber;

proc univariate data=mydata.cities plot normal;var populat;title1 'Univariate and EDA Statistics';run;

Page 10: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Univariate and EDA Statistics  PPD 404  Univariate Procedure Variable=POPULAT NUMBER OF RESIDENTS, IN 1,000S  Moments  N 63 Sum Wgts 63 Mean 587.4127 Sum 37007 Std Dev 1114.554 Variance 1242231 Skewness 5.090201 Kurtosis 30.74326 USS 98756687 CSS 77018305 CV 189.7395 Std Mean 140.4206 T:Mean=0 4.183237 Pr>|T| 0.0001 Num ^= 0 63 Num > 0 63 M(Sign) 31.5 Pr>=|M| 0.0001 Sgn Rank 1008 Pr>=|S| 0.0001 W:Normal 0.468356 Pr<W 0.0001

Page 11: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Quantiles(Def=5)  100% Max 7896 99% 7896 75% Q3 641 95% 1949 50% Med 278 90% 906 25% Q1 100 10% 72 0% Min 56 5% 60 1% 56 Range 7840 Q3-Q1 541 Mode 56   Extremes  Lowest Obs Highest Obs 56( 30) 1511( 56) 56( 24) 1949( 55) 58( 46) 2816( 54) 60( 21) 3367( 53) 65( 51) 7896( 52)

Page 12: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Univariate and EDA Statistics  PPD 404   Stem Leaf # Boxplot 7 9 1 7 6 6 5 5 4 4 3 3 4 1 * 2 8 1 * 2 1 59 2 0 1 2 1 | 0 555556666777778889 18 +--+--+ 0 111111111111111111111111222222333344444 39 *-----* ----+----+----+----+----+----+----+---- Multiply Stem.Leaf by 10**+3

Page 13: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Univariate and EDA Statistics  PPD 404  Univariate Procedure Variable=POPULAT NUMBER OF RESIDENTS, IN 1,000S  Normal Probability Plot 7750+ * | | 6250+ | | 4750+ | | 3250+ * ++++ | *++++ | +++++ 1750+ ++++ ** | +++++ * | ++++********** 250+ * * *** ****************** +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2  

Page 14: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Exercise 2  Identify which of the following are variables and which are constants. 

a. Size of citiesb. Denverc. Gender statusd. Computer literacye. Malef. College graduateg. Political party preferenceh. Grades on an examination

 

Page 15: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Answers  Identify which of the following are variables and which are constants. 

V a. Size of citiesC b. DenverV c. Gender statusV d. Computer literacyC e. MaleC f. College graduateV g. Political party preferenceV h. Grades on an examination

 

Page 16: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Identify which of the following are discrete variables and which are continuous variables. 

a. Region of the country: North, South, etc.b. TV viewing: number of hours per weekc. Agency size: number of full-time employeesd. Crime rate: serious crimes per 1,000 populatione. Your hometown: Pasadena, Newport Beach, etc.f. Political conservatism: percent voting Republicang. Contest results: first place, second place, etc.h. Opinion: five-point scale from "Strongly Agree" to "Strongly Disagree" 

 

Page 17: The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.

Answers

Identify which of the following are discrete variables and which are continuous variables. D a. Region of the country: North, South, etc.C b. TV viewing: number of hours per weekC c. Agency size: number of full-time employeesC d. Crime rate: serious crimes per 1,000 populationD e. Your hometown: Pasadena, Newport Beach, etc.C f. Political conservatism: percent voting RepublicanC or D g. Contest results: first place, second place, etc.C or D h. Opinion: five-point scale from "Strongly Agree"

to "Strongly Disagree"