Class 2

36
ECON 0592: Applied Statistics [50 marks; Credit 6] (Computer Application Based Course) 1 ECON0592

description

fsdfsd

Transcript of Class 2

ECON 0592: Applied Statistics [50 marks; Credit 6]

(Computer Application Based Course)

1 ECON0592

Summarizing Data: Descriptive Statistics

and Histograms

ECON0592 2

Key Concepts

• Central tendency, Dispersion, Histogram, Kurtosis, Mean, Median, Mode, Range, Sample variance, Skewness, Standard deviation, and Standard error.

ECON0592 3

Descriptive Statistics

• Input the following data into Column A cells A1 through A15: 7, 6, 5, 4, 5, 6, 2, 3, 4, 1, 6, 9, 8, 7, 2

• From the Data tab, choose the Data Analysis function.

• Select the Descriptive Statistics option, under Analysis Tools

• Cilck OK.

ECON0592 4

Descriptive Statistics

• Enter the range of data in

Input Range.

• Choose whether you want

to group by row or column.

• Select Output Range: (Make

sure you click inside the box

first or you will create

problems in your Input Range.)

ECON0592 5

Descriptive Statistics

• Click Summary Statistics. • The default Confidence Level for Mean is 95%. However, you may change between 0 and 100. • If you want the 3rd largest value in the data set you input 3 in the option Kth largest. Values may range from 1 to the number of data points that you have in your data set.

ECON0592 6

Descriptive Statistics

• Similarly Kth Smallest will

give you the kth smallest

value in the data set.

• Both Kth Largest and Kth

Smallest options may be left

unchecked.

• Click OK.

ECON0592 7

Descriptive Statistics

ECON0592 8

Histograms

• Excel provides a table with the frequency data as well as the actual histogram graph.

• Remember a histogram does not have spaces between the bars because both the x and y axes are scales and show quantitative data.

ECON0592 9

Histograms

• Suppose you have collected some

scores for your

friends in the department.

• The scale ranges from 0 to 1000.

• You decide to graph the data

in a histogram.

ECON0592 10

Histograms

• The next step is to set the bin/class range.

• A histogram usually has 5 to 15 bins.

• Lets choose 7 bins in this example.

• Bin/class range = Range / 7 = 97.28

• Using this decimal value would be confusing, we can round up to a whole number say 100.

• So the first Bin value is 100, the second would be 200 and so on.

ECON0592 11

Histograms

ECON0592 12

Histograms

• Now you’re ready to create a histogram.

• Under the Data tab click on the Data Analysis function.

• Select Histogram from the list of options and click OK.

ECON0592 13

Histograms

• Select the Pareto (sorted Histogram) to sort the bins in descending order.

• Selecting the Cumulative Percentage check box tells Excel to plot a line showing cumulative percentages in your histogram.

• You have to check the Chart Output box to get a histogram chart, else you wont get the histogram. Only the frequency distribution will be displayed.

ECON0592 14

Histograms

• To close gaps between bars, right click in any column in the graph and click the Format Data Series option from the menu.

ECON0592 15

Histograms

• Click on Series Options => Gap Width.

• The other options can be used to get the desired histogram.

ECON0592 16

Normal Distribution

• NORM.DIST calculates the probability or distribution of data to the left of your value and requires the mean and the standard deviation statistics.

• This actually calculates the percentile i.e what percentile of data is to the left of your value and not the standard units.

ECON0592 17

Normal Distribution

• The format for NORM.DIST is NORM.DIST(x value, mean, standard dev,1)

• The argument 1 (TRUE) tells Excel to compute the normal cumulative distribution. If the last argument is 0, Excel returns the actual value of the normal random variable.

• NORM.INV will convert the percentile values to measured units.

ECON0592 18

Normal Distribution

• These functions are most useful at:

– Percentile Calculation Problems including area charts. (NORM.DIST)

– Converting Percentiles to measured units (NORM.INV)

– Converting Measured Units to z scores (STANDARDIZE)

– Rank and Percentile

ECON0592 19

Normal Distribution

• Percentile calculation problems

• Example:- A transport company provides delivery service 7 days a week to stores selling fire logs. The data you collected for the months of Dec and Jan seems to follow a normal distribution. The average number of stores requiring deliveries on any given day is 100 and the standard deviation is 15 stores.

ECON0592 20

Normal Distribution

ECON0592 21

Normal Distribution

• Calculating the area to the left of a value.

– What % of the time did the transport company deliver to less than 90 stores during last Dec and Jan? In other words how much data is in the left tail?

ECON0592 22

Normal Distribution

• Click on empty cell.

• Use the NORM.DIST function.

• x = 90, mean = 100, std = 15, cumulative = 1

• Ans = 0.2524 or 25%

ECON0592 23

Normal Distribution

• Calculating the area between 2 values.

– What % of the time did the transport company deliver between 90 and 120 stores during last Dec and Jan?

– Sol:

=NORM.DIST(120, 100,15,1) – NORM.DIST(90,100,15,1)

ECON0592 24

Normal Distribution

• Calculating the area to the right of a value.

– What % of the time did the transport company deliver to 130 or more stores during last Dec and Jan?

– Ans: 2.3%

ECON0592 25

Normal Distribution

• Graphing a normal distribution (Area Chart)

• There are three steps in this process:

– Step1: Sort the data into a sequential order.

– Calculate the height of the normal distribution for each x value (NORM.DIST)

– Create the graph.

ECON0592 26

Normal Distribution

• Example: You would like to graph the distribution of the cost of airplane tickets purchased by employees over the past three months. The average cost is Rs. 485 and the standard deviation is Rs 260. You know that ticket prices are normally distributed based on historical data.

ECON0592 27

Normal Distribution

• Input the following ticket prices in column A:

ECON0592 28

Normal Distribution

• = NORM.DIST( x = A2, Mean, Std, 0)

• Highlight the second column of values generated.

• Insert Area Chart.

ECON0592 29

Normal Distribution

• Converting percentiles to measured units (NORM.INV)

– Lets use the same example used earlier, where a distribution company provides delivery service 7 days a week to stores selling fire logs. The average number of stores requiring deliveries on any given day is 100 and the standard dev is 15 stores.

– Calculate the number of stores corresponding with the 99th percentile.

ECON0592 30

Normal Distribution

ECON0592 31

Normal Distribution

• Use the NORM.INV() function

• Probability = 0.99, mean = 100, std = 15

• The answer is 134.8952 which means that the 99th percentile for this data set is about 135 stores. In other words, about 135 stores or less required deliveries on any given day 99 % of the time period.

ECON0592 32

Normal Distribution

• Convert the measured value of 135 stores to a z score.

ECON0592 33

Normal Distribution

• Use the STANDARDIZE() function.

• x = 135, mean = 100, std = 15.

ECON0592 34

Normal Distribution

• Calculate Rank and Percentile

• Example: You want to calculate the rank and percentile of airplane tickets costs purchased by you. The average cost is Rs 485 and the standard deviation is Rs 260.

• Use the data given along side.

ECON0592 35

Normal Distribution

• Go to Data => Data Analysis => Rank and Percentile .

ECON0592 36