Histograms REVIEWED

Post on 19-Jan-2016

38 views 0 download

description

Histograms REVIEWED. Histograms are more than just an illustrative summary of the data sample. Typical examples are shown below (in R: see help(hist) for the use and adjustment of histogram plots, or class09a.R for some advanced uses of the hist()-function.). NY Central Park - PowerPoint PPT Presentation

Transcript of Histograms REVIEWED

NY Central ParkMonthly mean temperature anomalies

Albany AirportMonthly mean temperature anomalies

Histograms are more than just an illustrative summary of the data sample.Typical examples are shown below (in R: see help(hist) for the use and adjustment of histogram plots, or class09a.R for some advanced uses of the hist()-function.)

Shown are 31 data points from Albany’s monthly mean temperature anomalies(with respect to the climatological seasonal cycle 1981-2010)

Shown are 31 data points from Albany’s monthly mean temperature anomalies(with respect to the climatological seasonal cycle 1981-2010)

the sample xi (i=1…N) has a rangedefined by minimum and maximumvalues in the sample.

Step 1: find a large-enough rangethat covers all sample points

Step2: break-up the range into equal-sized binsxk xk+1

Δx

3

Step 3: Count number of samplesfalling into bins: hk=3

1 2

Shown are 31 data points from Albany’s monthly mean temperature anomalies(with respect to the climatological seasonal cycle 1981-2010)

the sample xi (i=1…N) has a rangedefined by minimum and maximumvalues in the sample.

Step 1: find a large-enough rangethat covers all sample points

Step2: break-up the range into equal-sized binsxk xk+1

Δx

3

Step 3: Count number of samplesfalling into bins: hk=3

1 2

Shown is a histogram of Albany’s monthly mean temperature anomalies using n=31 sample (with respect to the climatological seasonal cycle 1981-2010)

Breakpoints from -9 to +9 deg. C19 breakpoints18 bins of width Δx = 1 C

Frequency hk is the number ofsamples falling into the k-th bin:

xk ≤ xobs < xk+1

hk counts the number of ‘events’

Note the n is the sample sizeand the sum only adds up to nif the bins cover the full range of the data sample.

The relative frequency is is a measure of the probabilityfor the event that the sample xobs is falling into the k-th bin.

Shown is a histogram of Albany’s monthly mean temperature anomalies using n=31 sample (with respect to the climatological seasonal cycle 1981-2010)

Shown are histograms based on 360 data points of Albany’smonthly mean temperature anomalies.

In this example the blue histogram shows the density fk calculated from a sample with n=31 sample data and 18 bins of width 1 deg C. The black line shows the density calculated with sample size n=360, bin width 0.5 deg C.In red, the density estimate based on n=720 samples, bin width 1/3 deg C.

Shown are density plots from Albany’s monthly mean temperature anomalies(with respect to the climatological seasonal cycle 1981-2010)

Note: The exact mathematical formalism is not to be discussed in this Introductory course.

Shown are density plots from Albany’s monthly mean temperature anomalies(with respect to the climatological seasonal cycle 1981-2010)

The density plot of monthly mean temperature anomalies shows a typical shape:unimodal, with characteristicsymmetric flanks where the density increases towards the center, and very low density in the tails.

Shown are density plots from Albany’s monthly mean temperature anomalies(with respect to the climatological seasonal cycle 1981-2010)

Consider a random number formed by summing (or averaging) independent random variables with arbitrary probability density distributions.

As the number of random variables increases in the summation (averaging) process, the more will the distribution of the newly formed random variable approach a Gaussian Distribution.

Average over 5 uniformlydistributed random variables

(repeated 10000 time)