Environmental Data Analysis with MatLab Lecture 2: Looking at Data.
-
Upload
roy-endicott -
Category
Documents
-
view
221 -
download
1
Transcript of Environmental Data Analysis with MatLab Lecture 2: Looking at Data.
![Page 1: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/1.jpg)
Environmental Data Analysis with MatLab
Lecture 2:Looking at Data
![Page 2: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/2.jpg)
Lecture 01 Using MatLabLecture 02 Looking At DataLecture 03 Probability and Measurement Error Lecture 04 Multivariate DistributionsLecture 05 Linear ModelsLecture 06 The Principle of Least Squares Lecture 07 Prior InformationLecture 08 Solving Generalized Least Squares Problems Lecture 09 Fourier SeriesLecture 10 Complex Fourier SeriesLecture 11 Lessons Learned from the Fourier TransformLecture 12 Power SpectraLecture 13 Filter Theory Lecture 14 Applications of Filters Lecture 15 Factor Analysis Lecture 16 Orthogonal functions Lecture 17 Covariance and AutocorrelationLecture 18 Cross-correlationLecture 19 Smoothing, Correlation and SpectraLecture 20 Coherence; Tapering and Spectral Analysis Lecture 21 InterpolationLecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-TestsLecture 24 Confidence Limits of Spectra, Bootstraps
SYLLABUS
![Page 3: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/3.jpg)
purpose of the lecture
get you started
looking critically at data
![Page 4: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/4.jpg)
Objectiveswhen taking a first look at data
Understand the general character of the dataset.
Understand the general behavior of individual parameters.
Detect obvious problems with the data.
![Page 5: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/5.jpg)
Tools for Looking at Datacovered in this lecture
reality checks
time plots
histograms
rate information
scatter plots
![Page 6: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/6.jpg)
Black Rock Forest Temperature
I downloaded the weather station data from the International Research Institute (IRI) for Climate and Society at Lamont-Doherty Earth Observatory, which is the data center used by the Black Rock Forest Consortium for its environmental data. About 20 parameters were available, but I downloaded only hourly averages of temperature. My original file, brf_raw.txt has time in a format that I thought would be hard to work with, so I wrote a MatLab script, brf_convert.m, that converted it into time in days, and wrote the results into the file that I gave you.
![Page 7: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/7.jpg)
format conversion
calendar date/time
days from start of first year of data
sequential time variable need for data analysisbut
format conversions provide opportunity for error to creep into dataset
0100-0159 2 Jan 1997 1.042
![Page 8: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/8.jpg)
Reality Checks
properties that your experience tells you that the data must have
check you expectations against the data
![Page 9: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/9.jpg)
Reality ChecksWhat do you expect the data to look like?
hourly measurements
thirteen years of data
location in New York (moderate climate)
![Page 10: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/10.jpg)
take a moment ...
to sketch a plot of what you expect the data to look like
![Page 11: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/11.jpg)
Reality ChecksWhat do you expect the data to look like?
hourly measurements
thirteen years of data
location in New York (moderate climate)
time increments by 1/24 day per sample
about 24*365*13 = 113880 lines of data
temperatures in the -20 to +35 deg C range
diurnal and seasonal cycles
![Page 12: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/12.jpg)
Does time increment by 1/24 days per sample?
0 17.2700 0.0417 17.8500 0.0833 18.4200 0.1250 18.9400 0.1667 19.2900
1/24 = 0.0417
Yes
D(1:5,:)
![Page 13: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/13.jpg)
Are there about 24*365*20 = 113880 lines of data ?
length(D)
110430
Yes
![Page 14: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/14.jpg)
temperatures in the -20 to +35 deg C range?
diurnal and seasonal cycles?
![Page 15: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/15.jpg)
annual cycle
cold spikes
hot spike
data drop-outs
-20
to +
35 r
ange
Temperatures in the -20 to +35 deg C range? MostlyDiurnal and seasonal cycles? Certainly seasonal.
![Page 16: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/16.jpg)
Data Drop-outs common in datasets
the instrument wasn’t working for a while …
take two forms:
missing rows of table
data set to some default value
0
n/a
-999all common
![Page 17: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/17.jpg)
cold spike
diurnal cycle
data drop-out
50 days of data from winter 50 days of data from summer
![Page 18: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/18.jpg)
Histograms
determine range of the majority of data values
quantifies the frequency of occurrence of data at different data values
easy to spot over-represented and under-represented values
![Page 19: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/19.jpg)
MatLab code for Histogram
Lh = 100; dmin = min(d); dmax = max(d); bins = dmin+(dmax-dmin)*[0:Lh-1]’/(Lh-1); dhist = hist(d, bins)’;
![Page 20: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/20.jpg)
temperature, ºC
coun
ts
Histogram of Black Rock Forest temperatures
![Page 21: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/21.jpg)
B)A)
temperature, ºC
coun
ts
Alternate ways of displaying a histogram
![Page 22: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/22.jpg)
Series of histograms, each on a relatively short time interval of data
Advantage: Shows the way that the frequency of occurrence of data varies with time
Disadvantage: Each histogram is computed using less data, and so is less accurate
Moving-Window Histograms
![Page 23: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/23.jpg)
-60
0
40
tem
pera
ture
, C
0 5000time, days
Moving-Window Histogramof Black Rock Forest temperatures
![Page 24: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/24.jpg)
good use of FOR loop
offset=1000; Lw=floor(N/offset)-1; Dhist = zeros(Lh, Lw); for i = [1:Lw]; j=1+(i-1)*offset; k=j+offset-1; Dhist(:,i) = hist(d(j:k), bins)'; end
![Page 25: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/25.jpg)
Rate Information
how fast a parameter is changing with time
or with distance
![Page 26: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/26.jpg)
finite-difference approximation to derivative
![Page 27: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/27.jpg)
![Page 28: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/28.jpg)
MatLab code for derivative
N=length(d);dddt=(d(2:N)-d(1:N-1))./(t(2:N)-t(1:N-1));
![Page 29: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/29.jpg)
0 500 10000
1
2
3
4
5
6
7
8
9
10
discharge, cfs
time,
day
s
-500 0 5000
1
2
3
4
5
6
7
8
9
10
d/dt discharge, cfs / day
time,
day
s
hypothetical storm eventnote that more time has negative dd/dt
raindraining of land
![Page 30: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/30.jpg)
![Page 31: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/31.jpg)
Hypothesisrate of change in discharge
correlates with
amount of discharge
logic
a river is bigger when it has high discharge
a big river flows faster than a small river
a river that flows faster drains away water faster(might only be true after the rain has stopped)
![Page 32: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/32.jpg)
MatLab Scriptpurpose: make two separate plots, one for times of increasing discharge, one for times of decreasing dischargepos = find(dddt>0); neg = find(dddt<0); - - - plot(d(pos),dddt(pos),'k.'); - - - plot(d(neg),dddt(neg),'k.');
![Page 33: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/33.jpg)
![Page 34: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/34.jpg)
Atlantic Rock Dataset
I downloaded rock chemistry data from PetDB’s website at www.petdb.org. Their database contains chemical information about ocean floor igneous and metamorphic rocks. I extracted all samples from the Atlantic Ocean that had the following chemical species: SiO2, TiO2, Al2O3, FeOtotal, MgO, CaO, Na2O and K2O My original file, rocks_raw.txt included a description of the rock samples, their geographic location and other textual information. However, I deleted everything except the chemical data from the file, rocks.txt, so it would be easy to read into MatLab. The order of the columns is as is given above and the units are weight percent.
![Page 35: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/35.jpg)
Using scatter plots to look for correlations among pairs of the eight chemical species8! / [2! (8-2!)] = 28 plots
![Page 36: Environmental Data Analysis with MatLab Lecture 2: Looking at Data.](https://reader034.fdocuments.us/reader034/viewer/2022052414/56649c915503460f9494b403/html5/thumbnails/36.jpg)
Al203
Ti02Al203
Si02
K20
Fe0
Mg0
Al203
A) B)
C) D)
four interesting scatter plot