Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or...

38
Validation of CSN Data Sean Raffuse Nicholas Spada Nicole Hyslop Air Quality Group UC Davis 2016 National Ambient Air Monitoring Conference St. Louis, MO August 8-11, 2016 8/16/2016 1

Transcript of Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or...

Page 1: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Validation of CSN DataSean RaffuseNicholas SpadaNicole Hyslop

Air Quality GroupUC Davis

2016 National Ambient Air Monitoring ConferenceSt. Louis, MOAugust 8-11, 2016

8/16/20161

Page 2: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Outline1. Introduction2. Current CSN Routine Validation3. Case Studies4. Future Developments

8/16/20162

Page 3: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

CSN Measurements• 33 elements by x-ray fluorescence (XRF) of PTFE filters

• S, K, Cl*• Soil elements (Fe, Al, Si, …) • Trace metals (Ni, V, Mg, …)

• Ions by ion chromatography (IC) of nylon filters• Cations

• Ammonium, Sodium, Potassium• Anions

• Nitrate, Sulfate, Chloride*• Carbon by thermal optical reflectance (TOR) of quartz filters

• Organic carbon• Elemental carbon

8/16/20163

* Chloride ion and Cl are being examined for suspected contaminationCl is invalidated and chloride is not yet being reported

Page 4: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

The CSN Validation Process – Many Involved

8/16/20164

Operator

Sample handling

Analytical labs

State/local analysts

Page 5: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Our Data Validation Philosophy• All data should be validated• Definitive evidence is required to invalidate records• Do not censor the data• Revisit and improve checks over time

8/16/20165

Page 6: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Checks- Data integrity- Automated sampler operating data- Sample shipping conditions- Comparison of measurements by different analytical

techniques- Filter swaps between sequential dates

8/16/20166

Page 7: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Ratio Analysis – Sulfur/Sulfate

8/16/20167

Sulfur (by XRF) correlates with Sulfate (by IC)• If (3*S)/Sulfate > 1.5 OR

(3*S)/Sulfate < 0.66 then flag as outlier (code 5) and examine

• An indication that either XRF or IC may be invalid on this sample date

• However, if no reason for discrepancy can be found, the data are not invalidated

Page 8: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Ratio Analysis – Sulfur/Sulfate

8/16/20168

Page 9: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Ratio Analysis – Sulfur/Sulfate

8/16/20169

Page 10: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Ratio Analysis – K/K+

8/16/201610

When well above detection limits, Potassium/Potassium ion ratios also show good agreement thus far• K+ = 1.16 * K• R2 = 0.96• No flagging criteria established yet

Page 11: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Reconstructed Mass

8/16/201611

SoilOrganic matterAmmonium sulfateAmmonium nitrateElemental carbonSea salt

An estimate of total PM2.5 mass based on speciated measurements and assumptions of chemical composition

Page 12: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Reconstructed Mass - CalculationWe reconstruct total PM2.5 mass from chemical speciation measurements as:PM2.5 = Ammonium sulfate + Ammonium Nitrate + Elemental Carbon +

Organic Mass + 1.8 Chlorine + Soilwhere• Ammonium Sulfate = 4.125 * Sulfur• Ammonum Nitrate = 1.29 * Nitrate• Organic Mass = 1.4 * Organic Carbon• Soil = 2.2 Al + 2.49 Si + 1.63 Ca + 2.42 Fe + 19.4 Ti

This reconstructed mass is compared to PM2.5 mass from nearby FEM and FRM monitors

8/16/201612

Page 13: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Reconstructed Mass – Network Wide

8/16/201613

Page 14: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Reconstructed Mass

8/16/201614

Yellow = AirNowBlack = Reconstructed

Page 15: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Case studiesHigh potassium in PhoenixHigh sulfate in Philadelphia

8/16/201615

Page 16: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Phoenix – High Potassium

8/16/201616

Page 17: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Phoenix – January 1

Mg

K

S

Zn

Cu

Al

Fe

Si

Page 18: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

8/16/201618 Courtesy of Imgur

Page 19: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Philadelphia – High Sulfate

8/16/201619

• Sulfate is high compared to neighbors and nearby days

• Is this reasonable?

Page 20: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Ratio Analysis – Sulfur/Sulfate

8/16/201620

Page 21: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Southeastern Pennsylvania – Sulfur/Sulfate Ratios

8/16/201621

Page 22: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Two Odd Sites – Sulfur and Potassium Ratios

8/16/201622

Sulfur/Sulfate

Potassium/Potassium Ion

Page 23: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Philadelphia – High Sulfate• It would appear that elements were low for two sites on January 25• We requested reanalysis for both XRF (elements) and IC (ions)• Reanalysis returned very similar resultsActions• Added qualifier flag for values outside of sulfur/sulfate ratio• Did not invalidate results• Continued investigation

• How frequently does this occur?• Is this isolated to specific sites or conditions?• What are the potential mechanisms?

8/16/201623

Page 24: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Possible Filter Swap between Consecutive Dates

8/16/201624

Ratio plots can indicate a swap may have occurred

• Swaps can occur in the field or the lab

• By swapping dates, data may look better

• But need evidence to adjust values

• Instances are examined as potential swaps

• Comments added to samples

Page 25: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Improvements in Progress• Historical context – percentile checks• Spatial outliers• Optical Transmissometer measurements vs. TOR carbon• Feedback with state and local validators

8/16/201625

Page 26: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Historical Outliers• We are developing a

database of the CSN historical archive

• Will allow us to routinely compare measured values with the historical range

• Can automatically flag and examine samples that are outside the norm for the site

8/16/201626

Page 27: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Spatial Outliers

8/16/201627

• Many pollutants are spatially autocorrelated (concentrations at one site tend to be similar to neighboring sites)

• We can highlight and examine sites that are different from their neighbors for autocorrelated species

Sulfate

Page 28: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Spatial Outliers

8/16/201628

We can highlight and examine sites that are different from their neighbors for autocorrelated species

Page 29: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Optical Measurement vs. TOR Carbon• We can corroborate elements with ions using sulfur/sulfate and

potassium/potassium ion ratios• Currently, we cannot corroborate carbon with anything• A new Transmissometer is now in testing and will begin operational

analysis soon• For details, see Warren White’s talk:

8/16/201629

• Filter Transmittance Measurements: Experience with IMPROVE and Plans for CSN

• 9:30am Wednesday -Chemical Speciation Technical Session

Page 30: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Feedback from State and Local ExpertsMany of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process.We’re interested in hearing your ideas.

[email protected] – Sean Raffuse – Data Management [email protected] – Nick Spada – CSN Data [email protected] – Support list watched by STI, EPA, and UC Davis

Thank you!

8/16/201630

Page 31: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Challenges Unique to CSN• We are removed from sample operators

• Local conditions or sampler issues may not be known• No data history with the current process

• Although we have historical data, it was handled by a different lab

• No optical Transmissometer™ measurements yet reported • Optical measurements added with the new contract so we

don’t have much data to compare against the TOR carbon measurements

• Sample loadings are often low• Many measurements are below detection

8/16/201631

Page 32: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Data Integrity – Import ValidationWhen electronic data are ingested, the import script performs several checks• Is the file well formed (correct columns found)?• Do all filter records match with existing sites?• Are number columns numbers; date columns dates?• Do records with the same ID already exist in our database?

8/16/201632

Page 33: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Data Integrity – Typo ChecksOperational data from field data sheet are manually entered and mistakes are inevitable (~ 1%)• Date Checks

• Sample Start Date – Intended Use Date = 0• Sample End Date – Sample Start Date = 24 hours• For inconsistent dates, request sample handling lab to check

original sheets for data entry typos• Sample Start Date = 2015-12-20• Sample End Date = 2015-11-21• Intended Use Date = 2015-11-20

• Flow Rates• Flow rates outside of the normal range on samples still marked

as ‘valid’ are cross-checked with sample handling lab

8/16/201633

Page 34: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

Automated FlaggingQualifier flags are added to records that fail certain criteria• Sample delivery temperature < 4°C

• Flag as TT (Transport temperature out of spec.)• Measured concentration below MDL

• Flag as MD (Value less than MDL)• Negative measured concentration

• Flag as 9 (Negative value detected – zero reported)• Flow rate

• Flag as AH (Flow rate average out of spec.) • This is terminal (i.e., the sample is invalid)

8/16/201634

Page 35: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

High Vanadium

8/16/201635

• Here we see a single V concentration, well above the others• Is it real?

Page 36: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

High Vanadium – Checking the Ratios

8/16/201636

• Both sulfur/sulfate and K/K+

are as expected for the date in question (Dec 20)

• No sample-wide problem detected

Page 37: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

High Vanadium – Historical RangeDoes the measured value have precedence at the site?

8/16/201637

0.06

0.93 The measured value is at about the 93rd percentile for this site over the last 10 years.

High, but within the normal range.

Page 38: Validation of CSN Data - US EPA · Many of you have site specific, state specific, CSN specific or general knowledge that could improve our validation process. We’re interested

High Vanadium – Broader ContextAdding January data makes the high value in December look more reasonable.

8/16/201638