David C. Wong and Daniel Tong *

1
Numerical instability in PM simulations has been reported in earlier releases of the Community Multiscale Air Quality (CMAQ) model [1, 2, 3]. Tong et al. [3] demonstrated that a tiny change of NO x emissions (0.5 moles/sec) in one grid cell over the Middlesex County of Connecticut can trigger up to 1 ug/m 3 change in PM 2.5 concentrations in the Ohio Valley and southern California in less than 48 hours. Recent efforts by the community have substantially reduced the numeric instability, but the numeric instability remains a problem in PM simulations [3]. We report here the results of our test on version 4.5.1 of CMAQ. Different from earlier studies on this subject, this work focuses on the effect of numerical noise compounded into the instability issue, something we call a butterfly effect Small perturbations can be introduced by parallel execution due to a different arithmetic sequence or numerical truncation. Aggressive compiler optimization can be a contributing factor. When such perturbations are introduced into CMAQ, the science processes such as CHEM, AERO, and CLDPROC will amplify it. David C. Wong and Daniel Tong * Atmospheric Sciences Modeling Division + , Air Resources Laboratory, National Oceanic and Atmospheric Administration + In partnership with U.S. Environmental Protection Agency, Research Triangle Park, NC 27711 * On assignment from Science and Technology Corporation, 10 Basil Sawyer Drive, Hampton, VA 23666-1393 Email: [email protected] (David Wong) NUMERICAL NOISE IN PM SIMULATION IN CMAQ 3. Results Acknowledgements: The research presented here was performed under the Memorandum of Understanding between the U.S. Environmental Protection Agency (EPA) and the U.S. Department of Commerce's National Oceanic and Atmospheric Administration (NOAA) and under agreement number DW13921548. This work constitutes a contribution to the NOAA Air Quality Program. Although it has been reviewed by EPA and NOAA and approved for publication, it does not necessarily reflect their policies or views. 1. Introduction 4. Conclusion and Future Work References 2. Experiment Design Tong and Mauzerall [3] report a small adjustment of emissions at an east coast location affects PM concentrations in the west coast within a few hours. This experiment examined whether parallel execution or the compiler introduces numerical noise that contributes to the PM instability. In general, it is known that any subset of the following four science processes: VDIFF, HADV, ZADV, and HDIFF, does not cause any difference when comparing simulation results from various executions using different parallel decompositions. Table 1 summarizes the general experiment setup. In our experiment, test case 1 and case 2 focus on identifying the origin of numerical noise with respect to the transport processes. Case 3 is a remedial solution. The only difference in this case is an additional compilation flag, -C, which creates an executable to perform run time array bounds check. Full model runs, as in case 2p and case 3p, are the extension of case 2 and case 3, respectively, so the impact of the numerical noise as well as the proposed solution can be examined. Since the mass conservation yamo scheme was introduced as an alternative to the PPM scheme in CMAQ 4.5, we have also tested cases 2 and 3 with PPM. Two cases were run with both a baseline emission (B) input and a new emission (N) input, which is identical to the baseline emission input but with a 0.5 moles/sec NO x added in one surface grid cell over South Carolina. Table 2 shows which run The experiment was run with 1x1, 2x2, and 4x4 processor configurations and conducted on two Linux clusters as well as on an IBM eServer 1600. We compiled the code with pgf90 5.0 and 6.0, both with -Mfixed -Mextend -O2 options on the Linux clusters, and with xlf90 -qfixed=132 -O3 -qstrict -Q on an IBM. Layer 1 total aerosol nitrate, ANO3, was selected for the illustration purpose. Similar noise behaviour is exhibited in other species as well. Results of the first, 8th and 24th hour of simulation were used in the Figures to show the progression of the numerical noise. There is no difference among results from Case 1. However, once HDIFF is included as in Case 2, differences were found in the simulations using the yamo scheme. Figures 1 (a) and Figure 1 (b) show where the differences are (value = 1), when comparing 1x1 and 2x2 processor configuration runs, and 1x1 and 4x4 processor configuration runs, respectively. The differences cluster around the processor boundaries and then spread across each processor domain as time progresses. No differences were found in the same test case for either the yamo or PPM scheme, for the IBM runs. ( a ) ( b ) Figure 1: Location of differences comparing 1x1 vs 2x2 (a) and 1x1 vs 4x4 (b) Figure 2 and 3 show the difference in Case 2p when comparing outputs between a 1x1 and a 2x2 processor configuration runs. Noise pops up in the western portion of the domain within a few hours. 1) Additional numerical noise was observed in the latest release of CMAQ using the Yamo advection schemewhen compiled with pgf90 (5.0 or 6.0) on a Linux cluster; 2) We recommend a new compiler option -C to pgf90 to eliminate the extra numerical noise; however, this new option will slow down CMAQ runs. Future work is needed 1) to identify the exact numerical calculation in the Yamo scheme that causes the pgf90 compiler to generate the noise; and 2) to pinpoint additional sources of the numerical noise as the butterfly problem remains in CMAQ, mostly in the upper layers. [1] Bhave, P., S.J. Roselle, F.S. Binkowski, C.G. Nolte, S. Yu, G.L. Gipson, and K.L. Schere, CMAQ aerosol module development: recent enhancements and future plans. Models-3 User’s Workshop, October 18th - 20th, 2004, Chapel Hill, NC. [2] Tong, D.Q., CMAQ in regulatory applications and remaining questions, Models-3 User’s Workshop, October 18th - 20th, 2004, Chapel Hill, NC. [3] Tong, D.Q. and D.L. Mauzerall, Numerical instability in the Community Multiscale Air Quality model and its impacts on aerosol and ozone simulations , submitted to Atmospheric Environment, 2006. The noise is amplified using the new emissions. Figure 4 shows changes in particulate nitrate in response to 0.5 moles/sec NO x increase in a SC grid cell using a 2x2 processor configuration. After 24 hours, nitrate changes are found in the west coast region with a magnitude comparable to that near the source area. Once the code was compiled with the pgf90 compiler on the Linux clusters using the -C option, the numerical noise was completely eliminated. However, the overall execution time is much longer (up to 4-fold). The -C flag is a compiler option that is very computational expensive and is used only for debugging. None of the runs were conducted on a dedicated platform, so the observed timing does not reflect the actual magnitude of the slowdown. Figure 3: Amplicication of the numerical noise in the full model: ASO4 ( a ) ( b ) Figure 4: Changes in ANO3 with the new emission Figure 2: Amplicication of the numerical noise in the full model: ANO3

description

1. Introduction. Email: [email protected] (David Wong). NUMERICAL NOISE IN PM SIMULATION IN CMAQ. David C. Wong and Daniel Tong * Atmospheric Sciences Modeling Division + , Air Resources Laboratory, National Oceanic and Atmospheric Administration - PowerPoint PPT Presentation

Transcript of David C. Wong and Daniel Tong *

Page 1: David C. Wong and Daniel Tong *

Numerical instability in PM simulations has been reported in earlier releases of the Community Multiscale Air Quality (CMAQ) model [1, 2, 3]. Tong et al. [3] demonstrated that a tiny change of NO

x emissions (0.5 moles/sec) in one grid cell over the Middlesex

County of Connecticut can trigger up to 1 ug/m3 change in PM2.5

concentrations in the Ohio Valley and southern California in less than 48 hours. Recent efforts by the community have substantially reduced the numeric instability, but the numeric instability remains a problem in PM simulations [3].

We report here the results of our test on version 4.5.1 of CMAQ. Different from earlier studies on this subject, this work focuses on the effect of numerical noise compounded into the instability issue, something we call a butterfly effect

Small perturbations can be introduced by parallel execution due to a different arithmetic sequence or numerical truncation. Aggressive compiler optimization can be a contributing factor. When such perturbations are introduced into CMAQ, the science processes such as CHEM, AERO, and CLDPROC will amplify it.

David C. Wong and Daniel Tong*

Atmospheric Sciences Modeling Division+, Air Resources Laboratory, National Oceanic and Atmospheric Administration+In partnership with U.S. Environmental Protection Agency, Research Triangle Park, NC 27711

*On assignment from Science and Technology Corporation, 10 Basil Sawyer Drive, Hampton, VA 23666-1393

Email: [email protected] (David Wong)

NUMERICAL NOISE IN PM SIMULATION IN CMAQ

3. Results

Acknowledgements: The research presented here was performed under the Memorandum of Understanding between the U.S. Environmental Protection Agency (EPA) and the U.S. Department of Commerce's National Oceanic and Atmospheric Administration (NOAA) and under agreement number DW13921548. This work constitutes a contribution to the NOAA Air Quality Program. Although it has been reviewed by EPA and NOAA and approved for publication, it does not necessarily reflect their policies or views.

1. Introduction

4. Conclusion and Future Work

References

2. Experiment Design

Tong and Mauzerall [3] report a small adjustment of emissions at an east coast location affects PM concentrations in the west coast within a few hours. This experiment examined whether parallel execution or the compiler introduces numerical noise that contributes to the PM instability. In general, it is known that any subset of the following four science processes: VDIFF, HADV, ZADV, and HDIFF, does not cause any difference when comparing simulation results from various executions using different parallel decompositions. Table 1 summarizes the general experiment setup. In our experiment, test case 1 and case 2 focus on identifying the origin of numerical noise with respect to the transport processes. Case 3 is a remedial solution. The only difference in this case is an additional compilation flag, -C, which creates an executable to perform run time array bounds check. Full model runs, as in case 2p and case 3p, are the extension of case 2 and case 3, respectively, so the impact of the numerical noise as well as the proposed solution can be examined.

Since the mass conservation yamo scheme was introduced as an alternative to the PPM scheme in CMAQ 4.5, we have also tested cases 2 and 3 with PPM. Two cases were run with both a baseline emission (B) input and a new emission (N) input, which is identical to the baseline emission input but with a 0.5 moles/sec NO

x added

in one surface grid cell over South Carolina. Table 2 shows which run was conducted in the experiment.

The experiment was run with 1x1, 2x2, and 4x4 processor configurations and conducted on two Linux clusters as well as on an IBM eServer 1600. We compiled the code with pgf90 5.0 and 6.0, both with -Mfixed -Mextend -O2 options on the Linux clusters, and with xlf90 -qfixed=132 -O3 -qstrict -Q on an IBM.

Layer 1 total aerosol nitrate, ANO3, was selected for the illustration purpose. Similar noise behaviour is exhibited in other species as well. Results of the first, 8th and 24th hour of simulation were used in the Figures to show the progression of the numerical noise.

There is no difference among results from Case 1. However, once HDIFF is included as in Case 2, differences were found in the simulations using the yamo scheme. Figures 1 (a) and Figure 1 (b) show where the differences are (value = 1), when comparing 1x1 and 2x2 processor configuration runs, and 1x1 and 4x4 processor configuration runs, respectively. The differences cluster around the processor boundaries and then spread across each processor domain as time progresses. No differences were found in the same test case for either the yamo or PPM scheme, for the IBM runs.

(a)

(b)

Figure 1: Location of differences comparing 1x1 vs 2x2 (a) and 1x1 vs 4x4 (b)

Figure 2 and 3 show the difference in Case 2p when comparing outputs between a 1x1 and a 2x2 processor configuration runs. Noise pops up in the western portion of the domain within a few hours.

1) Additional numerical noise was observed in the latest release of CMAQ using the Yamo advection schemewhen compiled with pgf90 (5.0 or 6.0) on a Linux cluster; 2) We recommend a new compiler option -C to pgf90 to eliminate the extra numerical noise; however, this new option will slow down CMAQ runs.

Future work is needed 1) to identify the exact numerical calculation in the Yamo scheme that causes the pgf90 compiler to generate the noise; and 2) to pinpoint additional sources of the numerical noise as the butterfly problem remains in CMAQ, mostly in the upper layers.

[1] Bhave, P., S.J. Roselle, F.S. Binkowski, C.G. Nolte, S. Yu, G.L. Gipson, and K.L. Schere, CMAQ aerosol module development: recent enhancements and future plans. Models-3 User’s Workshop , October 18th - 20th, 2004, Chapel Hill, NC.[2] Tong, D.Q., CMAQ in regulatory applications and remaining questions, Models-3 User’s Workshop, October 18th - 20th, 2004, Chapel Hill, NC.[3] Tong, D.Q. and D.L. Mauzerall, Numerical instability in the Community Multiscale Air Quality model and its impacts on aerosol and ozone simulations, submitted to Atmospheric Environment, 2006.

The noise is amplified using the new emissions. Figure 4 shows changes in particulate nitrate in response to 0.5 moles/sec NO

x increase in a SC grid cell using a 2x2 processor configuration. After 24 hours, nitrate changes are

found in the west coast region with a magnitude comparable to that near the source area.

Once the code was compiled with the pgf90 compiler on the Linux clusters using the -C option, the numerical noise was completely eliminated. However, the overall execution time is much longer (up to 4-fold). The -C flag is a compiler option that is very computational expensive and is used only for debugging. None of the runs were conducted on a dedicated platform, so the observed timing does not reflect the actual magnitude of the slowdown.

Figure 3: Amplicication of the numerical noise in the full model: ASO4

(a)

(b)

Figure 4: Changes in ANO3 with the new emission

Figure 2: Amplicication of the numerical noise in the full model: ANO3