Severe weather data analysis

20
Severe Weather Data Analysis Nicholas Brooks STAT 4001 12/8/2015

Transcript of Severe weather data analysis

Page 1: Severe weather data analysis

Severe Weather Data Analysis

Nicholas Brooks STAT 4001 12/8/2015

Page 2: Severe weather data analysis

Objective

Analyze severe weather data and show any indications of patterns or trends in severe weather data to better understand the behavior of severe weather

Page 3: Severe weather data analysis

The Problem

Over recent decades there have been some noticeable changes in the behavior of severe weather. Severe weather is becoming increasingly more violent and frequent compared to the past.

Page 4: Severe weather data analysis

The Data

Original dataset had 1.3 million observations Tedious cleaning and aggregating this massive dataset Final dataset consists of Severe weather observations from 1950 to

2015 Includes variables such as Damage($), Event type, Temperature, and

Total Count of severe weather event

Page 5: Severe weather data analysis

Descriptive Statistics I

Page 6: Severe weather data analysis

Descriptive Statistics I

Page 7: Severe weather data analysis

Descriptive Statistics II

Page 8: Severe weather data analysis

Descriptive Statistics II

Page 9: Severe weather data analysis

Descriptive Statistics Summary

Total event count plotted overtime shows a rapid increase in severe frequency

Property damage($) plotted overtime shows a noticeable increase in damage dealt

alarming increases starting in the 1990s 2010 – 2019 could prove to be the decade with the most severe

weather activity over the next 5 year at this rate

Page 10: Severe weather data analysis

Time series forecast The time series plot of Property Damage($) on the left is before log transformation and

after log transformation on the right

Page 11: Severe weather data analysis

Information for linear modeling

Data used in the following simple and multiple linear regression models are from the month of April throughout 1955-2015

Unfortunately these models don’t consider the yearly April data as a time series

proceed with the simple and multiple linear models regardless of time

Page 12: Severe weather data analysis

Linear modeling I

lm(formula = DAMAGE_PROPERTY ~ TotalCount) Coefficients: (Intercept) TotalCount -27213230 214440 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -27213230 78320213 -0.347 0.729 TotalCount 214440 26280 8.160 2.94e-11 *** Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Multiple R-squared: 0.5302, Adjusted R-squared: 0.5222 F-statistic: 66.58 on 1 and 59 DF, p-value: 2.939e-11

Page 13: Severe weather data analysis

Linear modeling I

Total Count of severe weather events is moderately correlated to Property Damage ($)

> AIC(modelfff) [1] 2607.031 Although this is a high AIC level for a linear model we will use it in

comparison to a multiple linear regression model based of this simple linear model

Page 14: Severe weather data analysis

Linear modeling I

> b=boxcox(modelfff) > y1<-DAMAGE_PROPERTY^.2 > modelffh=lm(y1~TotalCount) > modelffh

Call: lm(formula = y1 ~ TotalCount)

Coefficients: (Intercept) TotalCount 34.784599 0.004806

Page 15: Severe weather data analysis

Linear modeling I

lm(formula = y1 ~ TotalCount) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.478e+01 1.824e+00 19.076 < 2e-16 *** TotalCount 4.806e-03 6.119e-04 7.855 9.62e-11 *** Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Multiple R-squared: 0.5112, Adjusted R-squared: 0.5029 F-statistic: 61.7 on 1 and 59 DF, p-value: 9.619e-11

> AIC(modelffh) [1] 462.8133

Page 16: Severe weather data analysis

Linear modeling II

The multiple regression equation is used to predict property damage($) with more independent variables than the simple linear model

lm(formula = DAMAGE_PROPERTY ~ DAMAGE_CROPS + Hail + Tornado + Heavy.Rain + Flash.Flood + Flood + Wildfire + High.Wind + Heavy.Snow + Drought + Heat + Tropical.Storm)Coefficients: (Intercept) DAMAGE_CROPS Hail Tornado Heavy.Rain -3.389e+08 -9.500e-01 1.491e+05 3.922e+06 -8.237e+06 Flash.Flood Flood Wildfire High.Wind Heavy.Snow -2.674e+06 3.911e+06 -1.542e+07 1.549e+06 -3.781e+06 Drought Heat Tropical.Storm 2.616e+06 -7.827e+06 9.985e+08

Page 17: Severe weather data analysis

Linear modeling II

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.389e+08 7.462e+07 -4.541 3.77e-05 *** DAMAGE_CROPS -9.500e-01 4.109e-01 -2.312 0.025103 * Hail 1.491e+05 8.191e+04 1.821 0.074903 . Tornado 3.922e+06 5.065e+05 7.743 5.40e-10 *** Heavy.Rain -8.237e+06 3.769e+06 -2.185 0.033771 * Flash.Flood -2.674e+06 1.077e+06 -2.484 0.016540 * Flood 3.911e+06 9.482e+05 4.124 0.000147 *** Wildfire -1.542e+07 5.306e+06 -2.906 0.005519 ** High.Wind 1.549e+06 5.276e+05 2.936 0.005085 ** Heavy.Snow -3.781e+06 1.532e+06 -2.468 0.017187 * Drought 2.616e+06 7.477e+05 3.499 0.001017 ** Heat -7.827e+06 3.053e+06 -2.563 0.013553 * Tropical.Storm 9.985e+08 3.893e+08 2.565 0.013504 * Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Page 18: Severe weather data analysis

Linear modeling II

Multiple R-squared: 0.8522, Adjusted R-squared: 0.8153 F-statistic: 23.07 on 12 and 48 DF, p-value: 6.5e-16 > AIC(modelhhh) 2558.475 < 2607.031 AIC level of the multiple linear regression is less than the simple

linear regression supporting low multicollinearity in multiple linear regression model

Page 19: Severe weather data analysis

Conclusion

Data analysis of severe weather support reasoning to believe patterns and relationships within the data do exist.

Although rapid increases in data observation over time were shown graphically, the catalyst or catalysts that explain this behavior in the data remains enigmatic

In the future we should anticipate the probability that severe weather may occur more frequently and violently

Page 20: Severe weather data analysis

Works Cited

Storm Events Database. National Oceanic Atmospheric Administration. Web. 7 Dec. 2015. http://www.ncdc.noaa.gov/stormevents/ftp.jsp