Summary of NDM Data Sample Analysis

13
© British Gas Trading Limited 2011 Summary of NDM Data Sample Analysis Option C: Regression Analysis

description

Option C: Regression Analysis. Summary of NDM Data Sample Analysis. Contents. Regression Analysis per LDZ In-Sample Results Out-of-Sample Model fit CWV Contribution Conclusion. Regression Analysis. Regression Model as follows: - PowerPoint PPT Presentation

Transcript of Summary of NDM Data Sample Analysis

© British Gas Trading Limited 2011

Summary of NDM Data Sample Analysis

Option C: Regression Analysis

© British Gas Trading Limited 2011

Contents

• Regression Analysis per LDZ

• In-Sample Results

• Out-of-Sample Model fit

• CWV Contribution

• Conclusion

© British Gas Trading Limited 2011

Regression Analysis

• Regression Model as follows:

• Dummy variables (Bank Holidays, Easter, Christmas and so forth).

• Weather variables introduced as per DESC meeting on 4th April (e.g. Temperature, Global Radiation, Rainfall and so forth).

• Time intervals used based on office hours and domestic habits.

• Slot 1 from 5am to 8am

• Slot 2 from 9am to 4pm

• Slot 3 from 5pm to 10pm

• Slot 4 from 11pm to 4am

© British Gas Trading Limited 2011

Regression Analysis

• Data normalised by AQ because of erratic level changes observed year on year. Yearly cut-off date is of 1st April due to time span of original files and data deletion process

• Binary permutation of variables used to seek out best regression fit with p≤5% significance level.

© British Gas Trading Limited 2011

Regression Analysis Models used

• A benchmark model was used for each LDZ as the following:

Normalised Consumption = Intercept + a0 * CWV

• Using Binary permutations, a most optimised linear regression model (based on best R2 fit) is chosen. The linear regression is of the form:

Normalised Consumption = Intercept + a0 * CWV + a1 * Temperature + a2* Windspeed +

a3* Solar Radiation + …

• In-Sample data runs from April 2008 to March 2011 whereas Out-of-Sample data spans from April 2011 to March 2012.

• These models were applied to End-User Category 1 only (EUC1).

© British Gas Trading Limited 2011

Regression Analysis Parameters (1 of 2)

Parameters EA EM NE NO NW SC SE SW WM WSIntercept 0.006914 0.006223 0.005724 0.005377 0.006518 0.005753 0.007031 0.007086 0.006198 0.006572

CWV -0.00039 -0.00038 -0.00032 -0.00033 -0.00029 -0.00033 -0.0004 -0.00044 -0.00039 -0.00035mean_Temp -0.00009 0.00011 0.000075 -0.00002 -0.00014 0.000019 -0.00006 0.000024 0.00000351 -0.00005

mean_Windspeed 0.000025           0.000015     0.00006 mean_WindDirection -7.55E-07     -1.03E-06   -7.45E-07 -2.85E-07      

mean_Humidity -1.16E-07 0.00000266   0.000012   0.00000427     0.00000174   mean_Global_Radiation   -5.13E-07 -2.52E-07   -1.15E-06   4.23E-07     6.97E-07

mean_Rainfall 0.00024 0.000178 0.000179   0.000476     0.000145     mean_Temp_lag1   -0.00001 -8.53E-06 -0.00001 -0.00002 -8.35E-06   -8.83E-06 0.0000034  

mean_Windspeed_lag1   0.000022 0.000026 0.000014 0.000041 0.00003 0.000012 0.000014     mean_WindDirection_lag1     4.623E-07         3.49E-07 3.48E-07  

mean_Humidity_lag1 -7.05E-07 -2.05E-06   -2.06E-06   -1.87E-06         mean_Global_Radiation_lag1   1.376E-07       -1.47E-07 6.75E-08      

mean_Rainfall_lag1   0.000113       -0.00007       0.000096 WeekEnd     -0.00005 -0.00014     0.000094   0.000073 0.000071 Mon_Fri -0.00005 -0.00005             0.00000264  

WeekEnd_from__Friday               0.000041     Bank__Hols                 0.000064   School_Hols 0.000066     0.00007 0.00012       -2.35E-06   Mon_Thurs     -0.00009 -0.00013   -0.00003        

Slot1_Windspeed                     Slot1_Rainfall         -0.0001     -0.00006    

© British Gas Trading Limited 2011

Regression Analysis Parameters ( of 2)

Parameters EA EM NE NO NW SC SE SW WM WS Slot1_GlobalRadiation 2.293E-07   3.061E-07   0.00000351          

Slot1_Temp 0.000032 -0.00005 -0.00004       0.000021       Slot1_WindDirection     -3.68E-07     2.86E-07        

Slot1_Humidity     -3.08E-06               Slot2_Windspeed     9.999E-06             -0.00003

Slot2_Rainfall           0.000033         Slot2_GlobalRadiation -3.51E-08           -2.45E-07     -4.49E-07

Slot2_Temp   -0.00004 -0.00002   0.000059 -0.00003         Slot2_WindDirection 5.639E-07   3.091E-07         -4.25E-07    

Slot2_Humidity     3.617E-06               Slot3_Temp 0.000053     0.000028 0.000029 0.000013 0.000043     0.000022

Slot3_Windspeed -7.84E-06                   Slot3_GlobalRadiation 1.357E-08 2.51E-07                

Slot3_Rainfall -0.00009                 0.000061 Slot3_WindDirection       4.538E-07            

Slot3_Humidity 3.044E-06                   Slot4_Temp 0.000015           0.000013     0.000021

Slot4_WindDirection               3.71E-07     Slot4_Humidity -1.75E-06     -5.51E-06   -2.28E-06        

Slot4_GlobalRadiation                     Slot4_Windspeed                    

Slot4_Rainfall 0.000031                 0.000072

© British Gas Trading Limited 2011

In-Sample MAPE Results

© British Gas Trading Limited 2011

In-Sample R2 Results

© British Gas Trading Limited 2011

Out-of-Sample MAPE Results

© British Gas Trading Limited 2011

Out-of-Sample R2 Results

© British Gas Trading Limited 2011

Analysis of Contribution of CWV in Optimised Models

© British Gas Trading Limited 2011

Conclusion

• Improvements against Benchmark Results are made using weather and/or calendar effects on top of CWV.

• The significance, or non-significance, level of Weekend/Weekday/Bank Holiday is very much LDZ-specific.

• Global Radiation is a significant variable in all LDZ’s.

• Time Intervals (i.e., Slot 1 to 4) and Monday-to-Thursday dummy variable help explain customer behaviour in some LDZ’s.

• Relative Humidity stands out in almost every LDZ’s.

• CWV heavily contributes in the optimised models obtained.

• No cross-effects utilised in Regression models.

• LDZ SO and NT need further investigations