Towards Self-Managing Demand Side Management
Transcript of Towards Self-Managing Demand Side Management
Towards Self-Managing Demand Side
Management
PhD Thesis
Fahad Javed
2006-03-0042
Advisor: Dr. Naveed Arshad
Department of Computer Science
Syed Babar Ali School of Science and Engineering
Lahore University of Management Sciences
Dedicated to those who stood behind me and endured
with me; my parents, my wife and my children.
Lahore University of Management Sciences
Syed Baber Ali School of Science and Engineering
CERTIFICATE
I hereby recommend that the thesis prepared under my supervision by Fahad Javed titled
Towards Self-Managing Demand Side Management be accepted in partial ful�llment
of the requirements for the degree of doctor of philosophy in computer science.
Dr. Naveed Arshad (Advisor)
Recommendation of Examiners' Committee:
Name Signature
Dr. Asim Karim �������
Dr. Mian Muhammad Awais �������
Dr. Jahangir Ikram �������
Dr. Waqar Mahmood �������
Acknowledgements
First and foremost I thank God almighty for providing me with strength, determination
and everything else to make this work possible.
I was fortunate enough to be surrounded by some very wonderful people during my
Ph.D. who provided me with invaluable feedbacks, critiques and comments without which
this work might not have been possible. My advisor Dr. Naveed Arshad from the very �rst
day provided me with guidance and advice directing me in the right direction. I would also
like to acknowledge Dr. Shahid Masood for providing me with encouragement and guidance
at some very critical junctures.
I am also indebted to my tea buddies: Saqib Ilyas, Zeeshan Rana, Junaid Akhtar, Umer
Sulaiman, Aadil Zia Khan, Khurram Junejo, and Malik Tahir Hassan. The discussions, the
arguments, the critiques and the lame jokes helped me formulate my ideas and concepts and
are an integral part of my work. The RICE lab weekly meetings also deserve a mention
here. Dr. Awais and his groups' feedback was very instrumental in focusing on the speci�c
research questions that I attempt to answer in this thesis.
Last but not least I would like to acknowledge support from my wife. Her support in the
thick and thin, her encouragement and perseverance egged me on to this point.
Contents
1 Introduction 2
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Demand Side Management and Demand Response . . . . . . . . . . . . . . . 5
1.3 Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 DSM in Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 Challenges for DSM in Smart Grid . . . . . . . . . . . . . . . . . . . 13
1.4 Limitations, Assumptions and Scope . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Literature Survey 19
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Demand Side Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 Critical DSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.2 DSM for Price Responsive Systems . . . . . . . . . . . . . . . . . . . 27
2.2.3 Distributed Generation Supported by DSM . . . . . . . . . . . . . . . 33
2.3 Short Term Load Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.1 Statistical and Time Series Techniques . . . . . . . . . . . . . . . . . 35
2.3.2 Arti�cial Intelligence Techniques . . . . . . . . . . . . . . . . . . . . . 37
2.3.3 Hybrid Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5
2.3.4 STLF for Buildings and Micro grids . . . . . . . . . . . . . . . . . . . 40
2.4 Load Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 Self-managing Energy Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5.1 Server Farms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5.2 Home Energy Management . . . . . . . . . . . . . . . . . . . . . . . . 44
3 System Architecture 46
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Proposed Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.1 Collection Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.2 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.3 Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.5 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.6 Actuators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4 Forecasting Energy Load for Individual Consumers 55
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Problem Description: Issues in house level forecasting . . . . . . . . . . . . 58
4.3 STMLF Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.1 STLF Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.3.2 STLF for Independent House Forecast . . . . . . . . . . . . . . . . . 64
4.3.3 STMLF1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3.4 STMLF2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3.5 Model Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6
4.4 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4.1 Forecasting Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4.2 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4.3 Experimental Data Source . . . . . . . . . . . . . . . . . . . . . . . . 74
4.4.4 Experimental Environment . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.5.1 AI Based Experiment Results . . . . . . . . . . . . . . . . . . . . . . 75
4.5.2 Multiple STLFs vs. STMLF . . . . . . . . . . . . . . . . . . . . . . . 75
4.5.3 E�ect of Anthropologic and Structural Data . . . . . . . . . . 78
4.6 Discussion on Miss-Forecasted Combinations . . . . . . . . . . . . . . . . . . 80
4.7 Short Term Forecasting Techniques for STMLF . . . . . . . . . . . . . . . . 84
4.8 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5 Disaggregation Heavy Loads from Forecast 90
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.3 Evaluation Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.4 Disaggregation Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.5.1 Noiseless Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.5.2 Forecast with Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6 Demand Side Management Planning 99
6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.3 Clustered Frequency Based Algorithm . . . . . . . . . . . . . . . . . . . . . 105
7
6.3.1 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.3.2 Linear Programming Based Planning . . . . . . . . . . . . . . . . . . 107
6.3.3 Spike Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.4 Adaptable Optimization - AdOpt . . . . . . . . . . . . . . . . . . . . . . . . 122
6.4.1 Self-Optimizing techniques . . . . . . . . . . . . . . . . . . . . . . . . 122
6.4.2 System Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.4.3 Case-Based Reasoning Engine . . . . . . . . . . . . . . . . . . . . . . 128
6.4.4 Framework Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5 Adaptable Modeling Framework . . . . . . . . . . . . . . . . . . . . . . . . 143
6.5.1 Structure of the Mathematical Meta-Model . . . . . . . . . . . . . . . 144
6.5.2 Modeling at Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.5.3 Running Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.5.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.5.5 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.5.6 Future Dimensions of Runtime Modeling . . . . . . . . . . . . . . . . 159
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7 Conclusion and Future Work 161
7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.1.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.1.2 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.1.3 Load Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8
7.1.4 Putting It All Together . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.2 Lessons Learnt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
7.3.1 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
7.3.2 Load Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
7.3.3 Planning and Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 170
9
List of Figures
1.1 Consumption of State of California USA, on 2nd April 2013. . . . . . . . . 5
1.2 Goals of DSM as proposed by Gellings [Gellings, 1985]. . . . . . . . . . . . 7
2.1 Comfort Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2 Contractually bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Explicit Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Incentive based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 TOU based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6 Incentive with storage based . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.7 Incentive with renewable with/without storage based . . . . . . . . . . . . . 36
3.1 Self-managing demand side management architecture. . . . . . . . . . . . . 50
4.1 Box and whisker plot for consumers load over a 24 hour period of 204 houses
from Eskistuna, Sweden. Whiskers point the maximum load for the hour
upper and lower box edges are 25th and 75th quartiles respectively and the
line in box is the median. On X axis is time at intervals of one hour and Y
axis is load in KWH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Classi�cation of survey questions. We classi�ed questions as anthropologic
(human centric), or structural (building speci�c) and pseudo-anthropologic
which are occupants impact or usage of structural facilities. . . . . . . . . . 62
10
4.3 ANN models for three forecasters. (a) is ANN model is for a single house
where only load and global invariants are provided for forecast. (b) is the
ANN model for STMLF1. (c) is the ANN model for STMLF2 . . . . . . . . 69
4.4 Mean squared error for four test weeks (a. Week of January b. Week of
April c. Week of July d. Week of October) comparing STMLF with multiple
STLFs. Blue line is STMLF and red line is average MSE of all STLFs. Days
of week are on X axis and mean squared error is on Y axis. . . . . . . . . . . 77
4.5 Scatter plot of forecast against actual load for 7 day test period of January.
The top plot in each �gure is forecast through structural and anthropologic
data and bottom one uses house-Id as discriminant. In all �gure actual load
is on X axis and forecast is on Y axis. . . . . . . . . . . . . . . . . . . . . . . 79
4.6 Mean squared error for four test weeks (a. Week of January b. Week of April
c. Week of July d. Week of October). Days of week are on X axis and MSE
values on Y axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.7 MLR forecast error for 9 day evaluation period. Each day has 4896 forecasts.
Darker part of bars represent correctly forecasted loads and lighter shade
represents the miss-forecasted loads . Correct forecast is forecast within the
range de�ned by equation 4.9. . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.1 Heater load and load pro�le of a single house. Red line represents the main
load value and blue dots represents the hours in which the heater was on. . . 95
6.1 Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.2 Hourly planning LP equations . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.3 Typical Supply spike in system . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.4 Spike handling LP equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.5 System response for spike at 20 < t < 30 . . . . . . . . . . . . . . . . . . . . 112
11
6.6 Typical demand spike in system . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.7 Reserve margin lower limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.8 Hourly planning BP equations . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.9 Hourly planning LP equations . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.10 System Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.11 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.12 Summary of Results of Table 6.5. Top most �gure shows results of AdOpt,
middle one show results from Interior Point and the bottom one shows result
of Simplex method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.13 AdOpt and Simplex comparison on 7 day CAISO data . . . . . . . . . . . . 141
6.14 Supply demand and comparison for day 6 CAISO data . . . . . . . . . . . . 142
6.15 Hourly planning LP equations . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.16 Consumption pro�le of California for a day as published by CAISO (Con-
sumption in MWh) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.17 Response time for dynamic and standard modeler in comparison to demand.
(Response time in seconds) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.18 Comparison between demand, clusters and active users for 24 hour period as
observed in Sollentuna, Sweden . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.19 Solver time for 24 hours CAISO data. (Response time in seconds) . . . . . . 157
6.20 Solver e�ciency for 24 hours CAISO data. (Power allocation in Kilo-Watts) 158
12
List of Tables
4.1 Comparison of volatility measure of individual loads, micro grid loads and
standard grid loads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Results of 3 measures of forecast through multiple STLFs and STMLF. In
addition average load of load for that week is provided to show a relationship
between MSE and average load in that week. . . . . . . . . . . . . . . . . . . 76
4.3 Results of 3 measures for forecast based on model constructed through an-
thropologic and structural data and forecast based on house-Id. In addition
average load of load for that week is provided to show a relationship between
MSE and average load in that week. . . . . . . . . . . . . . . . . . . . . . . . 78
4.4 Repeat count of error and Cumulative accuracy error for 7 day period. . . . 84
4.5 Mean Squared Error (in KWH) for 9 day STMLF using multiple linear re-
gression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.1 Confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.2 Confusion matrices for noiseless forecast. a) Arti�cial neural network (ANN).
b) Support vector machines (SVM). c) (ANN OR SVM). . . . . . . . . . . . 96
5.3 Confusion matrices for noisy forecast. a) Arti�cial neural network (ANN). b)
Support vector machines (SVM). c) (ANN OR SVM). . . . . . . . . . . . . . 97
6.1 Classi�cation of household appliances according to power and usage pro�le . 101
13
6.2 Total time for analyze/plan (Error threshold = .1) . . . . . . . . . . . . . . . 117
6.3 Total time for analyze/plan(Error Threshold=.01) . . . . . . . . . . . . . . . 117
6.4 Input combinations for all three algorithms to generate the initial case-base . 134
6.5 Adaptability and E�ciency Test Results . . . . . . . . . . . . . . . . . . . . 136
6.6 Summary of simulation results for pair-wise testing . . . . . . . . . . . . . . 138
14
Abstract
E�cient energy management is considered as the most important resource for future energy
needs and for continuation of human progress. One of the most promising methods to
optimize electric energy, the largest energy saving component in future, is planning the
consumption to maximize the throughput of energy or demand side management (DSM).
Since domestic consumers contribute up to 50% of electric demand, it is important that
their demand is managed in an optimal way. However, implementing DSM for domestic
consumers is a complex and human intensive task. Doe to these reasons, the state of the art
DSM systems for domestic consumers have only realized 5% savings according to surveys.
In this thesis we present a self-managing demand side management infrastructure to handle
the complexities and reduce dependence on the human operators while managing demand
for optimal distribution of energy.
This self-managing DSM is composed of three components. First component is short term
load forecasting to predict the household energy needs for the next 24 hours. The complexity
of this task stems from the volatility of the individual home energy consumption patterns.
To resolve this we show that through an innovative forecasting modeling paradigm and use
of anthropologic and structural data we can increase our forecast accuracy by as much as
50%. The Second component of our self-managing DSM is load disaggregation mechanism.
This mechanism identi�es the devices which can be managed for optimal scheduling. We
show that we can disaggregate the loads with only 3% error. The third component is the
planning algorithm. Since scheduling the devices is an NP-complete problem, an exact
solution for many typical scenarios is not feasible. We �rst present an aggregating mechanism
to convert the scheduling problem from binary decision to frequency domain and then solve
the optimization problem with a bounded error caused by the transformation. Furthermore,
since the size of the problem varies over time, to maximize the exactness of the solution and
reduce the bounded error wherever possible we present Adaptable Optimization or AdOpt.
AdOpt gracefully scales the system leveraging exactness of the solution against the time of
computation to deliver the best possible plan.
Our results show that the combination of the three techniques can result in up to 30%
reduction in peak power under varied operational conditions.
2
AcronymsDSM Demand Side ManagementAdOpt Adaptable OptimizationDR Demand ResponseHAN Home Area NetworkHVAC Heating Ventilation and Air ConditioningSTLF Short Term Load ForecastNYISO New York Independent System OperatorToU Time of UseEES Electric Energy StorageDG Distributed GenerationAI Arti�cial IntelligenceANN Arti�cial Neural NetworkSVM Support Vector MachinesSoM Self-organizing MapsGPRS General Packet Radio ServiceGSM Global System for MobileFDAP Forecast-Disaggregate-Analyze-PlanAMR Automated Meter ReadingSTMLF Short Term Multiple Load ForecastMLR Multiple Linear RegressionKWH Kilowatt-HourMSE Mean Squared ErrorAcc AccuracyVar VariabilityNILM Non-Intrusive Load MonitoringREDD Reference Energy Disaggregation DatasetKHz Kilo HertzLP Linear ProgrammingSAPE Sense Analyze Plane ExecuteBIP Binary Integer ProgrammingBP Binary ProgrammingCBR Case-Based ReasoningCRE Case-based Recommendation EngineRM Runtime ModelerMT Mathematical ToobloxAPI Application Programming InterfaceCAISO CAlifornia Independent Service OperatorUP Unutilized PowerMWH Megawatt-HourDAS Dynamically Adaptable SystemPHEV Plug-in Hybrid Electric VehiclesCid Consumer idGARCH Generalized Autoregressive Conditional Heteroskedasticity
1
Chapter 1
Introduction
Energy is instrumental in human progress. Thus its e�cient use is very important for
continuation of the progress made by humanity. In this chapter we will discuss the issues
with e�cient energy management, in particular with the usage of electricity by domestic
consumers. We will �rst discuss how demand side management in existing grid can increase
the e�ciency of energy usage and also discuss reasons for it to fail as a viable option. We
introduce smart grid and how DSM within smart grid can achieve the desired results. We
then introduce autonomic computing as the computational paradigm that can realize DSM
in smart grid. We close the chapter with brief account of our contributions in setting up an
autonomic DSM.
2
1.1 Introduction
Energy is the key for continuation of human progress. It has been shown in various studies
that energy consumption is directly proportional to growth of the human civilization tech-
nologically, socially and even on personal level [Ozturk, 2010]. However, as need for energy
grows exponentially, the available energy sources are depleting at an alarming rate. In such
a situation managing available resources to maximize their usage is of critical importance.
This energy management can reduce energy prices, a�ect environment through lower carbon
footprint and prolong the existing resources till new sources of energy can be found and
utilized.
Electricity is by far the largest consumer of energy. Electricity accounts for 56% of energy
consumed in the world. Therefore, there are various ways electric grids are being made
more e�cient. E�ciency in transmission, generation and distribution has been the topic
of research for more than a hundred years. But a very important and e�ective electricity
e�ciency measure - management of the demand of electricity- has been overlooked till now
due to the complex dynamics of electricity production and consumption and also due to the
lack of computing technologies to resolve these complexities.
Energy management is the task of managing the supply-demand equation in the grid.
The goal is to always have su�cient supply to meet the demand. If the supply is at any time
insu�cient for demand then this results in catastrophic situation for the distribution grids.
An overwhelming demand destabilizes the system. The result can range from ine�ciency in
the generation plants, transmission power lines and home devices to break down of plants,
melting and burning of cable and of house appliances. Thus it is imperative that supply-
demand equation is always maintained in the positive.
Since the demand is produced by hundreds of thousands of devices, it is close to impossible
in existing grid to control the demand. Therefore, wherever possible, utility providers try
3
to a�ect the supply part of the equation through over provisioning of electricity. But since
demand for energy varies over a day, season and years, setting up plants for the maximum
power need over an entire year is usually not very feasible. To illustrate this �gure 1.1 shows
the electricity demand for the state of California in USA. To meet this demand di�erent
types of power plants are used. The more e�cient production units, in terms of cost and
carbon foot print, are used for the base energy needs. For peak demands older, costlier units
are commissioned at certain times of the day to provide energy since it makes more sense to
use cheaper producing units more regularly.
Furthermore, peak power is unpredictable to a degree. This requires utilities to have
instantaneous power production infrastructure. This is mostly provided through diesel or
furnace oil based generation units. These units, in comparison to standard electricity pro-
duction units drive engine through their automotive force rather than through steam as is the
case for majority of industrial power production units. But energy through these generation
units has the double negative e�ect of higher cost and higher carbon footprint.
In certain situations, such over-provisioning is not possible due to lack of generation re-
sources. In such situations utilities are forced to shutdown energy to sections of the grid
to balance the power equations. The Indian blackout of 2012 was caused by severe short-
age of energy [Romero, 2012]. In developing countries, where recent industrial and local
consumption use has outstripped the growth of production plants, this lack of supply is a
common phenomenon. This is true for Pakistan, India and China as well where the demand
for energy is growing at more than 20% a year and the production is growing at 15%. This
leads to scheduled load shedding sometimes up to 12 hours in a day. It is imperative then
that instead of providing expensive and environmentally unhealthy power or cutting power
to consumers altogether, demand is somehow managed in such situations to at least reduce
the e�ect of peak power. This phenomenon is called peak shaving or demand shaping. Peak
shaving is more speci�c to reducing the load at peak times whereas demand shaping may also
4
Figure 1.1: Consumption of State of California USA, on 2nd April 2013.
mean increasing loads at some times and reducing at other times. There are two strategies
used for this mitigation: demand side management and demand response.
1.2 Demand Side Management and Demand Response
Demand side management (DSM) and demand response (DR) are the two primary terms
used in literature for managing demand in an electric grid. In literature the terms are used
interchangeably but a subtle di�erence exists as we will discuss below.
Demand response is the most commonly used strategy. DR is a reactive process where at
time of peak consumption demand is curtailed through explicit shutdown of end user devices.
In almost all the cases this shut down is blanket in that all the devices that participate in
demand response program are switched o�. This sudden drop in demand is su�cient in
some cases to mitigate the peak power issue. However, as the events in Indian blackout
have shown, for large scale supply-demand shortfall demand-response can cause a cascading
5
failure of catastrophic nature.
Demand side management is usually taken as a proactive, user dependent method to cur-
tail power consumptions at peak time over a long run. The term coined in 1976 by Gellings
state 6 goals for DSM as shown in �gure 1.2 [Gellings, 1985]. As can be seen, peak shaving
was one of the tasks. Technically demand response thus is a DSM but for our purposes we
will de�ne DSM as:
A proactive measure to shape demand over a comparatively longer
window (24 hour) period for optimal device usage given supply condi-
tions and user preferences.
For our study we de�ne DR in comparison as:
The reactive peak shaving measures to curtail demand during critical
peak demand which does not explicitly consider the user preferences
or long term supply condition in its planning.
DSM measures till recently have been limited to providing incentive pricing to end users
to shift their loads to lower demand times. This has the double e�ect of "valley �lling" and
"peak shaving" as described by Gellings. Di�erent costing plans have been o�ered by energy
suppliers giving incentives to the users to reduce their load at speci�c times in the day. One
example of such incentive is the power7 plan in United Kingdom. Energy price for �xed 7
hours in a day is higher but for the remaining 17 hours the energy price is nominal.
As is obvious from the description there are various pros and cons for each of the two
methods. Existing DSM plans do not consider the real time energy situation and are rigid.
The timings for energy prices are �xed and are valid even for days when there is no need for
peak shaving. On the other hand, if peak occurs at times other than the speci�c designated
6
Figure 1.2: Goals of DSM as proposed by Gellings [Gellings, 1985].
7
hour then utility will be in a �x. Secondly, such methods require the end user to exactly
know when it is bene�cial to use energy. Another aspect is that though energy provider is
giving incentives, it is up to the user to leverage those incentives or not. This unsure state
is not acceptable for utility providers as they need some sort of guarantee that the demand
will be shaped according to their needs.
Demand response historically has been primarily preferred more since it provided a
stronger guarantee of demand but has issues such as forcing the user's device to shutdown
even when a user might need the device. Secondly, to date most of such controllers are blan-
ket load shedders. That is either all of the controllers switch of devices they are connected to
or they don't. This results in unnecessary over shooting the DR targets. Third, the state of
the art of this method does not consider needs of end user. A user buys a washing machine
with the intention of using it at speci�c times. However, if the timings of user's need clash
with peak power regularly then the machine will be turned o� at the most inopportune
time. This might have serious implication in users' acceptance of such methods. Fourthly,
DR does not plan the power but rather expects that o�oading demand from now to future
would somehow work. This is su�cient when peak demand is for short duration and the
o�oaded workload is relatively small, but when the peak is for longer duration or the of-
�oaded workload is signi�cant than this has domino e�ect as is the case in load shedding in
Pakistan. The critical peak power in Pakistan extends for several hours and the blanket load
shedding o�sets signi�cant portion of the load for future. But when the o�oaded workload
comes online, this again strains the system. To recover from this new peak, the system again
sheds load from some other part of the network resulting in a continuously oscillating system
where for some days the system never reaches a steady state.
The lessons learnt from the two programs thus far are:
1. Energy consumption management can result in lower costs and is bene�cial
8
2. Incentives are a good way to convince the end users to manage energy better. But
1) Are harder to manage for end user 2) Do not provide guaranteed consumption
portfolios for utilities.
3. Automating shutdowns is e�ective but 1) Have lower acceptance from end users due
to their intrusiveness and lack of understanding of user's needs. 2) Existing methods
(DR) do not e�ectively manage power but rather put o� the existing loads for later.
Although DR and DSM are applied in the existing grids, its application has not realized
the savings that were anticipated. This has been generally attributed to the overload of
thinking and planning placed on the user for DSM or the authoritarian application of DR.
However, in recent times the advent of smart grid can provide technologies to get the best
of both the techniques. In the next section we will �rst discuss the concept of smart grids
and then discuss how this DSM will feature in a smart grid.
1.3 Smart Grid
Existing power grids are complex centralized networks. However, with growth in demand,
and requirement for greater grid reliability, security and e�ciency needed a "quantum" leap
in the way grids were managed [Moslehi and Kumar, 2010]. The proposed changes to the
existing necessitated harnessing communication and information technologies. This new
management framework towards a "smarter" grid is now widely referred to as "smart grid".
Smart grid is an initiative which gears at incorporating IT in electric grid for more control,
visibility and sustainability. One of the goals of this smart grid is to strengthen and support
the demand side management programs [Rahimi and Ipakchi, 2010].
The DSM programs speci�cally bene�t from this smart grid initiative directly and indi-
rectly since few smart grid technologies directly or indirectly resolve the di�culties of im-
plementing demand side management. Fist is provisioning of variable pricing for consumers.
9
The second is distributed generation speci�cally in micro grids. Third is the concept of
micro-grid and fourth and the most important is advances in protocols and technologies
pertaining to home automation network (HAN).
We will �rst discuss the concept of variable pricing and its implication on DSM. Variable
pricing is the model in which price of electricity will vary based on the actual price of
production at that time. In the existing grid across the world, in most of the systems price
of electricity is �xed irrespective of time of use. There are bracket pricing methods but are
a bit di�erent than variable pricing. For example, in Pakistan, �rst 300 units of a bill cost
half of the next 200 units and any energy more than 500 units costs 3 times more than the
base price. This price controls total consumption of electricity in a billing cycle but does not
charge based on the hour of usage.
But energy production cost is dependent upon the real demand at the time of the day.
Even the slab system does not represent the fact that energy produced at say 9AM has
di�erent price than one produced at 9PM. The current billing method averages projected
prices and this becomes the �at rate for the bill.
Variable pricing would charge the users the price of power that was applicable in that
hour. This will, theoretically, give user incentives to change their usage habits or pay extra.
In a way a brute and rude demand side management incentive. But it is foolish to expect
a user to sit all day to observe energy prices and optimize her daily electric consumption
based on hourly energy prices.
Distributed generation on the other hand is the futuristic concept where energy produc-
tion is distributed so that the small sized production plants are situated closer to the con-
sumption. This provides better control as well as lower line losses. To manage such smaller
self-contained energy modules the concept of micro-grid is suggested. A micro-grids is a
self-contained grid of smaller size connected to the external grid through some interface. A
micro grid manages its supply-demand equations internally and presents to the conventional
10
grid as a single demand/supply source. Due to the smaller size of micro-grid it is possible
to micro-analyze the load and manage DSM and DR programs within the micro-grid more
e�ciently. Secondly, due to distributed generation, the needs of the micro-grid may well be
met independent of the conventional grid providing a much more �uid and e�cient manage-
ment framework. However, such �uid and dynamic systems require self-management since
it requires intelligent automation to manage the real time micro-grid system.
Lastly, recent advances in wireless technologies and de�nition of protocols for home
automation network (HAN) such as ZigBee has provided opportunity for communication
and control of devices for better energy management applications [Alliance, 2006]. Through
these technologies it is now possible to surgically manage individual devices instead of re-
sorting to blanket load shedding.
The combination of distributed generation with micro-grids coupled with variable pricing
and control of demand through HAN can open many avenues for DSM programs. By reducing
the size of the e�ective grid and availability of localized generating resources makes it possible
to realize surgical DSM where individual devices can be monitored, analyzed and planned
for instead.
However, so far research in e�cient energy management for smart grids is generally
restricted to managing a single house's energy consumption which are either �tted with
renewable resources or have access to variable pricing or both. Our �ndings on the other
hand show that savings can be greatly increased if we consider a micro-grid for planning.
Even simple synchronization of devices or pooling renewable energy can greatly enhance the
e�ectiveness of DSM. To our understanding, one of the reasons for this lack of concerted
e�ort is due to lack of a holistic view of the situation. In this thesis we try to present this
holistic view for implementing demand side management in future smart grids.
11
1.3.1 DSM in Smart Grid
A networked electric grid with ability to control end user devices can theoretically drastically
increase the e�cacy of DSM. DSM in smart grid can potentially look to exploit the demand
elasticity- the natural leverage in device usage. For example when a user puts his dishes in
dish washer the system identi�es an opportunity to optimize. It informs the user through
some method such as a panel on dish washer, or a wall mounted display, or through mobile
phone that the user usually puts the dishes in at 8AM and does not use them till 4PM in the
evening. If he allows the system to manage then he can save $X or more every day. The user
clicks on agreement and the machine is scheduled to run at the optimal time. When a user
wishes to override this plan due to any reason, he simply has to click on over-ride button
on the same panel. On the utility side, the system can forecast the load and load elasticity
and schedule the elasticity in such a way that demand is maintained at the optimal level.
However, such a system requires heavy investment from the consumer and utility.
A more feasible system is in extension of forced load shedding. If instead of shutting o�
the total power only the high consumption devices are controlled by utility and the utility
o�ers contracts for minimum acceptable service then a win-win situation may be attained.
This will result in similar control for the utility however, instead of blanket load sheds, with
the smart grid infrastructure the utility can intelligently sheds load in such a way that the
service level guarantee is not violated. For instances a user may require that at some hours
of the day, for instance between mid-night and 5 AM, HVAC should not be switched o� for
more than 30 minutes in an hour. However, it is too cumbersome, in fact is unfeasible, for
a consumer to plan usage of devices based on the global energy demand. What is required
is a level of intelligent management which can aid the consumer in making the best decision
and then automating these decisions on the devices. According to studies, this is a much
more feasible solution at the present time. To this end in this thesis we present a system,
12
an architecture and its associated methods to make this self-managing. 1
Our hypothesis is that:
If we instrument DSM as a self-managing system then we can in-
crease DSM's e�ectiveness and make it practical for utilities to in-
crease DSM's e�ciency and applicability in smart grids.
But to construct such a system various research challenges exist which hitherto were
not addressed. The challenges discussed below are the breakdown of our hypothesis in to
research questions from the software and algorithmic perspective of the cyber-physical DSM
system.
1.3.2 Challenges for DSM in Smart Grid
Our approach for DSM is a pro-actively strategy to plan the consumption so that the peak
load is preempted. This has the distinct advantage of having a provably optimal solution and
can avoid cascading failures as were observed in Indian blackout of 2012. However, pro-active
planning for the future 24 hours requires that the data for future states is available.
Since energy consumption at future times is not readily available, �rst research challenge
for this optimal planning is a forecast of household loads. In the existing grids, since the
controlling plane is a region, forecast of a region was deemed su�cient. But since DSM
controls devices in a house, it is essential that forecast of individual devices or at least of
1Self-management or autonomic system is a paradigm geared at providing intelligent solutions toreplicate human interactions in mundane and oft repeated tasks. Though this was the initial dream, thestate of the art in autonomic computing not only replaced the human operator, but through their techniquesallowed more diverse and elaborate management of resources that hitherto with human operators were notpossible.
13
house is available for planning. There are three question which need to be answered for using
the existing forecasting methods for self-managing DSM.
1. Can current short term load forecasting (STLF) models work e�ciently for forecasting
individual households loads?
2. Can additional data enhance the forecasting accuracy of individual consumer loads?
3. Can we �nd the consumption of the relevant devices by disaggregation of household
loads with high accuracy?
In chapter 4 and 5 we attempt to answer these questions by proposing forecasting and
disaggregation methods for our system.
The second research challenge is to construct a plan for the system which balances the
supply-demand for a future window at a device level. Various planning algorithms exist
in literature but to select the appropriate algorithm two practical considerations needs to
be considered. First is the scalability since potentially hundreds of devices would need to
be planned. Since our planning is built on a forecast which may be inaccurate, the second
practical consideration is robustness of planner. Either the planner should make a robust plan
which is able to consider the expected perturbations in the system or should be adaptable
enough to re-calculate a solution if the forecast fails over time. Chapter 6 provides methods
to answer this concern.
Third research challenge is to make the planning in second challenge adaptable and
scalable. Since our system constructs a plan for an evolving environment, it should be able
to model the environment at runtime. This means that if in this systems devices are added
or removed then the system is able to adapt its model without explicit intervention of the
administrators. Chapter 6 section 3 provides a modeling method to answer this challenge
Fourth, given the various components that work together to deliver the solution a de�ned
14
architecture should exist to facilitate the development and integration of the components.
In chapter 3 we present an architecture which can tie together the various cogs together.
1.4 Limitations, Assumptions and Scope
We speci�cally limit the scope of our DSM system to household and small consumers. As
will be discussed in chapter 2, DSM for industry and large scale consumers has been applied
with good e�ect. But with 40 to 50% of load being contributed by households, DSM for
domestic consumer is an important target which henceforth has not been looked at in great
detail.
Second, classi�cation studies have identi�ed certain class of devices which contribute most
to the peak load. These are usually high energy devices for heating and cooling purposes.
To maximize the impact of DSM strategy and minimize the impact on consumer's life the
are system is speci�cally scoped to model and control the high impact devices only.
This thesis is particularly interested in identifying the algorithmic and software compo-
nents needed for self-managing DSM. To this end we restrict our argument to the software
components of the cyber-physical DSM system. We assume that a viable communication
medium exist to communicate the consumption of and plans for devices. We also assume
that the required legislation and physical deployment of hardware will be carried out for
practical application of the proposed solution.
A limitation follows from this assumption, in that, we assume that the data provided by
the devices will be perfect. Changes will be required in planning and forecasting model to
make the system robust to noisy data.
Another limitation of the thesis is related to the data. We have speci�cally used data from
California independent system operators and from the city of Eskituna, Sweden. The results
may not be exactly replicable for other regions and the techniques may require adequate
15
modi�cation. Furthermore, the planning is speci�cally constrained to the tropical hot climate
condition of the location of the author's university.
1.5 Summary of Contributions
Here we brie�y introduce the contributions made in this thesis to achieve self-managing
demand side management. The contributions are:
Closed loop autonomic DSM framework for smart grids
DSM measures have been around since 1985 [Gellings, 1985]. Gellings description of DSM
was holistic for the existing grid. However, since 2006 most of the research in DSM is within
the ambits of smart grids. But for each contribution, the authors assume a new unde�ned
architecture within which their contributions can work. To our knowledge in this thesis we
present the �rst holistic closed loop DSM architecture which can incorporate a variety of
DSM strategies proposed in literature and in this thesis.
Furthermore, existing demand side management solutions have been passive in their in-
teractions. The user is provided with some incentives and is expected to maximize her
energy use according to the incentives. On the utility side as well it is hoped that the
incentives will be used as much as possible. However, research has shown that this task
is too tedious for end users to perform and results in only partial use of the incentives
[Kim and Shcherbakova, 2011]. The future smart grids are proposed to be even more com-
plicated. In such a scenario self-management can be critical to implementation of DSM. To
ameliorate this we propose here autonomic DSM. Autonomic DSM automates parts of the
system and to reduce the conscious e�ort on part of the user to avail DSM measures.
16
Forecasting Household Energy Consumption for DSM
The future smart grids are expected to manage energy consumption at a household level.
To achieve this goal a good forecast of the controlling plane is required. In this thesis we
show that the existing forecasting methods and the emergent forecasting strategy proposed
by some researchers are not very e�cient. In comparison we show that a multi-dimensional
model considering the anthropologic and structural aspects of the house is 1. More accurate,
2. Less computationally expensive.
Planning and Modeling for Self-managed DSM
Scheduling large number of events while satisfying conditions is in general an NP-hard prob-
lem. It has been shown in literature that demand side management is reducible to scheduling
problem and �nding an exact solution for large enough systems is not tractable. In most
cases, to resolve such a problem some approximate algorithm is used. However for small
enough problems the system can be solved exactly.
In this thesis we present a self-managing system which is self-managing at two levels.
First, we propose a self-managing DSM system which intelligently automates the manage-
ment of devices in a grid in such a way that the user is provided optimal services while the
load curtailment goals are met. While there are many algorithms proposed which achieve
similar goals, our proposed solution in addition to optimal service also provide service level
guarantee which limits the maximum amount of load curtailed for each consumer. Through
this way we guarantee fairness to all the consumers of the service.
Second, we propose a self-managing optimization engine which autonomically selects the
correct algorithm based on the system statistics. That is, when the system size is small, the
self-managing engine selects an exact solution method and when the system size is large or
the time constraints are hard an appropriate approximate solution is selected. This allows
the system to provide optimal service based on the system dynamics.
17
To achieve such dynamic algorithm selection, we propose a dynamic system modeling
methodology which constructs a system model at runtime from the available raw data. This
model is then provided to the planning algorithm for optimized scheduling.
18
Chapter 2
Literature Survey
Building autonomic demand side management is a multifaceted multidisciplinary undertak-
ing. Establishing a DSM in future grids requires amalgamation of techniques from domains
across energy management, arti�cial intelligence and software engineering. In this chapter
we �rst look at the literature that exist for planning of demand side management systems
speci�cally those intended for smart grids. We then look at the literature for short term
load forecasting followed by literature survey of load disaggregation. We then look at litera-
ture from self-managing systems for optimization and planning or energy system to give the
literature survey of self-managing systems for demand side management.
19
2.1 Introduction
Electric grids of future are envisioned to be cyber-physical systems where information tech-
nology components will be integrated with physical hardware components to make the energy
cheaper, e�cient, clean and perhaps more sustainable. This vision is given the title of smart
grid [Coll-Mayor et al., 2007].
One of the primary application of this smart grid is envisioned to be a more e�ective
and e�cient demand side management program [Rahimi and Ipakchi, 2010]. Due to the
lack of communication and algorithmic support, DSM historically has been applied for de-
mand shaping for critical peak loads for the industrial loads [Albadi and El-Saadany, 2008,
Cappers et al., 2010, Strbac, 2008, Saele and Grande, 2011]. Since the industrial units had
the human resource and infra-structural support, such DSM was possible.
Since the domestic consumers is the largest electricity consuming sector, close to 45% of
in most regions [of Finance Pakistan., 2009], it has been hoped that DSM can be deployed
for this sector [Rahimi and Ipakchi, 2010]. With the advances in information technology now
it is possible to achieve this goal as has been discussed in chapter 1.
However, to deploy a demand side management program for domestic consumer, it is
necessary to resolve three problems. First is to develop planning algorithms for household
level DSM. Second is to have su�cient forecasting and measurement algorithms to support
the planning. The third is a way to distinguish which load is being used from this forecast.
The fourth is to have a level of self-management to relieve the consumer from minute decision
making details.
In this chapter we will �rst discuss the literature that exists for planning the demand
side management at the household level. We classify the existing work into three broad
categories based on the criticality of need. We further sub-divide classes based on di�erent
social and technical parameters. Next we look at the forecasting algorithms that can be used
20
for energy forecast in the near future. In literature, such a forecast is called short term load
forecasting (STLF). We present a survey of existing STLF techniques and categorize the
techniques according to the algorithm classes. We then present the literature survey of the
load disaggregation research. Last, we also discuss various self-managing energy management
systems present in the literature.
2.2 Demand Side Management
Demand side management is the management of end user consumption to manage the energy
supply-demand equation. It was �rst proposed by Gellings in 1985 [Gellings, 1985]. Since
then, DSM has been applied in di�erent regions and in di�erent scenarios. There are a
number of studies citing the bene�ts and pitfalls of DSM programs across the globe. For
example, Walawalkar and colleagues overview of the evolution of the DR programs in PJM
and NYISO markets [Walawalkar et al., 2010]. They also analyze current opportunities that
exist in these markets for DSM expansion. Strbac discussed bene�ts of implementing DSM
in the UK [Strbac, 2008]. Saele and Grane presented results of user executed DSM strategy
in Norway showing reduction in energy at peak times [Saele and Grande, 2011]. Cappers
and colleagues studied demand side management program in United States which resulted
in 38,000 MW of peak load reduction [Cappers et al., 2010]. Albadi and El-Saadani provide
a summary of demand response in electricity markets [Albadi and El-Saadany, 2008].
Though these works point to the e�ectiveness of demand side management, DSM in most
of the existing energy systems target large commercial and industrial settings due to a variety
of reasons. Di�erent studies have identi�ed these reasons. Studies such as those by Kim and
Sherbakova [Kim and Shcherbakova, 2011], Lisovich andWicker [Lisovich and Wicker., 2008],
Greening [Greening, 2010] and Breukers and colleagues [Breukers et al., 2011]point to di�er-
ent socio-cultural and economic aspects for this lack of e�ectiveness. we can classify the
21
problems in three categories:
1. Technical shortcomings such as algorithm design, network setup and hardware design.
2. Social aspects, such as user's willingness to participate, privacy issues.
3. Economic issues such as cost of deployment and bene�t sharing strategies.
Such studies point to the issues that need to be resolved for DSM program to succeed. To
resolve these issues various algorithms have been proposed which we will discuss below. One
interesting observation in this regard is that the DSM programs response to the objections
above is strongly related to the criticallity or utility of DSM. In some instances the need for
DSM is so severe that social aspects are ignored. But when the need for DSM is not so severe,
social aspects such as privacy and satis�cing behavior [Sheth and Parvatiyar, 1995] become
signi�cant drivers in algorithm design. Based on the need we classify DSM programs into
three classes: Critical where DSM is a necassity, Renewable integration where DSM optimizes
and accentuate the use of renewable energy sources and TOU optimization where DSM uses
the utility provided incentives to increase savings.
2.2.1 Critical DSM
Critical DSM are those demand side management programs where energy conservation is
critical enough to automatically control end user devices or system stability will be compro-
mised. Since the utility is pro-actively controlling consumer devices, the control is generally
restricted to heavy loads such as of HVAC to minimize consumer impact. Secondly, since
switching of any device at any time is a very intrusive and perhaps rude method, systems
generally try to limit the utility driven switching. Based on this aspect there are three types
of algorithms for DSM proposed. First type is user comfort driven where user's preference
is modeled explicitly and the DSM algorithm tries to manage DSM goals within the comfort
22
bounds of user. Second set of algorithms contractually bounded where the amount of
time a device can be switched o�. The third type is explicit feedback where user is explicit
asked to provide preferences or is asked to select from a range of choices for DSM.
User Comfort Driven
The algorithms in this category try to capture the comfort of the user using di�erent models.
Based on these comfort ranges the DSM algorithm attempts to reduce load when-ever needed.
Figure There are various models used to ascertain user comfort. Venkatesan and colleagues
plan by utilizing consumer behavior modeling considering di�erent scenarios and levels of
consumer rationality while observing voltage pro�le and losses [Venkatesan et al., 2012]. Fan
used the lessons learnt from research on the internet tra�c and proposed a DSM where
user preferences were modeled as willingness to pay [Fan, 2011]. Moderink and colleagues
presented a three tier demand side management system which planned device usage us-
ing forecasts of devices and matching the consumption with the global supply equations
[Molderink et al., 2010]. However, it is unclear how the savings will be achieved. Further-
more, the DSM program plans for the forecasted load only and if the consumer behavior is
other than forecasted then the DSM planning is over-ruled by the consumer demand. This
results in failure of DSM at times of bad forecasts. PingWei and Ning proposed an HVAC
management plan where the heating and cooling loads are modeled using thermal dynamics
of heating and cooling to model the load [Du and Lu, 2011a]. The algorithm then uses a
comfort index in conjunction with the thermal dynamic modeled system to plan power cuts.
Daoxin and colleagues �rstly, model all the major market participants together with the
constraints of transmission and generation. Then, the energy market is analyzed with RERs
uncertainties and demand response [Daoxin et al., 2012].
A generalized model of these techniques is provided in �gure 2.1. The DSM system
elicits comfort from the devices either through rationality arguments, willingness to pay
23
Figure 2.1: Comfort Modeling
or through thermal dynamics, etc. This model is used to provide a forecast to the DSM
planning algorithm which sends a signal to device actuators to implement the DSM scheme.
The power supply from the utility in all of the cases was not e�ected directly.
Contractually Bound Systems
Contractually bound systems are those systems where consumers are provided with di�er-
ent contracts. These contracts limit the extent of DSM load shedding. Figure 2.2 shows
a model for such a system. The user consumption data is forecasted using some fore-
caster. The DSM is provided with the contractual bounds within which in can plan the
DSM strategy. The planner schedules loads such that DSM goals are met while staying
within the contractual obligations. One instance of such system is presented by Javed and
Arshad where they propose a linear programming solution to manage loads of HVAC in
a city. This algorithm modeled the problem as a series of equations which model provide
24
a fair and service level bound consumption for the users while maintaining the load man-
agement goals [Javed and Arshad, 2009a]. Kwag and Kim Proposes customer information
as the registration and participation information of DR, thus providing indices for evalu-
ating customer response, such as DR magnitude, duration, frequency and marginal cost
[Kwag and Kim., 2012]. Pedrasa and colleagues on the other hand limit the amount of led
shed based on its e�ect on the consumer [Pedrasa et al., 2010]. In both the cases the con-
tracts can vary.
Explicit Feedback
DSM programs in this category expect the user to provide guidelines on his preferences.
Figure 2.3 present a model of such a system. Here the constraints are explicitly provided by
consumer through some interface. This constraints are the limitations for the DSM planner
similar to the contractual case. There are various algorithms that have been proposed for
explicit feedback critical DSM systems. Escrivá and colleagues evaluated di�erent control
strategies to reduce cost of HVAC using contractual clauses [Escrivá-Escrivá et al., 2010].
Here the contracts are the explicit guidelines by the user to control the load. Kim and Poor
proposed a Markov decision process based for scheduling elastic loads where elastic and non
elastic loads are assumed to be known [Kim and Poor, 2011].
Since such management is very cumbersome, some DSM programs reduced the consumer
loads by simplifying the priority scheme. Ranade and Beal proposed colorPower where con-
sumer assign color to their devices [Ranade and Beal, 2010]. Each color is assigned a prior-
ity. The scheduling algorithm prioritizes the shut down of devices based on these color-based
priorities. Du and Lu [Du and Lu, 2011b] appliance commitment algorithm that schedules
thermostatically controlled household loads based on price and consumption forecasts con-
sidering users' comfort settings to meet an optimization objective such as minimum payment
or maximum comfort
25
Figure 2.2: Contractually bound
26
Mohsenian-Rad and colleagues presented a theocratical result where they posed DSM as
a game simulation and proved that if the utility can �x prices at certain levels then if the con-
sumers chose their strategies then the game is Nash-complete [Mohsenian-Rad et al., 2010].
This means that under the constraints de�ned, if each consumer picks his best strategy then
all the consumers will bene�t.
Faria and Vale present DemSi, a simulation environment that can be used by utility
provider to evaluate the e�ects of demand response demand response simulator that allows
studying demand response actions and schemes in distribution networks [Faria and Vale, 2011].
2.2.2 DSM for Price Responsive Systems
Demand side management historically has been passive programs where utility providers
cajole or convince users to consume energy in a manner which reduces cost to the utility.
There are two methods generally applied. A positive feedback by providing rebates to con-
sumers who reduce energy at peak demand times and a negative feedback mechanism which
penalizes consumption at peak timings. However, such systems were rigid since the timings
were �xed and not based on actual supply-demand and resulted in very low user response
[Kim and Shcherbakova, 2011]. With the advent of smart grids and integration of informa-
tion technology with electric grid, it is possible to make these feedbacks re�ective of actual
supply-demand situation. Furthermore, due to automation and large scale collaboration,
the user fatigue can be avoided which is considered as the most important factor by various
researchers in lack of response for existing DSM programs [Kim and Shcherbakova, 2011].
The incentives proposed in smart grids literature are of two types: �xed price incentives
where the price of electricity is �xed for all days and some automation method is proposed
to reduce user fatigue and time of use systems where the price of electricity changes dynam-
ically based on actual demand and supply. Another category of systems are those where
some storage is available to the user. This storage is used as a bu�er for buying electricity
27
Figure 2.3: Explicit Feedback
28
at lower price and is consumed or sold back to the system at higher price timings. Below
are the various works which fall under these categories.
Fixed Price Incentive
Fixed priced incentive systems usually provide a set of prices that are applied by di�erent
utility providers as DSM incentives. There are two types of systems: Global systems which
propose a price �xing mechanism and automation mechanisms which automate the consump-
tion to maximize the incentive bene�ts. Aalami and colleagues for instance proposed incen-
tive de�ning mechanism for demand response programs which penalized customers in case of
not responding to load reduction messages[Aalami et al., 2010]. In comparison Moghaddam
and colleagues presented an economic model for response of consumers on their �exible de-
mand [Moghaddam et al., 2011]. They proposed a customer bene�t function which is used
for managing the load of each house which responds to the incentives such that the load
reduction goals are achieved while maximizing the consumer bene�t.
On the other side, various algorithms are proposed to automate the consumption to
maximize the bene�ts of the incentives. Fan proposed a user response algorithm to re-
spond to �xed priced incentives and presented results on a simulated environment[Fan, 2011].
Finn and colleagues proposed a dish washer scheduling algorithm which is extendable to
other loads of similar nature to exploit DSM incentives [Finn et al., 2012]. Giorgio and
Pimpinella proposed an event driven Smart Home Controller enabling consumer economic
saving by exploiting the DSM incentives [Giorgio and Pimpinella, 2012]. Rastegar and col-
leagues proposed an optimal and automatic residential load commitment (LC) framework
to achieve the household minimum payment again in a �xed peak load pricing regime
[Rastegar et al., 2012].
Figure 2.4 provides a model for these systems. The incentives are computed using the
�rst set of algorithms and provided to the DSM planners. These incentives are constant
29
over time. The consumption is forecasted and the planner uses the incentives to maximize
savings of the consumers.
Time of Use
The more recent advancement is Time Of Use (TOU) pricing where the utility charges the
consumers the price of energy at the time of consumption. This results in a very complex
scenario where forecasting energy prices and adjusting consumption accordingly becomes
too di�cult a task for most consumers. Figure 2.5 represents DSM systems of this type.
Here instead of a �xed incentive, a cost calculation mechanism is added which calculates the
cost of energy instantaneously. To facilitate the consumer's interactions di�erent algorithms
thus are proposed to plan her consumption. Ramchurn and colleagues proposed a decentral-
ized agent based algorithm to coordinate deferment of loads through collaboration of agents
[Ramchurn et al., 2011]. Lee and Lee proposed a scheduling algorithm using TOU to reduce
cost and presented results and showed positive results on a simulation [Lee and Lee, 2011].
Datchanamoorthy, and colleagues TOU pricing model for monopolies and applied the al-
gorithm on a simulation [Datchanamoorthy et al., 2011]. Chen and colleagues proposed
a stochastic algorithm for management of residential loads as a response to TOU pricing
[Chen et al., 2012]. Fuller and colleagues [Fuller et al., 2011] and Valenzuela and colleagues
[Valenzuela et al., 2012] proposed simulation setup where they used various economic mar-
ket models to observe the e�ect of TOU pricing and consumers' response following economic
modeling techniques.
Incentive with Storage
A third stream in this research is to use storage devices to store energy at lower prices and
resell or use this stored energy at time of higher prices. Figure 2.6 models such systems. In
these systems a physical storage device is available to the system to store the electric energy.
30
Figure 2.4: Incentive based
31
Figure 2.5: TOU based
32
This is incorporated in the planner algorithm. Some of the interesting works in this domain
have used electric vehicle's batteries as a strategic storage. Shao and colleagues proposed
usage of electric vehicles' storage banks to act as bu�ers and use these bu�ers for scheduling
DSM while providing an interface to accommodate customer choice [Shao et al., 2012]. Xu,
Xie and Chen proposed an optimal electric energy storage (EES) scheduling algorithm which
uses day-ahead pricing and forecasted energy load to schedule EES. The proposed system
then uses a model predictive controller to adjust the scheduling according to handle the
inaccuracies of the forecast [Xu et al., 2010].
2.2.3 Distributed Generation Supported by DSM
A new facet of demand side management systems has been the usage of DSM for maximizing
utilization of distributed generation facilities. In most cases this distributed generation is
through renewable sources. The goal of these systems is to maximize the utilization of
renewable resources to reduce cost of overall energy. Figure 2.7 models the systems of this
category. Here in addition to the incentives, the system has physical generation units to
supply energy. In most cases there are some storage medium as well but this is not always
the case. Finn and colleagues proposed a device management system which integrates the
renewable sources for reducing cost of energy [Finn et al., 2012]. Livengood and Larson
proposed Energy Box, a smart controller which used stochastic dynamic programming to
schedule device consumptions while considering the generation from renewable resources
[Livengood and Larson, 2009].
In some regions, the utility provider buy back extra energy generated from renewable
sources. Gudi and colleagues used this facility for optimal management of distributed re-
newable resources where extra energy was also sold to the utility [Gudi et al., 2011]. Xiao-
hong and colleagues proposed a coordinating strategy for energy sources and loads for low
energy building [Guan et al., 2010]. Jiang and Yunsi Fei proposed a distributed generation
33
Figure 2.6: Incentive with storage based
34
integration mechanism using Hierarchical Agents [Jiang and Fei, 2011].
2.3 Short Term Load Forecasting
Short term load forecasting is the task of predicting a single system's load for a short future
window [Gross and Galiana, 1987]. The data for STLF usually is in a time series format that
is consecutive value represents the next observed value of the system. The future window
ranges from one hour to a week and the forecasting granularity varies from 15 minutes
windows to an hour. We will discuss each of the technique's application here. The underlying
systems that are being modeled though vary but there are few commonality in almost all
the systems. First the systems are non-linear. Second they exhibit a diurnal trends where
there are usually two peaks in a day corresponding to midday and late evening. Thirdly,
since historically STLF has been applied to forecast load of an entire grid, the patterns are
usually smooth. To construct this forecast two fundamental techniques have been applied
for STLF: statistical and time series models and AI techniques. Some researchers have gone
further to combine AI optimization models to tune the statistical as well as AI forecasting
techniques and proposed hybrid systems.
The aforementioned techniques are applied with great success in grid level systems. How-
ever, since the advent of smart grids interest in forecasting small scale loads such as those of
smart grids, buildings and houses has also seen interest. In this section we will �rst discuss
the time series and basic AI techniques followed by the hybrid methods. We will then discuss
the STLF techniques for small scale loads since this is forecast that our planning requires.
2.3.1 Statistical and Time Series Techniques
Statistical techniques initially relied on smoothing and averaging to build a model. The
hypothesis is that the system has a central tendancy which can be captured using these tech-
35
Figure 2.7: Incentive with renewable with/without storage based
36
niques. Papalexopoulos and Hesterberg used regression for STLF [Papalexopoulos and Hesterberg, 1990]
and Christiaanse used exponential smoothing [Christiaanse, 1971]. Lauret and colleagues
used Gaussian process model [Lauret et al., 2012]. Irisarri applied Kalman �lters for the
same task [Irisarri et al., 1982]. Amjadi used ARIMA model [Amjady, 2001] however, the
non-linearity of the underlying system resulted in low accuracy.
To resolve this issue Hagan used Box and Jenkin method [Hagan and Behr, 1987] and
Garcia and colleagues applied GARCH [Garcia et al., 2005]. Weron has described various
methods for applying statistical methods in [Weron, 2006]. Although statistical and time-
series methods produce su�ciently good results but they have certain limitation. One of
the reason is that time-series methods consider only the load data and any other supporting
information such as temperature etc. is not considered. This results in inaccurate forecasts
for times when these factors signi�cantly alter the general course of the time-series. For
this reason AI techniques have been applied for STLF. Amaral used a smooth transition
periodic autoregressive [Amaral et al., 2008] method as a non-linear method for time-series
forecasting of STLF.
2.3.2 Arti�cial Intelligence Techniques
Various AI techniques have been applied for STLF since the early days of the �eld. The
major focus in the �eld is on arti�cial neural networks but pattern recognition, fuzzy systems
and SVMs have been used as well.
Arti�cial Neural Networks
Arti�cial neural networks attempts to replicate the biological thinking pattern to optimize
and predict systems. ANNs have been used for a long time for STLF. Hippert reviewed the
arti�cial neural network techniques applied for STLF in 2001 [Hippert et al., 2001]. Since
then di�erent ANN algorithms in di�erent con�gurations have been applied. Fuzzy neural
37
networks were applied by Bakirtzis and colleagues [Bakirtzis et al., 1995] and a self orga-
nizing fuzzy-neural-network-based was applied by Dash and colleagues [Dash et al., 1998].
Abdel-Aal used a committees of neural networks for STLF [Abdel-Aal, 2005]. Cheng and
colleagues and yao and colleagues partition source load into components using wavelet trans-
form and then train ANN for each component for a more accurate forecast [Chen et al., 2010,
Yao et al., 2000]. Amjady and Keynia in comparison used wavelet transform but used a
evolutionary ANN for STLF [Amjady and Keynia, 2009]. Lauret used Bayesian ANN for
auto-tuned STLF [Lauret et al., 2008]. AlFuhaid used cascaded neural networks for STLF
[AlFuhaid et al., 1997].
Support Vector Machines
Support vector machines project data in hyper-dimensions in order to �nd discriminating
planes. This method has been used for regression and forecasting as well. Mohandes showed
that SVM out-perform standard auto-regressive models and certain ANNs [Mohandes, 2002].
Zhang has applied a di�erent kernel for SVM with better results [Zhang, 2005]. Chen and
colleagues reported on results of application of SVM for STLF in EUNITE competition
[Chen et al., 2004].
Since tuning parameters for SVM has signi�cant e�ect on the outcome, most of the SVMs
are used in some hybrid system where an optimization algorithm tunes SVM for forecasting.
We will look at these systems below.
Pattern Recognition
Pattern recognition is the task of identifying speci�c patterns in data. The basic intuition for
pattern recognition in STLF is to �nd patterns in history and forecast the future based on
these patterns. Dehdashti used this technique for STLF in 1982 [Dehdashti et al., 1982]. Dai
andWang used similar technique in conjunction with neural networks for STLF [Dai and Wang, 2007].
38
Other Methods
In addition to the traditional machine learning methods some researchers also applied other
arti�cial intelligence algorithms for STLF. Rehman and Bhatnagar expert system based
[Rahman and Bhatnagar, 1988] where input from experts is elicited to form a knowledge
base and then distance measures are used to identify the appropriate scenario for forecast-
ing. However, expert system based forecasting is not commonly used. Yang and Huang
[Yang and Huang, 1998] used fuzzy logic. Mastrocostas and colleagues used fuzzy logic with
constrained optimization [Mastorocostas et al., 2000] for STLF with varied results. Fuzzy
logic though is more used in simulation than in forecasting. Chen and colleagues on the
other hand used an agent based distributed forecasting mechanism [Chen et al., 1993].
2.3.3 Hybrid Techniques
Hybrid techniques are techniques where multiple techniques are used in combination to
increase accuracy of the forecast. There are four variations of these techniques: Time series
techniques tuned by AI, Time series techniques integrated with AI for better results, AI
techniques to tune parameter of another AI technique and AI techniques integrated with
another AI technique for better result.
Example of the �rst class - where AI tunes time series- are works of Desouky and col-
leagues who use ANN to tune ARIMA [El Desouky and Elkateb, 2000] and work of Nie and
colleagues who use tune ARIMA with SVM [Nie et al., 2012]. In comparison He and col-
leagues use ARIMA as base forecaster where SVM forecasts speci�c non-linear points in the
data for increased accuracy [He et al., 2006]. Lu and colleagues proposed a similar system
where they used ANN instead of SVM to support ARIMA forecast [Lu et al., 2004].
There are various hybrid models where an AI technique is used to tune another AI tech-
nique. Carpinteiro and colleagues used a self-organizing map - a type of ANN- to optimize an-
39
other SOM where the second SOM is used for forecasting [Carpinteiro et al., 2004]. Hippert
and Taylor used a Bayesian inference system to tune ANN forecaster [Hippert and Taylor, 2010].
Fan and Chen used a SOM to tune SVM [Fan and Chen, 2006]. Sun and Zou tuned ANN
with particle swarm optimization (PS0) [Sun and Zou, 2007]. Yun and colleagues tuned
ANN with a verity of neuro fuzzy optimizer (ANFIS) [Yun et al., 2008]. Pai and Hong
tuned SVM for forecasting through simulated annealing [Pai and Hong, 2005].
The last category is where an AI technique is integrated with another AI technique
for better results. Dai and Wang used ANN for forecasting but used pattern recognition
for forecasting of patterns where ANN failed [Dai and Wang, 2007]. In comparison Jain
and Satish used clustering to partition data and then forecasted the partitions using ANN
[Jain and Satish, 2009].
2.3.4 STLF for Buildings and Micro grids
Recent advances in smart grid has forced a new dimension in STLF research. Previously
STLF methods were focused on large population sets. But due to prevalence of smart grid
ideas, recently, research has been focused on STLF for small scale systems as well. STLF
for small scale systems is proven to be a much harder problem than for large scale system as
has been explained by Amjadi and colleagues in [Amjady et al., 2010]. [Amjady et al., 2010]
and [Gurguis and Zeid, 2005a] have proposed solutions which work better than the standard
STLF for a micro-grid or building level granularity. However, the accuracy of the system
still does not match those of a large scale STLF due to volatility issues.
2.4 Load Disaggregation
Load disaggregation is the task of identifying individual loads by observing the total load of
the system. In essence we disaggregate the individual loads from the aggregated load that
40
is reported by the main meter. Zeifman and Roth presented a good survey of the technique
in their survey [Zeifman and Roth, 2011]. The load disaggregation systems reported in this
survey and elsewhere, such as by Marceau and Zmeureanu [Marceau and Zmeureanu, 2000],
almost entirely apply load disaggregation for non-intrusive load monitoring or event detec-
tion, that is, to identify loads in the house without instrumenting the devices with sensors
[Hart, 1992]. It is assumed that for such disaggregation live feed of load is available.
There are two classes of algorithms for load disaggregation based on data frequency.
Algorithms for low frequency data -at the rate of one second or more- are generally for
heavy load detection and use active and reactive power and other macroscopic parameters
[Cole and Albicki, 1998, Farinaccio and Zmeureanu, 1999, Norford and Leeb, 1996, Powers et al., 1991].
The results from these studies however are not very accurate due to the severe complex-
ity of the problem. If we observe in more detail then it can be observed that the larger
loads have a higher accuracy rate than otherwise. This speci�c fact is of great importance
to us as we will see in chapter 5. Algorithms for high frequency data - sub-second rate
data- attempt to �nd almost all possible loads from the aggregated data [Chan et al., 2000,
Gupta et al., 2010, Leeb et al., 1995, Srinivasan et al., 2006]. Recently Lam and colleagues
constructed a taxonomy of devices based on load signatures [Lai et al., 2010]. This strategy
has resulted in drastic improvement of results and accuracy of up to 92% has been reported
[Hassan et al., 2013].
However, our task is to �nd the events of heavy load usage in the forecasted load where
the forecasted load is at the frequency of hour. However, since we plan to only manage the
largest loads of the house, there is a hope that we may be able to identify those loads with
a degree of con�dence.
41
2.5 Self-managing Energy Systems
Self managing systems are systems which are able to manage their situation themselves.
By managing their situation it is implied that they are able to self-con�gure, self-heal, self-
protect or self-optimize given normal operative situations. By this de�nition it should be
anticipated that a self-managing energy system is able to identify point of optimization on
its own and is able to reduce its cost or increase its throughput accordingly. The system
should be able to identify opportunity to optimize and self-con�gure themselves to bene�t
from the situation are very few. This self-awareness should in theory be the di�erentiating
factor between a DSM algorithm and a self-managing energy system. However, such self-
managing systems are very few. There are two domains in which such systems have been
investigated by the self-managing community. Server farms operations and its energy needs
is by for the more focused area by self-managing community. However, few instances of
self-managing system for home consumers also exist. In this section we present the survey
of these self-managing systems. We will �rst look at the server farm solutions followed by
self-managing systems for home consumers.
2.5.1 Server Farms
Reducing energy load of a server farm holds great importance for the server farm operator.
In most cases server farms host applications of third party. These applications have varying
load. The third party vendors pay the server farms according to the tra�c they receive
thus it becomes important that resources are scaled according to load as well. Given these
constraints management energy of server farms becomes a very complex and involved task. A
plethora of studies exist on its optimum management but since the thesis scope of this thesis
is limited to self-managing energy systems for home consumers we will only mention some
of the key works which represent the domain of server farm energy management systems.
42
The goal is not to provide an exhaustive list of energy management system for server farms
rather it is present a holistic picture of what is done in server farms energy management to
draw parallel with home consumer energy management.
We can divide the self-managing energy systems for server farms in to three categories.Those
which plan, model or visualize the energy consumption. We will look at each in turn.
Modeling
The main task of a server farm is to respond to requests. Since to respond to these requests
result in expending of energy, most of the studies to model the consumption have linked
requests with energy consumption. For instance Yuan and colleagues [Yuan et al., 2011]
modeled power consumption by observing requests in multi-tier service oriented systems.
Similarly Leite and colleagues [Leite et al., 2010] proposed a stochastic model in web-hosting
cluster focusing on control of power and tardiness.
Planning
A server farm or a service provider of a server farm can tune a series of parameter to plan
a more energy e�cient system. The scale can vary from assigning loads to di�erent server
farms to assigning loads to di�erent machines in a server farm and all the way down to
managing devices on a machine optimally. Various researchers have proposed solutions at
each layer of control. Ilyas and colleagues proposed allocation of loads to server farms to
reduce total energy cost [Ilyas et al., arch]. Deng and colleagues proposed self-managing
methodology to reduce carbon footprint [Deng et al., ].
At device or virtual machine level Shen and colleagues proposed CloudScale for resource
provisioning in cloud [Shen et al., 2011]. Similarly Zhang and colleagues proposed dynamic
provisioning in clouds again on a virtual machine level [Zhang et al., 2012].
In comparison some researchers planned work allocation by minimizing the cooling re-
43
quirements. Das and colleagues used utility functions to plan allocation whicle keeping
cooling loads of server farms as optimization goal[Das et al., 2010]. Vasic and colleages pro-
posed a thermal aware scheduling of workload [Vasic et al., 2010]. These planning were at
machine level within a single server farm.
A plethora of work exist for device throttling such as by David and colleagues who propose
memory power management via voltage/frequency scaling [David et al., 2011].
A work encompassing all three levels is 1000 isnla nds proposed by Zhu and colleagues.
They proposed hierarchical planners which optimized resource allocation at each level for
lower energy consumption [Zhu et al., 2008].
Visualization
Visualization system provide feedback to human operator and self-managing agents to ob-
serve the energy consumption of a server farm. WattApp by Koller and colleagues provide
an interesting self-managing application to visualize energy consumption by deployed ap-
plications. This aids in charging the applications based on power consumption instead of
requests alone [Koller et al., 2010].
2.5.2 Home Energy Management
A self managing energy system in principle should be able to optimize the energy needs of the
consumer by looking at the consumer needs, available power and tari� and then implement
this plan with minimal intervention. Following are some self-managing energy system for
home consumer.
Planning
The most interesting work in this regard is by Beal and colleagues. They �rst present
ColorPower [Ranade and Beal, 2010] in which the proposed a stochastic model for coordi-
44
nated demand side management. This was followed by ColorPower II which enhanced some
features of original system [Beal et al., 2012]. Ramchurn and colleagues also propose a de-
centralized DSM based on agent planning [Ramchurn et al., 2011]. However, the technique
is too general and simulation too broad and generic to merit a deeper study at the moment.
Modeling
Modeling energy consumption of end user is a very complex task since compared to server
farms metrics to relate energy consumption with other observable trends are not available.
Javed and colleagues presented a study where they showed that a latent relationship exist
between anthropologic and structural data with energy consumption [AE10]. Other studies
look at the movement of occupants and energy consumption such as that by Hoelzl and
colleagues [Hoelzl et al., 2012]. Tarzia and colleagues in comparison analyzed display power
management policy against di�erent user preferences [Tarzia et al., 2010].
45
Chapter 3
System Architecture
Demand side management for home consumers in the smart grid is a complex problem
requiring forecasting, demand disaggregation, planning and controlling of devices. Di�erent
researchers have proposed algorithms to resolve these tasks. However, very few architectures
provide a way to integrate the di�erent cogs together into a single cohesive and integrated
system. In this chapter �rst we discuss our proposed strategy for self-managing demand
side management for home consumers in future smart grids. We then describe the cogs that
are needed to deliver the functionality and present an architecture within which these cogs
integrate to provide the demand side management services. The details of cogs are discussed
in the next three chapters.
46
3.1 Introduction
Demand side management historically has been deployed for large scale consumers. In the
pre-smart grids era it made sense as well since managing hundreds of thousands of devices
was not feasible both network wise and algorithmically. Large scale consumers provided a
simple interface. The forecasting mechanism needed for such systems only required the total
consumption of the entire grid. Since the demand was smooth and showed low volatility
across the depending attributes, it was easier to forecast the demand. The demand curtail-
ment goals could be negotiated with the industrial consumers before hand even with explicit
human involvement.
In comparison managing demand of hundreds of thousands of household devices is com-
plex due to two reasons. First is the volatility of the loads and the second is the scale of the
problem. Although technologies are being proposed which forecast loads with a measure of
accuracy despite of the volatility and new algorithms are being developed to plan for large
number of devices, a concrete framework or architecture within which these technologies can
be integrated and deployed has not been observed.
In this chapter we describe an architecture to implement demand side management in
the future smart grids. Our intention is to show how one can integrate the forecasting,
load disaggregation, modeling, and planning algorithms discussed in this thesis in a single
system since to our knowledge no such architecture exists within which we can place our
technology. We start our discussion with the description of our target energy system and
the architecture to implement the DSM program. This is following by proposed architecture
and a brief introduction of the technologies that work together to deliver DSM. Details of
the technologies will be discussed in the subsequent chapters.
47
3.2 Proposed Strategy
We started with the hypothesis in chapter 1 that If we instrument DSM as an autonomic
system then we can increase DSM's e�ectiveness and make it practical for utilities to increase
DSM's e�ciency and applicability in smart grids. We further limited the scope of the system
to manage loads of the high energy consuming devices: Air-conditioners and refrigerators. It
is much more feasible to control these devices as they are fewer in number and are the only
ones which have elasticity of use. Due to their high consumptions they have su�cient impact
for the DSM goals. This is can be observed as well from a number of DSM systems discussed
in chapter 2 which speci�cally focus on such devices for their DSM implementation.
The goal is to plan the energy in such a way so that it e�ects the end user in an acceptable
way. Since the DSM requirement is critical, we chose the contractually bound system model.
Our system o�ers di�erent contractual service level guarantees to the user. For example,
we o�er that no air-conditioner will be turned o� for more than 30 minutes in an hour. A
user may request a tighter guarantee at other hours for additional price. We instrument the
sockets of these devices with GSM chips and relays as actuators to implement the plan. Our
strategy is that we plan in advance for 24 hours of when should each device be scheduled to
run. This schedule considers the supply goals and the service level guarantees to formulate
the plan. This plan is then propagated to device controllers for execution.
Although the users are concerned about their privacy but the price di�erential is so
signi�cant that they are willing to share some data with the utility. However, sharing data
among other users was not acceptable. With tens of houses containing thousands of devices
it was not feasible to control each and every device therefore we focused on the device which
hogs the most energy. Given the long and severe summers, air conditioning is the biggest
consumer. As per our assumption, most of the houses equip some rooms of the house with
split or window AC units. The load elasticity for air-conditioning is su�ciently good such
48
that we can pre-cool as well as slow-down cooling at critical times. We also assume that the
houses are already �tted with automated meter reading (AMR). The AMR can use GPRS
or GSM to communicate the monitoring data as discussed in [Omer et al., 2010].
3.3 System Architecture
The strategy discussed in the previous section requires planning of hundred of thousands
of devices. To achieve this however, we �rst require measurements of the devices for which
we are planning. However, the device energy usage is very volatile. Instead we considered
the option of forecasting the relatively less volatile household load and disaggregate the
high energy loads from this forecasted load. The architecture of the resulting self-managing
demand side management system is shown in �gure 3.1. We call this a forecast-disaggregate-
analyze-plan (FDAP) loop. The detailed description of each step will be discussed in the
next three chapters. The collection infrastructure and actuator mots are outside the scope
of the research work discussed here and are only described to complete the picture.
3.3.1 Collection Infrastructure
Self-managing DSM collects its consumption data from the automated meter reading (AMR)
[Wallin et al., 2005]. There are a verity of AMR infrastructures available which vary in their
price, data collection and transmission frequency and accuracy. Since our system speci�cally
targets high consumption devices only and that also on a coarser granularity, we deployed
MicroTech International's meter which collects data on a minutes interval and transmits
the average consumption every 15 minutes. The data is transmitted through an SMS based
protocol developed over the GSM network. This solution is the most cost e�ective solution
as shown by Omer and colleagues [Omer et al., 2010]. The data is received by a web-service
based application which processes the data and stores in database [Liaqat et al., 2012]. The
49
Figure 3.1: Self-managing demand side management architecture.
data is maintained in the data store where it is retrieved for forecasting and analysis.
3.3.2 Forecasting
As discussed in chapter 2, a plethora of energy forecasting algorithms exist for forecasting dis-
trict wide loads over a period of 24 hours [Rahman and Bhatnagar, 1988, Yang and Huang, 1998,
Chen et al., 1993, Mastorocostas et al., 2000, Hippert et al., 2001, Bakirtzis et al., 1995, Dehdashti et al., 1982,
Dash et al., 1998, Abdel-Aal, 2005, Chen et al., 2010, Yao et al., 2000, Amjady and Keynia, 2009,
Lauret et al., 2008, AlFuhaid et al., 1997, Papalexopoulos and Hesterberg, 1990, Christiaanse, 1971,
Lauret et al., 2012, Irisarri et al., 1982, Amjady, 2001]. Generally these are called short term
load forecasting systems. However, since our controlling plane are individual devices, we re-
quire a forecast which can provide us with su�cient device consumption habits of individual
devices.
We are faced with two questions to use this body of knowledge of STLF at city level
50
for using it to forecast energy load of individual houses or devices. First, can current short
term load forecasting (STLF) models work e�ciently for forecasting individual households?
Second, does extended data enhance the forecasting accuracy of individual consumer loads?
We will show in chapter 4 that forecast using existing STLF models is extremely error-
prone using standard forecasting models. The traditional model forecasts using a model for
a single time series. That is, to forecast load of a house, the algorithm will take historical
data of the house and will try to reconstruct the future loads from it. However, the data of
a single house is extremely volatile which results in a very low accuracy.
For the second question it was hoped that with additional data about the user or the house
a better forecast may result but our results show that using the existing model no signi�cant
improvement results when richer data is used [Abaravicius and , 2007, Wilhite and , 2000].
However, the initial analysis for forecasting did not show promising results. The main reason
for this was volatility of the data. The energy pro�le of a region or a city is relatively
smooth since di�erent loads attenuate or neutralize loads to give a smooth curve. As the
scope of the system gets smaller the increase in volatility is greatly increased. This has been
observed and reported by various researchers [Amjady et al., 2010, Javed et al., 2012]. To
cater to this volatility to our knowledge very few solutions have been proposed. The most
relevant is by Amjady and colleagues forecasting system for micro grids [Amjady et al., 2010]
and Gurguis and Zeid's STLF method for buildings [Gurguis and Zeid, 2005a]. A recent
study by Ardakanian and colleagues attempted to forecast household loads using Markov
models but the solution was for a di�erent problem and is barely applicable for DSM domain
[Ardakanian et al., 2011]. The basic method of forecasting here is to construct a single model
for a single structure, be it region, city or micro grid. However, we observed that forecasting
for a single house becomes untractable due to volatility.
In contrast we introduce a new modeling method where we train a single model using the
anthropologic and structural attributes of all the houses [Javed et al., 2012]. We then use the
51
anthropologic and structural parameters of a house to forecast the future load. This builds
on the basic premise that di�erent houses have resemblance but this connection is temporal.
For example houses which have school going children will have similar rate of change in
electricity use early in the morning on week days whereas houses with senior citizens will
have similar pattern around 11 AM. Similarly houses with good insolation will have lower
energy usage during extreme weather than those with ordinary insolation irrespective of
the age group of people inside. We build a model which leverages these these temporal
interdependence for a more crisp forecast. We call this short term multiple load forecast or
STMLF. Forecast through STMLF is provided to the analyzer for further processing.
3.3.3 Disaggregation
The general problem of identifying individual device consumption from the total consump-
tion is called load disaggregation.There are di�erent algorithms available for this purpose
[Marceau and Zmeureanu, 2000, Lai et al., 2010, Liang et al., 2010]. However, there is a ma-
jor di�erence between the standard load disaggregation problem and our task. Load disag-
gregation is usually applied on actual data at a high frequency to identify devices in realtime.
In comparison we require load disaggregation of only the high energy consuming devices from
the forecasted loads. Since our forecast is at the granularity of an hour, our load disaggre-
gation will also be done at hourly basis. In chapter 4 we answer the critical question that
is it possible to disaggregate the device consumption prediction from the total household
load prediction on an hourly scale with high accuracy? We achieve this by modeling the dis-
aggregation problem using a combination of Arti�cial Neural Network and Support Vector
Machines.
52
3.3.4 Analysis
As discussed in chapter 1, the demand side management system is a dynamic system where
the number of devices being actively used varies over time. This variation even occurs
during the course of the day. If we construct a static model then we will need to construct
the model which caters for all the devices which possibly exist in the system. However, as
we have discussed, the scheduling problem is NP complete and requires approximation for
arriving at the solution in tractable time. This approximation introduced a drop in accuracy
of the system. To maximize the accuracy of the system, we propose a dynamic modeling
method which construct the system model at runtime. The system dimensions, or its level
of approximation is identi�ed by evaluating the system size, or number of devices requiring
management, the time available for scheduling and historical record of computational time
for di�erent system sizes. The details of this analysis mechanism and modeling method is
discussed in chapter 6.
3.3.5 Planning
The planning model takes dynamically analyzed and constructed model and constructs a
plan accordingly. For our system we optimize this plan to maximize the usability of energy,
that is our system maximizes the number of devices that can be used while staying within the
electricity supply limit. To build this model we used three di�erent algorithms. We used an
integer programming formulation for exact solution and two variants of linear programming
for clustered solution: the interior point method and the simplex method.
If an exact solution is required then we use an integer program which is similar to a two-
dimensional knapsack problem for deciding the schedule. But if the approximate algorithm
is chosen then as an approximation we apply a clustering algorithm to combine loads which
adhere to similar contracts. The goal of clustering is to combine loads which share their pro-
53
�le. This transformed our problem from a binary selection problem to frequency evaluation
problem. That is, instead of deciding the on or o� state of a machine, we calculated the
number of machines which should be switched on in each of the clusters. The result can be
computed using a linear program and the real results rounded o�. This adds a round-o� error
however, we proved that this error will be less than 6%. On the other hand this increased
our scalability to hundreds of thousands of machines which for integer programming solution
was limited to hundreds. The choice of clustering and the error threshold for clustering is
decided in the analysis phase since this has rami�cations on the total time of execution. The
generated plan is passed on the actuators which are dispersed in the user homes to enforce
demand side management. This algorithm is discussed in detail in chapter 6 section 1.
3.3.6 Actuators
The plan constructed through FDAP is distributed to the device controlling mots. Device
controller is simply a relay with a GSM device. Based on the control signal the air-conditioner
or the refrigerator is switch on or o�. We developed series of mots for this purpose. The
mots use SMS over GSM for communications and are equipped with a standard controller
and a relay to control power details of which can be found here [Omer et al., 2010].
3.4 Discussion
In this chapter we have presented an architecture which is an adaptation of the autonomic
computing's Monitor-Analyze-Plan-Execute for self-managing demand side management.
The architecture integrates the forecasting and disaggregation components with the analysis
and planning components. Clean interfaces are de�ned across the components for a smooth
integration of the di�erent cogs needed for self-managing DSM.
54
Chapter 4
Forecasting Energy Load for Individual
Consumers
In the chapter 1 we discussed how demand side management can increase energy e�ciency
thereby reducing the cost and carbon foot print of our energy consumption. However, for
individual end consumers to be part of this scheme a dependable forecast of their behavior
is a must. In this chapter we answer two main questions for forecasting loads for individual
consumers: First, can current short term load forecasting (STLF) models work e�ciently
for forecasting individual households? Second, do the anthropologic and structural variables
enhance the forecasting accuracy of individual consumer loads? In this chapter we will show
how a single multi-dimensional model forecasting for all houses using anthropologic and
structural data variables is more accurate and correct than a forecast based on traditional
global measures. We have provided an extensive empirical evidence to support our claims.
55
4.1 Introduction
Demand side management is the process of managing end user consumption. Various DSM
planning strategies have been proposed for smart grids but to implement such planning
methods the knowledge of the amount of energy demand at house level is a must. This
requires a short term load forecast for houses, and in some cases even devices. To this end
in this chapter we propose two unique concepts for short term load forecasting of houses
through which accuracy for forecasting loads of houses can increase by as much as 50%.
This provides an important cog in our proposed smart grid architecture for demand side
management discussed in chapter 4 and 5.
Forecasting for larger loads such as city or the entire grid has been achieved with rela-
tively high accuracy [Alfares and Nazeeruddin, 2002]. But for smaller populations such as
a building, or a micro-grid the dynamics change so drastically that standard STLF tools
require certain re-adjustments [Amjady et al., 2010]. For even smaller consumer group, such
as individual houses, the volatility in dynamics is even more pronounced as can be seen from
discussion in section 4.2. To forecast for such system we need to look at the STLF model-
ing, tools, and data. There are two pertinent questions to engineer these re-adjustments for
STLF for individual houses in a system that we answer in this chapter. First is that can we
forecast energy load using the existing short term load forecasting model? Second question
is that is the knowledge used for existing forecasting models su�cient?
Kim and Shcherbakova point out at the lack of data about user as the failure for
DSM and DR programs but our initial results showed that simple correlation between
user and house characteristics is weak and the strongest in�uence on demand is weather
[Kim and Shcherbakova, 2011]. This was observed on anthropologic and structural data col-
lected from 205 houses in Eskistuna, Sweden. However, we observed that a subtle relationship
between user characteristics and consumption.
56
To observe and use this relationship for forecasting we trained a single model for the 205
houses and used the richer dataset as the di�erentiating factor between houses. In essence
our model is a short term forecasting model for multiple loads.
To illustrate how this work let us take example of two houses, one with school going
occupants and the other without such occupants. The bulk of energy consumption in both
cases will be driven by weather pattern. The colder it is the more energy will be used. But
for houses with school going occupants, the energy usage in the early hours of weekdays will
be di�erent than the others. Furthermore, This will be common in all the houses with school
going children.
The idea is that we train a single multi-dimensional model using the data from all the
houses. This on its own will mean that the forecast will be average load for each hour for all
the houses. This is where our second contribution comes in. We augment this single model
by adding the anthropologic and structural data to the model. This additional information
allows the modeler to make sub-groups within the model for particular anthropologic and
structural population groups. In our example the modeler will be able to identify the rela-
tionship of a house having school going occupants with extra energy consumption in earlier
hours on weekdays. This will allow the model to add a premium to consumption over what
the weather pattern will forecast. Since all houses with school going children will have sim-
ilar trends, if a single house has a di�erent trend, for instance because child being sick and
missing school, then a global modeler will not over-�t the model and still forecast accurately
when the local temporal phenomenon expires. Note that an exponentially high number of
sub-classes exist for the population but a combined model adds and subtracts premiums over
the base forecast to derive a more crisp load for each house.
This modeling method is inherently di�erent from modeling for each house independently
(STLF). It is also di�erent from modeling for all the houses without the anthropologic and
structural data. We would like to stress here that the forecasting engine (ANN and MLR) is
57
not part of the contribution. The contribution is the new modeling paradigm - short term
MULTIPLE load forecasting (STMLF) and the use of anthropologic and structural data
within STMLF. As we will show this combination increases forecast e�ciency for both AI
based and statistical forecasters. To stress on the improvement of forecast based on our
contribution and avoid engine speci�c enhancements we use the simplest of statistical and
AI forecasters.
The rest of the chapter is organized as follows:In section 4.2 we discuss the problem and
show through various results the volatility of data which will necessitate use of anthropologic
and structural data under STMLF (section 4.2). In section 4.3, we introduce STMLF as an
extension of the basic STLF model. In section 4.7, implementation issues for STMLF are
discussed and STLF techniques which can be complementary with STMLF are identi�ed.
Next the experimental setup including discussion on data, forecasting engine and measure-
ments that are used to verify the correctness of forecast are discussed in section 4.4. In
section 4.5.1 results gained through the use of anthropologic and structural data are shown
followed by results of STMLF in comparison to multiple STLFs in section 4.5.2 followed by
conclusion and future works section.
4.2 Problem Description: Issues in house level forecast-
ing
When we look at forecasting for large number of aggregated loads we see that it is signi�cantly
simpler since loads within a large system tend to neutralize or attenuate the total demand.
However, for individual loads this is not the case. This results in a highly volatile load
data that is di�cult to forecast. Even when we consider loads across the population at a
particular time instance we see a large variation. If we consider the standard time domain
volatile measure of standard deviation of rate of change, we �nd that volatility of individual
58
loads is order of magnitude more than that of a traditional grid [Amjady et al., 2010]. To
illustrate this, table 4.1 compares time domain volatility measure of individual loads to that
of micro grid, a city wide and regional grid volatility measures from the study of Amjady
and colleagues [Amjady et al., 2010].
Though this volatility in individual loads is dependent on global phenomenons such as
time of day and temperature but loads vary signi�cantly even for the same hour. This is
illustrated through the box and whisker plot in Figure 4.1 for a 24 hour period for load data
of a single day of 204 houses from Eskistuna, Sweden. The whiskers show the maximal value
in a given hour and box encloses 50% of the total data (top edge represents 75th quartile and
bottom edge 25th quartile and line in middle the median). If we construct a model using only
global phenomenons then irrespective of forecasting engine, there is no way to di�erentiate
between loads which are close to mean and which are not. This causes an accuracy drop and
increase in mean square error (MSE).
From such systems it can be inferred that any model trained on global variants is bound
to miss-forecast signi�cant number of individual loads since global variants can only identify
the general trend of the system and do not have su�cient discriminants to di�erentiate
between individual loads. To di�erentiate individual loads, we require deeper understanding
of the individual loads.
There are two types of data variables which a�ects the load consumption: anthropologic
System Individual Micro-grid Alberta's Ontario'sloads (University of power power
Calgary) system systemStandard deviationof normalized rate 1.82 3.83× 10−2 1.84× 10−2 2.69× 10−2
of change
Table 4.1: Comparison of volatility measure of individual loads, micro grid loads and stan-dard grid loads
59
aspects and structural variants. Anthropologic aspects are occupant characteristics such as
number of occupants, age, and so on while structural variants capture the physical char-
acteristics of the house. To construct a forecasting model which can di�erentiate between
consumers we conducted a survey consisting of both anthropologic and structural questions.
The details of the survey are provided in �gure 4.2.
This questionnaire combines a mixture of anthropologic (column 1) and structural (col-
umn 2) questions and pseudo-anthropologic (column 3) questions . These questions are
aimed at capturing a variety of information that ranges from the ages of occupants and
their general behavior of occupation to the type of walls, heating equipment, covered area
of property, etc.
Other than the structural and anthropologic data another important consideration for
modeling this system is the scalability of the proposed technique. Since this forecast is needed
for each house, it is impractical to have a dedicated forecaster for each house. First, this will
require signi�cant computing resources for each load. Second, since each forecaster will have
access to data of a single user, cross-cutting patterns of usage will not assist the forecast.
Intuitively, the method which can process various loads together will be better suited. In
the next section we discuss such a paradigm which can model multiple time-series. It turns
out that such a modeling framework has better accuracy and lower error than an STLF for
each individual load as illustrated in section 4.5.
4.3 STMLF Model
The need for STMLF is born out of the inherent short comings of existing short term load
forecasting models when forecasting for household loads. These short-comings stem from
the fact that till recently, the control of energy in grid did not provide a detailed control
of the demand side. The demand side, although made up of individual loads with their
60
Figure 4.1: Box and whisker plot for consumers load over a 24 hour period of 204 housesfrom Eskistuna, Sweden. Whiskers point the maximum load for the hour upper and lowerbox edges are 25th and 75th quartiles respectively and the line in box is the median. On Xaxis is time at intervals of one hour and Y axis is load in KWH. .
own pro�les, was considered as a single large chunk. Some researchers acknowledged this
diversity of patterns in load data but they only used the sub-pattern to forecast the total
load of the system and were less concerned with forecasting individual loads. For example,
[Tan et al., 2010] and [Nguyen and Nabney, 2010] leveraged this fact by identifying these
patterns through wavelet transform and forecasted the more crisp sub-patterns rather than
a complex combined pattern. These sub-patterns were then combined to form a single
forecast for the entire system. The break-down of the wavelet was also to a degree where
it was needed for large system forecast and not to forecast independent components of the
load.
However for our proposed ADSM in micro-grids the need is no longer for an aggregation
of all loads, rather our interest is to �nd the individual load value for each house for DSM.
But the existing methods which are used for STLF are explicitly limited to single time se-
ries. There are two options, either we use existing STLF for each house or we appropriately
61
Figure 4.2: Classi�cation of survey questions. We classi�ed questions as anthropologic (hu-man centric), or structural (building speci�c) and pseudo-anthropologic which are occupantsimpact or usage of structural facilities.
62
transform the STLF model to work for forecasting multiple loads. We found that trans-
forming STLF to STMLF not only increase e�ciency of running time but also increase the
accuracy of forecast as well. To understand this transformation we will �rst de�ne STLF
as an abstract system and then use this abstract model to explain the transformations to
realize STMLF.
4.3.1 STLF Operations
To understand the working of STLF and reason about the need for STMLF, we will �rst
diverge for a brief discussion on STLF's working at an abstract level. STLF is usually a two
step process. First an STLF modeler builds a model based on the time-series of consumption.
This time-series is usually complemented by other environmental variants which e�ect energy
load. These may include temperature, time of day, season, day of week etc. In addition each
model requires some tuning parameters and constants such as weights for algorithms which
are speci�c to the algorithm and the input data. These are the invariants, or variable which
do not change over time.
Formally we can say that STLF modeler is a function given by:
STLF (T(1..j,0..t−1), P0..t−1, E) = M (4.1)
Where T(1..j,0..t−1) is the time-series for j environmental variants such as temperature,
wind, solar radiance etc., P0..t−1 is the historical time series data of load and E are the local
invariants and tuning parameters such as weights given to parameters etc.
For most forecasting engines the input is usually streamed as series of tuples of data.
Each tuple is made of j + 1 + |E| values, j values for j environment variables, 1 value for
the load and |E| values for the number of invariants. For example, for �fth time quantum
there will be four tuples representing readings from �rst four time quanta and so on.
63
Based on this input STLF creates a model M . M can be simulated such that e�ects of
environment variants T(1..j,t), and invariants E for a speci�c time t over this model produce
load Pt. That is:
simulate(M, t, T(1..j,t−1), E) = Pt (4.2)
Here Pt is the forecast for the system for time T = 1. The modeler usually associate
the variants with speci�c load values. This creates a model of the system to be forecasted.
When a new forecast is required the model is simulated by providing it with variant and
invariant data for forecasting period and model simulation produces the load value which is
associated with the input data.
4.3.2 STLF for Independent House Forecast
STLF forecasts are for a single system. To forecast for a number of houses, this translates
to having an STLF modeler and simulator for each house. For such, following the general
convention of STLF, the input to each modeler will be series of t tuples where t is the length
of training period. Each tuple will contain the environment variable value, the load of the
house and invariants for the modeler. There are two problems with this method of forecasting
which we have discussed brie�y before and will delve in more detail here.
First, such a large number of modelers will require large computational resources. Either
each house will require computing resources to store data of the house and run a compu-
tationally complex model for every forecast or the utility will require numerous computing
resources to achieve this goal.
Secondly, as we have pointed out in the previous section, the load curve of a house is order
of magnitude more volatile than any other system that STLF has been applied on. There
are further two issues with modeling such volatile systems. First, su�cient data attributes
64
should be there to discriminate the root causes of volatility and second su�cient data should
be provided to avoid over-�tting. Over-�tting is the phenomenon when a forecaster captures
outliers, or out of ordinary incidences and considers them part of the normal operations,
thus increasing the error of forecast. First issue is related to the number of attributes of
data and second issue is related to number of good examples for each attribute combination.
Application of STLFs over house loads with existing data su�er from both of these prob-
lems. As we will show in our experiments the existing global variants are insu�cient to
discriminate house loads. This is because the house data is too volatile and the environmen-
tal variants of system are insu�cient to associate a load value to the input. This is evident
from our evaluation results later which show STLFs as ine�ective in forecast loads of houses.
To illustrate this point further let us take an example of a houses in a neighborhood. We
know from previous studies that temperature, day of week and hour of the day are the major
factors for energy load if considered in the same season. For a forecast for the aggregation of
loads these input parameters are su�cient. But each house has its own anthropologic and
structural characteristics. For instance if one house has school going children and another
has only o�ce going residents then though the bulk of the load will be decided by global
factors but the house with children will start consuming more energy a few hours or minutes
earlier than those who do not have this peculiarity.
Secondly, an STLF at house level will over-�t the data since it will have insu�cient sized
data. If we move this STLF to neighborhood level then we will have su�cient data to avoid
over �tting but this model will not be able to capture the di�erences in load variations since
it does not have the discriminating attribute to capture the volatility of sub-systems. Input
vector to this modeler will be the total (or average) load value, global variants and system
invariants. The result will be a forecast for average load of all the houses. This will be an
inaccurate forecast for both the house with school-going children and for those who do not
have this peculiar characteristic. Thus we need a forecast which has considerable size of data
65
to avoid over-�tting and su�cient attributes to di�erentiate between di�erent load patterns.
4.3.3 STMLF1
To ameliorate this problem we propose STMLF, a modeling framework for combining mul-
tiple time-series. We propose two paradigm shifts from STLF for this model.
First, instead of creating load model from a single time-series, we use all the available
time-series as training data. This is di�erent from sum of loads where all the loads are
summed and STLF forecasts the sum (or average) of loads. Rather each load and its at-
tributes are passed to the modeler as a tuple. That is instead of providing one value for each
time-period, we provide n tuples for each time-period. Here n is the number of houses. This
resolves the issue of over-�tting since su�ciently diverse data smooths the out of ordinary
events.
But just combining time-series in a single system is not su�cient. As we have discussed
above, we need to provide discriminating attribute for the modeler to associate the learning
output value with the input values.
Our �rst attempt was to use houseIds as the discriminating attribute. Such a model can
be expressed as:
STMLF1(T(1..j,0..t−1), P1(0..t−1), .., Pn(0..t−1)
, E, houseId) = M1 (4.3)
Here T and E are same as in single load forecasting but for each load i a time series Pi is
also considered.
The resultant model M can be simulated to map time t, environmental variants T , in-
variants E, and the index of load i to predicted load for P(i,t). That is:
simulate(M1, t, T(1..j,t), E, houseId) = Pi,t (4.4)
66
In this model an input tuple in addition to load value, environmental variants and system
invariants also contains the houseId �ag. For house number x the xth �ag is set as one and
the rest as zeros.
This scheme has two drawbacks. First houseId is too vague for the modeler to associate
load patterns with. We will prove this empirically in an experiment later in which we show
that our STMLF2 model is better than the STMLF1 model based on houseId discrimination.
We see that the forecast is strongly dependent on global variants and insu�ciently aided by
discriminating attribute.
Second, this scheme is computationally complex and not scalable as we will discuss later
in this section. A graphical representation of STMLF using houseId as discriminants in a
neural network is shown in �gure 4.3(b). Each input attribute of the tuple corresponds to
a neuron of �rst layer. The number of neurons grows with the number of houses we try to
model. This means that the neuron model is in the order of the number of houses. As the
number of houses grow the size of the neural network grows as well. This neural network
analogy is equally applicable in other forecasting models such as Bayesian belief networks,
time-series analysis etc.
4.3.4 STMLF2
Instead of this complex and inaccurate model, our second paradigm shift is to consider
richer data for forecast. This richer data incorporates the anthropologic and structural data
discussed in section 4.2. This resolves both the problems we faced in using independent
STLFs and in using a combined model using houseIds.
In this methodology, the modeler is provided with the local invariants in addition to
the global variants to construct its model. An input tuple for STMLF2 consists of the j
environment variants, the load data for the house, the system invariants and in addition
local invariants of the house which correspond to the load P .
67
STMLF2 considering this richer data is expressed as following:
STMLF2(T(1..j,0..t−1), P(1..n,0..t−1), E, E′(1..n,1..k)) = M2 (4.5)
Here E ′(1..i,1..k) maps the invariants of house to the load time series. A forecasting engine
will create a model which will associate T , P , and E ′ with the outout. Simulating this model
is a bit di�erent. Instead of providing the house �ag, the invariants of houses with E ′1..m
values are used, in addition to t and T , to construct a forecast for all the houses with E ′1..m
characteristics.
simulate(M2, t, T(1..j,t), E′(1..m), E) = P(1..m,t) (4.6)
We will �rst discuss its graphical representation in neural network model and then discuss
why it is better than STLF and STMLF1. A graphical representation of STMLF using richer
data as discriminants in a neural network is shown in �gure 4.3(c). Each training record of
our model is a tuple consisting of global variants (hour of day, day of week, temperature,
etc.), house variants (number of occupants, number of school going children, wall types,
etc.), and load value for that houseId under the variants. Each input attribute corresponds
to a neuron of �rst layer. The trainer associates weights with each neuron.
In this model di�erent input parameters or their combinations are assigned according to
the training data. Temperature and time of day may have higher weights but statistics such
as number of children will add their weight to the output as well. This weight can be positive
or negative and modulates the temperature driven load on the basis of local characteristics.
To explain this further let us consider the example we discussed above. In such a case,
when input for number of school going children is positive and time and date is early in
the morning and weekday then the the internal node connected with these input neurons
will add positive weight to the output. So for all the houses with these characteristics, in
addition to the load forecasted due to weather conditions, an additional load will be added.
68
(a) (b) (c)
Figure 4.3: ANN models for three forecasters. (a) is ANN model is for a single house whereonly load and global invariants are provided for forecast. (b) is the ANN model for STMLF1.(c) is the ANN model for STMLF2
In comparison, houses with no school going children will only be a�ected with weather
conditions. We add another twist to this example. For the houses with senior citizens, the
consumption may be low early in the morning but is high around 10 AM. For the houses
which contains senior citizens, a load will be added to the base load at 10AM. For those
with school going children, the addition will be for 7 AM. But if a house has both then it
will borrow from both models and will register speci�c consumption patterns for both 7 AM
and 10 AM. In this way we can potentially construct a model from a subset of houses and
use this model to forecast houses with similar trends and traits for forecasting.
4.3.5 Model Complexity
We have discussed three models. First is an STLF for each house, second is the STMLF
using houseId as discriminant and last is STMLF with house attributes as discriminant.
Complexity of a modeler is generally expressed in terms of the extra time system will take
with addition of input parameters. Traditionally O() (big oh) analysis considers the worst
time, that is the most time that algorithm can ever take and is taken as the academic and
69
industrial standard in computational sciences. Here we would like to clarify that O(1) does
not mean a small execution time. Rather it means that as we increase the number of houses,
the expected worst time of completion of algorithm remains unchanged for a single modeler.
Scalable modeler is one which is at the most some polynomial function of input variables
since power series or exponential series are intractable for large data sets. Without loss of
generality we can assume the complexity, or e�ciency, of a modeler as O(xγ) where x is the
number of input parameters and γ is modeler e�ciency
Considering the complexity of our modeler, STLF for each house will have the complexity
of:
Ostlf = n×O(jγ) = O(n)×O(jγ)
That is, we will have n STLFs and for each STLF we will require O(jγ) computations.
Here j are the number of environment variables.
Since j is not dependent on the number of houses, Eγ is a constant for a system with
�xed E and γ thus O(Eγ) = O(1):
Ostlf = O(n)×O(1) = O(n)
In comparison STMLF1 with houseId as discriminant will have
Ostmlf1 = O((j + n)γ)) = O(jγ + nγ) = O(1) +O(nγ)
Here the number of input parameters are j environment variables and n houseId �ags.
Although we require only one algorithm but since houseId �ags increase with time the
complexity of the system is worst than STLF as long as gama > 1.
The third model is STMLF with k house attributes used to model the load. The worst
case analysis for this model is:
70
Ostmlf2 = O((j × k)γ)
That is the forecasting engine's time increases with increase in number of house param-
eters and environment variants but the number of houses do not e�ect the running time of
the algorithm. Since both k and E are constant for a system:
Ostmlf2 = O(1)
.
This means that this model is not a�ected by number of houses. The complexity time
will be similar irrespective of the number of houses that are being modeled and forecasted
for.
4.4 Experimental Setup
This section discusses the experimental setup for our experiment. First our forecasting engine
is described followed by the measures used to assess the e�ects of STMLF and richer data
for forecasting. We will then discuss anthropologic and structural data that was collected
for this experiment.
4.4.1 Forecasting Engine
We discuss in section 4.7 in detail the issues with existing forecasting methods for for fore-
casting in a multivariate environment. The issue is with building a multidimensional model
in higher dimensions. Secondly, our focus in this chapter is to show the e�cacy of our mod-
eling paradigm and e�ects of richer data on forecast. For this reason we select the two base
forecasting algorithms which are used for forecasting namely regression and neural networks.
71
It is easy to see that since most of the state of the art forecasting engines are extensions
of these two basic engines a proof of increased on the archetypical engine implies e�ective-
ness of STMLF with richer data for the enhancements as well. For neural networks we use
the basic resilient back propagation algorithm of ANN proposed by Riedmiller and Braun
[Riedmiller and Braun, 1993]. For regression we use multiple linear regression (MLR) since
it is able to construct model for multivariate input stream.
The forecasting engine is constructed in Matlab. A three layered back propagation neural
network is trained on three weeks of data. ANN consists of three layers, input layer consisting
of 60 neurons representing the input in layer 1(L1). Second layer (L2) consists of 20 neurons.
The output layer only consists of single neuron representing the forecast. The trained model
is used to forecast the power load for each hour for the next day.
MLR was also implemented in Matlab using Matlab's regression toolbox. The toolbox
implements the algorithm proposed by Chatterjee and Hadi [Chatterjee and Hadi, 1986].
4.4.2 Measurements
Measuring success for multiple individual forecasts is more involved than measuring success
of a single system. There are three measures that are usually used for such systems. (1) pre-
cision, (2) accuracy and (3) stability or certainty. These measurements are more appropriate
when measuring forecasts for multiple objects. Traditional measures such as percentage er-
ror and even MSE is not considered the most appropriate measures for numerous forecasted
data as they can be over-in�uenced by some very bad examples and can overshadow a good
forecast for majority of population. For example, if consumption for a house is zero for a
particular hour then any forecast other than zero will be in�nitely erroneous if we consider
percentage error. Similarly a forecast of 0.2 for a consumption of 0.1 will be hundred percent
inaccurate though the actual miss-forecast is 0.1. When we consider numerous forecasts,
the more appropriate measure is accuracy which measures the number of wrong forecasts
72
against the number of correct forecasts. This will be discussed in more detail below.
Precision
Precision is the measure of how close we are able to forecast to the actual load. To measure
precision we use mean squared error given by the following function:
MSEt =
n∑i=0
|Li,t − Pi,t|
n(4.7)
Where Li is the observed load and Pi is the forecasted load.
Accuracy
Accuracy is the measure of how many correct forecasts the forecasting engine makes. Cor-
rectness is a user de�ned parameter. It is preferred to de�ne correct forecast as a value
within a percentage range of actual load. However, for low loads, a percentage range be-
comes insigni�cant. For a load of 0.1 KWH, a 20% range would be 0.08 to 0.12 and a forecast
of 0.2 will be considered extremely wrong. However, practically a forecast of 0.2 will not
be very unsuitable provided that such loads are not majority of population. To avoid this
false loss of accuracy we have two scales to measure accuracy. We set a 15% range of error
for accuracy, but if load is smaller than 3 then we consider range of ±0.5KWH as range of
acceptable forecast.
So accuracy for time t is given as:
Acct =
∑1 +
{∀Pi,t>3&|Li,t−Pi,t|>Pi,t∗0.15}
∑1
{∀Pi,t≤3&|Li,t−Pi,t|>0.5}
(4.8)
Accuracy is speci�cally important measure for measuring success over multiple forecasts.
73
Stability
The third measure of correctness is certainty or stability, that is the variance in error. It is
given by
vart =
n∑i=1
(P̄t − Pi,t)2
n− 1(4.9)
Here P̄ t is the average forecasted load for time t.
4.4.3 Experimental Data Source
The data for these experiments was provided by the Department of Energy,
Building and Environment, Malardalen University, Sweden. We greatly appre-
ciate their e�ort in collecting and sharing this data.
A survey over 204 houses was conducted in Eskistuna, a small town 100 KM from Stock-
holm, Sweden. The main goal of the survey was to collect structural data of the house and
anthropologic data of its occupants. In addition, these 204 houses were �tted with AMR
which collected power consumed at each hour. Weather data was collected from local me-
teorological department for forecasting as well. The questionnaire collected from occupants
contained the questions discussed in section 4.2. To represent the seasonality and season
speci�c patterns we conducted our experiments over a 7 day period in each season. That
is, forecasts were made for a week of January, April, July and October to represent the four
seasonal variations.
4.4.4 Experimental Environment
The simulations for the experiments described below were run on a Intel core2 duo processor.
The clock speed was 1.3 giga-hertz with 2 GB of memory. Matlab's Neural Networks Toolbox
74
was used to implement the ANN.
4.5 Results
The results in this section show the e�ectiveness of STMLF with richer data against use of
STLF with richer data and STMLF without richer data. That is, there are three comparisons,
STLF with anthropologic and structural data, STMLF with global parameters and STMLF
with anthropologic and structural data. Our claim is that STMLF with anthropologic and
structural data is a more robust technique and will have a higher accuracy than the other two
combinations. Here the �rst experiment represents using existing modeling paradigm with
richer data and the second set represents our modeling paradigm without the richer data. To
validate this claim we run experiment on two existing forecasting techniques. The decision
these two techniques is based on the evaluation of existing load forecasting technique and
is discussed in section 4.7. As has been discussed, the goal of these experiments is to show
that STMLF outperforms STLF for AI based and statistical based techniques.
4.5.1 AI Based Experiment Results
4.5.2 Multiple STLFs vs. STMLF
This experiment is designed to compare forecast made by an STLF for each load and STMLF
for entire population. If STLF is applied for independent consumers then for each consumer
a STLF engine will be run for each forecasting cycle. To simulate this we trained and
forecasted 204 STLFs, one for each load for each day. For the 28 day testing period we
executed 5712 STLFs. The results were compared with STMLF executed for each day.
To analytically prove our results we compared the results of STMLF with aggregated
output of multiple STLFs using the three measures discussed previously, namely precision,
75
accuracy and stability. Table 4.2 lists MSE, variance and accuracy for the 4 training weeks
of STMLF against STLF.
As can be observed, STMLF is more precise than aggregated STLF results. Average MSE
for STMLF is almost 42% lower than of aggregated STLFs. Specially in autumn and winter
months, when consumption is relatively higher, MSE for STLF is 2.7 times more than MSE
of STMLF. We can correlate this increase in performance for all three measures to higher
consumption of energy in these months in Sweden.
Similarly, STMLF is more accurate than the aggregated STLFs. This increase is as much
as eight to ten percentage points di�erence in autumn and winter months (59.7% vs 48.9%
and 59.9% vs 51.2% respectively) due to similar reasons stated above.
Stability of the forecast shows same trend as variance for STMLF is lower than aggregated
STLFs for all the four months.
Month STMLF STLF AverageLoad
Var MSE Acc Var MSE AccJan 4.23 1.59 59.9% 5.53 2.29 51.2% 4.21April 2.7 0.93 52.5% 3.08 1.10 49.0% 2.21July 1.93 0.62 65.0% 2.62 1.12 62.1% 1.12October 2.69 0.95 59.7% 3.39 2.61 48.9% 2.61
Table 4.2: Results of 3 measures of forecast through multiple STLFs and STMLF. In additionaverage load of load for that week is provided to show a relationship between MSE andaverage load in that week.
As can be seen STMLF outperforms STLF for each of the 28 test dates across 4 seasons.
STMLF is as much as 17% more accurate on some days in addition to the scalability concerns
as were discussed in section 4.2.
76
a. b.
c. d.
Figure 4.4: Mean squared error for four test weeks (a. Week of January b. Week of Aprilc. Week of July d. Week of October) comparing STMLF with multiple STLFs. Blue line isSTMLF and red line is average MSE of all STLFs. Days of week are on X axis and meansquared error is on Y axis.
77
4.5.3 E�ect of Anthropologic and Structural Data
This experiment compares results of STMLF1 model with STMLF2 model and shows how
anthropologic and structural data in STMLF2 model can increase accuracy of forecast in
comparison to STMLF1 model using house-Ids as discriminating attribute.. Figure 4.5 shows
scatter plot of forecast against actual load for seven day period in January. For each day, two
scatter plots are presented. The top scatter plot presents the forecast using anthropologic
and structural data whereas the bottom graph shows the scatter plot of forecast through
house-Id only. As is anticipated, since house-id is insu�cient to di�erentiate houses, only the
global variants are playing an active role in forecasting. This is evident from lower scatter
plot as the forecasts resembles a horizontal line, that is for each day, STMLF1 forecaster
predicts the mean load for the day for all the (house,hour) combinations. In comparison,
through anthropologic and structural data a more crisp model is created which is able to
di�erentiate the inherent variation in loads and thus the forecast is closer to (x = y) line
representing correct forecast.
To analytically validate our results we conducted three tests of correctness, that is, preci-
sion (MSE), accuracy and stability (variance). Figure 4.6(a-d) plots result of MSE for each
day for the four experimental weeks. As can be observed, MSE of STMLF2 is always better
than the results from STMLF1. Table 4.3 provides average MSE for the four weeks. Average
Month Anthropologic and Control Experiment AverageStructural Model LoadVar MSE Acc Var MSE Acc
Jan 4.23 1.59 59.9% 7.31 3.39 36.4% 4.21April 2.7 0.93 52.5% 3.69 1.57 35.2% 2.21July 1.93 0.62 65.0% 4.86 1.12 49.5% 1.12October 2.69 0.95 54.7% 5.26 1.92 37.6% 2.61
Table 4.3: Results of 3 measures for forecast based on model constructed through anthropo-logic and structural data and forecast based on house-Id. In addition average load of load forthat week is provided to show a relationship between MSE and average load in that week.
78
0 2 4 6 8 10 12 14 16 180
5
10
15
0 2 4 6 8 10 12 14 16 180
5
10
15
0 5 10 150
5
10
15
0 5 10 150
5
10
15
0 5 10 150
5
10
15
0 5 10 150
5
10
15
day 1 day 2 day 3
0 5 10 150
5
10
15
0 5 10 150
5
10
15
0 2 4 6 8 10 12 14 160
5
10
15
0 2 4 6 8 10 12 14 160
5
10
15
0 2 4 6 8 10 12 14 160
5
10
15
0 2 4 6 8 10 12 14 160
5
10
15
day 4 day 5 day 6
0 2 4 6 8 10 12 14 16 180
5
10
15
0 2 4 6 8 10 12 14 16 180
5
10
15
day 7
Figure 4.5: Scatter plot of forecast against actual load for 7 day test period of January. Thetop plot in each �gure is forecast through structural and anthropologic data and bottom oneuses house-Id as discriminant. In all �gure actual load is on X axis and forecast is on Y axis.
79
MSE for the 28 day period for STMLF1 data was 1.02 as compared to 2.0 for STMLF1.
This means the richer data consisting of anthropologic and structural data reduces error by
close to 50%.
Similarly accuracy showed improvement with richer data. As can be seen from data
listed in table 4.3, forecast using STMLF2 model at average is �ve to ten percentage point
better than by STMLF1. If we consider accuracy across the 4 weeks the total accuracy for
STMLF2 model is 58.25% and for STMLF1 is roughly 40%.
The third test for correctness of forecast is its certainty. A forecast with low variance
means high stability. This entails a more meaningful or trustable forecast. Table 4.3 lists
the variance of both the experiment sets for the 4 experimental weeks. As can be observed,
variance is high for both the cases as is expected for a forecast for individual loads due
to volatility of underlying system. However, variance of STMLF2 model is nearly half of
the STMLF1 output (0.63 and 1.02 respectively for entire experiment). This conclusively
proves that STMLF2 model using anthropologic and structural data increase the accuracy,
precision and stability of consumer load forecasts.
4.6 Discussion on Miss-Forecasted Combinations
The results in previous section compares and contrasts the use of anthropological and struc-
tural data against the use of global variants as well as application of our modeling framework
-STMLF- for end user load forecasting. In the results of experiments we observed that a small
proportion of [load,hour] combinations contributed signi�cantly more to miss-forecasting.
For example consider the test result for the week in July as shown in table 4.2. For 204
houses, we have 204× 24 = 4896 [load,hour] combination for each day. For 7 day period the
total data points are 34272 [load,hour,day] combinations. With close to 35% error, we have
11698[load,hour,day] combinations which are miss-forecasted.
80
a. b.
c. d.
Figure 4.6: Mean squared error for four test weeks (a. Week of January b. Week of April c.Week of July d. Week of October). Days of week are on X axis and MSE values on Y axis.
81
On focusing on recurrence of error on each day, we observed that many [load,hour] com-
binations are repeated on each day of the week. That is a load x at hour y is miss forecasted
on more than one day. Table 4.4 lists recurrence frequency of [load,hour] combinations over
the 7 day period. Here column 2 is the count of such instances where load x at hour y fails
for z (value in column 1) days in a week. Column 3 is the percentage of these combinations
in terms of daily combination (e.g. 188 out of 4896) and column 4 is the percentage of error
due to these instances in the whole of the week (188× 7 out of 34272 total errors).
As can be observed, 188 combinations are miss-forecasted on each of the 7 days. This
means that a certain house x is miss-forecasted at time y for each of the 7 days and there are
188 such instances. If we consider [load,hour] combinations which are miss-forecasted for 6
or more days out of 7 or 85% of the time then we have 426 [load,hour] combinations out of
4896 combinations (or 9%) in this range. That is, for these speci�c 9%(or 238) [load,hour]
combination, we can be extremely sure that we will miss-forecast the house since at least 6
days out of 7 we are miss-forecasting it. The probability of correct forecast (0.15) is very
low. Additionally since these house are forecasted incorrectly almost every day of the week,
they contribute roughly 24% of weekly error rate.
A small yet concise set of combinations being miss-forecasted so regularly points to some
inherent trends in these loads at those hours. Either these trends are not captured through
our model or through our data. It maybe that some critical anthropologic or structural
information was not collected leading to error or maybe the pattern of usage was too volatile
for our STMLF engine to capture and forecast.
To understand this error further consider the method of creating STMLF model. STMLF
builds a model based on time series of multiple loads. It uses load attributes E ′ to build these
model so that di�erent value combinations of vector E ′ are weighted to identify di�erent pat-
terns of loads. These patterns are then simulated to forecast the load under E ′ and global
invariants. However, this is under the assumption that combination of attributes discrimi-
82
nates various patterns completely in that variations in E ′ is able to explicitly di�erentiate
all the di�erent types of patterns that exist for loads in the system.
However, if the available attributes are not su�cient to di�erentiate between load pat-
terns then this results in ambiguity in forecaster. In such a situation forecaster is not able to
di�erentiate between patterns. For instance, two houses may have all k attributes of E ′ ex-
actly same but some occupant habit, such as working hours, di�erent then their consumption
patterns will also be di�erent at least for hours where their work hours are di�erent. But if
this critical load attribute is not collected then forecaster has no way to distinguish between
these loads. Since it is realistically impossible to capture all the attributes that e�ect a load,
it is imperative that some alternate method is identi�ed to mitigate this problem. Here we
will discuss our observations which can aid in arriving at some solution for this problem.
Let us di�erentiate this varied patterns within a particular combinatorial values of E ′
as Υn for dominant pattern and Υe as sub-patterns. With no way to distinguish between
Υe from Υn, as shown in discussion above, the forecasting engine forecasts weighted average
for both Υn and Υe. However, for Υe this is incorrect forecast. However, e�ciency can be
increased if we can identify Υe, at time of forecast. We can then sperate Υe prior to model
creation through di�erent methods and reduce the error caused by lack of discriminating
data.
To identify these patterns various classi�cation techniques can be used. We have per-
formed some experiments using support vector machines (SVM) to identify Υe. Our results
are encouraging but discussion of SVM model and its e�ect is beyond the scope of this thesis.
The next question is how to forecast for Υe? There are two options that one can apply to
these combinations. One is that an appropriate upper bound or lower bounds is assigned to
Υe. Another method that seems more promising is to have multi-level forecaster where Υe is
repeatedly removed from training data and model is trained on dominant trends only. Various
thresholds can be used to implement this cleansing of input data. However, discussion of
83
such methods is beyond the scope of this work and will be discussed in future.
Our intention here is to state the issue of ambiguity of patterns even with richer data
and identi�cation of methods to mitigate this problem. We leave this as future work.
4.7 Short Term Forecasting Techniques for STMLF
We now discuss the algorithms used for traditional STLF and their application on STMLF.
There are three concerns that we have for using a forecaster for STMLF. First it should
be able to handle at the least k input parameter. Our results show that this k should be
signi�cantly large to distinguish between house characteristics. Second, as is shown in section
4.2, signi�cant portion of our forecasted data is far from mean. Therefore, the forecasting
technique should not ignore or suppress outliers. Third, the technique should be able to
handle a highly volatile system since consumers loads are highly volatile as discussed earlier.
Now we will discuss various STLF techniques in light of STMLF requirements stated
above and discuss which techniques can be used for STMLF. This discussion is important
in identifying the forecasting engine that we use for STMLF since many existing forecasting
techniques do not support the computation required for STMLF.
Load forecasting historically has been used to forecast large scale monolithic systems such
# of Combination Percentage of Percentage ofmiss-forecasts Count(Percentage) Population Errorin 7 days
7 188 3.8% 11.25%6 238 8.7% 23.5%5 379 16.4% 40%4 583 28.3% 60%3 727 43.2% 78%2 819 59.9% 92%1 908 78.4% 100%
Table 4.4: Repeat count of error and Cumulative accuracy error for 7 day period.
84
as power loads of a city or region or cost of energy in a market. There are three fundamental
techniques which have been applied for such forecasts for a single system: 1) Statistical tech-
niques focused on smoothing and averaging such as regression [Papalexopoulos and Hesterberg, 1990],
exponential smoothing [Christiaanse, 1971], Kalman �lters [Irisarri et al., 1982], stochastic
models [Wang et al., 2011] etc. 2) Time series methods such as linear univariate model
[Cuaresma et al., 2004], ARIMA [Amjady, 2001], Box and Jenkin [Hagan and Behr, 1987], in
combination with econometrics model [D. and Uri, 1978], GIGARCH [Diongue et al., 2009],
GARCH [Garcia et al., 2005] and hybrid models such as combination of ARIMA and GARCH
using wavelet transform [Tan et al., 2010] etc., and 3) AI techniques such as ANN
[Hippert et al., 2001], ANN with radial basis function [Lin et al., 2010], pattern recognition-
based techniques [Dehdashti et al., 1982], expert system-based techniques [Rahman and Bhatnagar, 1988],
particle swarm optimization [AlRashidi and EL-Naggar, 2010] and fuzzy system-based tech-
niques [Yang and Huang, 1998] etc.
Recently due to prevalence of smart grid ideas research has been focused on STLF
for small scale systems. STLF for small scale systems is proven to be a much harder
problem than for a large scale system as has been explained by Amjadi and colleagues
[Amjady et al., 2010]. Amjadi and colleagues and [Amjady et al., 2010] and Gurguis and
Zeid [Gurguis and Zeid, 2005b] have proposed solutions which work better than the standard
STLF for a micro-grid or building level granularity. However, the accuracy of the system
still does not match those of a large scale STLF due to volatility of underlying system.
We will look at each of the three classes of algorithms to identify methods which can be
used for STMLF and also point out the reasons why an algorithm is not usable for STMLF.
We see that most of statistical techniques are not applicable for STMLF for two reasons.
First, these techniques are based on smoothing data around mean. As we have shown in
section 4.2, for large, highly volatile data-set, mean is not a good forecast. Regression, expo-
nential smoothing and Kalman �lters thus are not appropriate for such forecast. Secondly,
85
most of the techniques are not capable of handling higher input dimensions required for
the forecast. This is true for the above methods and the stochastic technique presented in
[Wang et al., 2011]. To prove our �rst claim we used multiple linear regression (MLR) for
STMLF since MLR is able to cater the k dimensions in its model. As expected, the forecast
has a high error rate. Table 4.5 shows the mean squared error (MSE) value for each of the
9 days of experiment. The results showed a high MSE with average MSE of 2.73 and for
some day as high as 3.42. For a value in the range of zero to �fteen, such a value is relatively
very high. When we evaluated the same results from perspective of miss-forecasted loads,
we found that roughly 74% of loads were beyond the acceptable range of forecasted value.
Figure. 4.7 shows the number of miss-forecasted loads over the 9 day period. In this �gure
the darker shade part of the bar represents miss-forecasted and lighter part represents the
correctly forecasted loads. As can be seen, for each day a sizeable number of forecasts are
beyond our acceptable range when forecasted using MLR.
It is well known that time series analysis techniques are neither scalable to higher dimen-
sions nor are e�ective in highly volatile data [Box and Jenkins, 1994]. Usually time-series
analysis are limited to 4 or 5 input variables which is insu�cient for our requirements. For this
reason time series methods such as linear univariate model [Cuaresma et al., 2004], ARIMA
[Amjady, 2001], Box and Jenkin [Hagan and Behr, 1987], in combination with econometrics
model [D. and Uri, 1978], GIGARCH [Diongue et al., 2009], GARCH [Garcia et al., 2005]
and hybrid models such as combination of ARIMA and GARCH using wavelet transform
[Tan et al., 2010] were not considered for STMLF.
In comparison, AI technique such as arti�cial neural networks through their hidden lay-
ers and SVMs through their projection into hyper-dimensions, seem much more capable
of solving an STMLF model. These techniques are able to identify hidden trends thereby
�nding the similar trends in di�erent time series. Furthermore, ANN and SVMs are proven
to be scalable to the dimensional needs of STMLF. However, their ability to handle such a
86
Day Mean Squared Error1 2.722 2.703 2.514 2.435 2.536 2.537 3.238 3.429 2.56
Table 4.5: Mean Squared Error (in KWH) for 9 day STMLF using multiple linear regression
Figure 4.7: MLR forecast error for 9 day evaluation period. Each day has 4896 forecasts.Darker part of bars represent correctly forecasted loads and lighter shade represents themiss-forecasted loads . Correct forecast is forecast within the range de�ned by equation 4.9.
87
volatile data set is still unknown. In next section we will discuss use of ANN for experimen-
tation comparing STLF with STMLF and quantifying e�ect of anthropologic and structural
data over consumer load forecasting. In summary we believe that from the existing short
term forecasting techniques only AI methods with ability to scale in input dimensions are
applicable for STMLF.
4.8 Conclusion and Future Work
In this chapter we have �rst introduced autonomic demand side management (ADSM) as a
paradigm to provide DSM and DR in micro-grids. We have identi�ed forecasting of individual
user's load as an important cog for ADSM and have attempted to answer two important
questions for making this forecast. The �rst question is:
Do current STLF models and techniques work appropriately for forecasting individual
households or are adjustments needed in modeling paradigm for forecasting individual con-
sumer loads?
We found that the STLF model has some shortcomings in forecasting loads of individual
consumers. STLF models are built to forecast for monolithic or single load forecasting.
To forecast for hundreds of thousands of loads, an STLF will be required for each load.
This posses a scalability problem. To overcome this shortcoming, we proposed a short term
multiple load forecasting (STMLF) model which combines individual load time-series into
a succinct model for forecasting many loads with a single model. Even more so we showed
through our results that STMLF is up to seven percentage points more accurate than indi-
vidual short term single load forecasts for each load. Furthermore, we identi�ed techniques
(ANN and SVM) which can compute forecasts based on STMLF model. For our experiments
we used a basic ANN algorithm to prove the e�ect of anthropologic and structural data over
88
STMLF. As future work this ANN engine can be replaced with more sophisticated ANNs to
increase e�ciency of forecast. Our second question was:
Do the anthropologic and structural variables enhance the forecasting accuracy of individ-
ual consumer loads?
We showed through experiments that a combination of anthropologic data and structural
data of houses can greatly enhance forecasting of individual consumer's load. This richer
data can reduce error up to 50% in some cases. However, we did not co-relate the questions
with the e�ciency of the system. A more detailed analysis of e�ect of anthropologic and
structural data over forecast accuracy is required.
Lastly we made observation regarding miss-forecasts of STMLF. We observed that a
pattern exists which can be exploited to increase accuracy and precision of this forecast.
As our future work we are exploring ways to design �lters to identify and separate out
these miss-forecasts. It needs to be investigated on how to mitigate these miss-forecasted
combinations once they are di�erentiated.
In conclusion, we recommend short term multiple load forecasting and use of anthro-
pologic and structural data for smart grid applications where highly accurate behavior of
individual consumers is required such as in demand response and demand side management.
89
Chapter 5
Disaggregation Heavy Loads from
Forecast
In chapter 4 we have described a methodology to forecast household energy loads with higher
accuracy than existing methods. However, the demand side management strategy aims at
controlling the high consumption devices such as heating and air-conditioning units. In this
chapter we answer the question that is it possible to disaggregate the device consumption
prediction from the total household load prediction on an hourly scale with high accuracy?
We will show how we disaggregate the load of these high consumption devices from total
house load forecast. Due to the severe di�erence in consumption when the target device is
on and when it is not, we will show that even with the worst forecast error rate reported in
chapter 4, we can still achieve accuracy of 97%.
90
5.1 Introduction
Load disaggregation is the task of identifying individual device loads by observing the total
load of the entire system. Its main application is in the non-intrusive load monitoring
(NILM) domain in which from a single meter installed at the house entry point individual
load consumption pro�le can be identi�ed [Hart, 1992]. NILM is primarily concerned with
realtime identi�cation and monitoring of loads. Thus there are certain features in almost all
of the load disaggregation algorithms which are speci�c to this NILM analysis.
First, is that the algorithms usually are applied at high frequency data. The data fre-
quency ranges from 1 reading per second to 16 KHz - 16 thousand readings per second.
Second is that all the algorithms are applied on realtime time-series data. This realtime
constraint with the high data rate enforces algorithms to be more fast and scalable than ac-
curate and precise as is discussed in survey by Zeifman and Roth [Zeifman and Roth, 2011].
The algorithms vary in their approach, algorithm and scope of the devices targeted for
disaggregation but usually the algorithms target all the main appliances in the house.
Our system requirement however, are very di�erent from these load disaggregation sys-
tem. First our data is not realtime actual data but is the forecast of the load for the next
24 hour session. An inherent problem of forecast is that it will have errors. As we presented
in chapter 4, the error can be as much as 48% loss in accuracy and a variation of roughly
20%. Second, we do not wish to disaggregate all the device rather are only concerned with
the heating and cooling load. Third, our data is at a coarser granularity of one reading per
hour. Lastly, our main concern is accuracy rather as we shall see in this chapter, our goal is
to reduce false negative since false negative e�ects the correctness of our planning the most.
These characteristics pose our load disaggregation system very di�erently in comparison
to the load disaggregation algorithms discussed in literature. Invariably, our technique is
almost incomparable to other disaggregation techniques due to these variations.
91
As we have discussed above load disaggregation from forecasted data has its complexities.
On the other hand this load disaggregation has certain advantages as well. Since by de�nition
we are targeting the largest possible loads, it aids in our e�ort since the di�erential between
total load when heating or air conditioning is on and when it is not on is signi�cant enough
for the classi�ers to identify. Secondly, since we are more concerned with false negative to
be low and not overly concerned about false positives, we can combine di�erent algorithms
to cover the maximum range of the solution space.
In the next section we will discuss the data that we will use for our experiments followed
by discussion of evaluation criterions. We will then present the results of the ANN-SVM
based technique for load disaggregation and concluding remarks on load disaggregation.
5.2 Data
To conduct the disaggregation experiment we considered the Reference Energy Disaggrega-
tion Dataset (REDD), a publicly available data-set for load disaggregation [Kolter and Johnson, 2011].
This data-set contains detailed energy usage information of several homes over extended pe-
riods and is available in high frequency of 16 K.Hz -16 thousand readings per second- and in
low frequency of 1 Hz - 1 reading per second. Since we required readings at one hour inter-
vals, to simulate hourly loads, we calculated the net energy consumed in a 1 hour window
using the 1 Hz data. In this data-set the data is collected for both the main and the devices.
This provides us with the opportunity to test our strategy to identify times when heating
load is on.
This last information, of device usage pattern, was not available in the Swedish data-
set therefore Swedish data was not usable for demand disaggregation evaluation. But to
simulate hour level forecast of STMLF we added arti�cial noise corresponding to the error
levels in STMLF results.
92
Thus we conduct experiment on two data-sets: disaggregation on clean data, and disag-
gregation on noisy data.
5.3 Evaluation Criterion
The goal of the load disaggregation from the forecasted load is to identify houses where the
high load device will be used. This information will be then used to plan these forecasted
devices for load management. To illustrate and understand the value of correct forecast
we shall discuss the evaluation with reference to confusion matrix. a confusion matrix is a
speci�c table layout that allows visualization of the performance of an algorithm. There are
four cells in the table since we have two classes. The cells represent the following:
• True positive: Predicted true and actual is true.
• False negative: Predicted false but actual is true.
• True Negative: Predicted false and actual is false.
• False positive: Predicted true but actual is false.
Our main concerned is that all the elements that are true, that is all the houses where
heavy loads will be used are identi�ed. A small false positives in our system is less detrimental
than a false negative. This is because if we schedule load management for a device but the
device is not used then we may have marginally extra energy to use that we did not allocate
Predicted classUsed Not Used
Actual
class Used True positive False negative
Not used False positive True negative
Table 5.1: Confusion matrix
93
but system stability will not be compromised. In comparison a false negative would mean
that a device that is not scheduled by our planner is switched on. If su�cient number of
false negative exists then the system stability can be a�ected. Though we have a method
to recover from this situation as we will discuss in chapter 6, however, the result will be
sub-optimal.
The main matric for evaluation thus will be accuracy which we de�ne as:
accuracy =truePositive
truePositive+ falseNegative
In the next section we will present the disaggregation strategy. The strategy speci�cally
aims at reducing the false negative and incurs a bigger false positive. However, as we have
discussed above, a relatively bigger false positive is acceptable as long as it is not too large.
5.4 Disaggregation Strategy
Figure 5.1 shows the hourly load of a house from REDD data-set. The red line shows the
total load on the main of the house and the blue dots specify the hours in which heating
load is on. As can be observed the correlation between high total consumption and heating
load being present in that hour is very high.
To classify the timings in which heating load is present, we applied two classi�cation
algorithm, namely: arti�cial neural network (ANN) and support vector machine (SVM).
We applied two strategies to maximize the likely-hood of reducing the false negative.
First, we biased the training data by categorizing any hour as heat load hour if in any
second of the hour the heating load was present. However, for classifying an hour as heat
load hour, we calculated the net energy and if the net energy represented a heating load then
we considered that hour as a heat load hour. Here the di�erence is that if the heat load was
present for couple of minutes then we do not consider it as a heat load hour for classi�cation.
94
Figure 5.1: Heater load and load pro�le of a single house. Red line represents the main loadvalue and blue dots represents the hours in which the heater was on.
The second strategy is to use Or operator on the output of ANN and SVM. ANN and
SVM are two of the most widely used classi�ers but have marginally di�erent strategies to
build their classi�ers. Whereas SVM is a large margin classi�er which aims to produce a
more generalizable result, ANN attempts to model the system accurately as a mathematical
model. By combining their positive results we can bene�t from both the methods and reduce
our false negatives. The side e�ect of this strategy is a higher false positives rate. However, as
we have discussed above, we prefer false negative over false positive and we see this trade-o�
favorable.
5.5 Results
In this section we present the results of load disaggregation from the forecasted loads. We
present the results of load disaggregation with perfect forecast and forecast with noise equiv-
95
alent to the forecast error presented in chapter 3.
5.5.1 Noiseless Forecast
Our �rst set of experiments assumed that we have perfect forecast. We applied two clas-
si�cation algorithms, ANN and SVM, and a third classi�cation where we combined the
predictions of ANN and SVM. Table 5.5.1 shows the confusion matrix for this experiment.
The interpretation of values is as following. The �rst value is true positive count, the second
value is false negative. The �rst value on second line is false positive and second value is true
negative count. To derive the percentage we divide the cell value with the sum of values of
its row. This scheme will continue for all the confusion matrix presented in this study.
As can be seen the false negative is 17% (5 out of 30) for ANN and 10% (3 out of 30) for
SVM. But when we combine the two, the false negative rate goes down to 0%. That is, we
correctly identify all the hours in which heating was used. The accuracy of the system thus
is 100%.
Although false negative rate of ANN is lower but when we use the OR operator the false
negative rate is same as that of SVM. The false negative rate of the �nal forecast is 28% (20
out of 71).
Predicted classUsed Not Used
Actual
class Used 25 5
Not used 6 65
Predicted classUsed Not Used
Actual
class Used 27 3
Not used 20 51(a) (b)
Predicted classUsed Not Used
Actual
class Used 30 0
Not used 20 51(c)
Table 5.2: Confusion matrices for noiseless forecast. a) Arti�cial neural network (ANN). b)Support vector machines (SVM). c) (ANN OR SVM).
96
5.5.2 Forecast with Noise
Our second set of experiments add noise to the measured values to simulate the forecast
error as presented in chapter 2. We applied the same strategy of applying ANN, SVM and
combination of the two algorithms for these experiments as well. Table 5.5.1 shows the
confusion matrix for this experiment. The interpretation is same as before.
The error interestingly improve in this experiment but this can be attributed to the
random nature of the setup. Nevertheless the variation is minute enough to ignore. The
accuracy for ANN in this experiment is 10% and for SVM is 13%. But when we combine the
two, the false negative rate goes down to 0%. That is, we correctly identify all the hours in
which heating was used.
Thus accuracy of (SVM OR ANN) is not e�ected by the error in the forecast presented
in chapter 2.
The false negative is marginally better as well but only marginally (27%).
Predicted classUsed Not Used
Actual
class Used 27 3
Not used 19 52
Predicted classUsed Not Used
Actual
class Used 26 4Not used 6 65
(a) (b)Predicted classUsed Not Used
Actual
class Used 27 3
Not used 19 52(c)
Table 5.3: Confusion matrices for noisy forecast. a) Arti�cial neural network (ANN). b)Support vector machines (SVM). c) (ANN OR SVM).
97
5.6 Discussion
In this chapter we have shown that highly accurate prediction of high consumption devices
is possible from the forecasted value of the whole house. We have shown that by using
combination of SVM and ANN we can achieve 100% accuracy. Furthermore we have shown
that there is no visible change in accuracy even if our forecast is faulty within the range of
chapter 2.
98
Chapter 6
Demand Side Management Planning
We have shown in chapter 5 that we can construct an accurate forecast of heavy loads from
short term load forecast of individual houses. In this chapter we show how we can use
this device forecast to construct a demand side management plan. There are two research
questions that we answer in this chapter which were posed in chapter 1. First is the issue
of constructing a scalable and robust plan for DSM scheduling. This is discussed in 6.3.
Second is the variability of the size of the system. The system size changes over time
which allows us to modulate the approximation for a more exact optimization. We �rst
introduce the variation in size of algorithm in section 6.4. In this section we introduce
adaptable optimization or AdOpt to leverage this variation in size to maximize the exactness
of optimization based on runtime constraints. We then discuss the dynamic modeling method
to support AdOpt in section 6.5. Our results show that we can achieve our load curtailment
targets by using the combination of the aforementioned strategies.
99
6.1 Motivation
Our main motivation for autonomic demand side management stems from the critical need
of management in electric power distribution. A typical power distribution system provides
electricity to a locality which consists of tens of thousands of consumers. However, due
to power crises in developing countries if the demand of power outstrips supply the power
company cuts o� complete power to one or more neighborhoods to keep supply more than
demand.
A second challenge to energy management systems are spikes. Power supply to the
power grid can increase or decrease any time depending on the availability of electricity
generation sources. When the power supply drops the power grid managers are forced to
shut down power supply to some areas. This abrupt unscheduled shutdown is damaging and
an inconvenience for the customers and their appliances.
To plan a strategy for optimized power allocation and to handle spikes, we look at our
consumer usage patterns. In a typical household we could divide electric devices into four
broad categories. These four categories are shown in �gure 6.1. The �rst category includes
devices that are low powered and low usage. This means that these devices consume relatively
less power i.e. less than 500 watts and are used seldom. In the second category are devices
that are low powered but are used more frequently or for a longer duration of time e.g.
electric fans, lights etc. The third category are devices that are high powered but are used
seldom. Devices in these categories include things like microwave ovens, washing machines
etc. Finally, the fourth category are devices that use more power i.e. more than 500 watts
and are also used for a longer duration of time typically an hour or more. Devices such as
refrigerators and air conditioners fall in this category.
Our hypothesis in this work is that if we somehow can optimize the use of the devices in
the fourth category, we could eliminate or at least reduce the gap between supply and demand.
100
Low usage High UsageLow Vacuum- 200 watts TV 70 wattsPower Cleaner Fan 50 watts
Shaver 15 watts Computer 150 wattsHigh Micro- 1000 watts Air Cond- 2000 wattsPower wave itioner
Toaster 1500 watts
Table 6.1: Classi�cation of household appliances according to power and usage pro�le
In a tropical countries, due to very long and hot summers air conditioners of di�erent types
makes up for most of the devices in this category. In this chapter we present results on
simulations where we manage the usage of air conditioners to optimize the distribution of
electricity.
What this optimization entails is that a customer will get full electricity supply for type
1, 2 and 3. However, the air conditioners are regulated by the power company. The power
company will be able to remotely switch 'o�' the electricity to the air conditioners for short
durations. This duration is small enough to retain the cooling e�ect produced by the air
conditioners and long enough to save electricity at a grid station level.
To have fairness this scheme provided a service-level guarantee to each household. Since
such a system has to keep up with the demand and supply pattern of electricity and also has
to ensure service-level guarantees for hundreds of thousands of heavy duty electric appliances
we used a linear programming model of the system to apply self-optimization on this system.
The use of linear programming in self-optimization problems could be complex depending
on the dynamics of such a system.
In a nutshell, our optimization scheme turns o� high-powered devices for a small duration
of time typically determined by a service-level agreement between the electricity company
and the consumer. This methodology optimizes the supply of electricity to high-powered
devices based on the overall supply/demand situation, a service-level guarantee and other
101
factors. An hour of usage for each high powered device is divided into six ten minutes slots.
At times when supply outstrips demand then all devices are powered for all the six time
slots. However, as the demand outstrips supply devices are turned o� based on a fair scheme
such as round robin. The maximum a device can be turned o� is based on a service level
guarantee between the electric company and the consumer. For simplicity purposes, in this
paper, we use a two slots service guarantee for all the consumers. This means that a device
is to be turned on for at least twenty minutes of an hour. Therefore, the optimization goal
is to �nd a plan for the next hour for each device in the system.
Since a plan is generated for each hour there is no need to recalculate the plan during
the course of the hour unless two situations occur: there is a sharp increase in the demand
or there is a sharp decrease in the supply. In both cases, the plan has to be recalculated for
the rest of the hour.
But implementing such electricity optimization has many challenges. First, with the
present infrastructure there is no way a power company can turn on or o� the air con-
ditioners remotely. However, recent advances in smart homes and smart grid networking
technology has provided us with su�cient tools to implement such a plan. A survey by Yan
and colleagues provide su�cient information in this regard [Yan et al., 2013]. A study by
Omer and colleagues also provide a �nancial assessment of di�erent available technologies
[Omer et al., 2010].
Second, the number of air conditioners is enormous. In a typical locality thousands of
these devices are present. Therefore, we need a technique of self-optimization that can scale
to large number of devices without a signi�cant cost overhead.
Third, both the electricity supply and its demand can vary, i.e. spikes can occur in our
system. Therefore, the methodology must be dynamic enough to quickly act on supply and
demand spikes and take the system back to an acceptable state. A supply and demand graph
is shown in �gure 6.3. This is a typical supply and demand situation on a summer day for
102
a locality. Assuming that we have such historical data available we plan for the electricity
optimizations for the next hour.
The usage of electricity is a very dynamic. Therefore, in order to cater for any short and
temporary spike we de�ne a reserve margin between the peak demand and supply. Reserve
margin is a bu�er between the maximum anticipate demand and the supply that is available
to the system. This reserve margin is maintained to cater for any growth in demand beyond
the supply. The motivation and way to calculate this reserve margin in section 6.3.3.
In our scheme, if we do need to replan the optimization of electricity this reserve margin
provides the time necessary to replan the distribution of electricity.
For our methodology this margin is margin ≤ calcT ime×∆. Where ∆ is the slope when
the global demand function approaches maximum supply and calcT ime is the maximum time
taken to analyze, plan and execute the plan. The derivation of this equation is provided in
the evaluation section. We subtract the margin from the electricity supply value and used
the new number to plan for electricity optimization. The supply value minus the reserve
margin gives what we call an adjusted supply.
Fourth, the methodology must ensure service-level guarantees i.e. the promise of the
power company with the consumer that an air conditioner will not be turned o� for more
than x minutes in a given hour. This requires a very dynamic self-optimization technique.
6.2 Approach
To plan for such a dynamic system with soft-realtime constraints, we applied linear pro-
gramming to schedule our devices. To achieve this a linear system of equations is developed
based on the entities in the system. Since the decision is a zero/one - device switched o� or
switched on- the problem is easily reducible to 2-dimensional knapsack problem which is a
known NP-complete problem. There are two limitations to this approach. First, the demand
103
for machines in power distribution system depends on consumers turning on and o� their
devices and a �xed linear set of equations is not enough. Second, the system is not scalable
since for large number of entities no solution to 2-dimensional knapsack problem exists.
In order to solve the �rst problem we used a meta model to generate a linear set of
equations at runtime. This set of linear equations is mostly based on the state of the system
i.e. the number of electric devices consuming power at the time of optimization. Once the
linear system of equations is generated an optimization algorithm such as `simplex' is used
to solve the knapsack problem [Hillier and Lieberman, 2001]. This system is resolvable for
a small set of problem when the time constraints are not stringent.
However, when either the system size grows or the response time is small as is the case in
response to a spike then the simple 2-dimensional knapsack formulation is not feasible. To
counter this problem we use a clustering technique to cluster the entities based on a given
variance. The clusters are then used to generate equations and thus a relatively small set of
equations was generated. Secondly, since we have frequency instead of a zero/one decision,
we can use linear programming instead of integer programming needed for 2-dimensional
knapsack problem. In solving this approximated problem the use of linear programming
proved to be considerably fast for even an ultra-large dataset consisting of one hundred
thousand entities. We will discuss the clustering algorithm in section 6.3
However, since we were using clusters the solution speed had a penalty of un-utilized
power that could have been utilized. For the most part when a solution is required instantly
a small quantity of un-utilized power is acceptable but when the need for distributing the
load scaled up or down sharply, we found that we could do a better job with a algorithms
that solves the problem with virtually no unutilized power in the system.
To get the best of both the algorithm, and possibly other approaches, we propose adapt-
able optimization or AdOpt. AdOpt adapts the optimization model and method based on
system state. AdOpt observes the number of entities and the soft-realtime constraints and
104
based on a soft-computing technique it identi�es the optimal model and technique to be
used. So if the time requirements are stringent or the number of entities is large then it
will use clustered optimization. If the size is very large then it will have fewer clusters with
higher penalty but a quicker response time. But if the entities are not many and based on
historical evidence knapsack style problem can be resolved then AdOpt models the system
as such and uses integer programming to derive the optimal answer.
However, changing algorithms, at runtime is not possible because other algorithms could
only be applied if the system is abstracted in a di�erent model. This means that in order to
use multiple optimization algorithms the runtime models have to be generated at runtime
too. Therefore, to use multiple optimization algorithms we need a methodology that analyzes
the state of the system, recommends an optimization algorithm, generates a runtime model
of the system and use an optimization algorithm to produce a plan for the power distribution.
To acheive the aforementioned goal we developed a dynamic modeling strategy discussed in
section 6.5.
In the next sections �rst we will discuss the approximation algorithm which uses clustering
to convert the 0/1 knapsack problem to a frequency domain linear programming problem.
We will then discuss AdOpt which decides between clustered frequency approach and 0/1
knapsack approach based on system statistics. last, we will brie�y present the dynamic
modeling framework used in AdOpt to model the system at runtime.
6.3 Clustered Frequency Based Algorithm
In this section we discuss our approximate algorithm which plans the scheduling problem in
a relatively short time but with the penalty of rounding o� error. The scheduling problem
with our given constraints is a two dimensional knapsack problem. It is desirable to �ll each
time slot with maximal number of loads. On the other hand its is desired that each load
105
is provided the maximum number of operative cycles. On the load dimension there is the
constraint of minimum loading, that is, each load must be scheduled at least 3 times in an
hour.
The 0/1 knapsack problem for any dimension, where we can select a weight as a whole
and not as a fraction is an NP complete problem. However, if it is allowed that we select
partial weights then this problem is solvable in polynomial time. However, for scheduling
air-conditioning loads a partial value is not viable since the state can be either on or o�.
In clustered-frequency based algorithm our main motivation is to transform the problem
into a frequency domain where so that we can solve the problem in polynomial time. We
make this transformation by clustering loads of similar characteristics. The goal of clustering
is to reduce the intra-cluster variance so that mean or max is representative of the loads in
the cluster.
We now model the system using the cluster frequency and their representative value
and schedule number of devices from each class for each time period rather than individual
devices for each time period. This provides us with two advantages. First, the size of the
problem is reduced to the number of clusters rather than the number of devices. This number
of cluster can be tuned through variance goals of the clustering algorithm. Secondly, since
now we have frequency of clusters, we can use linear programming to solve the 2-dimensional
knapsack problem. We take the �oor of the output of frequency value of the schedule. This
results in sub-optimal scheduling where possibly for each <cluster,timeslot> we can miss a
load that was scheduled by the linear program. However, this is error is bounded by the
term < cluster× timeslots > which is less than 6% of the total loads. Given that our other
options are exponential time algorithm or blanket load shedding this is an acceptable error
for now. In this section we will �rst describe the clustering algorithm followed by planning
LP formulation. This will be followed by results and discussion.
106
6.3.1 Clustering
We adopt an incremental clustering approach since it provides the best control over
the inter-cluster variance[Hillier and Lieberman, 2001]. To cluster we �rst sort the power
pro�les in incremental order. For each data point, we include it in the current cluster and
then calculate the variance (σ2) of the current cluster. If σ2 < errorThreshold then we
proceed to the next data point. Other wise we pull out the data point and create a new
cluster for this data point. here threshold is a user de�ned parameter limiting the variance
of a cluster. As our values are sorted prior to the clustering, we are always sure that our
variance for each cluster will be less than threshold.
In order to make the best out of clustering we had to reduce the inter cluster variance.
Therefore, to counter this problem we set our cuto� criterion for the clustering to restrict
the variance (σ2) of a cluster within a threshold. This means that each value in the cluster
is in the range of µ± errorThreshold. Here µis the mean of a speci�c cluster.
To cluster we could also have used another clustering algorithm such as k-means. But
k-means limits of number of clusters (k) where our requirements was to stabilize the system
with respect to the inter-cluster variance and the number of clusters are not important.
The output of the analysis phase hence will be the category 4 device usage information
divided into clusters. This information is then used to plan the actual optimizations.
6.3.2 Linear Programming Based Planning
The clustered data is then passed to the linear programming engine. This data includes the
cluster mean and frequency data and the electricity power supply available. The goal of this
LP formulation is to plan shutdowns in a pre-de�ned scheme.
Figure 6.9 de�nes the set of equations to de�ne LP. The cost function (Eq. 6.27) max-
imizes the total frequency of the system for all time periods. Here Xi,t represents the ith
107
[clusters] = clusterize(n)
sort (n)
k = 1
for i = 1..n
clusters[k].insert( n[i] )
if(variance(cluster[k] > errorThreshold))
clusters[k].remove(i)
k = k + 1
cluster[k].insert (n[i])
(Where: n = Data points to be clustered; clusters = two dimensional array. each 1D arrayis a cluster; k = number of clusters)
Figure 6.1: Clustering Algorithm
cluster in tth time period. As we do not have any priority, for clusters all machines have
equal chance of getting selected. Our z will give us the total number of 'on` machines and
value for Xitwill give us the number of machines to switch in ithcluster at the tth time period.
Equation 6.28 represents the service level guarantee constraints that in every time period,
at-least 1/3rd of systems should be in powered on state. equation 6.29 limits the allocation
under the maximum available supply and equation 6.30 puts the technical constraint that
the number of allocated consumers in a cluster should not exceed the cluster size.
For some of the controlled devices, it is recommended that a delay of 10 minutes should
be given between power cycle. To cater this requirement, we divide our total time (1 hour)
into 6 chunk of 10 minutes. A cluster for each 10 minute time period is represented as
a decision variable. So if we have 100 clusters and t is 6 then we will have 600 decision
variables, each variable de�ning the number of consumer that should get power on in that 10
minute period. The optimization function is to maximize the number of consumer getting
the supply for the entire duration. In should be noted here that for other problems, the t
can be increased or decreased accordingly.
Linear programming assumes real values for all variables. This means that LP outputs
non-integer values for state frequencies for each cluster. For example, LP can output 5.2
108
devices to be in "on" state in time t. We will take the �oor of the cluster values. For our
problem this adds an element of error. However this error will be order of (kt) where k
are the number of clusters and t is the number of time periods. An integer programming
solution will contain the same allocation but some of these kt machines will be allocated as
on and some of them will be 'o�' state. By switching o� all of these kt machines we might
be under allocating the resource. However since our maximum k is 1% and our t = 6 Our
total missed allocation will not exceed 6%.
The plan given by LP thus consists of the machines to be turned o� in each ten minute
period of an hour in each cluster. Therefore, this is an hourly optimization plan to be
implemented in an hour. This plan is provided to the execution module as input.
6.3.3 Spike Handling
A spike is a sudden upward or downward surge in supply or demand. Figure 6.3 de�nes
a typical power usage pattern at a grid level. The power provider guarantees a 2200 KW
power supply. But this supply could drop arbitrarily. The fall in power supply when demand
is low does not e�ect the system much. But when demand matches or exceeds supply, like
in the 11th time period create problems.
A demand spike occurs when a predicted maximum load is crossed. Figure 6.6 shows
the predicted and actual load of a system in real time. The maximum predicted load was
2600 KW. Four hundred kilowatts was a reserve margin. However, still the demand outstrips
supply.
A downward spike in demand or an upward spike in supply does not e�ect our system.
For the sake of simplicity we do not handle these two types of spikes.
To deal with a upward spike in demand and a downward spike in supply we use two
SAPE cycles, respectively. Some of the modules like the sense and analyze are used almost
in a similar fashion but the analyze and plan are designed di�erently.
109
Maximize(Z =∑i,t
Xi,t) (6.1)
∀t∀iXi,t ≥MAXi/3 (6.2)∑i,t
µiXi,t ≤ supply (6.3)
∀i,tXi,t ≤MAXi (6.4)
Figure 6.2: Hourly planning LP equations
Figure 6.3: Typical Supply spike in system
Supply-side Spike
Supply side spike occurs when the power supply company faces a sudden loss of a power
generation source such as a unit in a thermal power plant. This kind of spike occurs almost
instantly. However, sometimes there is a margin of a few seconds. We assume that we use
these few seconds to handle the spike.
As soon as the system sense a supply side spike we initiate a replanning process.
For clustering we assume that our initial prediction and clustering was correct.
However, for LP based planning we use a proactive approach. We calculate the minimum
threshold power that is needed in every 10 minute time period without violating the service
guarantee. We calculate this immediately after calculating the hourly plan. That is, at the
start of each hour, in addition to the main plan, we have 5 additional plans for failure at the
10th, 20th, 30th, 40th, and 50th minutes.
110
Minimize(Z =∑i,t
µi ×Xi,t) (6.5)
∀t∀iXi,t ≥MAXi/3 (6.6)∑i,t
µiXi,t ≤ supplyNew (6.7)
∀i,tXi,t ≤MAXi (6.8)
∀t′:t′<currenttime∀iXi,t′ = allocXi,t (6.9)
Figure 6.4: Spike handling LP equations
The planning for each 10 minute period uses the LP in �gure 6.4. Here we �nd the plan
which minimizes the possible power supply without violating the service level agreements
(eq. 6.5).
Assume that the spike occurs at time t′. As at time t′ all the power allocations for time
slices before t′ must have already been implemented, we consider the values for decision
variables for timings before t′ as constant. This is represented as equation 6.9. Our demand
constraint and guarantee constraint remain the same (equations 6.6 and 6.8) . Our opti-
mization function is to choose a set of decision variables for which the total power needed is
minimum (equation 6.5).
If at time t′ the power does go below the amount promised at the start of the hour,
we have a plan ready which can be propagated to the system instantaneously. We assume
here that our communication network is fast and stable enough to propagate the new plan.
As mentioned previously execution phase is implemented using the same communication
network i.e. SMS.
Since our planning is at discrete time intervals, we revert the system to a plan older than
current time. For example, if a drop in supply occurs in the 24th minute then we will revert
to the plan 2 made for t = 20 as shown in �gure 6.5.
111
Demand-side Spike
Demand side spike occurs when the demand from the consumers outstrip the projected
demand calculated at the start of an hour. Figure 6.6 shows a typical hot summer day when
the real demand outstrips than the projected demand. We are calculating demand on per
home basis. The sum of our demands give us a global picture of how much energy is needed.
One thing to note here is that unlike supply-side spike demand-side spike almost always
grow smoothly. This give us breathing space to deal with this kind of spike in a much robust
way. Therefore, we use a reactive method in this approach.
For demand-side spike is a two step process. First, we evaluate that how fast is our
demand approaching our supply since it is non-deterministic. This means that an increase
in demand may not be a continuous increase in demand in which case we would like wait and
see if a replanning is really required. To do this we take the demand data at every minute
and use a linear regression to �nd a �t for our global demand. We then use relation de�ned
in equation 6.10. In this relation ∆ is the slope of our linear regression �t and calcT ime is
the time to cluster, plan and propagate the updated plan. If the relationship does not hold
then we proceed to step two of analysis.
This relationship is the trigger to a replanning. As we measure the rate of growth and
the current state and the relate it with the time it takes for us to react, we ensure that we
always have a plan ready for an eventuality.
We start our step two with pulling the data from all the device units. A demand di�erent
t = 0 t = 20t = 10 t = 30 t = 40 t = 50 t = 60
Figure 6.5: System response for spike at 20 < t < 30
112
Figure 6.6: Typical demand spike in system
reserveMargin ≤ ∆× CalcT ime (6.10)
Figure 6.7: Reserve margin lower limit
from the projected demand requires a re-clustering of the air conditioners usage data. To
do a better job at clustering we use the historical data of the air conditioners and take a
max of the two pieces of data for each device. Here we use the same incremental clustering
algorithm with similar threshold.
We then use an LP which borrows from the two LPs discussed so far. As our goal here is
the same as that of program in �gure 6.9, that is to maximize the number of users, we use
the optimization function from that LP (equation 6.27). The constraints equations will be
mathematically same as those in �gure 6.4. However, we will change the constants di�erently
now. Since in this scenario our mean(µ) for clusters has changed and the supply is same as
before, our LP will have a changed left hand side (µ) instead of right hand side(supply). As
these values are input parameters, we will not need to change the actual LP equations and
only changing the constants will be su�cient. We can also interpolate that the running time
and complexity will be the same for both the LP.
113
6.3.4 Evaluation
We have evaluated our methodology on the challenges mentioned in previous section. These
challenges are scalability, spike handling and ensuring service-level guarantees. The equations
of the linear programming has already proven the e�ectiveness of the technique in ensuring
supply-side guarantees. Therefore, in this section we discuss the simulation results that
proves the scalability of our approach and a mathematical derivation that proves that our
approach is e�ective in handling spikes.
We used two di�erent methodologies to evaluate our systems and support our hypothesis.
We �rst conducted simulations with varied data sets and sizes to test the scalability and
correctness of our system. As our spike handling uses the same clustering and a similar LP,
we used mathematical derivation to evaluate and prove the correctness and resilience of our
spike handler since the scalability and correctness of clustering and LP has already been
shown.
We start with discussion of our complexity of clustering and LP with respect to scalability.
In section 6.3.4 we evaluate our test results for scalability of the system. We prove that our
running time for an increase in number of controlled devices can be managed thus answering
the scalability question. In section 6.3.4 we prove the correctness of spikes mitigation system.
This will give us the answer for the e�ectively handling of un-scheduled updates question.
Is the Solution Scalable?
Since we are controlling individual devices in this methodology the shear number of the
devices require a very scalable solution. This also means that in no way we can a�ord a
non-polynomial solution.
We evaluate the scalability of our methodology for analyze and plan phases
as the sense and analyze measures will be discussed elsewhere
The scalability of our methodology is dependent in most part on the analyze and plan phases.
114
In analyze we have used an incremental clustering technique. Our algorithm tries to insert
each element in one cluster only, and then calculates the variance of that cluster. that is
our complexity is dependent on the length of the largest cluster l, the number of clusters
k and the number of elements n. The order of this incremental clustering algorithm is
O(nkl2 +n(log(n)) where n(log(n)) is the complexity of sorting. A point to note here is that
the number of cluster and size of clusters are very closely related with the error threshold
and the variance of the distribution. We discuss this relationship and its e�ects shortly.
In the plan phase we use a linear programming (LP) algorithm. The complexity of
LP is dependent on the input number of decision variables. Our LP will decide the num-
ber of machines that should be on for each cluster. We used Matlab's LIPSOL algorithm
[Zhang, 1997]. This algorithm is based on Mehrotra's Predictor-corrector interior point al-
gorithm [Mehrotra, 1992]. Complexity of a modi�cation of this algorithms in big-O terms
was calculated by Salahi and colleagues and was found to be O(k2L) [Salahi et al., 2007].
Here k is the number of variables and L is the length of string needed to encode the input.
Recall k is the number of clusters that we created in previous step.
A grid station typically supports a few thousand consumers. As the consumption of
individual devices could vary we used random data generated using an upper and lower
bound to mimic actual device power consumption.
The scalability is a�ected by three variables: 1) the number of device usage pro�les i.e. n,
2) the variance σ2 of the usage pro�les, and 3) the threshold e within a cluster the threshold
we will tolerate within a cluster.
To evaluate the system we considered device usage pro�les between 10,000 and 100,000.
We used data with a variance between 1 kilowatt and 50 kilowatt.
Finally, we varied the threshold within a given cluster between 0.01 and 0.1.
For the aforementioned variables the clustering results with an error threshold of 0.1 are
given in table 6.2 and with an error threshold of 0.01 are given in table 6.3.
115
It can be observed that for a typical 50,000 values and error threshold of 0.01, our
clustering time is less than a minute. In fact our worst time for clustering (151.8 seconds)
was with the lowest variance (1) of data and with the highest error threshold (0.1).
Our observation is that clustering infact takes the major chunk of time as time to calculate
is extremely small. For example with a value of k = 1045 the plan is calculated in just 2.1
seconds. The total time required for both clustering and LP is given in tables 6.2 and 6.3. It
can be seen that LP adds a fraction of time to the total calculations. In contrast an integer
programming solution takes close to 30 minutes for a dataset of size 30!
Is the Solution E�ective in Handling Spikes?
As discussed previously, our system will encounter two types of spikes. We discuss the
e�ectiveness and response time for each source individually. Since our demand spike only
uses an LP which is similar to the one discussed in previous section. And our demand side
spike uses the same clustering algorithm as discussed previously, we do not re-evaluate our
clustering or LP scalability here. Instead our focus in this section will be to prove that the
spike handling mechanism is robust and able to maintain the guarantees, if possible, in real
time.
Supply side spike A sudden dip in supply is very much a possibility. Since our system
requirements stipulate that a machine shut down should not be restarted in the next 10
minutes hence any change in supply is immediately transferred to the customers. For the
next cycle, however, due to our proactive approach we already have a plan ready and this
plan will simply replace the current plan. Hence we will seamlessly integrate the change. It
might be possible that the dip in power in later stages of an hour might give us a situation
where reaching a guarantee is not possible. This case is a policy decision and is beyond the
scope of our work. In case the system managers would like to lower their guarantee with
some penalty then such a plan can also be calculated apriori and enforced in such a situation.
116
Variance Size Cluster Clustering LP Totalcount Time Time Time
10,000 7 1.96 0.06 2.021 50,000 7 41.76 0.05 41.71
100,000 6 151.8 0.09 151.8910,000 64 .76 0.16 .92
10 50,000 76 6.48 0.21 6.69100,000 80 21.49 0.23 21.7210,000 304 0.5 0.37 0.87
50 50,000 348 3.8 0.56 4.36100,000 372 9.12 0.88 10
Table 6.2: Total time for analyze/plan (Error threshold = .1)
Variance Size Cluster Clustering LP TotalCount Time Time Time
10,000 23 1.12 0.12 1.241 50,000 24 14.5 0.09 14.59
100,000 26 58 0.11 58.1110,000 201 0.56 0.43 0.99
10 50,000 220 4.2 0.34 4.54100,000 234 11.2 0.43 11.6310,000 857 0.5 1.40 1.9
50 50,000 1004 3.86 1.45 5.31100,000 1045 8 2.04 10.04
Table 6.3: Total time for analyze/plan(Error Threshold=.01)
117
In a nutshell, at the time of supply side spike the system goes down to guaranteeing
minimum service level. That is the system will try to conserve energy and only ration 20
minutes of usage per device. However if the power dip does not reach our minimum level,
we can always recalculate the plan using LP in �gure 6.9 for next time period and update
the plan like wise. Since maximum time to calculate a plan is less than 4 minutes, and our
time period is 10 minutes, we have the ability to update a plan if needed.
Demand side spike As discussed previously demand side spike is a result of growth of
consumption beyond predictions. A point to note is that the demand growth over time is
not as drastic as the supply side spike. Especially when considering that we have tens of
thousands of consumers. In order to predict demand side spikes we need to �nd out when
the SAPE process should start when demand changes. Another related aspect is the reserve
margin that we need to subtract from the supply to get the adjusted supply.
For the trigger mechanism we must ensure that we have enough time to run the SAPE
process before the demand is at an unacceptable level. Planning is not possible as the growth
in demand is quite non-deterministic. Our limitation is that we need to start our process
calcT ime seconds before we reach our max supply point. We derive a formula for the trigger
as follows:
Let calcT ime be the time to analyze plan and execute our SAPE cycle. Therefore, if
we want to have enough margin so that we trigger our calculations before we overrun our
supply then our trigger will be:
t′ − t ≤ calcT ime (6.11)
Where t is the current time and t′is that time when, according to current estimates, our
demand will reach our supply.
We do a regression analysis for values of demand in previous �ve minutes to approximate
118
the movement of our demand. The regression analysis gives us the equation:
currentdemand = ∆t+ c (6.12)
we can say that t′can be given as:
supply = ∆t′ + c (6.13)
replacing values in 6.11 with equations 6.12 and 6.13, we get
supply − currentDemand∆
≤ calcT ime (6.14)
Simplifying we get our trigger as:
currentDemand ≥ supply −∆× calcT ime (6.15)
That is, when our currentDemand goes below supply −∆ × calcT ime we will initiate our
demand spike planning module.
The second value that we need to know is an acceptable reserve margin. Recall from an
earlier section that we introduced lower limit of margin as margin ≤ calcT ime×∆ . If our
predictions were correct then at the time of peak usage we will not trigger a spike mitigation
if our marginis greater than or equal to calcT ime×∆.
To ensure that at peak time, our demand spike mechanism is not started when we are
following our demand pattern, we put a lower limit to our margin.
reserveMargin ≥ f(t′)− f(t) (6.16)
wheref(t)is the maximum demand and f(t′)is the hypothetical demand at time t′. If the
119
demand would have continued based on regression analysis for a window from �ve minutes
before maximum demand till the maximum demand. We use regression analysis on the
predicted demand function to �nd f(x). The equation given by regression analysis is:
f(x) = demand(x) = ∆x+ c (6.17)
we can hence write equation 6.16 as
reserveMargin ≥ ∆(t′ − t) (6.18)
We know that (t′ − t) is bounded by calcT ime. For calculating margin limit, we use the
lower bound for calcT imehence we can say that.
reserveMargin ≥ ∆× calcT ime (6.19)
This gives us the lower bound for the reserveMargin.
Using these mathematical proofs we can say with con�dence that as long as our reserveMargin
is more than our ∆ × calcT ime and our trigger is calculated regularly and we will always
have enough time to calculate a new solution for an increasing demand.
6.3.5 Discussion
In this section we discuss some of the observations and frequently asked question about our
methodology.
• Is the solution cost e�ective? We have calculated that to control each air-conditioner
remotely we need equipment worth about $30. This includes relays, GSM cell phone,
circuits, wiring, etc. The cost of running the system is also quite low because many
120
GSM companies provides packages of unlimited SMS for just $2 a month. Moroever,
considering that the available alternative solutions, if they are available, are much more
expensive this solution is de�nitely cost e�ective.
• Who will pay the cost/ who will be responsible for setup? We envision that our system
will be adopted by townships and/or power distribution companies. A city council or
a power providing company will decide on setting up a system. In such a case, either
the power company or the city council will enforce such a system and will devise a
cost breakdown mechanism. Such adaptation will have legal and social implications
however this discussion is beyond the scope of this paper.
• Is our solution generic? Our main contribution in this paper is the idea of clustering
a large data set and then to use linear programming to solve an otherwise complex
problem. This has a large range of applicability and can be applied to most of the
problems where decision has to be made on states of elements based on some constraint
and a global goal. We also provide a way to �nd bound to the error that is introduced
due to clustering and subsequent real value LP solution. However, given that the
original problem was intractable by a big margin, it can be argued that the error
margin is negligible. Our technique is not a generic approximation solution for NP-
hard problems. It is a technique for near optimal solution which can decide states of
tens of thousands of variables while staying within constraints and bounds.
• What is the learning curve of applying this technique? Our technique uses basic clus-
tering and mathematics. The modeling framework required to adopt a new problem is
much less than many other proposed techniques [Diao et al., 2003, Abrahao et al., 2006,
Abdelwahed et al., 2004, Lefurgy et al., 2007, Wang et al., 2006]. There are open source
tools available (GNU linear Programming Kit) available for solving LP. Indeed the
methodology we have proposed is not only simple but is so commonly used by economists,
121
managers, engineers and planners.
• Can we use other techniques to solve this problem? As discussed in previous sections,
the task of determining the state of individual elements with constraints is an NP-hard
problem. Our algorithm is a transformation of this problem into a solvable domain.
Though this transformation introduces an error but We have already de�ned a bound
for this error in section 6.3.2. A binary programming solution for this problem will be
in the order of 2nwhere nis the number of elements. For our basic case this will be
210,000. An intractable problem!
6.4 Adaptable Optimization - AdOpt
To realize the goal of using multiple optimizations on a given system, a whole autonomic
framework is required. The very basic building blocks of this framework are the optimization
algorithms that will eventually optimize the system. The selection of optimization algorithm
is partly dependent on the optimization problem in the system and its eventual goals. We
will �rst describe the optimization algorithms that we selected to solve our optimization
problem. Part of these algorithms have been discussed in previous section. Here we also
discuss the implementation details of the algorithm since the decision to chose a particular
implementation is important for AdOpt. This will be followed by discussion of selection
methodology and the formulation of the problem in BIP and LP terms. We will then discuss
the architecture followed by results and discussion.
6.4.1 Self-Optimizing techniques
We used three optimization methods for the calculation plan of the optimization of electricity.
In this section we describe the scalability and applicability of the three optimization methods
.
122
Binary Programming
In our optimization problem we need to �nd a plan of whether to turn on or turn o� a
electric machine. Binary programming (BP) is an ideal solution to this problem because
each device has only an `on' or an `o�' state during a slot [Hillier and Lieberman, 2001].
The advantage of BP is that it gives an exact solution and does not have any rounding o�
error. This means that there is no unutilized power in the distribution system. However, on
the downside the running time for BP degrades exponentially as more devices are added into
the system. Therefore, BP could only be used if the system has a small number of devices.
BP is a known NP-hard problem. To �nd the optimal solution takes the problem a bit
further and makes it a∑2 (Sigma2) problem, a class of problems known to be more complex
than NP. There are no polynomial time algorithm to solve these problems. The only known
way is to enumerate all the possible solutions. However, applying combination of state of the
art branch and bound techniques and linear programming can solve small sized problems in
a relatively short time.
The formulation of BP encodes each machine-time slot tuple as a single variable. BP
decides the state for each tuple subject to the service-level guarantee and the supply and
demand constraints. Therefore, the solution provides the state of each machine in the system
for the six or so time slots. The formulation of the runtime system model to represent our
power distribution system is discussed in a next section.
Linear Programming
Linear programming (LP) is used to solve the optimization problem of power distribution
system. We used two algorithms in linear programming: the simplex method, and the interior
point method. As mentioned before that to solve the system using linear programming so as
to allow for a large number of machines we clustered the data based on a distance threshold
of the power consumption pro�le. The clusters are then used to generate the equations that
123
represents the system from a meta-model. Once the equations are generated one of the two
aforementioned methods to solve the linear system of equations is used to �nd a plan for
electricity optimization. Because of the inherent modeling of the system both the simplex
method and interior point methods result into an error margin i.e. un-utilized power. The
main di�erentiating factor between simplex and interior point is the underlying method of
optimization. Simplex traverses along the edges of dimensions and changes its direction
when a constraint is encountered. This is slower but in general allows a variable to reach it's
maximum point in terms of optimality before optimizing other variables.
Interior point method is the state of the art LP solving method being used and improved
since it was invented in 1987 by Karmarkar. Interior point method traverses the interior of
valid region in search for optimal point. In doing so it changes values for a larger number
of variables at the same time. Hence at the end of the optimization, for a degenerate
problem, there is a possibility that more variables contribute marginally to the optimal
point in comparison to simplex solved solution. A degenerate solution is when more than
one combination of values yield the same optimal value.
In short, simplex has less error rate but is slower than interior point. On the other hand
interior point is fast but could give a large error at the end.
6.4.2 System Models
Multi-dimensional Multi-Knapsack Model
To use binary programming we modeled our problem as a two interwoven multi-knapsack
problems. A Knapsack problem is a formulation where hypothetical sacks have a maximum
weight and the smaller weights have to be �tted into sacks so that the sack is maximally
utilized. In our problem, because of the service-level guarantee, we have to select at least
two time slots for each machine. Therefore, the allocation of time slots for each machine
124
becomes a weight. The number of such devices becomes the number of sacks of a knapsack
problem as shown in �gure 6.8.
Concurrently, for each time slot we have to switch-on devices such that the number of
devices is maximized while the total power consumed is within the maximum power supply
available. This again is a knapsack problem. Here our sacks are the time slots. These two
interwoven problems can be expressed independently and then merged together in a single
equation matrix for one of our solution methods.
The �rst knapsack problem requires a more dynamic solution than the later one. As we
do not know the number of devices, the number of equations that will be in this problem are
not known at design time. For each sack, that is for each device, we create an equation with
six boolean decision variables representing each time period. The sum of these variables
should be greater than or equal to the service-level guarantee that the speci�c device is
calibrated to. All of these equations for sacks are aggregated to form two matrices, the
variable counting left hand side and the service level guarantee bound as right hand side as
shown in �gure 6.8.
The second knapsack has time slots as sacks. This means that for our current setup this
problem will have six sacks. Though it might be possible that the number of these "sacks"
may vary at runtime. For each of these six sacks a sum of product of decision variable and
consumption value is calculated as right hand side. The boolean decision variable here is
the same as used in the previous knapsack problem. This reuse, or double use of decision
variable weaves the two di�erent knapsacks. The left hand side for each time period is the
amount of resource, in our case power available for the system in that time slot.
Figure 6.8 is the template for binary programming planning equations. In this �gure
eq. 6.21 represents a sack for the �rst of the intervowen knapsack. As in that problem,
each device is considered as a sack, equation 6.21 is generated for each device in our system.
In comparison equation 6.22 represents a sack of the second knapsack problem. As in this
125
problem, each time slot is considered as a sack equation 6.22 is generated for each time slot
in our system.
Clustered-frequency Based Modeling
A linear program is a mathematical modeling and solving technique to plan for scarce re-
sources for multiple demands. The system is modeled as a series of linear equations. These
equations de�ne the whole system including the constraints, cost-functions and decision vari-
ables. These equations are usually derived from the technical requirements as well as logical
considerations.
LP solutions are in real domain and cannot be restricted to integer or binary values.
Doing so makes LP an equivalent of Integer Programming which is an NP-Hard problem.
Thus LP on a binary decision problem is not scalable. To derive an answer in within our time
constraints, we instead transformed our problem from binary decision to frequency determi-
nation domain. This was done by reducing the dimensions of the problem through clustering.
A simpli�ed quality threshold algorithm was used to cluster the data [Heyer et al., 1999].
As we are using mean of a cluster as a representative element, restricting the radius of clus-
ter makes the value more meaningful. Through this transformation we are able to change
the problem from deciding weather a machine should be kept on or o� in a time slice, to,
determining the optimal frequency for each cluster of machines. We rounded o� the value
of cluster to arrive at a sub-optimal, but maximal, plan for our system.
The resulting clustered problem has two logical constraints and one technical constraint.
Maximize(Z =∑i,t
Xi,t) (6.20)
∀i∑i
Xi,t ≥ guarantee (6.21)
∀t∑t
µiXi,t ≤ supply (6.22)
Figure 6.8: Hourly planning BP equations
126
Maximize(Z =∑i,t
Xi,t) (6.23)
∀t∀iXi,t ≥MAXi/guarantee (6.24)∑i,t
µiXi,t ≤ supply (6.25)
∀i,tXi,t ≤MAXi (6.26)
Figure 6.9: Hourly planning LP equations
The �rst logical constraint is that the total energy consumed, as planned by the LP, should
not exceed the available power supply. This is represented as equation 6.29. The second
constraint is that the number of machines to be powered "on" in each cluster for each time
period should at least be x% of the total number of machines in that cluster. Where x is
the minimum service level guarantee for the speci�c cluster . For example if the guarantee
for the system is that no machine will be o� for more than 20 minutes (or 33% of the time).
Then the constraint will be that 33% of the consumers for each cluster shall be powered "on"
in each cycle to ensure the 20 minutes guarantee(equation 6.28). The technical constraint
is that the number of machines powered on in each cycle should not exceed the number of
machines in that cluster as given in equation 6.30.
Figure 6.9 de�nes the complete LP meta-model for our problem. The cost function (i.e.
equation 6.27) maximizes the total number of machines in each cluster for all time periods.
Here Xi,t represents the ith cluster in tth time period. As we do not have any priority, for
clusters all machines have equal chance of getting selected. Our z will give us the total
number of `on' machines and value for Xitwill give us the number of machines to switch
in ithcluster at the tth time period. Equation 6.28 represents the service level guarantee
constraints that in every time period, at-least 1/3rd of systems should be in powered on
state. equation 6.29 limits the allocation under the maximum available supply and equation
6.30 puts the technical constraint that the number of allocated consumers in a cluster should
not exceed the cluster size.
127
The plan given by LP thus consists of the machines to be turned o� in each ten minute
period of an hour in each cluster. Therefore, this is an hourly optimization plan to be
implemented every hour.
Spike in Supply or Demand
In addition to the hourly planning we also developed models that could be used to replan
during an hour. This is necessary because if contrary to the planned optimization there is a
sudden upward trend in the demand or a sudden downward trend in the supply the system
should be able to handle it gracefully. These erratic �uctuations in the demand or supply
pattern called spikes are handled using a similar model as represented in �gures 6.8 and 6.9.
The only di�erence is that when an optimization planning during an hour is performed the
time window of that planning is small and the system has to keep the service-level guarantees
pending for certain consumer devices. Due to the similarity of the spike related meta-model
with the hourly planning model we are not going to discuss its formulation. However, we
will discuss the spike related results in the evaluation section.
6.4.3 Case-Based Reasoning Engine
In planning an optimization the system has to select a method out of the available opti-
mization methods. Because of competing factors it is not always possible to use a simple
rule based system to select it. Therefore, there is a requirement of a recommendation en-
gine that suggest an optimization method to plan optimizations based on historical data
and user preferences. To this end we have used a recommendation engine that is based on
Case-Based Reasoning (CBR) to �nd the right method when an optimization is required
[Aamodt and Plaza, 1994].
Case-Based Reasoning (CBR) is a technique to derive a new solution by considering
similar past solutions. The CBR engine maintains a case base developed using historical
data. Given a new situation the CBR engine compares the new situation with old situations
128
derives a solution that is closest to it. This engine has the ability to revise and update
its case base. We use CBR to select the optimizer for our base problem. The input to
CBR are statistics of the system alongwith user policies, constraints, etc and output is the
recommended optimization method i.e. BP, simplex, or interior point.
Another motivation to use a recommendation engine for the system is because we do
not have any hard boundaries for a given problem in terms of its size. This means that
given a new situation it is not possible for us to select one of the three methods unless
we have evaluated its performance previously on a similar situation. The recommendation
engine based on CBR requires an initial case-base of historical data and when new cases
come it uses the previous experience to recommend an optimization method. The new cases
depending on the error rate and speed of their solution are used to improve the set of cases
in the CBR engine.
Some constraints such as user policies and service-level guarantees are already fed into
the system at design time. At runtime the CBR recommendation engine takes three inputs:
• Total gap between power supply and power demand.
• Total number of machines to be managed.
• Approximate time available for calculating the optimization plan.
Any CBR engine requires a set of initial case bases. We have developed the initial case base
using the following methodology: we generated an initial case base of this CBR by running
our three solver algorithms on sample data. We varied the size, spread and supply-demand
gap of our data. Variations of these are listed in table 6.4. These variations resulted in total
of 54 cases. The resulting time and un-utilized power margin generated by each of these
runs alongwith the inputs are saved in the case base.
Once this initial case base is populated at runtime when an optimization is requested,
CBR �nds the case closest to the input parameters and selects the solver that can solve the
129
problem within the available time while minimizes the un-utilized power. For online learning
a feedback system is incorporated in the CBR recommendation engine system.
6.4.4 Framework Architecture
We developed our �gure so that it is easily integrated into a SAPE cycle. Therefore, we
assume that each electric device has a mechanism to send us the information about its status
including whether it is powered on by the consumer and its present power consumption.
Assuming all this data is received into the system our framework plans for an optimization.
The system state data received from the sensors are fed into a CBR Recommendation En-
gine (CRE). CRE is applied on this data to generate the selection of optimization algorithm
for the given optimization problem. This decision acts as an input to the Runtime Modeler
(RM). The runtime model generated by the RM is an input for a mathematical toolbox
(MT) that applies the appropriate optimization algorithm to the model. This toolbox also
measures the statistics of of the solution such as time taken to solve and amount of power
un-utilized for feedback to the CRE.
The plan generated by the MT are propagated to the devices that make up the system
and executors perform the implementation of this planning. Again sensors and executors
are assumed to be there and our framework �ts in between the two for performing self-
optimization.
In this section we describe a particular implementation of AdOpt used to run the simu-
lations of a real power distribution system.
CBR Recommendation Engine: CRE consists of a mathematical summarizer, a CBR
engine, result evaluator sub-module and a case-base as shown in �gure 6.11. Raw data is fed
to the engine for summarization. This summary is used, in conjunction with the case-base
to generate the input for subsequent modules. At the other end, CRE has a result evaluator
module which receives the statistics of the MT at the end of an optimization cycle. These
130
statistics include, number of devices, supply-demand gap, solver selected, solver time and
error rate. If the real time execution conforms with the case-base then no action is taken.
But if the real time execution reveals that the solver time was incorrect then the case base
is updated. If the number of devices is too far from any of the cases in the case-base then
the result is added as a new case.
Runtime Modeler: Details of our Runtime Modeler (RM) are discussed in previous
sections. Architecturally, dynamic mathematical modeler consists of three components, a
Knapsack modeler, an LP modeler and a clustering component for dimension reduction, as
discussed previously.
Mathematical Toolbox: Our mathematical solvers are standard operations research
tools. We the mathematical solvers for the three optimization methods as black boxes. The
input to the solvers in MT are the runtime model(s) generated by the RM and the output is
an optimization plan. Details of the speci�c mathematical solvers are given when we describe
our evaluation setup.
6.4.5 Evaluation
The promise of autonomic system in general is to reduce the load of the operator. In order to
cater to this need, an autonomic system should be able to cater to as many conditions that
it can possibly handle without operator intervention. In addition, a self-optimizing system
should deliver a better result in general cases in comparison to a system where optimization
is performed manually.
In general, we claim that AdOpt framework caters to these demands of the self-optimizing
system. In fact our system can outperform other self-optimization techniques in a rapidly
changing system by leveraging the very changes which cause sub-optimal behavior.
To evaluate our system against these two core requirements we pose the following ques-
tions for evaluating our system:
131
Figure 6.10: System Flow
Figure 6.11: System Architecture
132
1. Is our system leveraging the change in the dynamics of the system in terms of its in
size and available time?
2. How much did we optimize or in other words what percentage of power is utilized?
We evaluate our system against two di�erent set of evaluation suites. Our �rst evaluation
suite tested our system for e�ectiveness, that is, are we able to provide a plan irrespective
of variations in the system. We compare our results with three techniques running without
adaptation and compare e�ectiveness and saving.
Our second evaluation suite is built on data from a power distribution network where
we compare our framework's results with existing technique. Our savings, or pro�tability
comes from allocating as much power to devices as we can. In this test suite we compare the
un-allocated power of AdOpt framework with the un-allocated power of existing techniques.
In this section �rst we will describe our evaluation environment. Then we will discuss
results for the two di�erent sets of evaluation we conducted on our framework.
Evaluation Environment
We used a a shared 2.4 G.Hz. Pentium Core 2 Duo processor with total of 2.00 GB of
RAM to conduct our simulations. The CRE is developed using FreeCBR. The APIs of CBR
are integrated with Matlab and uses Matlab's internal JDK to call FreeCBR functions. We
wrote our own code for the RT and clustering of raw data. The mathematical solvers in MT
uses Tomlab's CPlex 1 solver to solve BP and uses Matlab's optimization toolbox for LP
optimizations using simplex or interior point methods.
CBR Initial Case-base
To develop our CBR case-base we varied supply-demand gap and size of the problem. We
applied this data to all the three algorithms. Due to practical consideration we limited BP
1http://tomopt.com/tomlab/products/cplex/
133
to sizes of 2000 users. The variations in case-base, at the starting of evaluation is given in
table 6.4. For each data size, we considered three random values of resource provisioning,
ranging from close to minimum requirements (1/3rd) to 88%.
Experimental Simulations
We evaluated the claims on our system on two sets of data. The �rst set consists of a
hypothetical system where we showed the adaptability and e�ciency of our system. To
validate results further we tested our system on a real data obtained from a consortium of
independent power companies in California (CAISO).
Adaptability and E�ciency Evaluation
In these experiments we evaluated our system on three key variation of the system.
These variation includes the electric devices present in the system, the gap between supply
and demand of electricity, and time available to calculate the plan. We made equivalence
classes of these three variations to develop the set of experiments to test the system under
various set of variations.
We used �ve equivalence classes of the number of machines present in the system i.e. 500,
1000, 2000, 10000 and 50000. Three variations in demand and supply gaps are considered i.e.
5%, 20% and 50% more demand than available electricity. The available time to calculate
the plan is also varied. This is the maximum time available to �nd a plan. For hourly
Sizes Resourceprovisioning(%)
100 72 54 351000 88 65 442000 80 65 405000 60 40 3510000 80 62 4250000 67 50 34
Table 6.4: Input combinations for all three algorithms to generate the initial case-base
134
planning this may not matter because we have a whole hour at maximum to calculate the
plan. However, at any time when there is a spike in the system of demand or supply then a
plan needs to be recalculated for the hour. Therefore, we considered three upper boundaries
for time available to recalculate. These upper boundaries are 5, 20, 60 and 200 seconds
respectively.
Since a cross product of these three variations results into sixty cases. Showing results
of sixty cases is not possible in the paper. Therefore, we used orthogonal arrays to generate
twenty combinations of the experimental results that we show here. According to a study by
NIST using orthogonal arrays covers 98% for all cases in real situations [Wallace and Kuhn, ].
These twenty experiments are summarized in table 6.5.
The �rst three columns of table 6.5 show the three variations for the experimental run.
The # of devices shows the number of electric machine present in the system where a service-
level guarantee is required. Shortfall is the di�erent between the demand and supply. A-Time
is the maximum time available to calculate the plan.
Using these variations in each test we used the three optimization techniques indepen-
dently and then used AdOpt to see if AdOpt provides us with a better optimization by
selecting one of the optimization technique dynamically.
The C-Time is the consumed by AdOpt to �nd the plan after it receives the raw data in
the CRE and produces a plan as output. UP is the unutilized power in the system that an
optimization is unable to distribute amongst the electric devices.
Discussion
Now during all the experimental runs all optimization methods gave a result in form of a
plan except a few cases in binary programming (BP). This is because BP is only applicable
at small scales and it runs out of memory in test runs where the number of devices is larger
than 2000. Therefore, we did not run BP on problems involving more than 2000 devices.
It is clear from the results that AdOpt has picked the best optimization method to solve
135
Exp # Variations AdOpt Simplex Interior Binary
Point Programming
# of Short- A C C C C
Devices fall Time Time UP Time UP Time UP Time UP
1 500 5% 5 0.31 1% 0.08 1.5% 0.062 1.27% 0.20 1%2 500 30% 20 0.33 1% 0.1 2.5% 0.078 3% 0.17 1%3 500 50% 60 2.78 1% 0.06 7.4% 0.06 7.4% 2.57 1%4 1000 30% 5 4.54 1% 0.14 2.7% 0.13 3.99% 4.15 1%5 1000 5% 20 0.30 1% 0.13 1% 0.11 1% 0.59 1%6 1000 5% 60 0.31 1% 0.13 1% 0.14 1% 0.59 1%7 1000 50% 200 21.42 1% 0.19 6.9% 0.13 7.9% 1.31 1%8 2000 50% 5 0.33 6.7% 0.20 6.7% 0.20 6.7% 102.06 1%9 2000 5% 20 0.37 1% 0.22 1% 0.23 1% 2.09 1%10 2000 30% 60 46.47 1% 0.23 2.8% 0.18 3% 46.63 1%11 10000 50% 20 0.98 2.3% 0.86 2.4% 0.86 2.4% na na12 10000 5% 200 0.95 1% 0.89 1% 0.84 0% na na13 10000 30% 5 1.11 1% 0.83 1% 0.83 7.7% na na14 50000 30% 200 4.74 2.2% 4.38 2.2% 4.46 2.2% na na15 50000 5% 5 4.52 1% 4.35 1% 4.54 1% na na16 50000 50% 20 4.6 3.7% 4.31 3.7% 4.31 3.7% na na17 500 30% 200 0.23 1% 0.05 5.6% 0.08 5.6% 0.22 1%18 2000 5% 200 2.08 1% 0.22 1% 0.20 1% 1.92 1%19 10000 30% 60 0.98 1% 0.84 1% 0.81 1% na na20 50000 5% 60 4.5 3.7% 4.38 3.7% 4.42 3.7% na na
Table 6.5: Adaptability and E�ciency Test Results
136
Figure 6.12: Summary of Results of Table 6.5. Top most �gure shows results of AdOpt,middle one show results from Interior Point and the bottom one shows result of Simplexmethod.
the optimization problem.
Our �rst evaluation suite is based on pair-wise testing. This evaluation covers 95% of
input space thereby giving us a con�dence that most of our problems encountered can be
handled.
Pair-wise testing generates test cases to cover all operational equivalence partitions.
These partitions cater to various variations that are inherent in the system. We can par-
tition based on our optimization system's classes or on our ME's variation classes. Since
137
our technique selects it's behavior based on learned information, it is more appropriate to
partition the system on basis of managed element's behavior. Variations in our ME are
observed through three parameters; Size of problem, supply-demand gap, and time allowed
to calculate the results.
Our variations in data are closely related to system description in the previous section.
In our previous system, we provided three SAPE cycles to cater to di�erent types of opti-
mizations. Time to calculate in an ME varies with the SAPE cycle for which optimization
is being performed. The times are approximately 10 seconds, 1 minute or 10 minutes.
Historically, in the region of the authors, supply demand gap varies from 5% to 50%.
Our partition for gap will be 5% 30% and 50%.
Size of problem usually lies between few hundreds in winters to tens of thousands in
summer. Our partitions for sizes will be 500, 1000, 5000, and 50,000.
We used satis�ce.com's ALLPAIRS tool to generate test cases. For the variations listed
above 20 test cases were generated.
Summary of the simulation results is provided in table 6.4.1. We compared our technique
against stand alone executions of simplex, interior point and bip solvers.
Our conclusions are as following
• AdOpt is e�ective in 100% of class whereas in comparison
� BIP is not applicable for 50% of classes
� Simplex is not applicable for 10% of classes
Method Average time Average errorSimplex 1.14 3.3 %
Interior Point 1.15 3.4%BIP 12.63 .1%
Adaptable 2.47 2.28 %
Table 6.6: Summary of simulation results for pair-wise testing
138
• AdOpt though is slower than simplex and interior point yet it is as e�ective or better
than any independent technique. In addition AdOpt provides better utilization for
35% of classes
It should be noted that results and result times for all the three algorithms are highly
dependent on the input values. Secondly, the results in this test suites were conducted on
equivalence classes. The percentage results for a running system will depend on the frequency
for each class.
This comprehensive testing routine provides us a guarantee that our system can handle,
optimally, any variation of system behavior.
Our conclusions from this set of tests is that:
• As size increases, un-utilized power increases. Time is a critical factor only if size > 5.
• In almost all of the cases UP is minimum in AdOpt. Except in test-case 8(5,2000) 6.5.
This is becaus time is very small for this run and our shortfall is 50%. Combination
of these two key factors has an adverse e�ect on time.
• E�ciency of AdOpt is inversely proportional to size and shortfall, and directly propor-
tional to time . That is, as our size increases, e�ciency decreases. Similarly increase in
shortfall also has adverse e�ect on e�ciency, but as our allowed time increases e�ciency
increases to.
• BP is not able to handle all the situations and in situations where it is applicable it's
results are close to 1%. This is why we have not shown time vs devices graph of BP.s
CAISO Data-set evaluation
In our second set of experiments, we applied AdOpt on data collected in the state of California
in USA. California Independent System Operator provides it's actual and prediction demand
data online.
139
We collected hourly data for 7 days. Our demand for the system varied between 16000MWh
and 25000MWh. We interpolated this data on our local problem. Power generation in au-
thor's region is usually 30% to 60% less than required. To simulate such a situation, we set
power supply at 10000KWh. which roughly equaled the 30-60% window.
Our motivation to test system with data CAISO is that California's data is available and
the weather pattern in California closely resembles that of Pakistan. With this co-relation,
we can correlate the behavior of users in our country and simulate the results.
Discussion
We applied AdOpt and simplex algorithm for hourly planning. Since results for simplex
and interior point are very close for large data-sets we used only simplex to compare in this
experiment. The size of the problem in some cases is too big for BP to solve hence BP can
not be considered as a stand-alone technique to solve the problem.
Summary of results for 7 days are shown in �g. 6.13. AdOpt provided a better resource
utilization on everyday. For the 7 day period we generated plans with 27% better utilization
as compared to LP solution.
To better illustrate our system. We focus on the most busy day for AdOpt; day 6. Fig.
6.14 describe the supply and demand variations on 6th day of our data. As can be seen, the
demand for power in the starting of day is low, the demand grows steadily till mid-day and
then tapers o� towards the night time. If we use an LP based solver to assign power, our
un-utilized power stays in the range of 6% to 9.5%. In comparison AdOpt's consumption
starts o� at close to 0% when power demand and supply gap was low, as the power shortfall
grew, AdOpt updated it's selection of algorithm to manage the increase in size. At the end
of the day, when the demand again dropped sharply, AdOpt also scaled down and applied a
more conserving algorithm and increased it's saving.
Our comparison with LP yielded a 27% more e�cient energy allocation system. This
was leverage was achieved by applying a more conservative, albeit slow, technique at low
140
consumption times, mainly at night. In day-time, AdOpt adapted to a more robust LP based
algorithm to handle the bigger size/gap situation.
An interesting point to note here is that AdOpt was able to scale up and down imme-
diately. The system did not take any convergence time when the system grew or shrank
sharply. This is in contrast with control theoretic approaches which aim for stability and
sudden changes are smoothed over and sudden jumps in input space do not translate to a
drastic change in solution space.
We can summarize our �ndings as:
• AdOpt provides a 27% better service than LP.
• AdOpt can provide solutions to 100% of problems where as BP is not able to solve all
the situations in a 24 hour period
• AdOpt's adapts to sudden change in realtime. If a change in input necessitates an
adaptation then AdOpt's change is instantaneous.
Figure 6.13: AdOpt and Simplex comparison on 7 day CAISO data
6.4.6 Discussion
In this section we try to answer some of the questions that a reader may have regarding our
framework. Are the simulations realistic? Data for our simulation comes from a live
141
Figure 6.14: Supply demand and comparison for day 6 CAISO data
system. Users can download usage data from the website. Hence the projection of demand
is actual. We know that in places such as Pakistan, supply-demand gap is as much as 50%
so our power provisioning of roughly 80 - 55% of power is a realistic scenario for author's
country.
What is the breaking point of the system? We do not guarantee a plan if the supply
is not enough for the guarantees. In such a case, guarantees will need to be scaled down and
plan then can be calculated. This scaling down is an administrative decision. Our framework
however is able to incorporate such change.
Savings? Our saving, or increase in pro�t, is by allowing more users to consume electricity
thereby increasing user satisfaction and in a round about way more pro�ts from higher and
more tighter consumption.
What about spike handling? Spikes are a reality in our systems. We can not ignore them
but due to paucity of space, we did not discuss the details of spike handling. Our all-pairs
evaluations however did cater for spikes by testing system for 10, and 60 seconds durations.
Why AdOpt when simplex works? It is true that we were able to allocate upto 92%
of electricity through simplex alone, but with AdOpt we have increased this availability to
142
97%. This minor advantage means that in an area of 10,000 devices, 500 customers will be
more satis�ed than in LP alone.
Can this system be implemented? In some cities of Germany, systems which micro-
manage heating to individual hosing has been successfully implemented.
6.5 Adaptable Modeling Framework
Traditionally, models created for optimization of systems are generally expressed as abstract
mathematical models. These models are de�ned in standard mathematical lexicon. When
a system is to be deployed, its model is realized as code segments or equation matrices
or equation arrays based on solver being used for optimization. The dimensions of these
matrices and cardinality of variables is usually de�ned at the time of deployment and is hard
coded in code segments, matrix dimensions, etc.
In comparison, for systems such as DAS, system dimensions at the time of deployment are
meaningless . This is due to the fact that it can grow, as well as shrink over time. To handle
such changes, a measure of self-aware modeling integrated with self-optimization is necessary
to manage DAS. This self-aware optimization can leverage the change in dimensions of DAS
at runtime to attain scalability and performance boost according to the runtime state of
DAS.
Various systems have been optimized through mathematical models. However, in all of
the applications of mathematical techniques seen so far by the authors, the constraints and
tuning parameters were known when the system was being implemented [Femal and Freeh, 2005,
Javed and Arshad, 2008, Jabr et al., 2000]. We have not observed any detailed work for en-
gineering a system's model that exhibited variability in the size of their constraints and
control features.
Therefore in our modeling framework we have used the abstract mathematical models
143
as a meta-model to create an on-demand, instantaneous model of system based on system
statistics. In this section we de�ne our modeling framework for constructing an instantaneous
model of a system at runtime.
6.5.1 Structure of the Mathematical Meta-Model
In practice mathematical models are developed and expressed as abstract models. Mathe-
matical models represent a system in form of decision variables and constraints. Decision
variables are the controlling parameters to change the system state where as constraints are
the limitations of the system. Since in mathematics, a variable can take any numeric value,
it is important that we specify the limits of our decision variables as well.
To model a system, the control parameters and limitations of the system are analyzed. A
system can be composed of many control parameters but usually there exist logical groupings
with which these control parameters can be abstracted into a single entity or class. Usually
this also means that similar constraints apply on each of the element of the grouping. It also
means that a single abstract equation with appropriate quanti�ers can su�ce for containing
the behavior of all the variables within a group. Since these are logical groupings and
resemble a set like structure, we call these variable abstraction as ontologies of our system.
Hence an ontology is a group of control parameters which have similar logical structure
and are subjected to similar constraints. Like sets, ontologies can be grouped together to
form more inclusive notation. Mathematically, this means that whereas two di�erent logical
groups of variables, or ontologies are subjected to their own constraints, it can also have a
set of constraints that are applicable to both the groups. Hence our decision variables can
be part of a multitude of ontologies. Here a subscript de�ne the speci�c element within an
ontology. We call these grouping of ontologies as an ontological class. Figure 6.15 describes
the abstract model that we will discuss in detail here.
144
Adaptive Modeling for AdOpt
We consider making a meta-model for planning in AdOpt. We divide our devices into
ontologies according to their consumption pro�les and time periods. Our task is to maximize
the number of machines from each set which can be kept in "on" state for a particular period
in an hour without violating the service-level guarantee. Here the number of machines to
keep in "on" state in a particular time period is our tuning parameter or "decision variable".
For each tuning parameter there are two ontologies. First there are di�erent sets of machines.
Each type is represented as a subscript i. The second attribute is of time, that is which time
period does a speci�c decision variable represent. These types are represented as a subscript
t. Hence i, and t represent two ontologies combined in a single decision variable Xi,t.
The system in �gure 6.15 is subjected to three classes of constraints. Each of these class
is represented as a single abstract equation. Notice that equation 6.29 is only applicable to
one ontology, the time t while the other two are subjected to both. For demonstration of our
framework we will consider the example of equation 6.28 in detail. This equation constraints
the system by enforcing a minimum service level. It states that for every time period t, the
number of machines switched on in every machine class i should not be less than 1/3rd of
the total number of machines in that class.
During implementation these abstract models are expanded according to available system
statistics. If our system had �xed machine classes, say 10 and 6 time periods (t) the abstract
equation 6.28 would have been expanded to 60 equations. Each of these 60 equations would
have represented one speci�c (t, i) tuple.
Mathematical models for systems which do not exhibit change in cardinality from abstract
model to implemented model can be modeled e�ectively. That is, if we can enumerate at
time of implementation or deployment as to how many machines we have and how many
time segments we have, then generating an actual model of the system from abstract model
is straight forward.
145
However, if the cardinality cannot be evaluated at the time of implementation, then
modeling becomes a di�cult task. A naive modeling technique is to consider worst case
scenario. For example, in the sample model above, we limit i, or device classes, to say 1000
and then make a model for these many classes.
For a grid level electric distribution network this solution is not feasible. First, the
number of device classes cannot be predicted. There are new types of machines that are
being added everyday and limiting this growth is not possible. Second, worst case setup is
highly ine�cient. By calculating for a 1000 classes always, we are consuming much more
resources where as in actuality we might need a fraction of these calculations. Third, because
we always assume a large data-set, the choices for algorithms is limited. There are algorithms
which are more e�cient for small to medium sized data-sets. If we can evaluate and model
at runtime, it is possible to derive a better result by using more accurate algorithms.
6.5.2 Modeling at Runtime
Various techniques exist for creating a runtime model of a system. These e�orts are usually
intended for architectural and operational runtime modeling systems. We observed that
these modeling framework have some commonality in processing their task. Usually runtime
modeling frameworks de�ne a set of primitive artifacts with de�ned semantics. At runtime
these artifacts are instantiated and replicated and relationship among these artifacts is estab-
Maximize(Z =∑i,t
Xi,t) (6.27)
∀t∀iXi,t ≥ supplyi/3 (6.28)
∀t∑i,t
µiXi,t ≤ supplyi (6.29)
∀i,tXi,t ≤MAXi (6.30)
Figure 6.15: Hourly planning LP equations
146
lished [Pickering et al., 2009, Goldsby and Cheng, 2008, Kuhn and Verwaest, 2008]. There
are various methods to extract information from a system and various uses of the modeled
systems, but this is beyond the scope of runtime model generation.
The underlying architecture of our framework is similar to these runtime modelers. The
di�erence is that we use the components of abstract mathematical models as our primitive
artifacts. Speci�cally, the abstract mathematical model de�ned for the system is used as a
meta-model. The primitive artifacts for us are the ontological classes. When we observe an
object, or a variable, belonging to a speci�c ontology, we create a corresponding ontology
object for it in our mathematical model. This process is covered in the modeling of ontologies
step (step 1 de�ned below).
The equations of our meta-model de�ne the relationships between di�erent variables.
Once we determine the cardinality of ontological classes, we develop relationships of ontolo-
gies by exploring the equations one by one and setting up the constraints and limitation of
the system in the process. This process and production of the complete model is generated
in the modeling phase.
This runtime modeling is three step process. Our framework �rst determines the system
statistics to de�ne cardinality for ontologies. In the second step, it determines the cardinality
of relationships and determine the number of equations each meta-equation will generate.
The third step uses the cardinalities to create an instantaneous model. The second and third
steps are closely related and their implementation is also intertwined. However, since step
2 is platform independent and step 3 is dependent on the solvers, merging the two steps is
avoided where-ever possible.
The description of the phases is given below.
147
Modeling of Ontologies
Modeling of ontologies is a two step process. First we pre-process our data to reduce dimen-
sions of our input data.
An input to our system consists of raw usage data for devices. In pre-processing we
reduce the dimensionality of raw usage data using a clustering algorithm. The details of
this dimension reduction is discussed in our previous work [Javed and Arshad, 2009b]. This
pre-processing is required due to the nature of problem. In other works such as Femal and
Freeh's use of LP, such pre-processing will not be required[Femal and Freeh, 2005]. For such
models, direct evaluation is possible.
Modeling of ontologies determines the cardinality of each ontological class. In our model,
there is only one ontological class, X. This ontology in turn is composed of two co-dependent
ontologies: time interval, represented by subscript t and instance of a cluster represented by
subscript i. We consider 6 time intervals for our problem, however, this interval can also be
changed in runtime.
Modeling of Relationships
A mathematical model is a representation of system in terms of inequality and equality
equations. These equations de�ne the constraints and limits of the system.
Our framework �rst distinguish between the equality and inequality equation. Though
both are evaluated in the same way but in construction step, a di�erent matrix is generated
for each of those equation genres.
Our framework in this step uses cardinalities of ontological classes to expand the quanti-
�ers. Each quanti�er expands some ontological classi�cation. For example, a ∀Xi quanti�er
translates to 1 equation for ontology i within the ontological class X. In addition, the co-
e�cient and right hand side for these equations is also determined in this step as constants
148
are sometimes also associated with a speci�c instance of ontology.
Similarly, equation 6.28 has a (∀t∀i) quanti�er. Hence this meta-equation is expanded
into i x t equations, since the equation is created for each (i, t) tuple. The equation states
that the coe�cient of (i, t)th decision variable is 1. So for each new equation expanded
for meta-equation 6.28, the coe�cient for variable Xi,t will be one and all other variables
will have coe�cients of zero. The equation states that the right hand side of this equation
will have the constant value of supplyi/3. The supplyi/3 is the cluster of the set Xi. We
determined this value in step one. Hence for each equation the correct corresponding value
for Xi/3 is placed.
Model Construction
A mathematical model can be represented in di�erent forms. One of the most commonly
used form to represent mathematical models in computing systems is a matrix form. Since
arrays and matrix are realization of the same phenomenon, we will discuss how we created
matrices from our results from previous steps.
In matrix notation, a series of linear inequality equations are represented as:
A× x < b
and a series of linear equality constraints as:
Ae× x = be
Here x is a vector representing the variables, b is a vector for right hand side constants for
inequality constraints and be for right hand side constants for equality constraints. Similarly
A is matrix of coe�cients of x for inequality constraints and Ae for equality constraints.
149
Similar generalizations exist for non-linear systems but is beyond the scope of this work.
Though both equality and inequality constructs are almost similar but solvers accepts
them in two di�erent set of matrices. We construct both the matrices in similar fashion.
The process of constructing matrices is as follows: We �rst determine the matrix x. We
use the notion determine because x is not constructed in matrix form per se. Rather x
is considered as an ordering of decision variables. Decision variables, if we recall, are the
instances of various ontological classes that we created in step 1. Fixing the order does not
change the execution of algorithm so any convention which completely covers the ontological
class space is su�cient. However, �xing an order is necessary as this order determines the
placement of coe�cients in matrices A and Ae.
Our model has a single ontological class of decision variables, Xi,t. We �x an order of
expanding the two dimensional space of X by arranging rows before columns. This step �xes
our x vector.
Our framework proceeds with processing our equations determined in step 2. For each
equation a row in matrix A and one in b is added for an inequality constraint. Similar step
is executed for equality constraint but for matrices Ae and be. In a newly added row of A,
all elements are zero excepts the ones speci�ed by the equation. The constant values for
coe�cient of A and the value in b are placed. This step is repeated for all the equations
which were generated in step 2. At the end of this The complete matrices A, Ae, b and be
are produced.
6.5.3 Running Example
We now describe the construction for a row of equation 6.28. Let's assume that we 50 clusters
were created during our pre-processing and we have 6 time slots. This means that our step
1 will provide us with the value of 300. These are the number of decision variables that we
150
will have in our system. For our model generating step, this means that size of x matrix will
be 1× 300 and matrices A and Ae will have 300 columns.
Lets assume that cluster number 10 has 18 elements. So our equation for second time
period from step 2 will look like the following:
1×X10,2 < 6
Our model construction will construct the following row for this equation in matrix A.
Column 1..61 col 62 63 .. 300
Value 0..0 1 0..0
In addition, it will add a row in matrix b and put the value 6 in the newly added row.
A complete matrix A thus will have x× t columns and i× t equations for meta-equation
6.28, t equations for meta-equation 6.29 and i × t equations for meta-equation 6.30 and a
solitary equation for meta-equation 6.27.
6.5.4 Evaluation
We have designed a framework for modeling of optimization of large scale power systems.
Conservation of power through optimizing usage of end user devices is a not new concept.
However, to our knowledge very few techniques are available which are scalable and e�cient
to achieve this goal. So far the major work in this �eld has been performed on �xed sized
systems where the number of devices are known at design time. The models of systems are
before deployment time based on the largest possible or worst case deployment of system
[Ashok, 2007, Galus and Andersson, 2008].
Our system engineers the model at runtime instead of populating the variables of a
�xed model. Therefore our evaluation, compares the existing modeling methods for similar
151
smart-grid application with our runtime modeling results. We claim improved performance
using two key matrices; First our response time is faster than a �xed model. Second, we
claim better e�ciency in achieving goal of optimization, i.e. in distributing power to the
consumers.
The aforementioned 'e�ciency' of our electric distribution is the unutilized power (UP) in
the system that an optimization is unable to distribute amongst the electric devices. The de-
tails of why such unutilized power exists is discussed in our previous work [Javed and Arshad, 2009b].
We would like to state here that the increased e�ciency in our example system is because
we modeled it in a way so that a decrease in model size will increase e�ciency. Thus our
results of e�ciency are applicable when the system can be and is modeled in a way which
relates the e�ciency with size of model.
Our evaluation thus evaluates the following hypothesis: Does modeling at runtime for
a system that varies its size and structure results in bene�ts in terms of time or e�ciency.
In order to test this hypothesis we used two sets of real data collected from two di�erent
sources.
We conduct our evaluation on two di�erent sets of actual data readings. This is because
of two reasons: First, consumption data of individual users for a city is not readily available.
Second, this split analysis proves applicability of our framework for systems both large and
small.
Our �rst set is a small but detailed study of household energy use in Sollentuna, Sweden
performed over the course of two years. Experiments on this data is used to show a corre-
lation between total consumption, time to calculate and the number of users. Our second
experiment is on a data from the state of California, USA. In this experiment we apply our
modeling framework on large scale set of data and we see the bene�ts in terms of e�ciency.
152
Evaluation Setup
For our evaluations we used a shared 2.4 G.Hz. Pentium Core 2 Duo processor with total of
2.00 GB of RAM. The mathematical solvers used Matlab's optimization toolbox.
Evaluation Data Details
Our �rst experiment uses hourly consumption data from approximately 700 houses collected
in Sollentuna, Sweden for the year 2005-2006. Through this experiment we validated the
following
• There exists a strong correlation between the time taken for optimization by dynamic
modeler and the consumption of energy.
• There is a weak correlation between a �xed model optimization and consumption of
energy.
• there exists a strong correlation between total demand for energy and the number of
consumer clusters.
Whereas the �rst two claims support the case for dynamic modeling, the last claim helps
us construct a more powerful scenario for validating the scalability and applicability of our
modeling framework.
Our modeling framework can model and optimize systems which vary in size. The real
bene�t of the system is attained when the variation in size is considerable and the scale of
optimization is large. Since a small scale LP optimization in itself takes insigni�cant time.
To test our framework for a large scale realistic system we use data published by CAISO.
This data consists of daily usage of electricity in state of California, USA. A sample of this
data is provided in �gure 6.16. However, this data is incomplete for our modeling since
we require the usage pattern of individual users and not just the total consumption of the
153
system. To overcome this problem, we arti�cially constructed the clusters of users based
on total consumption by dividing the total consumption over in a Gaussian distribution.
Gaussian distribution was used because it was the most appropriate and simple distribution
to represent the natural behavior of large number of users. Though the distribution of load
has a minor impact on the overall performance. We still consider it as part of our future work
to model and evaluate the system with di�erent distributions. To validate this distribution
further, we used results from our �rst experiment set. Even though, intuitively it makes
sense that increased consumption means increase in number of consumers. We still base our
argument for constructing the usage patterns for individual users based on the correlation
found between consumption and users in our �rst experiment.
In the following sections we de�ne the standard modeler, the modeler simulating the
prevalent modeling methods in smart-grid literature, and our dynamic modeler using the
aforementioned sets of data. The �rst set of data will validate the correlations and the second
set will validate the scalability and e�ciency of our framework in a large scale environment.
Standard Modeler
Smart-grid techniques which focus on global optimizations such as in [Ashok, 2007], [Izquierdo et al., 2008],
[Jabr et al., 2000], and [Wang et al., 2002] build models for the worst case scenario. With-
Figure 6.16: Consumption pro�le of California for a day as published by CAISO (Consump-tion in MWh)
154
out a runtime modeling framework this is necessary because updating the system model
manually at runtime is not possible.
For a system such as our micro-management application for smart-grids, a model using
the standard method means constructing a model for the worst possible day throughout the
life cycle of the system. Instead of simulating this scenario, we only consider the cluster
con�guration for the worst hour of the day we conducted our experiments on. Note that this
is not the worst case or largest con�guration for the system life cycle. However, this provides
su�cient comparison since our technique has proven itself to be faster. We use the number
of clusters as the metrics here because the size of the model is dependent upon the number
of clusters for each hour. We used standard k-means clustering on the input data where k
is worst-case clustering size for the day. These k clusters and their frequencies populate the
�xed input matrix for the optimizer.
6.5.5 Evaluation Results
Swedish Household Consumption Data
Our �rst experiment uses the data collected from Sollentuna, Sweden. The data consists of
consumption of electricity in a suburb of Sollentuna collected at an interval of 1 hour. We
use these consumption pro�les as input for both our dynamic modeling framework and the
standard modeler. We conducted the experiments multiple times and considered the mean
of runs to deal with operating system related noise in response time. This is because the
response time for the small data-set is small enough to be a�ected by background processes
of the operating system.
The execution time for the dynamic modeler, standard modeler, and the total demand
of the system is shown in �gure 6.17. Here the line with square points represent the time for
standard modeler in seconds, the line with diamond points represent the response time of
155
dynamic modeler and and the line with square points represent the total energy demand in
MWh. As it can be observed, there is a correlation between the demand and the response
time for the dynamic modeler. The correlation coe�cient for these observations is 0.75 using
Pearson method. On the other hand, the relation between response time of standard modeler
and demand comes out as week inverse, -0.3 using Pearson method. This validates our �rst
two claim that the strong correlation exists between time taken by dynamic solver and the
total demand of the system and that a �xed size modeler is not able to bene�t from the
change in demand.
Our third claim is explained through the graph in �gure 6.18. Here the dotted line
represents the total demand for each hour, the line with square points represents the number
of users and the strong line with triangle points represents the cluster count. Here we can
see the relation between the number of consumption clusters and the total consumption. We
see that a strong correlation exists between the number of and the total consumption using
Pearson method (0.83).
We can thus conclude from this experiment that a strong correlation exists between the
consumption, number of users and time taken by the dynamic modeler. Furthermore, no
correlation was observed between the standard modeler and total consumption.
CAISO Data
Our second evaluation compares our modeling framework with the standard modeling method
on the criterion of running time and e�ciency if we were to distribute electricity in state of
California using our method. We evaluated our system by running both systems on data of
24 hours from a power distributor's pro�le.
We observed that our framework's execution time was considerably less in comparison
to the standard modeler. Figure 6.19 plots our framework time and standard modeler time.
Here the squares represent the time in seconds the standard modeler required to model and
156
Figure 6.17: Response time for dynamic and standard modeler in comparison to demand.(Response time in seconds)
Figure 6.18: Comparison between demand, clusters and active users for 24 hour period asobserved in Sollentuna, Sweden
Figure 6.19: Solver time for 24 hours CAISO data. (Response time in seconds)
157
Figure 6.20: Solver e�ciency for 24 hours CAISO data. (Power allocation in Kilo-Watts)
optimize the data for that speci�c time period and diamonds represent the time in seconds
for our runtime framework. It can be observed that runtime modeling time is considerably
faster throughout, except for two cases, in the 6th and the 21st periods. These are the cases
where the size of runtime model was maximum and both the models were of similar size.
We witnessed at an average 56% better response time than the standard system.
Our second evaluation goal was to achieve better performance. Figure 6.20 plots the
power allocated by runtime framework and by standard modeler. Here diamonds represent
the runtime framework allocation of power in megawatts and squares represent the standard
modeler results . We observed a marginal improvement in allocation of power. Total increase
in power allocation was close to 2% which is signi�cant for a large scale system.
Our results show that our runtime modeling framework is faster than a static modeling
method. Our runtime modeler is approximately 50% faster than the standard modeler.
Furthermore, we observed that we can achieve better performance for our speci�c model
through the use of dynamic runtime modeling
158
6.5.6 Future Dimensions of Runtime Modeling
Dynamic modeling intuitively leads to an e�cient of optimization. Since the model only
consists of variables and constraints that are applicable at that instance, a more streamlined
and concise model is constructed resulting in faster optimization and better results.
From the study of application of smart grid, cloud computing and other �elds where
adaptable behavior is anticipated in future systems, we see that this rigidity of structure will
not be guaranteed in our future systems. We have seen optimization applications of smart-
grid such as applications for Plug-In Hybrid Electric Vehicles (PHEVs) [Galus and Andersson, 2008]
where the demand pattern of the users is an evolving phenomenon. From modeling perspec-
tive this means that the relationships and constraints for the system will be described at
runtime. Even more appropriate comparison is the work of Ogston and colleagues who de�ne
an adaptive clustering method to group together various devices [Ogston et al., 2007]. The
technique is scalable for clustering devices in a city and the resulting clusters, their patterns,
frequencies and shape will emerge at runtime. If we are to use this data to manage these
devices then runtime engineering of model that considering the new clusters and patterns
will be necessary.
Our modeling framework provides the basis for engineering models for such techniques
of the future. Although our existing work caters for LP. However, the three step engineering
process described in section 6.5.2 for creating models is more or less same for modeling non-
linear, integer and some heuristic optimizations. Our work not only provide solution for the
smart grid problem but also provides a foundation for future dynamic modeling for these
modeling techniques.
Our current framework is a proof of concept and requires engineering to integrate a meta-
model into our framework. In our future work we look at ways to bridge this gap. We are
working on evolving a method to de�ne mathematical abstract models in a language which
our framework can understand and create a meta-model from. To this end we are evaluating
159
various modeling languages and are planning on including a translation engine which will
translate abstract mathematical equations into a meta-model. Such work will streamline
integration of our framework with existing optimization platforms.
Our second direction is looking at ways of determining constraints from system statistics.
In our current framework, system cardinality of system constraints are determined solely
by the cardinality of quanti�ers. However, systems which can "sense" constraints through
statistical analysis can produce much more powerful modelers.
Our third direction of interest is integration of our framework and optimizers with phys-
ical infrastructure and implement optimization of resources. A running system of this sorts
will be of real bene�t to society.
6.6 Conclusion
In this chapter we have provided three methods for scalable, dynamically modeled, scheduling
component which self-optimize its accuracy based on size of the system. The proposed
methods have three distinct advantages to static planners proposed in literature. First the
system is able to plan for large scale scheduling. Second, the system is able to leverage
accuracy against size for optimal scheduling. Third, the system is able to model itself at
runtime for a dynamically modeled system to increase the accuracy of the system without
the involvement of human operators.
160
Chapter 7
Conclusion and Future Work
Demand side management (DSM) is the task of managing end user consumption for optimal
provisioning of electricity. Whereas DSM for large customers or at a coarser control gran-
ularity has been deployed for some years now, applying the same strategies over domestic
devices through manual control has been considered infeasible due to the complexity of task
and user fatigue [Kim and Shcherbakova, 2011]. What is required is a self-managing system
which can automate the task of energy management. However, a �ne-grained control of
energy devices is a very hard problem due to complexities of volatility and size.
The demand for energy in a house is extremely volatile when compared to loads in a city
or in large scale industry. Forecasting such load using existing modeling paradigms yielded
inaccurate results. Furthermore, additional data about the consumer did not provide any
increase in e�ciency. Even if perfect forecast is available, Scheduling of devices for optimal
use is reducible to NP-complete scheduling problem proving it to be intractable.
In this thesis we have presented methods and techniques to resolve the volatility and size
issues gracefully and deduce the best possible planning for a DSM system. These techniques
work within a self-managing control loop which seamlessly integrate the di�erent techniques
161
into a self-managing demand side management system. In this chapter we summarize the
results from the di�erent cogs to present a comprehensive self-managing DSM system.
162
7.1 Summary
In chapter 1 we argued that we require a self-managing demand side management for do-
mestic consumers. This self-management is needed to intelligently automate the task of
DSM which would relieve the human consumer and operator from continuous monitoring
and planning of devices. Our strategy is to control the high usage heavy load devices since
they have the biggest impact on the consumption. To provide a fair and equitable distribu-
tion, we restrict our plans to limits prescribed by service level agreements. A service level
agreement is a contract between a utility provider and a consumer limiting the maximum
load shedding that can be done over a period of time.
To plan such devices we require a scalable planning strategy which we discussed in chapter
6. To construct this plan though we require a forecast of device consumption. Due to severe
volatility, constructing such pro�le is non-trivial. In comparison, forecasting for a house and
then desegregating peak load was found to be a more feasible solution. In chapter 4 we
discussed the forecasting paradigm that we introduced to forecast household loads and in
chapter 5 we discussed the disaggregation results proving the e�ectiveness of this strategy
for providing heavy loads data for planning. In chapter 3 we presented an architecture
which tied these cogs together to deliver the self-managing demand side management. In
this chapter we will in turn summarize the results for the three cogs of the architecture.
7.1.1 Planning
Scheduling devices under a set of constraints is an NP-complete problem. Thus scheduling
loads for even a small population exactly is not possible. To resolve this issue we transformed
our problem in to frequency demand through clustering and then applied linear program-
ming to �nd the frequency for each clustered group (chap. 6 sec. 2). This transformation
introduced an error of up to 6% in the optimality of the solution but made the problem
163
scalable. There are two conclusions from this strategy. First is that we can reduce loads by
as much as 30% through this strategy. Second is that we showed that through this transfor-
mation we are able to plan for hundreds of thousands of devices within the time constraints.
We also show that the transformation and optimization is fast enough for us to replan the
system in case the underlying system changes and the data becomes invalid.
An observation from the energy system is that the number of devices in the system
is not constant. Since the heating and cooling loads are dependent on the weather, the
number of devices requiring energy vary over time. Such variations provide us with an
opportunity to improve our optimization performance. Though the transformation for large
scale optimization is necessary for a small scale problem of close to 500 devices, an exact
solution is possible if su�cient time is provided to the optimizer. Our results showed that
through this adaptable optimization we are able to perform better than the transformation
strategy in 75% of the cases (Chap 6. Sec. 3). Furthermore, AdOpt is capable of handling
all the scalability variations in the system whereas the exact integer programming solution
can only cover 60% of the cases.
Since the size of the system varies, it is also important that the model dimensions should
also vary. However, generating model at runtime is a non-trivial problem. In chapter 6
section 4, we present a dynamic modeling technique. This modeling technique varies the
dimensions of the model by observing the types of the systems and their frequencies and
builds a model at runtime which is optimal. Our results show that such modeling marginally
reduces the size of the system resulting in faster calculations and in the case of transformation
a lower error.
The task of planning is strongly dependent upon the knowledge of consumption data
of the devices. The planning requires as input the number of controllable devices that are
predicted to be in use in the next hour. But forecasting devices is a very di�cult task due
to the volatility of the data. However, if we can forecast the house and disaggregate the load
164
to �nd the usage of device then the planning system can be deployed. Next we will discuss
the forecasting methodology followed by disaggregation results
7.1.2 Forecasting
Forecasting energy load in general is a non-trivial task. However, advances in forecasting has
resulted in good strategies for forecasting regional or city-wide loads. The major innovation
which lead to this advance was incorporating weather and other related data which impacted
the consumption of electricity in a region. But for household loads, the extreme volatility of
the data, makes the forecast di�cult. Though it can be argued that this volatility can be
characterized to a certain degree by observing the occupant and the structure of the house,
studies to correlate consumption with these attributes did not yield positive results.
However, we observed a subtle but strong temporal relationship between the attributes
and the consumption. By temporal we mean that attributes showed varying correlation with
the consumption over time. Although it was di�cult to quantify this, we used this subtle
relationship to improve the forecasting accuracy. The existing short-term load forecasting
methods usually apply technique over a single house. But due to the extreme volatility and
small sample space the model was inaccurate and showed signs of over-�tting. In some cases,
the forecasters tried to group together similar houses based on some parameter and then
forecasted for the entire group. Since correlation across attributes for all times was weak,
the forecast accuracy in such cases was also low. We instead, built a multi-dimensional
model where each attribute, including the hour of day, served as a dimension of the model
(chap. 4). Data from all the houses was used to train the data where the hour of the day and
house attributes became the parameters of the model. We used a back propagation neural
network for this task. The internal mechanism of the neural network identi�ed the temporal
relation of loads with attributes and provided us with a better results than with the existing
165
modeling paradigm.
This multi-dimensional model was able to capture the temporal correlation of attributes
with consumption better. This resulted in increase in accuracy by as much as 50% and
reduction in mean squared error by as much as 39%. However, our results even with such
improvement are not very accurate. The maximum achieved accuracy is only 65% and
the least variance is close to 2. Secondly our planning algorithm requires heating and air-
conditioning device state and not the total energy pro�le. To identify heating and cooling
loads, we disaggregated data from the forecasted load which we discuss next.
7.1.3 Load Disaggregation
Load disaggregation is the task of identifying individual device load pro�le or state by observ-
ing only the total load. Load disaggregation is usually used for non-intrusive load monitoring
for identifying device usage in real time. The data for NILM is usually a time series at a
very high frequency ranging from 16 KHz to a reading every second. However, our data is
not realtime but is rather a forecast and is at the level of a reading per hour. On a positive
note, we only require disaggregation for the heavy heating and cooling loads which are very
pronounced in the load signatures.
To achieve this disaggregation we applied a combination of neural networks and support
vector machines (SVM) for disaggregation heating load from the forecasted data (chap 5.).
Our main concern in this task was to �nd all the high consumption devices which are switched
on. Thus the important parameter is accuracy - the percentage of times we identi�ed the
usage of device. Our results show that we can attain accuracy of 99% or more with ANN-
SVM combination for desegregating our target loads.
However, our forecast is not perfect. The forecast can vary in accuracy between 52.5%
and 65% with variance ranging between 2 and 4.23. To simulate the forecasting error and
166
characterize disaggregation results over an inaccurate forecast, we incorporated noise corre-
sponding to the worst numbers of forecast in the load data. Our results show that even with
faulty forecast we can achieve accuracy of 97%.
7.1.4 Putting It All Together
In this thesis we presented a method to forecast energy load with higher accuracy than
existing methods. We then present our �ndings on the ability to identify heavy loads from
this forecast. We then showed that if we have the list of devices which are under our DSM
plan then we can reduce energy demand. Furthermore we can adapt this planning based on
system size for an exact solution or gracefully reduce accuracy in favor of a scalable solution
for large scale datum. In chapter 3 we present a way to integrate these technologies to deliver
a single self-managing DSM solution.
7.2 Lessons Learnt
In this section, we present a list of lessons we learned while working on this thesis.
Adaptability is key to scalability
Scalability is a major concern for the future smart grids. To mitigate the scalability issue
there are two traditional solutions. One is to limit the size of the system and use exact or
close to exact solutions. The other way is to use distributed algorithms. These algorithms
for DSM generally sacri�ce accuracy of result but can resolve larger system con�gurations.
We learnt through our experience in this thesis that the key to maintaining scalability and
provide accurate solution is adaptability. If the system is dynamic in size, that is, the size of
the system varies over time, then this dynamic nature can be leveraged to vary the accuracy
167
of the system. This way we can deliver timely solutions in every scenario but can also
increase our accuracy whenever the system con�guration allows us to increase accuracy.
Forecasting is a anthro-structuro-temporal phenomenon
Short term electricity load forecasting is traditionally modeled with global phenomenons
such as temperature, day of week etc. This is acceptable for regional load modeling since
these variables impact the majority of population. However, when forecasting for a single
house care should be taken in choosing the features which de�ne the variability of the house.
We found that for our system energy consumption has an intricate relation with social,
anthropologic and structural features of the house. The relation was not visible over the
entire period but rather was temporal, that is, di�erent features showed a relationship with
energy consumption at di�erent hours of day or days of weeks. This led us to think of energy
forecast modeling in a very di�erent way leading to STMLF.
Volatility is the biggest challenge for forecasting
Volatility is by far the biggest challenge for making accurate forecast. In this regard we learnt
that it is important to capture and model data which reduces the volatility. Attempting
to forecast the extreme volatile device data may not have resulted in acceptable results.
However, by capturing household consumption data we were able to make a better forecast
due to relatively lower volatility of this data. Since our forecast was more accurate we were
able to reconstruct the device consumption data through disaggregation.
Holistic picture helps in identifying key parameters
Various researchers have attempted to resolve the components of the DSM problems in
isolation. However, we observed that a holistic approach is more practical. For instance, the
168
issue of forecasting device loads is still not resolved yet but looking at it holistically from
system perspective, the household forecast-disaggregate route produces much better results.
7.3 Future Work
For each of the contribution in this thesis we have a list of future directions that can be
explored.
7.3.1 Forecasting
The proposed short term load forecast method is a proof of concept anthropologic and
structural data can bene�t forecasting. We have three proposed strategy to move forecasting
forward.
- First we would like to group together houses on attribute-temporal axis and then con-
struct the forecast either in the sub-groups or by weighting the attributes according to
their groupings.
- Second we would like to explore the miss-forecasted tuples as discussed in chap. 4 sec.
6. Our conjuncture is that we do not have the attributes to di�erentiate these regularly
miss-forecasted tuples from the rest of the data and thus the forecast is so regularly
inaccurate. We would like to try and isolate these tuples and devise a strategy to infer
the missing attributes and reconstruct the forecast using these pseudo-attributes.
- Third, We would like to test the generalization quality of our system. We would like
to see if we can reconstruct attributes of the house by observing the load pro�le and
169
matching it with our labeled data to identify attributes of the house.
7.3.2 Load Disaggregation
Our current load disaggregation solution aims at reducing false negative only. This has the
repercussion of a high false positive. Though it does not e�ect the correctness of our system,
but it may result in sub-optimal solution by providing a bigger pool of demand than actual.
One future direction in this work can be application of algorithms to reduce the false positive
rate without compromising on the false negative rate.
7.3.3 Planning and Modeling
- Our current adaptive model has only two stages, either we construct an exact solution
or an approximate one. A more robust solution may partially transform the problem
based on size and realtime constraints. In such a system instead of transforming all
the decision variables from binary domain to frequency domain we may adapt only
as many variables as the real-time constraints require. This will provide a stratum of
adaptation instead of the current two state model.
- Our current solution only looks at small sub-set of contractual possibilities. We would
like to model other types of contracts as constraints as well to provide di�erent type
of options to the consumers.
- Our current solution does not incorporate time of use pricing or other �nancial initia-
tive based DSM strategies. We would like to extend our planning algorithm to these
strategies as well.
170
Bibliography
[Aalami et al., 2010] Aalami, H., Moghaddam, M. P., and Youse�, G. (2010). Demand
response modeling considering interruptible/curtailable loads and capacity market pro-
grams. Applied Energy, 87(1):243 � 250.
[Aamodt and Plaza, 1994] Aamodt, A. and Plaza, E. (1994). Case-based reasoning: founda-
tional issues, methodological variations, and system approaches. AI Commun., 7(1):39�59.
[Abaravicius and , 2007] Abaravicius, J. and , Sernhed, K. a. P. J. (2007). More or less
about data-analyzing load demand in residential houses. In ACEEE 2006 Summer Study,
Paci�c Grove, California (2007).
[Abdel-Aal, 2005] Abdel-Aal, R. (2005). Improving electric load forecasts using network
committees. Electric Power Systems Research, 74(1):83 � 94.
[Abdelwahed et al., 2004] Abdelwahed, S., Kandasamy, N., and Neema, S. (2004). A control-
based framework for self-managing distributed computing systems. In WOSS '04: Pro-
ceedings of the 1st ACM SIGSOFT workshop on Self-managed systems, pages 3�7, New
York, NY, USA. ACM Press.
[Abrahao et al., 2006] Abrahao, B., Almeida, V., and Almeida, J. (2006). Self-adaptive sla-
driven capacity management for internet services. In in 17th IFIP/IEEE International
Workshop on Distributed Systems: Operations and Management, DSOM.
171
[Albadi and El-Saadany, 2008] Albadi, M. and El-Saadany, E. (2008). A summary of de-
mand response in electricity markets. Electric Power Systems Research, 78(11):1989 �
1996. A summary of demand response in electricity markets.
[Alfares and Nazeeruddin, 2002] Alfares, H. K. and Nazeeruddin, M. (2002). Electric load
forecasting: Literature survey and classi�cation of methods. International Journal of
Systems Science, 33(1):23�34.
[AlFuhaid et al., 1997] AlFuhaid, A., El-Sayed, M., and Mahmoud, M. (1997). Cascaded
arti�cial neural networks for short-term load forecasting. Power Systems, IEEE Transac-
tions on, 12(4):1524 �1529.
[Alliance, 2006] Alliance, Z. (2006). Zigbee speci�cations, version 1.0 r13. ZigBee Alliance,
http://www. zigbee. org.
[AlRashidi and EL-Naggar, 2010] AlRashidi, M. and EL-Naggar, K. (2010). Long term elec-
tric load forecasting based on particle swarm optimization. Applied Energy, 87(1):320 �
326.
[Amaral et al., 2008] Amaral, L. F., Souza, R. C., and Stevenson, M. (2008). A smooth tran-
sition periodic autoregressive (stpar) model for short-term load forecasting. International
Journal of Forecasting, 24(4):603 � 615. <ce:title>Energy Forecasting</ce:title>.
[Amjady, 2001] Amjady, N. (2001). Short-term hourly load forecasting using time-series
modeling with peak load estimation capability. Power Systems, IEEE Transactions on,
16(3):498 �505.
[Amjady and Keynia, 2009] Amjady, N. and Keynia, F. (2009). Short-term load forecasting
of power systems by combination of wavelet transform and neuro-evolutionary algorithm.
Energy, 34(1):46 � 57.
172
[Amjady et al., 2010] Amjady, N., Keynia, F., and Zareipour, H. (2010). Short-term load
forecast of microgrids by a new bilevel prediction strategy. Smart Grid, IEEE Transactions
on, 1(3):286 �294.
[Ardakanian et al., 2011] Ardakanian, O., Keshav, S., and Rosenberg, C. (2011). Markovian
models for home electricity consumption.
[Ashok, 2007] Ashok, S. (2007). Optimised model for community-based hybrid energy sys-
tem. Renewable Energy, 32(7):1155 � 1164.
[Bakirtzis et al., 1995] Bakirtzis, A., Theocharis, J., Kiartzis, S., and Satsios, K. (1995).
Short term load forecasting using fuzzy neural networks. Power Systems, IEEE Transac-
tions on, 10(3):1518 �1524.
[Beal et al., 2012] Beal, J., Berliner, J., and Hunter, K. (2012). Fast precise distributed
control for energy demand management. In Self-Adaptive and Self-Organizing Systems
(SASO), 2012 IEEE Sixth International Conference on, pages 187�192. IEEE.
[Box and Jenkins, 1994] Box, G. E. P. and Jenkins, G. M. (1994). Time series analysis.
Forecasting and control. Englewood Cli�s, NJ: Prentice-Hall,.
[Breukers et al., 2011] Breukers, S., Heiskanen, E., Brohmann, B., Mourik, R., and Feenstra,
C. (2011). Connecting research to practice to improve energy demand-side management
(dsm). Energy, 36(4):2176 � 2185.
[Cappers et al., 2010] Cappers, P., Goldman, C., and Kathan, D. (2010). Demand response
in u.s. electricity markets: Empirical evidence. Energy, 35(4):1526 � 1535.
[Carpinteiro et al., 2004] Carpinteiro, O. A., Reis, A. J., and da Silva, A. P. (2004). A
hierarchical neural model in short-term load forecasting. Applied Soft Computing, 4(4):405
� 412.
173
[Chan et al., 2000] Chan, W., So, A. T., and Lai, L. (2000). Harmonics load signature
recognition by wavelets transforms. In Electric Utility Deregulation and Restructuring
and Power Technologies, 2000. Proceedings. DRPT 2000. International Conference on,
pages 666�671. IEEE.
[Chatterjee and Hadi, 1986] Chatterjee, S. and Hadi, A. S. (1986). In�uential observations,
high leverage points, and outliers in linear regression. Statistical Science, 1(3):379�393.
[Chen et al., 2004] Chen, B.-J., Chang, M.-W., and lin, C.-J. (2004). Load forecasting using
support vector machines: a study on eunite competition 2001. Power Systems, IEEE
Transactions on, 19(4):1821 � 1830.
[Chen et al., 1993] Chen, J.-L., Tsai, R., and Liang, S.-S. (1993). A distributed problem
solving system for short-term load forecasting. Electric Power Systems Research, 26(3):219
� 224.
[Chen et al., 2010] Chen, Y., Luh, P., Guan, C., Zhao, Y., Michel, L., Coolbeth, M., Fried-
land, P., and Rourke, S. (2010). Short-term load forecasting: Similar day-based wavelet
neural networks. Power Systems, IEEE Transactions on, 25(1):322 �330.
[Chen et al., 2012] Chen, Z., Wu, L., and Fu, Y. (2012). Real-time price-based demand
response management for residential appliances via stochastic optimization and robust
optimization. Smart Grid, IEEE Transactions on, PP(99):1 �9.
[Christiaanse, 1971] Christiaanse, W. (1971). Short-term load forecasting using general
exponential smoothing. Power Apparatus and Systems, IEEE Transactions on, PAS-
90(2):900 �911.
[Cole and Albicki, 1998] Cole, A. I. and Albicki, A. (1998). Data extraction for e�ective non-
intrusive identi�cation of residential power loads. In Instrumentation and Measurement
174
Technology Conference, 1998. IMTC/98. Conference Proceedings. IEEE, volume 2, pages
812�815. IEEE.
[Coll-Mayor et al., 2007] Coll-Mayor, D., Paget, M., and Lightner, E. (2007). Future intel-
ligent power grids: Analysis of the vision in the european union and the united states.
Energy Policy, 35(4):2453 � 2465.
[Cuaresma et al., 2004] Cuaresma, J. C., Hlouskova, J., Kossmeier, S., and Obersteiner,
M. (2004). Forecasting electricity spot-prices using linear univariate time-series models.
Applied Energy, 77(1):87 � 106.
[D. and Uri, 1978] D., N. and Uri (1978). Forecasting peak system load using a combined
time series and econometric model. Applied Energy, 4(3):219 � 227.
[Dai and Wang, 2007] Dai, W. and Wang, P. (2007). Application of pattern recognition
and arti�cial neural network to load forecasting in electric power system. In Natural
Computation, 2007. ICNC 2007. Third International Conference on, volume 1, pages 381
�385.
[Daoxin et al., 2012] Daoxin, L., Lingyun, L., Yingjie, C., and Ming, Z. (2012). Market
equilibrium based on renewable energy resources and demand response in energy engi-
neering. Systems Engineering Procedia, 4(0):87 � 98. <ce:title>Information Engineering
and Complexity Science - Part II</ce:title>.
[Das et al., 2010] Das, R., Kephart, J. O., Lenchner, J., and Hamann, H. (2010). Utility-
function-driven energy-e�cient cooling in data centers. In Proceedings of the 7th interna-
tional conference on Autonomic computing, ICAC '10, pages 61�70, New York, NY, USA.
ACM.
175
[Dash et al., 1998] Dash, P., Satpathy, H., and Liew, A. (1998). A real-time short-term
peak and average load forecasting system using a self-organising fuzzy neural network.
Engineering Applications of Arti�cial Intelligence, 11(2):307 � 316.
[Datchanamoorthy et al., 2011] Datchanamoorthy, S., Kumar, S., Ozturk, Y., and Lee, G.
(2011). Optimal time-of-use pricing for residential load control. In Smart Grid Commu-
nications (SmartGridComm), 2011 IEEE International Conference on, pages 375 �380.
Time of use pricing model for monopolies. On simulation no reboud.
[David et al., 2011] David, H., Fallin, C., Gorbatov, E., Hanebutte, U. R., and Mutlu, O.
(2011). Memory power management via dynamic voltage/frequency scaling. In Proceedings
of the 8th ACM international conference on Autonomic computing, ICAC '11, pages 31�40,
New York, NY, USA. ACM.
[Dehdashti et al., 1982] Dehdashti, A., Tudor, J., and Smith, M. (1982). Forecasting of
hourly load by pattern recognition a deterministic approach. Power Apparatus and Sys-
tems, IEEE Transactions on, PAS-101(9):3290 �3294.
[Deng et al., ] Deng, N., Stewart, C., Kelley, J., Gmach, D., and Arlitt, M. Adaptive green
hosting.
[Diao et al., 2003] Diao, Y., Eskesen, F., Froehlich, S., L. Hellerstein, J., Spainhower, L. F.,
and Surendra, M. (2003). Generic online optimization of multiple con�guration parameters
with application to a database server. In Distributed Systems, Operations and Management
(DSOM), volume 2867/2004 of Lecture Notes in Computer Science. Springer Berlin /
Heidelberg.
[Diongue et al., 2009] Diongue, A., Guagan, D., and Vignal, B. (2009). Forecasting elec-
tricity spot market prices with a k-factor gigarch process. Applied Energy, 86(4):505 �
510.
176
[Du and Lu, 2011a] Du, P. and Lu, N. (2011a). Appliance commitment for household load
scheduling. Smart Grid, IEEE Transactions on, 2(2):411 �419.
[Du and Lu, 2011b] Du, P. and Lu, N. (2011b). Appliance commitment for household load
scheduling. Smart Grid, IEEE Transactions on, 2(2):411 �419.
[El Desouky and Elkateb, 2000] El Desouky, A. and Elkateb, M. (2000). Hybrid adaptive
techniques for electric-load forecast using ann and arima. Generation, Transmission and
Distribution, IEE Proceedings-, 147(4):213 �217.
[Escrivá-Escrivá et al., 2010] Escrivá-Escrivá, G., Segura-Heras, I., and Alcázar-Ortega, M.
(2010). Application of an energy management and control system to assess the potential
of di�erent control strategies in hvac systems. Energy and Buildings, 42(11):2258 � 2267.
[Fan and Chen, 2006] Fan, S. and Chen, L. (2006). Short-term load forecasting based on an
adaptive hybrid method. Power Systems, IEEE Transactions on, 21(1):392 � 401.
[Fan, 2011] Fan, Z. (2011). Distributed demand response and user adaptation in smart grids.
In Integrated Network Management (IM), 2011 IFIP/IEEE International Symposium on,
pages 726 �729. Internet tra�c method for DR. Modeling user preference as willingness
to pay. Simulation based. no rebound calculations.
[Faria and Vale, 2011] Faria, P. and Vale, Z. (2011). Demand response in electrical energy
supply: An optimal real time pricing approach. Energy, 36(8):5374 � 5384. demand
response simulator that allows studying demand response actions and schemes in distri-
bution networks.
[Farinaccio and Zmeureanu, 1999] Farinaccio, L. and Zmeureanu, R. (1999). Using a pattern
recognition approach to disaggregate the total electricity consumption in a house into the
major end-uses. Energy and Buildings, 30(3):245�259.
177
[Femal and Freeh, 2005] Femal, M. and Freeh, V. (13-16 June 2005). Boosting data center
performance through non-uniform power allocation. Autonomic Computing, 2005. ICAC
2005. Proceedings. Second International Conference on, pages 250�261.
[Finn et al., 2012] Finn, P., OConnell, M., and Fitzpatrick, C. (2012). Demand side man-
agement of a domestic dishwasher: Wind energy gains, �nancial savings and peak-time
load reduction. Applied Energy, (0):�.
[Fuller et al., 2011] Fuller, J., Schneider, K., and Chassin, D. (2011). Analysis of residential
demand response and double-auction markets. In Power and Energy Society General
Meeting, 2011 IEEE, pages 1 �7.
[Galus and Andersson, 2008] Galus, M. and Andersson, G. (2008). Demand management
of grid connected plug-in hybrid electric vehicles (phev). Energy 2030 Conference, 2008.
ENERGY 2008. IEEE, pages 1�8.
[Garcia et al., 2005] Garcia, R., Contreras, J., van Akkeren, M., and Garcia, J. (2005). A
garch forecasting model to predict day-ahead electricity prices. Power Systems, IEEE
Transactions on, 20(2):867 � 874.
[Gellings, 1985] Gellings, C. (1985). The concept of demand-side management for electric
utilities. Proceedings of the IEEE, 73(10):1468 � 1470.
[Giorgio and Pimpinella, 2012] Giorgio, A. D. and Pimpinella, L. (2012). An event driven
smart home controller enabling consumer economic saving and automated demand side
management. Applied Energy, 96(0):92 � 103. <ce:title>Smart Grids</ce:title>.
[Goldsby and Cheng, 2008] Goldsby, H. J. and Cheng, B. H. (2008). Automatically gen-
erating behavioral models of adaptive systems to address uncertainty. In Model Driven
Engineering Languages and Systems, pages 568�583. Springer.
178
[Greening, 2010] Greening, L. A. (2010). Demand response resources: Who is responsible
for implementation in a deregulated market? Energy, 35(4):1518 � 1525. wider implemen-
tation will need to accrue from coordinated actions along the electricity supply chain.
[Gross and Galiana, 1987] Gross, G. and Galiana, F. (1987). Short-term load forecasting.
Proceedings of the IEEE, 75(12):1558 � 1573.
[Guan et al., 2010] Guan, X., Xu, Z., and Jia, Q.-S. (2010). Energy-e�cient buildings fa-
cilitated by microgrid. Smart Grid, IEEE Transactions on, 1(3):243 �252. Coordinate
energy sources and loads for low energy building.
[Gudi et al., 2011] Gudi, N., Wang, L., Devabhaktuni, V., and Depuru, S. (2011). A
demand-side management simulation platform incorporating optimal management of dis-
tributed renewable resources. In Power Systems Conference and Exposition (PSCE), 2011
IEEE/PES, pages 1 �7.
[Gupta et al., 2010] Gupta, S., Reynolds, M. S., and Patel, S. N. (2010). Electrisense: single-
point sensing using emi for electrical event detection and classi�cation in the home. In
Proceedings of the 12th ACM international conference on Ubiquitous computing, pages
139�148. ACM.
[Gurguis and Zeid, 2005a] Gurguis, S. A. and Zeid, A. (2005a). Towards autonomic web
services: achieving self-healing using web services. SIGSOFT Softw. Eng. Notes, 30(4):1�
5.
[Gurguis and Zeid, 2005b] Gurguis, S. A. and Zeid, A. (2005b). Towards autonomic web
services: achieving self-healing using web services. SIGSOFT Softw. Eng. Notes, 30(4):1�
5.
[Hagan and Behr, 1987] Hagan, M. T. and Behr, S. M. (1987). The time series approach to
short term load forecasting. Power Systems, IEEE Transactions on, 2(3):785 �791.
179
[Hart, 1992] Hart, G. W. (1992). Nonintrusive appliance load monitoring. Proceedings of
the IEEE, 80(12):1870�1891.
[Hassan et al., 2013] Hassan, T., Javed, F., and Arshad, N. (2013). An empirical investi-
gation of vi trajectory based load signatures for non-intrusive load monitoring. arXiv
preprint arXiv:1305.0596.
[He et al., 2006] He, Y., Zhu, Y., and Duan, D. (2006). Research on hybrid arima and
support vector machine model in short term load forecasting. In Intelligent Systems
Design and Applications, 2006. ISDA '06. Sixth International Conference on, volume 1,
pages 804 �809.
[Heyer et al., 1999] Heyer, L. J., Kruglyak, S., and Yooseph, S. (1999). Exploring expression
data: Identi�cation and analysis of coexpressed genes. Genome Res., 9(11):1106�1115.
[Hillier and Lieberman, 2001] Hillier, F. and Lieberman, G. (2001). Introduction to opera-
tions research. McGraw-Hill.
[Hippert et al., 2001] Hippert, H., Pedreira, C., and Souza, R. (2001). Neural networks for
short-term load forecasting: a review and evaluation. Power Systems, IEEE Transactions
on, 16(1):44 �55.
[Hippert and Taylor, 2010] Hippert, H. S. and Taylor, J. W. (2010). An evaluation of
bayesian techniques for controlling model complexity and selecting inputs in a neural
network for short-term load forecasting. Neural Networks, 23(3):386 � 395.
[Hoelzl et al., 2012] Hoelzl, G., Kurz, M., Halbmayer, P., Erhart, J., Matscheko, M., Ferscha,
A., Eisl, S., and Kaltenleithner, J. (2012). Locomotion@ location: When the rubber hits
the road.
180
[Ilyas et al., arch] Ilyas, M., Raza, S., Chen, C.-C., Uzmi, Z., and Chuah, C.-N. (March).
Red-bl: Energy solution for loading data centers. In INFOCOM, 2012 Proceedings IEEE,
pages 2866�2870.
[Irisarri et al., 1982] Irisarri, G., Widergren, S., and Yehsakul, P. (1982). On-line load fore-
casting for energy control center application. Power Apparatus and Systems, IEEE Trans-
actions on, PAS-101(1):71 �78.
[Izquierdo et al., 2008] Izquierdo, M. D. Z., Jiménez, J. J. S., and del Sol, A. M. (2008).
Matlab software to determine the saving in parallel pumps optimal operation systems, by
using variable speed. In Energy 2030 Conference, 2008. ENERGY 2008. IEEE, pages 1�8.
IEEE.
[Jabr et al., 2000] Jabr, R., Coonick, A., and Cory, B. (2000). A homogeneous linear pro-
gramming algorithm for the security constrained economic dispatch problem. Power Sys-
tems, IEEE Transactions on, 15(3):930�936.
[Jain and Satish, 2009] Jain, A. and Satish, B. (2009). Clustering based short term load
forecasting using arti�cial neural network. In Power Systems Conference and Exposition,
2009. PSCE '09. IEEE/PES, pages 1 �7.
[Javed and Arshad, 2008] Javed, F. and Arshad, N. (2008). On the use of linear program-
ming in optimizing energy costs. In IWSOS, pages 305�310.
[Javed and Arshad, 2009a] Javed, F. and Arshad, N. (2009a). Adopt: An adaptive optimiza-
tion framework for large-scale power distribution systems. In Proceedings of the 2009 Third
IEEE International Conference on Self-Adaptive and Self-Organizing Systems, SASO '09,
pages 254�264, Washington, DC, USA. IEEE Computer Society.
[Javed and Arshad, 2009b] Javed, F. and Arshad, N. (2009b). A penny saved is a penny
earned: Applying optimization techniques to power management. In 16th IEEE Interna-
181
tional Conference on the Engineering of Computer-Based Systems (ECBS 2009), 13-16
April 2009, San Francisco, CA, USA.
[Javed et al., 2012] Javed, F., Arshad, N., Wallin, F., Vassileva, I., and Dahlquist, E. (2012).
Forecasting for demand response in smart grids: An analysis on use of anthropologic and
structural data and short term multiple loads forecasting. Applied Energy, 96(0):150 �
160. <ce:title>Smart Grids</ce:title>.
[Jiang and Fei, 2011] Jiang, B. and Fei, Y. (2011). Dynamic residential demand response and
distributed generation management in smart microgrid with hierarchical agents. Energy
Procedia, 12(0):76 � 90. <ce:title>The Proceedings of International Conference on Smart
Grid and Clean Energy Technologies (ICSGCE 2011</ce:title>.
[Kim and Shcherbakova, 2011] Kim, J.-H. and Shcherbakova, A. (2011). Common failures
of demand response. Energy, 36(2):873 � 880.
[Kim and Poor, 2011] Kim, T. and Poor, H. (2011). Scheduling power consumption with
price uncertainty. Smart Grid, IEEE Transactions on, 2(3):519 �527.
[Koller et al., 2010] Koller, R., Verma, A., and Neogi, A. (2010). Wattapp: an applica-
tion aware power meter for shared data centers. In Proceedings of the 7th international
conference on Autonomic computing, ICAC '10, pages 31�40, New York, NY, USA. ACM.
[Kolter and Johnson, 2011] Kolter, J. Z. and Johnson, M. J. (2011). Redd: A public data
set for energy disaggregation research. In proceedings of the SustKDD workshop on Data
Mining Applications in Sustainability, pages 1�6.
[Kuhn and Verwaest, 2008] Kuhn, A. and Verwaest, T. (2008). Fame, a polyglot library for
metamodeling at runtime. Models @ Runtime 2008, pages 57�66.
182
[Kwag and Kim., 2012] Kwag, H.-G. and Kim., J.-O. (2012). Optimal combined scheduling
of generation and demand response with demand resource constraints. Applied Energy,
(0):�.
[Lai et al., 2010] Lai, H. W., Fung, G., Lam, H., and Lee, W. (2010). Disaggregate loads by
particle swarm optimization method for non-intrusive load monitoring. In International
Conference on Electrical Engineering, July 2007.
[Lauret et al., 2012] Lauret, P., David, M., and Calogine, D. (2012). Nonlinear models
for short-time load forecasting. Energy Procedia, 14(0):1404 � 1409. Gaussean process
regression and NN for STLF.
[Lauret et al., 2008] Lauret, P., Fock, E., Randrianarivony, R. N., and Manicom-Ramsamy,
J.-F. (2008). Bayesian neural network approach to short time load forecasting. Energy
Conversion and Management, 49(5):1156 � 1166.
[Lee and Lee, 2011] Lee, J.-W. and Lee, D.-H. (2011). Residential electricity load schedul-
ing for multi-class appliances with time-of-use pricing. In GLOBECOM Workshops (GC
Wkshps), 2011 IEEE, pages 1194 �1198. Scheduling algorithm using time of use to reduce
cost. No rebound in simulation.
[Leeb et al., 1995] Leeb, S. B., Shaw, S. R., and Kirtley Jr, J. L. (1995). Transient event
detection in spectral envelope estimates for nonintrusive load monitoring. Power Delivery,
IEEE Transactions on, 10(3):1200�1210.
[Lefurgy et al., 2007] Lefurgy, C., Wang, X., and Ware, M. (11-15 June 2007). Server-level
power control. Autonomic Computing, 2007. ICAC '07. Fourth International Conference
on, pages 4�4.
[Leite et al., 2010] Leite, J. C., Kusic, D. M., Mossé, D., and Bertini, L. (2010). Stochastic
approximation control of power and tardiness in a three-tier web-hosting cluster. In Pro-
183
ceedings of the 7th international conference on Autonomic computing, ICAC '10, pages
41�50, New York, NY, USA. ACM.
[Liang et al., 2010] Liang, J., Ng, S., Kendall, G., and Cheng, J. (2010). Load signature
study part ii: Disaggregation framework, simulation, and applications. Power Delivery,
IEEE Transactions on, 25(2):561 �569.
[Liaqat et al., 2012] Liaqat, M. D., Javed, F., and Arshad, N. (2012). Towards a self-
managing tool for optimizing energy usage in buildings. In SEB '2012: Proceedings of
the International Conference on Sustainability in Energy and Buildings, Stockholms Swe-
den.
[Lin et al., 2010] Lin, W.-M., Gow, H.-J., and Tsai, M.-T. (2010). An enhanced radial basis
function network for short-term electricity price forecasting. Applied Energy, 87(10):3226
� 3234.
[Lisovich and Wicker., 2008] Lisovich, M. and Wicker., S. (2008). Privacy concerns in up-
coming residential and commercial demand-response systems. In 2008 Clemson University
Power Systems Conference., Clemson University,.
[Livengood and Larson, 2009] Livengood, D. and Larson, R. (2009). Energy box: locally
automated optimal control of residential electricity usage. Service Science, 1(1):1 �16.
[Lu et al., 2004] Lu, J.-C., Niu, D.-X., and Jia, Z.-Y. (2004). A study of short-term load
forecasting based on arima-ann. In Machine Learning and Cybernetics, 2004. Proceedings
of 2004 International Conference on, volume 5, pages 3183 � 3187 vol.5.
[Marceau and Zmeureanu, 2000] Marceau, M. and Zmeureanu, R. (2000). Nonintrusive load
disaggregation computer program to estimate the energy consumption of major end uses
in residential buildings. Energy Conversion and Management, 41(13):1389 � 1403.
184
[Mastorocostas et al., 2000] Mastorocostas, P., Theocharis, J., Kiartzis, S., and Bakirtzis,
A. (2000). A hybrid fuzzy modeling method for short-term load forecasting. Mathematics
and Computers in Simulation, 51:221 � 232.
[Mehrotra, 1992] Mehrotra, S. (1992). On the implementation of a primal-dual interior point
method. SIAM Journal on Optimization, 2(4):575�601.
[Moghaddam et al., 2011] Moghaddam, M. P., Abdollahi, A., and Rashidinejad, M. (2011).
Flexible demand response programs modeling in competitive electricity markets. Applied
Energy, 88(9):3257 � 3269.
[Mohandes, 2002] Mohandes, M. (2002). Support vector machines for short-term electrical
load forecasting. International Journal of Energy Research, 26(4):335�345.
[Mohsenian-Rad et al., 2010] Mohsenian-Rad, A., Wong, V., Jatskevich, J., Schober, R., and
Leon-Garcia, A. (2010). Autonomous demand-side management based on game-theoretic
energy consumption scheduling for the future smart grid. Smart Grid, IEEE Transactions
on, 1(3):320 �331.
[Molderink et al., 2010] Molderink, A., Bakker, V., Bosman, M., Hurink, J., and Smit, G.
(2010). Management and control of domestic smart grid technology. Smart Grid, IEEE
Transactions on, 1(2):109 �119.
[Moslehi and Kumar, 2010] Moslehi, K. and Kumar, R. (2010). Smart grid - a reliability
perspective. pages 1 �8.
[Nguyen and Nabney, 2010] Nguyen, H. T. and Nabney, I. T. (2010). Short-term electricity
demand and gas price forecasts using wavelet transforms and adaptive models. Energy,
35(9):3674 � 3685.
185
[Nie et al., 2012] Nie, H., Liu, G., Liu, X., and Wang, Y. (2012). Hybrid of arima and svms
for short-term load forecasting. Energy Procedia, 16, Part C(0):1455 � 1460. <ce:title>2012
International Conference on Future Energy, Environment, and Materials</ce:title>.
[Norford and Leeb, 1996] Norford, L. K. and Leeb, S. B. (1996). Non-intrusive electrical load
monitoring in commercial buildings based on steady-state and transient load-detection
algorithms. Energy and Buildings, 24(1):51�64.
[of Finance Pakistan., 2009] of Finance Pakistan., M. (2009). Economic survey of pakistan.
Economic survey of Pakistan.
[Ogston et al., 2007] Ogston, E., Zeman, A., Prokopenko, M., and James, G. (2007). Clus-
tering distributed energy resources for large-scale demand management. In Self-Adaptive
and Self-Organizing Systems, 2007. SASO'07. First International Conference on, pages
97�108. IEEE.
[Omer et al., 2010] Omer, A., Javed, F., and Arshad, N. (2010). A case study of imple-
menting a localized smart grid in developing countries. In ICAE '2010: Proceedings of the
Second International Conference on Applied Energy, Singapore.
[Ozturk, 2010] Ozturk, I. (2010). A literature survey on energy�growth nexus. Energy
Policy, 38(1):340�349.
[Pai and Hong, 2005] Pai, P.-F. and Hong, W.-C. (2005). Support vector machines with
simulated annealing algorithms in electricity load forecasting. Energy Conversion and
Management, 46(17):2669 � 2688.
[Papalexopoulos and Hesterberg, 1990] Papalexopoulos, A. and Hesterberg, T. (1990). A
regression-based approach to short-term system load forecasting. Power Systems, IEEE
Transactions on, 5(4):1535 �1547.
186
[Pedrasa et al., 2010] Pedrasa, M., Spooner, T., and MacGill, I. (2010). Coordinated
scheduling of residential distributed energy resources to optimize smart home energy ser-
vices. Smart Grid, IEEE Transactions on, 1(2):134 �143.
[Pickering et al., 2009] Pickering, B., Robert, S., M?noret, S., and Mengusoglu, E. (2009).
Model-driven management of complex systems. Technical Report COMP COMP-005-2008
Lancaster University, page 117.
[Powers et al., 1991] Powers, J., Margossian, B., and Smith, B. (1991). Using a rule-based
algorithm to disaggregate end-use load pro�les from premise-level data. Computer Appli-
cations in Power, IEEE, 4(2):42�47.
[Rahimi and Ipakchi, 2010] Rahimi, F. and Ipakchi, A. (2010). Demand response as a market
resource under the smart grid paradigm. Smart Grid, IEEE Transactions on, 1(1):82 �88.
Introductory case building paper.
[Rahman and Bhatnagar, 1988] Rahman, S. and Bhatnagar, R. (1988). An expert system
based algorithm for short term load forecast. Power Systems, IEEE Transactions on,
3(2):392 �399.
[Ramchurn et al., 2011] Ramchurn, S., Vytelingum, P., Rogers, A., and Jennings, N. (2011).
Agent-based control for decentralised demand side management in the smart grid. In The
Tenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS
2011), pages 5�12.
[Ranade and Beal, 2010] Ranade, V. V. and Beal, J. (2010). Distributed control for small
customer energy demand management. In SASO'10, pages 11�20.
[Rastegar et al., 2012] Rastegar, M., Fotuhi-Firuzabad, M., and Aminifar, F. (2012). Load
commitment in a smart home. Applied Energy, 96(0):45 � 54.
187
[Riedmiller and Braun, 1993] Riedmiller, M. and Braun, H. (1993). A direct adaptive
method for faster backpropagation learning: the rprop algorithm. In Neural Networks,
1993., IEEE International Conference on, pages 586 �591 vol.1.
[Romero, 2012] Romero, J. J. (2012). Blackouts illuminate india's power problems. Spec-
trum, IEEE, 49(10):11�12.
[Saele and Grande, 2011] Saele, H. and Grande, O. (2011). Demand response from household
customers: Experiences from a pilot study in norway. Smart Grid, IEEE Transactions
on, 2(1):102 �109. Manual DR in Norway showing reduction in energy at peak times.
[Salahi et al., 2007] Salahi, M., Peng, J., and Terlaky, T. (2007). On mehrotra-type
predictor-corrector algorithms. SIAM J. on Optimization, 18(4):1377�1397.
[Shao et al., 2012] Shao, S., Pipattanasomporn, M., and Rahman, S. (2012). Grid integra-
tion of electric vehicles and demand response with customer choice. Smart Grid, IEEE
Transactions on, 3(1):543 �550.
[Shen et al., 2011] Shen, Z., Subbiah, S., Gu, X., and Wilkes, J. (2011). Cloudscale: elastic
resource scaling for multi-tenant cloud systems. In Proceedings of the 2nd ACM Symposium
on Cloud Computing, SOCC '11, pages 5:1�5:14, New York, NY, USA. ACM.
[Sheth and Parvatiyar, 1995] Sheth, J. and Parvatiyar, A. (1995). Relationship marketing in
consumer markets: Antecedents and consequences. Journal of the Academy of Marketing
Science, 23:255�271. 10.1177/009207039502300405.
[Srinivasan et al., 2006] Srinivasan, D., Ng, W., and Liew, A. (2006). Neural-network-based
signature recognition for harmonic source identi�cation. Power Delivery, IEEE Transac-
tions on, 21(1):398�405.
188
[Strbac, 2008] Strbac, G. (2008). Demand side management: Bene�ts and challenges. Energy
Policy, 36(12):4419 � 4426. major bene�ts and challenges of electricity demandsideman-
agement (DSM) are discussed in the context of the UK electricity system.
[Sun and Zou, 2007] Sun, W. and Zou, Y. (2007). Short term load forecasting based on bp
neural network trained by pso. In Machine Learning and Cybernetics, 2007 International
Conference on, volume 5, pages 2863 �2868.
[Tan et al., 2010] Tan, Z., Zhang, J., Wang, J., and Xu, J. (2010). Day-ahead electricity
price forecasting using wavelet transform combined with arima and garch models. Applied
Energy, 87(11):3606 � 3610.
[Tarzia et al., 2010] Tarzia, S. P., Dinda, P. A., Dick, R. P., and Memik, G. (2010). Display
power management policies in practice. In Proceedings of the 7th international conference
on Autonomic computing, ICAC '10, pages 51�60, New York, NY, USA. ACM.
[Valenzuela et al., 2012] Valenzuela, J., Thimmapuram, P. R., and Kim, J. (2012). Modeling
and simulation of consumer response to dynamic pricing with enabled technologies. Applied
Energy, 96(0):122 � 132. e develop a model that represents the response of consumers to
dynamic pricing. In the model, consumers use forecasted day-ahead prices to shift daily
energy consumption.
[Vasic et al., 2010] Vasic, N., Scherer, T., and Schott, W. (2010). Thermal-aware workload
scheduling for energy e�cient data centers. In Proceedings of the 7th international con-
ference on Autonomic computing, ICAC '10, pages 169�174, New York, NY, USA. ACM.
[Venkatesan et al., 2012] Venkatesan, N., Solanki, J., and Solanki, S. K. (2012). Residential
demand response model and impact on voltage pro�le and losses of an electric distribution
network. Applied Energy, (0):�. DemandResponse (DR) by utilizing consumer behavior
189
modeling considering di�erent scenarios and levels of consumer rationality while observing
voltage pro�le and losses.
[Walawalkar et al., 2010] Walawalkar, R., Fernands, S., Thakur, N., and Chevva, K. R.
(2010). Evolution and current status of demand response (dr) in electricity markets:
Insights from pjm and nyiso. Energy, 35(4):1553 � 1560.
[Wallace and Kuhn, ] Wallace, D. R. and Kuhn, D. R. Converting system failure histories
into future win situations. nist. 2000.
[Wallin et al., 2005] Wallin, F., Bartusch, C., Thorin, E., Bdckstrom, T., and Dahlquist, E.
(2005). "the use of automatic meter readings for a demand-based tari�". pages 1�6.
[Wang et al., 2011] Wang, J., Botterud, A., Bessa, R., Keko, H., Carvalho, L., Issicaba,
D., Sumaili, J., and Miranda, V. (2011). Wind power forecasting uncertainty and unit
commitment. Applied Energy, 88(11):4014 � 4023.
[Wang et al., 2006] Wang, M., Kandasamy, N., Guez, A., and Kam, M. (13-16 June 2006).
Adaptive performance control of computing systems via distributed cooperative control:
Application to power management in computing clusters. Autonomic Computing, 2006.
ICAC '06. IEEE International Conference on, pages 165�174.
[Wang et al., 2002] Wang, X., Song, Y.-H., and Lu, Q. (2002). A coordinated real-time
optimal dispatch method for unbundled electricity markets. Power Systems, IEEE Trans-
actions on, 17(2):482�490.
[Weron, 2006] Weron, R. (2006). Modeling and Forecasting Electricity Loads and Prices: A
Statistical Approach (The Wiley Finance Series). Wiley.
[Wilhite and , 2000] Wilhite, H. and, S. E. a. L. L. and , Kempton, W. (2000). Twenty years
of energy demand management: we know more about individual behavior but how much
190
do we really know about demand? In 2000 Summer Study Proceedings of the American
Council for an Energy-E�cient Economy, Washington, DC, pages 8435�8453.
[Xu et al., 2010] Xu, Y., Xie, L., and Singh, C. (2010). Optimal scheduling and operation of
load aggregator with electric energy storage in power markets. In North American Power
Symposium (NAPS), 2010, pages 1�7. IEEE.
[Yan et al., 2013] Yan, Y., Qian, Y., Sharif, H., and Tipper, D. (2013). A survey on smart
grid communication infrastructures: Motivations, requirements and challenges. Commu-
nications Surveys Tutorials, IEEE, 15(1):5�20.
[Yang and Huang, 1998] Yang, H.-T. and Huang, C.-M. (1998). A new short-term load
forecasting approach using self-organizing fuzzy armax models. Power Systems, IEEE
Transactions on, 13(1):217 �225.
[Yao et al., 2000] Yao, S., Song, Y., Zhang, L., and Cheng, X. (2000). Wavelet transform
and neural networks for short-term electrical load forecasting. Energy Conversion and
Management, 41(18):1975 � 1988.
[Yuan et al., 2011] Yuan, L., Lu, G., Zhan, J., Wang, H., and Wang, L. (2011). Powertracer:
Tracing requests in multi-tier services to diagnose energy ine�ciency.
[Yun et al., 2008] Yun, Z., Quan, Z., Caixin, S., Shaolan, L., Yuming, L., and Yang, S.
(2008). Rbf neural network and an�s-based short-term load forecasting approach in real-
time price environment. Power Systems, IEEE Transactions on, 23(3):853 �858.
[Zeifman and Roth, 2011] Zeifman, M. and Roth, K. (2011). Nonintrusive appliance load
monitoring: Review and outlook. Consumer Electronics, IEEE Transactions on, 57(1):76�
84.
191
[Zhang, 2005] Zhang, M.-G. (2005). Short-term load forecasting based on support vector
machines regression. In Machine Learning and Cybernetics, 2005. Proceedings of 2005
International Conference on, volume 7, pages 4310 �4314 Vol. 7.
[Zhang et al., 2012] Zhang, Q., Zhani, M. F., Zhu, Q., Zhang, S., Boutaba, R., and Heller-
stein, J. (2012). Dynamic energy-aware capacity provisioning for cloud computing envi-
ronments. In Proceedings IEEE/ACM International Conference on Autonomic Computing
(ICAC).
[Zhang, 1997] Zhang, Y. (1997). Solving large-scale linear programs by interior-point meth-
ods under the matlab environment.
[Zhu et al., 2008] Zhu, X., Young, D., Watson, B., Wang, Z., Rolia, J., Singhal, S., McKee,
B., Hyser, C., Gmach, D., Gardner, R., Christian, T., and Cherkasova, L. (2008). 1000 is-
lands: Integrated capacity and workload management for the next generation data center.
pages 172�181.
192