Towards Self-Managing Demand Side Management

Towards Self-Managing Demand Side

Management

PhD Thesis

Fahad Javed

2006-03-0042

Advisor: Dr. Naveed Arshad

Department of Computer Science

Syed Babar Ali School of Science and Engineering

Lahore University of Management Sciences

Dedicated to those who stood behind me and endured

with me; my parents, my wife and my children.

Lahore University of Management Sciences

Syed Baber Ali School of Science and Engineering

CERTIFICATE

I hereby recommend that the thesis prepared under my supervision by Fahad Javed titled

Towards Self-Managing Demand Side Management be accepted in partial ful�llment

of the requirements for the degree of doctor of philosophy in computer science.

Dr. Naveed Arshad (Advisor)

Recommendation of Examiners' Committee:

Name Signature

Dr. Asim Karim ��

Dr. Mian Muhammad Awais ��

Dr. Jahangir Ikram ��

Dr. Waqar Mahmood ��

Acknowledgements

First and foremost I thank God almighty for providing me with strength, determination

and everything else to make this work possible.

I was fortunate enough to be surrounded by some very wonderful people during my

Ph.D. who provided me with invaluable feedbacks, critiques and comments without which

this work might not have been possible. My advisor Dr. Naveed Arshad from the very �rst

day provided me with guidance and advice directing me in the right direction. I would also

like to acknowledge Dr. Shahid Masood for providing me with encouragement and guidance

at some very critical junctures.

I am also indebted to my tea buddies: Saqib Ilyas, Zeeshan Rana, Junaid Akhtar, Umer

Sulaiman, Aadil Zia Khan, Khurram Junejo, and Malik Tahir Hassan. The discussions, the

arguments, the critiques and the lame jokes helped me formulate my ideas and concepts and

are an integral part of my work. The RICE lab weekly meetings also deserve a mention

here. Dr. Awais and his groups' feedback was very instrumental in focusing on the speci�c

research questions that I attempt to answer in this thesis.

Last but not least I would like to acknowledge support from my wife. Her support in the

thick and thin, her encouragement and perseverance egged me on to this point.

Contents

1 Introduction 2

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Demand Side Management and Demand Response . . . . . . . . . . . . . . . 5

1.3 Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3.1 DSM in Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3.2 Challenges for DSM in Smart Grid . . . . . . . . . . . . . . . . . . . 13

1.4 Limitations, Assumptions and Scope . . . . . . . . . . . . . . . . . . . . . . 15

1.5 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 Literature Survey 19

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2 Demand Side Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.1 Critical DSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.2 DSM for Price Responsive Systems . . . . . . . . . . . . . . . . . . . 27

2.2.3 Distributed Generation Supported by DSM . . . . . . . . . . . . . . . 33

2.3 Short Term Load Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.3.1 Statistical and Time Series Techniques . . . . . . . . . . . . . . . . . 35

2.3.2 Arti�cial Intelligence Techniques . . . . . . . . . . . . . . . . . . . . . 37

2.3.3 Hybrid Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5

2.3.4 STLF for Buildings and Micro grids . . . . . . . . . . . . . . . . . . . 40

2.4 Load Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.5 Self-managing Energy Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.5.1 Server Farms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.5.2 Home Energy Management . . . . . . . . . . . . . . . . . . . . . . . . 44

3 System Architecture 46

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.2 Proposed Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.3.1 Collection Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.3.2 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3.3 Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.3.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3.5 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3.6 Actuators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4 Forecasting Energy Load for Individual Consumers 55

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.2 Problem Description: Issues in house level forecasting . . . . . . . . . . . . 58

4.3 STMLF Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.3.1 STLF Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.3.2 STLF for Independent House Forecast . . . . . . . . . . . . . . . . . 64

4.3.3 STMLF1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.3.4 STMLF2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3.5 Model Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6

4.4 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.4.1 Forecasting Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.4.2 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.4.3 Experimental Data Source . . . . . . . . . . . . . . . . . . . . . . . . 74

4.4.4 Experimental Environment . . . . . . . . . . . . . . . . . . . . . . . . 74

4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.5.1 AI Based Experiment Results . . . . . . . . . . . . . . . . . . . . . . 75

4.5.2 Multiple STLFs vs. STMLF . . . . . . . . . . . . . . . . . . . . . . . 75

4.5.3 E�ect of Anthropologic and Structural Data . . . . . . . . . . 78

4.6 Discussion on Miss-Forecasted Combinations . . . . . . . . . . . . . . . . . . 80

4.7 Short Term Forecasting Techniques for STMLF . . . . . . . . . . . . . . . . 84

4.8 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5 Disaggregation Heavy Loads from Forecast 90

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.3 Evaluation Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.4 Disaggregation Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.5.1 Noiseless Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.5.2 Forecast with Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

6 Demand Side Management Planning 99

6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.3 Clustered Frequency Based Algorithm . . . . . . . . . . . . . . . . . . . . . 105

7

6.3.1 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6.3.2 Linear Programming Based Planning . . . . . . . . . . . . . . . . . . 107

6.3.3 Spike Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

6.4 Adaptable Optimization - AdOpt . . . . . . . . . . . . . . . . . . . . . . . . 122

6.4.1 Self-Optimizing techniques . . . . . . . . . . . . . . . . . . . . . . . . 122

6.4.2 System Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.4.3 Case-Based Reasoning Engine . . . . . . . . . . . . . . . . . . . . . . 128

6.4.4 Framework Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 130

6.4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

6.5 Adaptable Modeling Framework . . . . . . . . . . . . . . . . . . . . . . . . 143

6.5.1 Structure of the Mathematical Meta-Model . . . . . . . . . . . . . . . 144

6.5.2 Modeling at Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . 146

6.5.3 Running Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

6.5.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6.5.5 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

6.5.6 Future Dimensions of Runtime Modeling . . . . . . . . . . . . . . . . 159

6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

7 Conclusion and Future Work 161

7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

7.1.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

7.1.2 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

7.1.3 Load Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

8

7.1.4 Putting It All Together . . . . . . . . . . . . . . . . . . . . . . . . . . 167

7.2 Lessons Learnt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

7.3.1 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

7.3.2 Load Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

7.3.3 Planning and Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 170

9

List of Figures

1.1 Consumption of State of California USA, on 2nd April 2013. . . . . . . . . 5

1.2 Goals of DSM as proposed by Gellings [Gellings, 1985]. . . . . . . . . . . . 7

2.1 Comfort Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2 Contractually bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3 Explicit Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.4 Incentive based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.5 TOU based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.6 Incentive with storage based . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.7 Incentive with renewable with/without storage based . . . . . . . . . . . . . 36

3.1 Self-managing demand side management architecture. . . . . . . . . . . . . 50

4.1 Box and whisker plot for consumers load over a 24 hour period of 204 houses

from Eskistuna, Sweden. Whiskers point the maximum load for the hour

upper and lower box edges are 25th and 75th quartiles respectively and the

line in box is the median. On X axis is time at intervals of one hour and Y

axis is load in KWH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2 Classi�cation of survey questions. We classi�ed questions as anthropologic

(human centric), or structural (building speci�c) and pseudo-anthropologic

which are occupants impact or usage of structural facilities. . . . . . . . . . 62

10

4.3 ANN models for three forecasters. (a) is ANN model is for a single house

where only load and global invariants are provided for forecast. (b) is the

ANN model for STMLF1. (c) is the ANN model for STMLF2 . . . . . . . . 69

4.4 Mean squared error for four test weeks (a. Week of January b. Week of

April c. Week of July d. Week of October) comparing STMLF with multiple

STLFs. Blue line is STMLF and red line is average MSE of all STLFs. Days

of week are on X axis and mean squared error is on Y axis. . . . . . . . . . . 77

4.5 Scatter plot of forecast against actual load for 7 day test period of January.

The top plot in each �gure is forecast through structural and anthropologic

data and bottom one uses house-Id as discriminant. In all �gure actual load

is on X axis and forecast is on Y axis. . . . . . . . . . . . . . . . . . . . . . . 79

4.6 Mean squared error for four test weeks (a. Week of January b. Week of April

c. Week of July d. Week of October). Days of week are on X axis and MSE

values on Y axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.7 MLR forecast error for 9 day evaluation period. Each day has 4896 forecasts.

Darker part of bars represent correctly forecasted loads and lighter shade

represents the miss-forecasted loads . Correct forecast is forecast within the

range de�ned by equation 4.9. . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.1 Heater load and load pro�le of a single house. Red line represents the main

load value and blue dots represents the hours in which the heater was on. . . 95

6.1 Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

6.2 Hourly planning LP equations . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6.3 Typical Supply spike in system . . . . . . . . . . . . . . . . . . . . . . . . . 110

6.4 Spike handling LP equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.5 System response for spike at 20 < t < 30 . . . . . . . . . . . . . . . . . . . . 112

11

6.6 Typical demand spike in system . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.7 Reserve margin lower limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.8 Hourly planning BP equations . . . . . . . . . . . . . . . . . . . . . . . . . . 126


6.10 System Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

6.11 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

6.12 Summary of Results of Table 6.5. Top most �gure shows results of AdOpt,

middle one show results from Interior Point and the bottom one shows result

of Simplex method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

6.13 AdOpt and Simplex comparison on 7 day CAISO data . . . . . . . . . . . . 141

6.14 Supply demand and comparison for day 6 CAISO data . . . . . . . . . . . . 142


6.16 Consumption pro�le of California for a day as published by CAISO (Con-

sumption in MWh) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

6.17 Response time for dynamic and standard modeler in comparison to demand.

(Response time in seconds) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

6.18 Comparison between demand, clusters and active users for 24 hour period as

observed in Sollentuna, Sweden . . . . . . . . . . . . . . . . . . . . . . . . . 157

6.19 Solver time for 24 hours CAISO data. (Response time in seconds) . . . . . . 157

6.20 Solver e�ciency for 24 hours CAISO data. (Power allocation in Kilo-Watts) 158

12

List of Tables

4.1 Comparison of volatility measure of individual loads, micro grid loads and

standard grid loads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.2 Results of 3 measures of forecast through multiple STLFs and STMLF. In

addition average load of load for that week is provided to show a relationship

between MSE and average load in that week. . . . . . . . . . . . . . . . . . . 76

4.3 Results of 3 measures for forecast based on model constructed through an-

thropologic and structural data and forecast based on house-Id. In addition

average load of load for that week is provided to show a relationship between

MSE and average load in that week. . . . . . . . . . . . . . . . . . . . . . . . 78

4.4 Repeat count of error and Cumulative accuracy error for 7 day period. . . . 84

4.5 Mean Squared Error (in KWH) for 9 day STMLF using multiple linear re-

gression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.1 Confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.2 Confusion matrices for noiseless forecast. a) Arti�cial neural network (ANN).

b) Support vector machines (SVM). c) (ANN OR SVM). . . . . . . . . . . . 96

5.3 Confusion matrices for noisy forecast. a) Arti�cial neural network (ANN). b)

Support vector machines (SVM). c) (ANN OR SVM). . . . . . . . . . . . . . 97

6.1 Classi�cation of household appliances according to power and usage pro�le . 101

13

6.2 Total time for analyze/plan (Error threshold = .1) . . . . . . . . . . . . . . . 117

6.3 Total time for analyze/plan(Error Threshold=.01) . . . . . . . . . . . . . . . 117

6.4 Input combinations for all three algorithms to generate the initial case-base . 134

6.5 Adaptability and E�ciency Test Results . . . . . . . . . . . . . . . . . . . . 136

6.6 Summary of simulation results for pair-wise testing . . . . . . . . . . . . . . 138

14

Abstract

E�cient energy management is considered as the most important resource for future energy

needs and for continuation of human progress. One of the most promising methods to

optimize electric energy, the largest energy saving component in future, is planning the

consumption to maximize the throughput of energy or demand side management (DSM).

Since domestic consumers contribute up to 50% of electric demand, it is important that

their demand is managed in an optimal way. However, implementing DSM for domestic

consumers is a complex and human intensive task. Doe to these reasons, the state of the art

DSM systems for domestic consumers have only realized 5% savings according to surveys.

In this thesis we present a self-managing demand side management infrastructure to handle

the complexities and reduce dependence on the human operators while managing demand

for optimal distribution of energy.

This self-managing DSM is composed of three components. First component is short term

load forecasting to predict the household energy needs for the next 24 hours. The complexity

of this task stems from the volatility of the individual home energy consumption patterns.

To resolve this we show that through an innovative forecasting modeling paradigm and use

of anthropologic and structural data we can increase our forecast accuracy by as much as

50%. The Second component of our self-managing DSM is load disaggregation mechanism.

This mechanism identi�es the devices which can be managed for optimal scheduling. We

show that we can disaggregate the loads with only 3% error. The third component is the

planning algorithm. Since scheduling the devices is an NP-complete problem, an exact

solution for many typical scenarios is not feasible. We �rst present an aggregating mechanism

to convert the scheduling problem from binary decision to frequency domain and then solve

the optimization problem with a bounded error caused by the transformation. Furthermore,

since the size of the problem varies over time, to maximize the exactness of the solution and

reduce the bounded error wherever possible we present Adaptable Optimization or AdOpt.

AdOpt gracefully scales the system leveraging exactness of the solution against the time of

computation to deliver the best possible plan.

Our results show that the combination of the three techniques can result in up to 30%

reduction in peak power under varied operational conditions.

2

AcronymsDSM Demand Side ManagementAdOpt Adaptable OptimizationDR Demand ResponseHAN Home Area NetworkHVAC Heating Ventilation and Air ConditioningSTLF Short Term Load ForecastNYISO New York Independent System OperatorToU Time of UseEES Electric Energy StorageDG Distributed GenerationAI Arti�cial IntelligenceANN Arti�cial Neural NetworkSVM Support Vector MachinesSoM Self-organizing MapsGPRS General Packet Radio ServiceGSM Global System for MobileFDAP Forecast-Disaggregate-Analyze-PlanAMR Automated Meter ReadingSTMLF Short Term Multiple Load ForecastMLR Multiple Linear RegressionKWH Kilowatt-HourMSE Mean Squared ErrorAcc AccuracyVar VariabilityNILM Non-Intrusive Load MonitoringREDD Reference Energy Disaggregation DatasetKHz Kilo HertzLP Linear ProgrammingSAPE Sense Analyze Plane ExecuteBIP Binary Integer ProgrammingBP Binary ProgrammingCBR Case-Based ReasoningCRE Case-based Recommendation EngineRM Runtime ModelerMT Mathematical ToobloxAPI Application Programming InterfaceCAISO CAlifornia Independent Service OperatorUP Unutilized PowerMWH Megawatt-HourDAS Dynamically Adaptable SystemPHEV Plug-in Hybrid Electric VehiclesCid Consumer idGARCH Generalized Autoregressive Conditional Heteroskedasticity

1

Chapter 1

Introduction

Energy is instrumental in human progress. Thus its e�cient use is very important for

continuation of the progress made by humanity. In this chapter we will discuss the issues

with e�cient energy management, in particular with the usage of electricity by domestic

consumers. We will �rst discuss how demand side management in existing grid can increase

the e�ciency of energy usage and also discuss reasons for it to fail as a viable option. We

introduce smart grid and how DSM within smart grid can achieve the desired results. We

then introduce autonomic computing as the computational paradigm that can realize DSM

in smart grid. We close the chapter with brief account of our contributions in setting up an

autonomic DSM.

2

1.1 Introduction

Energy is the key for continuation of human progress. It has been shown in various studies

that energy consumption is directly proportional to growth of the human civilization tech-

nologically, socially and even on personal level [Ozturk, 2010]. However, as need for energy

grows exponentially, the available energy sources are depleting at an alarming rate. In such

a situation managing available resources to maximize their usage is of critical importance.

This energy management can reduce energy prices, a�ect environment through lower carbon

footprint and prolong the existing resources till new sources of energy can be found and

utilized.

Electricity is by far the largest consumer of energy. Electricity accounts for 56% of energy

consumed in the world. Therefore, there are various ways electric grids are being made

more e�cient. E�ciency in transmission, generation and distribution has been the topic

of research for more than a hundred years. But a very important and e�ective electricity

e�ciency measure - management of the demand of electricity- has been overlooked till now

due to the complex dynamics of electricity production and consumption and also due to the

lack of computing technologies to resolve these complexities.

Energy management is the task of managing the supply-demand equation in the grid.

The goal is to always have su�cient supply to meet the demand. If the supply is at any time

insu�cient for demand then this results in catastrophic situation for the distribution grids.

An overwhelming demand destabilizes the system. The result can range from ine�ciency in

the generation plants, transmission power lines and home devices to break down of plants,

melting and burning of cable and of house appliances. Thus it is imperative that supply-

demand equation is always maintained in the positive.

Since the demand is produced by hundreds of thousands of devices, it is close to impossible

in existing grid to control the demand. Therefore, wherever possible, utility providers try

3

to a�ect the supply part of the equation through over provisioning of electricity. But since

demand for energy varies over a day, season and years, setting up plants for the maximum

power need over an entire year is usually not very feasible. To illustrate this �gure 1.1 shows

the electricity demand for the state of California in USA. To meet this demand di�erent

types of power plants are used. The more e�cient production units, in terms of cost and

carbon foot print, are used for the base energy needs. For peak demands older, costlier units

are commissioned at certain times of the day to provide energy since it makes more sense to

use cheaper producing units more regularly.

Furthermore, peak power is unpredictable to a degree. This requires utilities to have

instantaneous power production infrastructure. This is mostly provided through diesel or

furnace oil based generation units. These units, in comparison to standard electricity pro-

duction units drive engine through their automotive force rather than through steam as is the

case for majority of industrial power production units. But energy through these generation

units has the double negative e�ect of higher cost and higher carbon footprint.

In certain situations, such over-provisioning is not possible due to lack of generation re-

sources. In such situations utilities are forced to shutdown energy to sections of the grid

to balance the power equations. The Indian blackout of 2012 was caused by severe short-

age of energy [Romero, 2012]. In developing countries, where recent industrial and local

consumption use has outstripped the growth of production plants, this lack of supply is a

common phenomenon. This is true for Pakistan, India and China as well where the demand

for energy is growing at more than 20% a year and the production is growing at 15%. This

leads to scheduled load shedding sometimes up to 12 hours in a day. It is imperative then

that instead of providing expensive and environmentally unhealthy power or cutting power

to consumers altogether, demand is somehow managed in such situations to at least reduce

the e�ect of peak power. This phenomenon is called peak shaving or demand shaping. Peak

shaving is more speci�c to reducing the load at peak times whereas demand shaping may also

4

Figure 1.1: Consumption of State of California USA, on 2nd April 2013.

mean increasing loads at some times and reducing at other times. There are two strategies

used for this mitigation: demand side management and demand response.

1.2 Demand Side Management and Demand Response

Demand side management (DSM) and demand response (DR) are the two primary terms

used in literature for managing demand in an electric grid. In literature the terms are used

interchangeably but a subtle di�erence exists as we will discuss below.

Demand response is the most commonly used strategy. DR is a reactive process where at

time of peak consumption demand is curtailed through explicit shutdown of end user devices.

In almost all the cases this shut down is blanket in that all the devices that participate in

demand response program are switched o�. This sudden drop in demand is su�cient in

some cases to mitigate the peak power issue. However, as the events in Indian blackout

have shown, for large scale supply-demand shortfall demand-response can cause a cascading

5

failure of catastrophic nature.

Demand side management is usually taken as a proactive, user dependent method to cur-

tail power consumptions at peak time over a long run. The term coined in 1976 by Gellings

state 6 goals for DSM as shown in �gure 1.2 [Gellings, 1985]. As can be seen, peak shaving

was one of the tasks. Technically demand response thus is a DSM but for our purposes we

will de�ne DSM as:

A proactive measure to shape demand over a comparatively longer

window (24 hour) period for optimal device usage given supply condi-

tions and user preferences.

For our study we de�ne DR in comparison as:

The reactive peak shaving measures to curtail demand during critical

peak demand which does not explicitly consider the user preferences

or long term supply condition in its planning.

DSM measures till recently have been limited to providing incentive pricing to end users

to shift their loads to lower demand times. This has the double e�ect of "valley �lling" and

"peak shaving" as described by Gellings. Di�erent costing plans have been o�ered by energy

suppliers giving incentives to the users to reduce their load at speci�c times in the day. One

example of such incentive is the power7 plan in United Kingdom. Energy price for �xed 7

hours in a day is higher but for the remaining 17 hours the energy price is nominal.

As is obvious from the description there are various pros and cons for each of the two

methods. Existing DSM plans do not consider the real time energy situation and are rigid.

The timings for energy prices are �xed and are valid even for days when there is no need for

peak shaving. On the other hand, if peak occurs at times other than the speci�c designated

6

Figure 1.2: Goals of DSM as proposed by Gellings [Gellings, 1985].

7

hour then utility will be in a �x. Secondly, such methods require the end user to exactly

know when it is bene�cial to use energy. Another aspect is that though energy provider is

giving incentives, it is up to the user to leverage those incentives or not. This unsure state

is not acceptable for utility providers as they need some sort of guarantee that the demand

will be shaped according to their needs.

Demand response historically has been primarily preferred more since it provided a

stronger guarantee of demand but has issues such as forcing the user's device to shutdown

even when a user might need the device. Secondly, to date most of such controllers are blan-

ket load shedders. That is either all of the controllers switch of devices they are connected to

or they don't. This results in unnecessary over shooting the DR targets. Third, the state of

the art of this method does not consider needs of end user. A user buys a washing machine

with the intention of using it at speci�c times. However, if the timings of user's need clash

with peak power regularly then the machine will be turned o� at the most inopportune

time. This might have serious implication in users' acceptance of such methods. Fourthly,

DR does not plan the power but rather expects that o�oading demand from now to future

would somehow work. This is su�cient when peak demand is for short duration and the

o�oaded workload is relatively small, but when the peak is for longer duration or the of-

�oaded workload is signi�cant than this has domino e�ect as is the case in load shedding in

Pakistan. The critical peak power in Pakistan extends for several hours and the blanket load

shedding o�sets signi�cant portion of the load for future. But when the o�oaded workload

comes online, this again strains the system. To recover from this new peak, the system again

sheds load from some other part of the network resulting in a continuously oscillating system

where for some days the system never reaches a steady state.

The lessons learnt from the two programs thus far are:

1. Energy consumption management can result in lower costs and is bene�cial

8

2. Incentives are a good way to convince the end users to manage energy better. But

1) Are harder to manage for end user 2) Do not provide guaranteed consumption

portfolios for utilities.

3. Automating shutdowns is e�ective but 1) Have lower acceptance from end users due

to their intrusiveness and lack of understanding of user's needs. 2) Existing methods

(DR) do not e�ectively manage power but rather put o� the existing loads for later.

Although DR and DSM are applied in the existing grids, its application has not realized

the savings that were anticipated. This has been generally attributed to the overload of

thinking and planning placed on the user for DSM or the authoritarian application of DR.

However, in recent times the advent of smart grid can provide technologies to get the best

of both the techniques. In the next section we will �rst discuss the concept of smart grids

and then discuss how this DSM will feature in a smart grid.

1.3 Smart Grid

Existing power grids are complex centralized networks. However, with growth in demand,

and requirement for greater grid reliability, security and e�ciency needed a "quantum" leap

in the way grids were managed [Moslehi and Kumar, 2010]. The proposed changes to the

existing necessitated harnessing communication and information technologies. This new

management framework towards a "smarter" grid is now widely referred to as "smart grid".

Smart grid is an initiative which gears at incorporating IT in electric grid for more control,

visibility and sustainability. One of the goals of this smart grid is to strengthen and support

the demand side management programs [Rahimi and Ipakchi, 2010].

The DSM programs speci�cally bene�t from this smart grid initiative directly and indi-

rectly since few smart grid technologies directly or indirectly resolve the di�culties of im-

plementing demand side management. Fist is provisioning of variable pricing for consumers.

9

The second is distributed generation speci�cally in micro grids. Third is the concept of

micro-grid and fourth and the most important is advances in protocols and technologies

pertaining to home automation network (HAN).

We will �rst discuss the concept of variable pricing and its implication on DSM. Variable

pricing is the model in which price of electricity will vary based on the actual price of

production at that time. In the existing grid across the world, in most of the systems price

of electricity is �xed irrespective of time of use. There are bracket pricing methods but are

a bit di�erent than variable pricing. For example, in Pakistan, �rst 300 units of a bill cost

half of the next 200 units and any energy more than 500 units costs 3 times more than the

base price. This price controls total consumption of electricity in a billing cycle but does not

charge based on the hour of usage.

But energy production cost is dependent upon the real demand at the time of the day.

Even the slab system does not represent the fact that energy produced at say 9AM has

di�erent price than one produced at 9PM. The current billing method averages projected

prices and this becomes the �at rate for the bill.

Variable pricing would charge the users the price of power that was applicable in that

hour. This will, theoretically, give user incentives to change their usage habits or pay extra.

In a way a brute and rude demand side management incentive. But it is foolish to expect

a user to sit all day to observe energy prices and optimize her daily electric consumption

based on hourly energy prices.

Distributed generation on the other hand is the futuristic concept where energy produc-

tion is distributed so that the small sized production plants are situated closer to the con-

sumption. This provides better control as well as lower line losses. To manage such smaller

self-contained energy modules the concept of micro-grid is suggested. A micro-grids is a

self-contained grid of smaller size connected to the external grid through some interface. A

micro grid manages its supply-demand equations internally and presents to the conventional

10

grid as a single demand/supply source. Due to the smaller size of micro-grid it is possible

to micro-analyze the load and manage DSM and DR programs within the micro-grid more

e�ciently. Secondly, due to distributed generation, the needs of the micro-grid may well be

met independent of the conventional grid providing a much more �uid and e�cient manage-

ment framework. However, such �uid and dynamic systems require self-management since

it requires intelligent automation to manage the real time micro-grid system.

Lastly, recent advances in wireless technologies and de�nition of protocols for home

automation network (HAN) such as ZigBee has provided opportunity for communication

and control of devices for better energy management applications [Alliance, 2006]. Through

these technologies it is now possible to surgically manage individual devices instead of re-

sorting to blanket load shedding.

The combination of distributed generation with micro-grids coupled with variable pricing

and control of demand through HAN can open many avenues for DSM programs. By reducing

the size of the e�ective grid and availability of localized generating resources makes it possible

to realize surgical DSM where individual devices can be monitored, analyzed and planned

for instead.

However, so far research in e�cient energy management for smart grids is generally

restricted to managing a single house's energy consumption which are either �tted with

renewable resources or have access to variable pricing or both. Our �ndings on the other

hand show that savings can be greatly increased if we consider a micro-grid for planning.

Even simple synchronization of devices or pooling renewable energy can greatly enhance the

e�ectiveness of DSM. To our understanding, one of the reasons for this lack of concerted

e�ort is due to lack of a holistic view of the situation. In this thesis we try to present this

holistic view for implementing demand side management in future smart grids.

11

1.3.1 DSM in Smart Grid

A networked electric grid with ability to control end user devices can theoretically drastically

increase the e�cacy of DSM. DSM in smart grid can potentially look to exploit the demand

elasticity- the natural leverage in device usage. For example when a user puts his dishes in

dish washer the system identi�es an opportunity to optimize. It informs the user through

some method such as a panel on dish washer, or a wall mounted display, or through mobile

phone that the user usually puts the dishes in at 8AM and does not use them till 4PM in the

evening. If he allows the system to manage then he can save $X or more every day. The user

clicks on agreement and the machine is scheduled to run at the optimal time. When a user

wishes to override this plan due to any reason, he simply has to click on over-ride button

on the same panel. On the utility side, the system can forecast the load and load elasticity

and schedule the elasticity in such a way that demand is maintained at the optimal level.

However, such a system requires heavy investment from the consumer and utility.

A more feasible system is in extension of forced load shedding. If instead of shutting o�

the total power only the high consumption devices are controlled by utility and the utility

o�ers contracts for minimum acceptable service then a win-win situation may be attained.

This will result in similar control for the utility however, instead of blanket load sheds, with

the smart grid infrastructure the utility can intelligently sheds load in such a way that the

service level guarantee is not violated. For instances a user may require that at some hours

of the day, for instance between mid-night and 5 AM, HVAC should not be switched o� for

more than 30 minutes in an hour. However, it is too cumbersome, in fact is unfeasible, for

a consumer to plan usage of devices based on the global energy demand. What is required

is a level of intelligent management which can aid the consumer in making the best decision

and then automating these decisions on the devices. According to studies, this is a much

more feasible solution at the present time. To this end in this thesis we present a system,

12

an architecture and its associated methods to make this self-managing. 1

Our hypothesis is that:

If we instrument DSM as a self-managing system then we can in-

crease DSM's e�ectiveness and make it practical for utilities to in-

crease DSM's e�ciency and applicability in smart grids.

But to construct such a system various research challenges exist which hitherto were

not addressed. The challenges discussed below are the breakdown of our hypothesis in to

research questions from the software and algorithmic perspective of the cyber-physical DSM

system.

1.3.2 Challenges for DSM in Smart Grid

Our approach for DSM is a pro-actively strategy to plan the consumption so that the peak

load is preempted. This has the distinct advantage of having a provably optimal solution and

can avoid cascading failures as were observed in Indian blackout of 2012. However, pro-active

planning for the future 24 hours requires that the data for future states is available.

Since energy consumption at future times is not readily available, �rst research challenge

for this optimal planning is a forecast of household loads. In the existing grids, since the

controlling plane is a region, forecast of a region was deemed su�cient. But since DSM

controls devices in a house, it is essential that forecast of individual devices or at least of

1Self-management or autonomic system is a paradigm geared at providing intelligent solutions toreplicate human interactions in mundane and oft repeated tasks. Though this was the initial dream, thestate of the art in autonomic computing not only replaced the human operator, but through their techniquesallowed more diverse and elaborate management of resources that hitherto with human operators were notpossible.

13

house is available for planning. There are three question which need to be answered for using

the existing forecasting methods for self-managing DSM.

1. Can current short term load forecasting (STLF) models work e�ciently for forecasting

individual households loads?

2. Can additional data enhance the forecasting accuracy of individual consumer loads?

3. Can we �nd the consumption of the relevant devices by disaggregation of household

loads with high accuracy?

In chapter 4 and 5 we attempt to answer these questions by proposing forecasting and

disaggregation methods for our system.

The second research challenge is to construct a plan for the system which balances the

supply-demand for a future window at a device level. Various planning algorithms exist

in literature but to select the appropriate algorithm two practical considerations needs to

be considered. First is the scalability since potentially hundreds of devices would need to

be planned. Since our planning is built on a forecast which may be inaccurate, the second

practical consideration is robustness of planner. Either the planner should make a robust plan

which is able to consider the expected perturbations in the system or should be adaptable

enough to re-calculate a solution if the forecast fails over time. Chapter 6 provides methods

to answer this concern.

Third research challenge is to make the planning in second challenge adaptable and

scalable. Since our system constructs a plan for an evolving environment, it should be able

to model the environment at runtime. This means that if in this systems devices are added

or removed then the system is able to adapt its model without explicit intervention of the

administrators. Chapter 6 section 3 provides a modeling method to answer this challenge

Fourth, given the various components that work together to deliver the solution a de�ned

14

architecture should exist to facilitate the development and integration of the components.

In chapter 3 we present an architecture which can tie together the various cogs together.

1.4 Limitations, Assumptions and Scope

We speci�cally limit the scope of our DSM system to household and small consumers. As

will be discussed in chapter 2, DSM for industry and large scale consumers has been applied

with good e�ect. But with 40 to 50% of load being contributed by households, DSM for

domestic consumer is an important target which henceforth has not been looked at in great

detail.

Second, classi�cation studies have identi�ed certain class of devices which contribute most

to the peak load. These are usually high energy devices for heating and cooling purposes.

To maximize the impact of DSM strategy and minimize the impact on consumer's life the

are system is speci�cally scoped to model and control the high impact devices only.

This thesis is particularly interested in identifying the algorithmic and software compo-

nents needed for self-managing DSM. To this end we restrict our argument to the software

components of the cyber-physical DSM system. We assume that a viable communication

medium exist to communicate the consumption of and plans for devices. We also assume

that the required legislation and physical deployment of hardware will be carried out for

practical application of the proposed solution.

A limitation follows from this assumption, in that, we assume that the data provided by

the devices will be perfect. Changes will be required in planning and forecasting model to

make the system robust to noisy data.

Another limitation of the thesis is related to the data. We have speci�cally used data from

California independent system operators and from the city of Eskituna, Sweden. The results

may not be exactly replicable for other regions and the techniques may require adequate

15

modi�cation. Furthermore, the planning is speci�cally constrained to the tropical hot climate

condition of the location of the author's university.

1.5 Summary of Contributions

Here we brie�y introduce the contributions made in this thesis to achieve self-managing

demand side management. The contributions are:

Closed loop autonomic DSM framework for smart grids

DSM measures have been around since 1985 [Gellings, 1985]. Gellings description of DSM

was holistic for the existing grid. However, since 2006 most of the research in DSM is within

the ambits of smart grids. But for each contribution, the authors assume a new unde�ned

architecture within which their contributions can work. To our knowledge in this thesis we

present the �rst holistic closed loop DSM architecture which can incorporate a variety of

DSM strategies proposed in literature and in this thesis.

Furthermore, existing demand side management solutions have been passive in their in-

teractions. The user is provided with some incentives and is expected to maximize her

energy use according to the incentives. On the utility side as well it is hoped that the

incentives will be used as much as possible. However, research has shown that this task

is too tedious for end users to perform and results in only partial use of the incentives

[Kim and Shcherbakova, 2011]. The future smart grids are proposed to be even more com-

plicated. In such a scenario self-management can be critical to implementation of DSM. To

ameliorate this we propose here autonomic DSM. Autonomic DSM automates parts of the

system and to reduce the conscious e�ort on part of the user to avail DSM measures.

16

Forecasting Household Energy Consumption for DSM

The future smart grids are expected to manage energy consumption at a household level.

To achieve this goal a good forecast of the controlling plane is required. In this thesis we

show that the existing forecasting methods and the emergent forecasting strategy proposed

by some researchers are not very e�cient. In comparison we show that a multi-dimensional

model considering the anthropologic and structural aspects of the house is 1. More accurate,

2. Less computationally expensive.

Planning and Modeling for Self-managed DSM

Scheduling large number of events while satisfying conditions is in general an NP-hard prob-

lem. It has been shown in literature that demand side management is reducible to scheduling

problem and �nding an exact solution for large enough systems is not tractable. In most

cases, to resolve such a problem some approximate algorithm is used. However for small

enough problems the system can be solved exactly.

In this thesis we present a self-managing system which is self-managing at two levels.

First, we propose a self-managing DSM system which intelligently automates the manage-

ment of devices in a grid in such a way that the user is provided optimal services while the

load curtailment goals are met. While there are many algorithms proposed which achieve

similar goals, our proposed solution in addition to optimal service also provide service level

guarantee which limits the maximum amount of load curtailed for each consumer. Through

this way we guarantee fairness to all the consumers of the service.

Second, we propose a self-managing optimization engine which autonomically selects the

correct algorithm based on the system statistics. That is, when the system size is small, the

self-managing engine selects an exact solution method and when the system size is large or

the time constraints are hard an appropriate approximate solution is selected. This allows

the system to provide optimal service based on the system dynamics.

17

To achieve such dynamic algorithm selection, we propose a dynamic system modeling

methodology which constructs a system model at runtime from the available raw data. This

model is then provided to the planning algorithm for optimized scheduling.

18

Chapter 2

Literature Survey

Building autonomic demand side management is a multifaceted multidisciplinary undertak-

ing. Establishing a DSM in future grids requires amalgamation of techniques from domains

across energy management, arti�cial intelligence and software engineering. In this chapter

we �rst look at the literature that exist for planning of demand side management systems

speci�cally those intended for smart grids. We then look at the literature for short term

load forecasting followed by literature survey of load disaggregation. We then look at litera-

ture from self-managing systems for optimization and planning or energy system to give the

literature survey of self-managing systems for demand side management.

19

2.1 Introduction

Electric grids of future are envisioned to be cyber-physical systems where information tech-

nology components will be integrated with physical hardware components to make the energy

cheaper, e�cient, clean and perhaps more sustainable. This vision is given the title of smart

grid [Coll-Mayor et al., 2007].

One of the primary application of this smart grid is envisioned to be a more e�ective

and e�cient demand side management program [Rahimi and Ipakchi, 2010]. Due to the

lack of communication and algorithmic support, DSM historically has been applied for de-

mand shaping for critical peak loads for the industrial loads [Albadi and El-Saadany, 2008,

Cappers et al., 2010, Strbac, 2008, Saele and Grande, 2011]. Since the industrial units had

the human resource and infra-structural support, such DSM was possible.

Since the domestic consumers is the largest electricity consuming sector, close to 45% of

in most regions [of Finance Pakistan., 2009], it has been hoped that DSM can be deployed

for this sector [Rahimi and Ipakchi, 2010]. With the advances in information technology now

it is possible to achieve this goal as has been discussed in chapter 1.

However, to deploy a demand side management program for domestic consumer, it is

necessary to resolve three problems. First is to develop planning algorithms for household

level DSM. Second is to have su�cient forecasting and measurement algorithms to support

the planning. The third is a way to distinguish which load is being used from this forecast.

The fourth is to have a level of self-management to relieve the consumer from minute decision

making details.

In this chapter we will �rst discuss the literature that exists for planning the demand

side management at the household level. We classify the existing work into three broad

categories based on the criticality of need. We further sub-divide classes based on di�erent

social and technical parameters. Next we look at the forecasting algorithms that can be used

20

for energy forecast in the near future. In literature, such a forecast is called short term load

forecasting (STLF). We present a survey of existing STLF techniques and categorize the

techniques according to the algorithm classes. We then present the literature survey of the

load disaggregation research. Last, we also discuss various self-managing energy management

systems present in the literature.

2.2 Demand Side Management

Demand side management is the management of end user consumption to manage the energy

supply-demand equation. It was �rst proposed by Gellings in 1985 [Gellings, 1985]. Since

then, DSM has been applied in di�erent regions and in di�erent scenarios. There are a

number of studies citing the bene�ts and pitfalls of DSM programs across the globe. For

example, Walawalkar and colleagues overview of the evolution of the DR programs in PJM

and NYISO markets [Walawalkar et al., 2010]. They also analyze current opportunities that

exist in these markets for DSM expansion. Strbac discussed bene�ts of implementing DSM

in the UK [Strbac, 2008]. Saele and Grane presented results of user executed DSM strategy

in Norway showing reduction in energy at peak times [Saele and Grande, 2011]. Cappers

and colleagues studied demand side management program in United States which resulted

in 38,000 MW of peak load reduction [Cappers et al., 2010]. Albadi and El-Saadani provide

a summary of demand response in electricity markets [Albadi and El-Saadany, 2008].

Though these works point to the e�ectiveness of demand side management, DSM in most

of the existing energy systems target large commercial and industrial settings due to a variety

of reasons. Di�erent studies have identi�ed these reasons. Studies such as those by Kim and

Sherbakova [Kim and Shcherbakova, 2011], Lisovich andWicker [Lisovich and Wicker., 2008],

Greening [Greening, 2010] and Breukers and colleagues [Breukers et al., 2011]point to di�er-

ent socio-cultural and economic aspects for this lack of e�ectiveness. we can classify the

21

problems in three categories:

1. Technical shortcomings such as algorithm design, network setup and hardware design.

2. Social aspects, such as user's willingness to participate, privacy issues.

3. Economic issues such as cost of deployment and bene�t sharing strategies.

Such studies point to the issues that need to be resolved for DSM program to succeed. To

resolve these issues various algorithms have been proposed which we will discuss below. One

interesting observation in this regard is that the DSM programs response to the objections

above is strongly related to the criticallity or utility of DSM. In some instances the need for

DSM is so severe that social aspects are ignored. But when the need for DSM is not so severe,

social aspects such as privacy and satis�cing behavior [Sheth and Parvatiyar, 1995] become

signi�cant drivers in algorithm design. Based on the need we classify DSM programs into

three classes: Critical where DSM is a necassity, Renewable integration where DSM optimizes

and accentuate the use of renewable energy sources and TOU optimization where DSM uses

the utility provided incentives to increase savings.

2.2.1 Critical DSM

Critical DSM are those demand side management programs where energy conservation is

critical enough to automatically control end user devices or system stability will be compro-

mised. Since the utility is pro-actively controlling consumer devices, the control is generally

restricted to heavy loads such as of HVAC to minimize consumer impact. Secondly, since

switching of any device at any time is a very intrusive and perhaps rude method, systems

generally try to limit the utility driven switching. Based on this aspect there are three types

of algorithms for DSM proposed. First type is user comfort driven where user's preference

is modeled explicitly and the DSM algorithm tries to manage DSM goals within the comfort

22

bounds of user. Second set of algorithms contractually bounded where the amount of

time a device can be switched o�. The third type is explicit feedback where user is explicit

asked to provide preferences or is asked to select from a range of choices for DSM.

User Comfort Driven

The algorithms in this category try to capture the comfort of the user using di�erent models.

Based on these comfort ranges the DSM algorithm attempts to reduce load when-ever needed.

Figure There are various models used to ascertain user comfort. Venkatesan and colleagues

plan by utilizing consumer behavior modeling considering di�erent scenarios and levels of

consumer rationality while observing voltage pro�le and losses [Venkatesan et al., 2012]. Fan

used the lessons learnt from research on the internet tra�c and proposed a DSM where

user preferences were modeled as willingness to pay [Fan, 2011]. Moderink and colleagues

presented a three tier demand side management system which planned device usage us-

ing forecasts of devices and matching the consumption with the global supply equations

[Molderink et al., 2010]. However, it is unclear how the savings will be achieved. Further-

more, the DSM program plans for the forecasted load only and if the consumer behavior is

other than forecasted then the DSM planning is over-ruled by the consumer demand. This

results in failure of DSM at times of bad forecasts. PingWei and Ning proposed an HVAC

management plan where the heating and cooling loads are modeled using thermal dynamics

of heating and cooling to model the load [Du and Lu, 2011a]. The algorithm then uses a

comfort index in conjunction with the thermal dynamic modeled system to plan power cuts.

Daoxin and colleagues �rstly, model all the major market participants together with the

constraints of transmission and generation. Then, the energy market is analyzed with RERs

uncertainties and demand response [Daoxin et al., 2012].

A generalized model of these techniques is provided in �gure 2.1. The DSM system

elicits comfort from the devices either through rationality arguments, willingness to pay

23

Figure 2.1: Comfort Modeling

or through thermal dynamics, etc. This model is used to provide a forecast to the DSM

planning algorithm which sends a signal to device actuators to implement the DSM scheme.

The power supply from the utility in all of the cases was not e�ected directly.

Contractually Bound Systems

Contractually bound systems are those systems where consumers are provided with di�er-

ent contracts. These contracts limit the extent of DSM load shedding. Figure 2.2 shows

a model for such a system. The user consumption data is forecasted using some fore-

caster. The DSM is provided with the contractual bounds within which in can plan the

DSM strategy. The planner schedules loads such that DSM goals are met while staying

within the contractual obligations. One instance of such system is presented by Javed and

Arshad where they propose a linear programming solution to manage loads of HVAC in

a city. This algorithm modeled the problem as a series of equations which model provide

24

a fair and service level bound consumption for the users while maintaining the load man-

agement goals [Javed and Arshad, 2009a]. Kwag and Kim Proposes customer information

as the registration and participation information of DR, thus providing indices for evalu-

ating customer response, such as DR magnitude, duration, frequency and marginal cost

[Kwag and Kim., 2012]. Pedrasa and colleagues on the other hand limit the amount of led

shed based on its e�ect on the consumer [Pedrasa et al., 2010]. In both the cases the con-

tracts can vary.

Explicit Feedback

DSM programs in this category expect the user to provide guidelines on his preferences.

Figure 2.3 present a model of such a system. Here the constraints are explicitly provided by

consumer through some interface. This constraints are the limitations for the DSM planner

similar to the contractual case. There are various algorithms that have been proposed for

explicit feedback critical DSM systems. Escrivá and colleagues evaluated di�erent control

strategies to reduce cost of HVAC using contractual clauses [Escrivá-Escrivá et al., 2010].

Here the contracts are the explicit guidelines by the user to control the load. Kim and Poor

proposed a Markov decision process based for scheduling elastic loads where elastic and non

elastic loads are assumed to be known [Kim and Poor, 2011].

Since such management is very cumbersome, some DSM programs reduced the consumer

loads by simplifying the priority scheme. Ranade and Beal proposed colorPower where con-

sumer assign color to their devices [Ranade and Beal, 2010]. Each color is assigned a prior-

ity. The scheduling algorithm prioritizes the shut down of devices based on these color-based

priorities. Du and Lu [Du and Lu, 2011b] appliance commitment algorithm that schedules

thermostatically controlled household loads based on price and consumption forecasts con-

sidering users' comfort settings to meet an optimization objective such as minimum payment

or maximum comfort

25

Figure 2.2: Contractually bound

26

Mohsenian-Rad and colleagues presented a theocratical result where they posed DSM as

a game simulation and proved that if the utility can �x prices at certain levels then if the con-

sumers chose their strategies then the game is Nash-complete [Mohsenian-Rad et al., 2010].

This means that under the constraints de�ned, if each consumer picks his best strategy then

all the consumers will bene�t.

Faria and Vale present DemSi, a simulation environment that can be used by utility

provider to evaluate the e�ects of demand response demand response simulator that allows

studying demand response actions and schemes in distribution networks [Faria and Vale, 2011].

2.2.2 DSM for Price Responsive Systems

Demand side management historically has been passive programs where utility providers

cajole or convince users to consume energy in a manner which reduces cost to the utility.

There are two methods generally applied. A positive feedback by providing rebates to con-

sumers who reduce energy at peak demand times and a negative feedback mechanism which

penalizes consumption at peak timings. However, such systems were rigid since the timings

were �xed and not based on actual supply-demand and resulted in very low user response

[Kim and Shcherbakova, 2011]. With the advent of smart grids and integration of informa-

tion technology with electric grid, it is possible to make these feedbacks re�ective of actual

supply-demand situation. Furthermore, due to automation and large scale collaboration,

the user fatigue can be avoided which is considered as the most important factor by various

researchers in lack of response for existing DSM programs [Kim and Shcherbakova, 2011].

The incentives proposed in smart grids literature are of two types: �xed price incentives

where the price of electricity is �xed for all days and some automation method is proposed

to reduce user fatigue and time of use systems where the price of electricity changes dynam-

ically based on actual demand and supply. Another category of systems are those where

some storage is available to the user. This storage is used as a bu�er for buying electricity

27

Figure 2.3: Explicit Feedback

28

at lower price and is consumed or sold back to the system at higher price timings. Below

are the various works which fall under these categories.

Fixed Price Incentive

Fixed priced incentive systems usually provide a set of prices that are applied by di�erent

utility providers as DSM incentives. There are two types of systems: Global systems which

propose a price �xing mechanism and automation mechanisms which automate the consump-

tion to maximize the incentive bene�ts. Aalami and colleagues for instance proposed incen-

tive de�ning mechanism for demand response programs which penalized customers in case of

not responding to load reduction messages[Aalami et al., 2010]. In comparison Moghaddam

and colleagues presented an economic model for response of consumers on their �exible de-

mand [Moghaddam et al., 2011]. They proposed a customer bene�t function which is used

for managing the load of each house which responds to the incentives such that the load

reduction goals are achieved while maximizing the consumer bene�t.

On the other side, various algorithms are proposed to automate the consumption to

maximize the bene�ts of the incentives. Fan proposed a user response algorithm to re-

spond to �xed priced incentives and presented results on a simulated environment[Fan, 2011].

Finn and colleagues proposed a dish washer scheduling algorithm which is extendable to

other loads of similar nature to exploit DSM incentives [Finn et al., 2012]. Giorgio and

Pimpinella proposed an event driven Smart Home Controller enabling consumer economic

saving by exploiting the DSM incentives [Giorgio and Pimpinella, 2012]. Rastegar and col-

leagues proposed an optimal and automatic residential load commitment (LC) framework

to achieve the household minimum payment again in a �xed peak load pricing regime

[Rastegar et al., 2012].

Figure 2.4 provides a model for these systems. The incentives are computed using the

�rst set of algorithms and provided to the DSM planners. These incentives are constant

29

over time. The consumption is forecasted and the planner uses the incentives to maximize

savings of the consumers.

Time of Use

The more recent advancement is Time Of Use (TOU) pricing where the utility charges the

consumers the price of energy at the time of consumption. This results in a very complex

scenario where forecasting energy prices and adjusting consumption accordingly becomes

too di�cult a task for most consumers. Figure 2.5 represents DSM systems of this type.

Here instead of a �xed incentive, a cost calculation mechanism is added which calculates the

cost of energy instantaneously. To facilitate the consumer's interactions di�erent algorithms

thus are proposed to plan her consumption. Ramchurn and colleagues proposed a decentral-

ized agent based algorithm to coordinate deferment of loads through collaboration of agents

[Ramchurn et al., 2011]. Lee and Lee proposed a scheduling algorithm using TOU to reduce

cost and presented results and showed positive results on a simulation [Lee and Lee, 2011].

Datchanamoorthy, and colleagues TOU pricing model for monopolies and applied the al-

gorithm on a simulation [Datchanamoorthy et al., 2011]. Chen and colleagues proposed

a stochastic algorithm for management of residential loads as a response to TOU pricing

[Chen et al., 2012]. Fuller and colleagues [Fuller et al., 2011] and Valenzuela and colleagues

[Valenzuela et al., 2012] proposed simulation setup where they used various economic mar-

ket models to observe the e�ect of TOU pricing and consumers' response following economic

modeling techniques.

Incentive with Storage

A third stream in this research is to use storage devices to store energy at lower prices and

resell or use this stored energy at time of higher prices. Figure 2.6 models such systems. In

these systems a physical storage device is available to the system to store the electric energy.

30

Figure 2.4: Incentive based

31

Figure 2.5: TOU based

32

This is incorporated in the planner algorithm. Some of the interesting works in this domain

have used electric vehicle's batteries as a strategic storage. Shao and colleagues proposed

usage of electric vehicles' storage banks to act as bu�ers and use these bu�ers for scheduling

DSM while providing an interface to accommodate customer choice [Shao et al., 2012]. Xu,

Xie and Chen proposed an optimal electric energy storage (EES) scheduling algorithm which

uses day-ahead pricing and forecasted energy load to schedule EES. The proposed system

then uses a model predictive controller to adjust the scheduling according to handle the

inaccuracies of the forecast [Xu et al., 2010].

2.2.3 Distributed Generation Supported by DSM

A new facet of demand side management systems has been the usage of DSM for maximizing

utilization of distributed generation facilities. In most cases this distributed generation is

through renewable sources. The goal of these systems is to maximize the utilization of

renewable resources to reduce cost of overall energy. Figure 2.7 models the systems of this

category. Here in addition to the incentives, the system has physical generation units to

supply energy. In most cases there are some storage medium as well but this is not always

the case. Finn and colleagues proposed a device management system which integrates the

renewable sources for reducing cost of energy [Finn et al., 2012]. Livengood and Larson

proposed Energy Box, a smart controller which used stochastic dynamic programming to

schedule device consumptions while considering the generation from renewable resources

[Livengood and Larson, 2009].

In some regions, the utility provider buy back extra energy generated from renewable

sources. Gudi and colleagues used this facility for optimal management of distributed re-

newable resources where extra energy was also sold to the utility [Gudi et al., 2011]. Xiao-

hong and colleagues proposed a coordinating strategy for energy sources and loads for low

energy building [Guan et al., 2010]. Jiang and Yunsi Fei proposed a distributed generation

33

Figure 2.6: Incentive with storage based

34

integration mechanism using Hierarchical Agents [Jiang and Fei, 2011].

2.3 Short Term Load Forecasting

Short term load forecasting is the task of predicting a single system's load for a short future

window [Gross and Galiana, 1987]. The data for STLF usually is in a time series format that

is consecutive value represents the next observed value of the system. The future window

ranges from one hour to a week and the forecasting granularity varies from 15 minutes

windows to an hour. We will discuss each of the technique's application here. The underlying

systems that are being modeled though vary but there are few commonality in almost all

the systems. First the systems are non-linear. Second they exhibit a diurnal trends where

there are usually two peaks in a day corresponding to midday and late evening. Thirdly,

since historically STLF has been applied to forecast load of an entire grid, the patterns are

usually smooth. To construct this forecast two fundamental techniques have been applied

for STLF: statistical and time series models and AI techniques. Some researchers have gone

further to combine AI optimization models to tune the statistical as well as AI forecasting

techniques and proposed hybrid systems.

The aforementioned techniques are applied with great success in grid level systems. How-

ever, since the advent of smart grids interest in forecasting small scale loads such as those of

smart grids, buildings and houses has also seen interest. In this section we will �rst discuss

the time series and basic AI techniques followed by the hybrid methods. We will then discuss

the STLF techniques for small scale loads since this is forecast that our planning requires.

2.3.1 Statistical and Time Series Techniques

Statistical techniques initially relied on smoothing and averaging to build a model. The

hypothesis is that the system has a central tendancy which can be captured using these tech-

35

Figure 2.7: Incentive with renewable with/without storage based

36

niques. Papalexopoulos and Hesterberg used regression for STLF [Papalexopoulos and Hesterberg, 1990]

and Christiaanse used exponential smoothing [Christiaanse, 1971]. Lauret and colleagues

used Gaussian process model [Lauret et al., 2012]. Irisarri applied Kalman �lters for the

same task [Irisarri et al., 1982]. Amjadi used ARIMA model [Amjady, 2001] however, the

non-linearity of the underlying system resulted in low accuracy.

To resolve this issue Hagan used Box and Jenkin method [Hagan and Behr, 1987] and

Garcia and colleagues applied GARCH [Garcia et al., 2005]. Weron has described various

methods for applying statistical methods in [Weron, 2006]. Although statistical and time-

series methods produce su�ciently good results but they have certain limitation. One of

the reason is that time-series methods consider only the load data and any other supporting

information such as temperature etc. is not considered. This results in inaccurate forecasts

for times when these factors signi�cantly alter the general course of the time-series. For

this reason AI techniques have been applied for STLF. Amaral used a smooth transition

periodic autoregressive [Amaral et al., 2008] method as a non-linear method for time-series

forecasting of STLF.

2.3.2 Arti�cial Intelligence Techniques

Various AI techniques have been applied for STLF since the early days of the �eld. The

major focus in the �eld is on arti�cial neural networks but pattern recognition, fuzzy systems

and SVMs have been used as well.

Arti�cial Neural Networks

Arti�cial neural networks attempts to replicate the biological thinking pattern to optimize

and predict systems. ANNs have been used for a long time for STLF. Hippert reviewed the

arti�cial neural network techniques applied for STLF in 2001 [Hippert et al., 2001]. Since

then di�erent ANN algorithms in di�erent con�gurations have been applied. Fuzzy neural

37

networks were applied by Bakirtzis and colleagues [Bakirtzis et al., 1995] and a self orga-

nizing fuzzy-neural-network-based was applied by Dash and colleagues [Dash et al., 1998].

Abdel-Aal used a committees of neural networks for STLF [Abdel-Aal, 2005]. Cheng and

colleagues and yao and colleagues partition source load into components using wavelet trans-

form and then train ANN for each component for a more accurate forecast [Chen et al., 2010,

Yao et al., 2000]. Amjady and Keynia in comparison used wavelet transform but used a

evolutionary ANN for STLF [Amjady and Keynia, 2009]. Lauret used Bayesian ANN for

auto-tuned STLF [Lauret et al., 2008]. AlFuhaid used cascaded neural networks for STLF

[AlFuhaid et al., 1997].

Support Vector Machines

Support vector machines project data in hyper-dimensions in order to �nd discriminating

planes. This method has been used for regression and forecasting as well. Mohandes showed

that SVM out-perform standard auto-regressive models and certain ANNs [Mohandes, 2002].

Zhang has applied a di�erent kernel for SVM with better results [Zhang, 2005]. Chen and

colleagues reported on results of application of SVM for STLF in EUNITE competition

[Chen et al., 2004].

Since tuning parameters for SVM has signi�cant e�ect on the outcome, most of the SVMs

are used in some hybrid system where an optimization algorithm tunes SVM for forecasting.

We will look at these systems below.

Pattern Recognition

Pattern recognition is the task of identifying speci�c patterns in data. The basic intuition for

pattern recognition in STLF is to �nd patterns in history and forecast the future based on

these patterns. Dehdashti used this technique for STLF in 1982 [Dehdashti et al., 1982]. Dai

andWang used similar technique in conjunction with neural networks for STLF [Dai and Wang, 2007].

38

Other Methods

In addition to the traditional machine learning methods some researchers also applied other

arti�cial intelligence algorithms for STLF. Rehman and Bhatnagar expert system based

[Rahman and Bhatnagar, 1988] where input from experts is elicited to form a knowledge

base and then distance measures are used to identify the appropriate scenario for forecast-

ing. However, expert system based forecasting is not commonly used. Yang and Huang

[Yang and Huang, 1998] used fuzzy logic. Mastrocostas and colleagues used fuzzy logic with

constrained optimization [Mastorocostas et al., 2000] for STLF with varied results. Fuzzy

logic though is more used in simulation than in forecasting. Chen and colleagues on the

other hand used an agent based distributed forecasting mechanism [Chen et al., 1993].

2.3.3 Hybrid Techniques

Hybrid techniques are techniques where multiple techniques are used in combination to

increase accuracy of the forecast. There are four variations of these techniques: Time series

techniques tuned by AI, Time series techniques integrated with AI for better results, AI

techniques to tune parameter of another AI technique and AI techniques integrated with

another AI technique for better result.

Example of the �rst class - where AI tunes time series- are works of Desouky and col-

leagues who use ANN to tune ARIMA [El Desouky and Elkateb, 2000] and work of Nie and

colleagues who use tune ARIMA with SVM [Nie et al., 2012]. In comparison He and col-

leagues use ARIMA as base forecaster where SVM forecasts speci�c non-linear points in the

data for increased accuracy [He et al., 2006]. Lu and colleagues proposed a similar system

where they used ANN instead of SVM to support ARIMA forecast [Lu et al., 2004].

There are various hybrid models where an AI technique is used to tune another AI tech-

nique. Carpinteiro and colleagues used a self-organizing map - a type of ANN- to optimize an-

39

other SOM where the second SOM is used for forecasting [Carpinteiro et al., 2004]. Hippert

and Taylor used a Bayesian inference system to tune ANN forecaster [Hippert and Taylor, 2010].

Fan and Chen used a SOM to tune SVM [Fan and Chen, 2006]. Sun and Zou tuned ANN

with particle swarm optimization (PS0) [Sun and Zou, 2007]. Yun and colleagues tuned

ANN with a verity of neuro fuzzy optimizer (ANFIS) [Yun et al., 2008]. Pai and Hong

tuned SVM for forecasting through simulated annealing [Pai and Hong, 2005].

The last category is where an AI technique is integrated with another AI technique

for better results. Dai and Wang used ANN for forecasting but used pattern recognition

for forecasting of patterns where ANN failed [Dai and Wang, 2007]. In comparison Jain

and Satish used clustering to partition data and then forecasted the partitions using ANN

[Jain and Satish, 2009].

2.3.4 STLF for Buildings and Micro grids

Recent advances in smart grid has forced a new dimension in STLF research. Previously

STLF methods were focused on large population sets. But due to prevalence of smart grid

ideas, recently, research has been focused on STLF for small scale systems as well. STLF

for small scale systems is proven to be a much harder problem than for large scale system as

has been explained by Amjadi and colleagues in [Amjady et al., 2010]. [Amjady et al., 2010]

and [Gurguis and Zeid, 2005a] have proposed solutions which work better than the standard

STLF for a micro-grid or building level granularity. However, the accuracy of the system

still does not match those of a large scale STLF due to volatility issues.

2.4 Load Disaggregation

Load disaggregation is the task of identifying individual loads by observing the total load of

the system. In essence we disaggregate the individual loads from the aggregated load that

40

is reported by the main meter. Zeifman and Roth presented a good survey of the technique

in their survey [Zeifman and Roth, 2011]. The load disaggregation systems reported in this

survey and elsewhere, such as by Marceau and Zmeureanu [Marceau and Zmeureanu, 2000],

almost entirely apply load disaggregation for non-intrusive load monitoring or event detec-

tion, that is, to identify loads in the house without instrumenting the devices with sensors

[Hart, 1992]. It is assumed that for such disaggregation live feed of load is available.

There are two classes of algorithms for load disaggregation based on data frequency.

Algorithms for low frequency data -at the rate of one second or more- are generally for

heavy load detection and use active and reactive power and other macroscopic parameters

[Cole and Albicki, 1998, Farinaccio and Zmeureanu, 1999, Norford and Leeb, 1996, Powers et al., 1991].

The results from these studies however are not very accurate due to the severe complex-

ity of the problem. If we observe in more detail then it can be observed that the larger

loads have a higher accuracy rate than otherwise. This speci�c fact is of great importance

to us as we will see in chapter 5. Algorithms for high frequency data - sub-second rate

data- attempt to �nd almost all possible loads from the aggregated data [Chan et al., 2000,

Gupta et al., 2010, Leeb et al., 1995, Srinivasan et al., 2006]. Recently Lam and colleagues

constructed a taxonomy of devices based on load signatures [Lai et al., 2010]. This strategy

has resulted in drastic improvement of results and accuracy of up to 92% has been reported

[Hassan et al., 2013].

However, our task is to �nd the events of heavy load usage in the forecasted load where

the forecasted load is at the frequency of hour. However, since we plan to only manage the

largest loads of the house, there is a hope that we may be able to identify those loads with

a degree of con�dence.

41

2.5 Self-managing Energy Systems

Self managing systems are systems which are able to manage their situation themselves.

By managing their situation it is implied that they are able to self-con�gure, self-heal, self-

protect or self-optimize given normal operative situations. By this de�nition it should be

anticipated that a self-managing energy system is able to identify point of optimization on

its own and is able to reduce its cost or increase its throughput accordingly. The system

should be able to identify opportunity to optimize and self-con�gure themselves to bene�t

from the situation are very few. This self-awareness should in theory be the di�erentiating

factor between a DSM algorithm and a self-managing energy system. However, such self-

managing systems are very few. There are two domains in which such systems have been

investigated by the self-managing community. Server farms operations and its energy needs

is by for the more focused area by self-managing community. However, few instances of

self-managing system for home consumers also exist. In this section we present the survey

of these self-managing systems. We will �rst look at the server farm solutions followed by

self-managing systems for home consumers.

2.5.1 Server Farms

Reducing energy load of a server farm holds great importance for the server farm operator.

In most cases server farms host applications of third party. These applications have varying

load. The third party vendors pay the server farms according to the tra�c they receive

thus it becomes important that resources are scaled according to load as well. Given these

constraints management energy of server farms becomes a very complex and involved task. A

plethora of studies exist on its optimum management but since the thesis scope of this thesis

is limited to self-managing energy systems for home consumers we will only mention some

of the key works which represent the domain of server farm energy management systems.

42

The goal is not to provide an exhaustive list of energy management system for server farms

rather it is present a holistic picture of what is done in server farms energy management to

draw parallel with home consumer energy management.

We can divide the self-managing energy systems for server farms in to three categories.Those

which plan, model or visualize the energy consumption. We will look at each in turn.

Modeling

The main task of a server farm is to respond to requests. Since to respond to these requests

result in expending of energy, most of the studies to model the consumption have linked

requests with energy consumption. For instance Yuan and colleagues [Yuan et al., 2011]

modeled power consumption by observing requests in multi-tier service oriented systems.

Similarly Leite and colleagues [Leite et al., 2010] proposed a stochastic model in web-hosting

cluster focusing on control of power and tardiness.

Planning

A server farm or a service provider of a server farm can tune a series of parameter to plan

a more energy e�cient system. The scale can vary from assigning loads to di�erent server

farms to assigning loads to di�erent machines in a server farm and all the way down to

managing devices on a machine optimally. Various researchers have proposed solutions at

each layer of control. Ilyas and colleagues proposed allocation of loads to server farms to

reduce total energy cost [Ilyas et al., arch]. Deng and colleagues proposed self-managing

methodology to reduce carbon footprint [Deng et al., ].

At device or virtual machine level Shen and colleagues proposed CloudScale for resource

provisioning in cloud [Shen et al., 2011]. Similarly Zhang and colleagues proposed dynamic

provisioning in clouds again on a virtual machine level [Zhang et al., 2012].

In comparison some researchers planned work allocation by minimizing the cooling re-

43

quirements. Das and colleagues used utility functions to plan allocation whicle keeping

cooling loads of server farms as optimization goal[Das et al., 2010]. Vasic and colleages pro-

posed a thermal aware scheduling of workload [Vasic et al., 2010]. These planning were at

machine level within a single server farm.

A plethora of work exist for device throttling such as by David and colleagues who propose

memory power management via voltage/frequency scaling [David et al., 2011].

A work encompassing all three levels is 1000 isnla nds proposed by Zhu and colleagues.

They proposed hierarchical planners which optimized resource allocation at each level for

lower energy consumption [Zhu et al., 2008].

Visualization

Visualization system provide feedback to human operator and self-managing agents to ob-

serve the energy consumption of a server farm. WattApp by Koller and colleagues provide

an interesting self-managing application to visualize energy consumption by deployed ap-

plications. This aids in charging the applications based on power consumption instead of

requests alone [Koller et al., 2010].

2.5.2 Home Energy Management

A self managing energy system in principle should be able to optimize the energy needs of the

consumer by looking at the consumer needs, available power and tari� and then implement

this plan with minimal intervention. Following are some self-managing energy system for

home consumer.

Planning

The most interesting work in this regard is by Beal and colleagues. They �rst present

ColorPower [Ranade and Beal, 2010] in which the proposed a stochastic model for coordi-

44

nated demand side management. This was followed by ColorPower II which enhanced some

features of original system [Beal et al., 2012]. Ramchurn and colleagues also propose a de-

centralized DSM based on agent planning [Ramchurn et al., 2011]. However, the technique

is too general and simulation too broad and generic to merit a deeper study at the moment.

Modeling

Modeling energy consumption of end user is a very complex task since compared to server

farms metrics to relate energy consumption with other observable trends are not available.

Javed and colleagues presented a study where they showed that a latent relationship exist

between anthropologic and structural data with energy consumption [AE10]. Other studies

look at the movement of occupants and energy consumption such as that by Hoelzl and

colleagues [Hoelzl et al., 2012]. Tarzia and colleagues in comparison analyzed display power

management policy against di�erent user preferences [Tarzia et al., 2010].

45

Chapter 3

System Architecture

Demand side management for home consumers in the smart grid is a complex problem

requiring forecasting, demand disaggregation, planning and controlling of devices. Di�erent

researchers have proposed algorithms to resolve these tasks. However, very few architectures

provide a way to integrate the di�erent cogs together into a single cohesive and integrated

system. In this chapter �rst we discuss our proposed strategy for self-managing demand

side management for home consumers in future smart grids. We then describe the cogs that

are needed to deliver the functionality and present an architecture within which these cogs

integrate to provide the demand side management services. The details of cogs are discussed

in the next three chapters.

46

3.1 Introduction

Demand side management historically has been deployed for large scale consumers. In the

pre-smart grids era it made sense as well since managing hundreds of thousands of devices

was not feasible both network wise and algorithmically. Large scale consumers provided a

simple interface. The forecasting mechanism needed for such systems only required the total

consumption of the entire grid. Since the demand was smooth and showed low volatility

across the depending attributes, it was easier to forecast the demand. The demand curtail-

ment goals could be negotiated with the industrial consumers before hand even with explicit

human involvement.

In comparison managing demand of hundreds of thousands of household devices is com-

plex due to two reasons. First is the volatility of the loads and the second is the scale of the

problem. Although technologies are being proposed which forecast loads with a measure of

accuracy despite of the volatility and new algorithms are being developed to plan for large

number of devices, a concrete framework or architecture within which these technologies can

be integrated and deployed has not been observed.

In this chapter we describe an architecture to implement demand side management in

the future smart grids. Our intention is to show how one can integrate the forecasting,

load disaggregation, modeling, and planning algorithms discussed in this thesis in a single

system since to our knowledge no such architecture exists within which we can place our

technology. We start our discussion with the description of our target energy system and

the architecture to implement the DSM program. This is following by proposed architecture

and a brief introduction of the technologies that work together to deliver DSM. Details of

the technologies will be discussed in the subsequent chapters.

47

3.2 Proposed Strategy

We started with the hypothesis in chapter 1 that If we instrument DSM as an autonomic

system then we can increase DSM's e�ectiveness and make it practical for utilities to increase

DSM's e�ciency and applicability in smart grids. We further limited the scope of the system

to manage loads of the high energy consuming devices: Air-conditioners and refrigerators. It

is much more feasible to control these devices as they are fewer in number and are the only

ones which have elasticity of use. Due to their high consumptions they have su�cient impact

for the DSM goals. This is can be observed as well from a number of DSM systems discussed

in chapter 2 which speci�cally focus on such devices for their DSM implementation.

The goal is to plan the energy in such a way so that it e�ects the end user in an acceptable

way. Since the DSM requirement is critical, we chose the contractually bound system model.

Our system o�ers di�erent contractual service level guarantees to the user. For example,

we o�er that no air-conditioner will be turned o� for more than 30 minutes in an hour. A

user may request a tighter guarantee at other hours for additional price. We instrument the

sockets of these devices with GSM chips and relays as actuators to implement the plan. Our

strategy is that we plan in advance for 24 hours of when should each device be scheduled to

run. This schedule considers the supply goals and the service level guarantees to formulate

the plan. This plan is then propagated to device controllers for execution.

Although the users are concerned about their privacy but the price di�erential is so

signi�cant that they are willing to share some data with the utility. However, sharing data

among other users was not acceptable. With tens of houses containing thousands of devices

it was not feasible to control each and every device therefore we focused on the device which

hogs the most energy. Given the long and severe summers, air conditioning is the biggest

consumer. As per our assumption, most of the houses equip some rooms of the house with

split or window AC units. The load elasticity for air-conditioning is su�ciently good such

48

that we can pre-cool as well as slow-down cooling at critical times. We also assume that the

houses are already �tted with automated meter reading (AMR). The AMR can use GPRS

or GSM to communicate the monitoring data as discussed in [Omer et al., 2010].

3.3 System Architecture

The strategy discussed in the previous section requires planning of hundred of thousands

of devices. To achieve this however, we �rst require measurements of the devices for which

we are planning. However, the device energy usage is very volatile. Instead we considered

the option of forecasting the relatively less volatile household load and disaggregate the

high energy loads from this forecasted load. The architecture of the resulting self-managing

demand side management system is shown in �gure 3.1. We call this a forecast-disaggregate-

analyze-plan (FDAP) loop. The detailed description of each step will be discussed in the

next three chapters. The collection infrastructure and actuator mots are outside the scope

of the research work discussed here and are only described to complete the picture.

3.3.1 Collection Infrastructure

Self-managing DSM collects its consumption data from the automated meter reading (AMR)

[Wallin et al., 2005]. There are a verity of AMR infrastructures available which vary in their

price, data collection and transmission frequency and accuracy. Since our system speci�cally

targets high consumption devices only and that also on a coarser granularity, we deployed

MicroTech International's meter which collects data on a minutes interval and transmits

the average consumption every 15 minutes. The data is transmitted through an SMS based

protocol developed over the GSM network. This solution is the most cost e�ective solution

as shown by Omer and colleagues [Omer et al., 2010]. The data is received by a web-service

based application which processes the data and stores in database [Liaqat et al., 2012]. The

49

Figure 3.1: Self-managing demand side management architecture.

data is maintained in the data store where it is retrieved for forecasting and analysis.

3.3.2 Forecasting

As discussed in chapter 2, a plethora of energy forecasting algorithms exist for forecasting dis-

trict wide loads over a period of 24 hours [Rahman and Bhatnagar, 1988, Yang and Huang, 1998,

Chen et al., 1993, Mastorocostas et al., 2000, Hippert et al., 2001, Bakirtzis et al., 1995, Dehdashti et al., 1982,

Dash et al., 1998, Abdel-Aal, 2005, Chen et al., 2010, Yao et al., 2000, Amjady and Keynia, 2009,

Lauret et al., 2008, AlFuhaid et al., 1997, Papalexopoulos and Hesterberg, 1990, Christiaanse, 1971,

Lauret et al., 2012, Irisarri et al., 1982, Amjady, 2001]. Generally these are called short term

load forecasting systems. However, since our controlling plane are individual devices, we re-

quire a forecast which can provide us with su�cient device consumption habits of individual

devices.

We are faced with two questions to use this body of knowledge of STLF at city level

50

for using it to forecast energy load of individual houses or devices. First, can current short

term load forecasting (STLF) models work e�ciently for forecasting individual households?

Second, does extended data enhance the forecasting accuracy of individual consumer loads?

We will show in chapter 4 that forecast using existing STLF models is extremely error-

prone using standard forecasting models. The traditional model forecasts using a model for

a single time series. That is, to forecast load of a house, the algorithm will take historical

data of the house and will try to reconstruct the future loads from it. However, the data of

a single house is extremely volatile which results in a very low accuracy.

For the second question it was hoped that with additional data about the user or the house

a better forecast may result but our results show that using the existing model no signi�cant

improvement results when richer data is used [Abaravicius and , 2007, Wilhite and , 2000].

However, the initial analysis for forecasting did not show promising results. The main reason

for this was volatility of the data. The energy pro�le of a region or a city is relatively

smooth since di�erent loads attenuate or neutralize loads to give a smooth curve. As the

scope of the system gets smaller the increase in volatility is greatly increased. This has been

observed and reported by various researchers [Amjady et al., 2010, Javed et al., 2012]. To

cater to this volatility to our knowledge very few solutions have been proposed. The most

relevant is by Amjady and colleagues forecasting system for micro grids [Amjady et al., 2010]

and Gurguis and Zeid's STLF method for buildings [Gurguis and Zeid, 2005a]. A recent

study by Ardakanian and colleagues attempted to forecast household loads using Markov

models but the solution was for a di�erent problem and is barely applicable for DSM domain

[Ardakanian et al., 2011]. The basic method of forecasting here is to construct a single model

for a single structure, be it region, city or micro grid. However, we observed that forecasting

for a single house becomes untractable due to volatility.

In contrast we introduce a new modeling method where we train a single model using the

anthropologic and structural attributes of all the houses [Javed et al., 2012]. We then use the

51

anthropologic and structural parameters of a house to forecast the future load. This builds

on the basic premise that di�erent houses have resemblance but this connection is temporal.

For example houses which have school going children will have similar rate of change in

electricity use early in the morning on week days whereas houses with senior citizens will

have similar pattern around 11 AM. Similarly houses with good insolation will have lower

energy usage during extreme weather than those with ordinary insolation irrespective of

the age group of people inside. We build a model which leverages these these temporal

interdependence for a more crisp forecast. We call this short term multiple load forecast or

STMLF. Forecast through STMLF is provided to the analyzer for further processing.

3.3.3 Disaggregation

The general problem of identifying individual device consumption from the total consump-

tion is called load disaggregation.There are di�erent algorithms available for this purpose

[Marceau and Zmeureanu, 2000, Lai et al., 2010, Liang et al., 2010]. However, there is a ma-

jor di�erence between the standard load disaggregation problem and our task. Load disag-

gregation is usually applied on actual data at a high frequency to identify devices in realtime.

In comparison we require load disaggregation of only the high energy consuming devices from

the forecasted loads. Since our forecast is at the granularity of an hour, our load disaggre-

gation will also be done at hourly basis. In chapter 4 we answer the critical question that

is it possible to disaggregate the device consumption prediction from the total household

load prediction on an hourly scale with high accuracy? We achieve this by modeling the dis-

aggregation problem using a combination of Arti�cial Neural Network and Support Vector

Machines.

52

3.3.4 Analysis

As discussed in chapter 1, the demand side management system is a dynamic system where

the number of devices being actively used varies over time. This variation even occurs

during the course of the day. If we construct a static model then we will need to construct

the model which caters for all the devices which possibly exist in the system. However, as

we have discussed, the scheduling problem is NP complete and requires approximation for

arriving at the solution in tractable time. This approximation introduced a drop in accuracy

of the system. To maximize the accuracy of the system, we propose a dynamic modeling

method which construct the system model at runtime. The system dimensions, or its level

of approximation is identi�ed by evaluating the system size, or number of devices requiring

management, the time available for scheduling and historical record of computational time

for di�erent system sizes. The details of this analysis mechanism and modeling method is

discussed in chapter 6.

3.3.5 Planning

The planning model takes dynamically analyzed and constructed model and constructs a

plan accordingly. For our system we optimize this plan to maximize the usability of energy,

that is our system maximizes the number of devices that can be used while staying within the

electricity supply limit. To build this model we used three di�erent algorithms. We used an

integer programming formulation for exact solution and two variants of linear programming

for clustered solution: the interior point method and the simplex method.

If an exact solution is required then we use an integer program which is similar to a two-

dimensional knapsack problem for deciding the schedule. But if the approximate algorithm

is chosen then as an approximation we apply a clustering algorithm to combine loads which

adhere to similar contracts. The goal of clustering is to combine loads which share their pro-

53

�le. This transformed our problem from a binary selection problem to frequency evaluation

problem. That is, instead of deciding the on or o� state of a machine, we calculated the

number of machines which should be switched on in each of the clusters. The result can be

computed using a linear program and the real results rounded o�. This adds a round-o� error

however, we proved that this error will be less than 6%. On the other hand this increased

our scalability to hundreds of thousands of machines which for integer programming solution

was limited to hundreds. The choice of clustering and the error threshold for clustering is

decided in the analysis phase since this has rami�cations on the total time of execution. The

generated plan is passed on the actuators which are dispersed in the user homes to enforce

demand side management. This algorithm is discussed in detail in chapter 6 section 1.

3.3.6 Actuators

The plan constructed through FDAP is distributed to the device controlling mots. Device

controller is simply a relay with a GSM device. Based on the control signal the air-conditioner

or the refrigerator is switch on or o�. We developed series of mots for this purpose. The

mots use SMS over GSM for communications and are equipped with a standard controller

and a relay to control power details of which can be found here [Omer et al., 2010].

3.4 Discussion

In this chapter we have presented an architecture which is an adaptation of the autonomic

computing's Monitor-Analyze-Plan-Execute for self-managing demand side management.

The architecture integrates the forecasting and disaggregation components with the analysis

and planning components. Clean interfaces are de�ned across the components for a smooth

integration of the di�erent cogs needed for self-managing DSM.

54

Chapter 4

Forecasting Energy Load for Individual

Consumers

In the chapter 1 we discussed how demand side management can increase energy e�ciency

thereby reducing the cost and carbon foot print of our energy consumption. However, for

individual end consumers to be part of this scheme a dependable forecast of their behavior

is a must. In this chapter we answer two main questions for forecasting loads for individual

consumers: First, can current short term load forecasting (STLF) models work e�ciently

for forecasting individual households? Second, do the anthropologic and structural variables

enhance the forecasting accuracy of individual consumer loads? In this chapter we will show

how a single multi-dimensional model forecasting for all houses using anthropologic and

structural data variables is more accurate and correct than a forecast based on traditional

global measures. We have provided an extensive empirical evidence to support our claims.

55

4.1 Introduction

Demand side management is the process of managing end user consumption. Various DSM

planning strategies have been proposed for smart grids but to implement such planning

methods the knowledge of the amount of energy demand at house level is a must. This

requires a short term load forecast for houses, and in some cases even devices. To this end

in this chapter we propose two unique concepts for short term load forecasting of houses

through which accuracy for forecasting loads of houses can increase by as much as 50%.

This provides an important cog in our proposed smart grid architecture for demand side

management discussed in chapter 4 and 5.

Forecasting for larger loads such as city or the entire grid has been achieved with rela-

tively high accuracy [Alfares and Nazeeruddin, 2002]. But for smaller populations such as

a building, or a micro-grid the dynamics change so drastically that standard STLF tools

require certain re-adjustments [Amjady et al., 2010]. For even smaller consumer group, such

as individual houses, the volatility in dynamics is even more pronounced as can be seen from

discussion in section 4.2. To forecast for such system we need to look at the STLF model-

ing, tools, and data. There are two pertinent questions to engineer these re-adjustments for

STLF for individual houses in a system that we answer in this chapter. First is that can we

forecast energy load using the existing short term load forecasting model? Second question

is that is the knowledge used for existing forecasting models su�cient?

Kim and Shcherbakova point out at the lack of data about user as the failure for

DSM and DR programs but our initial results showed that simple correlation between

user and house characteristics is weak and the strongest in�uence on demand is weather

[Kim and Shcherbakova, 2011]. This was observed on anthropologic and structural data col-

lected from 205 houses in Eskistuna, Sweden. However, we observed that a subtle relationship

between user characteristics and consumption.

56

To observe and use this relationship for forecasting we trained a single model for the 205

houses and used the richer dataset as the di�erentiating factor between houses. In essence

our model is a short term forecasting model for multiple loads.

To illustrate how this work let us take example of two houses, one with school going

occupants and the other without such occupants. The bulk of energy consumption in both

cases will be driven by weather pattern. The colder it is the more energy will be used. But

for houses with school going occupants, the energy usage in the early hours of weekdays will

be di�erent than the others. Furthermore, This will be common in all the houses with school

going children.

The idea is that we train a single multi-dimensional model using the data from all the

houses. This on its own will mean that the forecast will be average load for each hour for all

the houses. This is where our second contribution comes in. We augment this single model

by adding the anthropologic and structural data to the model. This additional information

allows the modeler to make sub-groups within the model for particular anthropologic and

structural population groups. In our example the modeler will be able to identify the rela-

tionship of a house having school going occupants with extra energy consumption in earlier

hours on weekdays. This will allow the model to add a premium to consumption over what

the weather pattern will forecast. Since all houses with school going children will have sim-

ilar trends, if a single house has a di�erent trend, for instance because child being sick and

missing school, then a global modeler will not over-�t the model and still forecast accurately

when the local temporal phenomenon expires. Note that an exponentially high number of

sub-classes exist for the population but a combined model adds and subtracts premiums over

the base forecast to derive a more crisp load for each house.

This modeling method is inherently di�erent from modeling for each house independently

(STLF). It is also di�erent from modeling for all the houses without the anthropologic and

structural data. We would like to stress here that the forecasting engine (ANN and MLR) is

57

not part of the contribution. The contribution is the new modeling paradigm - short term

MULTIPLE load forecasting (STMLF) and the use of anthropologic and structural data

within STMLF. As we will show this combination increases forecast e�ciency for both AI

based and statistical forecasters. To stress on the improvement of forecast based on our

contribution and avoid engine speci�c enhancements we use the simplest of statistical and

AI forecasters.

The rest of the chapter is organized as follows:In section 4.2 we discuss the problem and

show through various results the volatility of data which will necessitate use of anthropologic

and structural data under STMLF (section 4.2). In section 4.3, we introduce STMLF as an

extension of the basic STLF model. In section 4.7, implementation issues for STMLF are

discussed and STLF techniques which can be complementary with STMLF are identi�ed.

Next the experimental setup including discussion on data, forecasting engine and measure-

ments that are used to verify the correctness of forecast are discussed in section 4.4. In

section 4.5.1 results gained through the use of anthropologic and structural data are shown

followed by results of STMLF in comparison to multiple STLFs in section 4.5.2 followed by

conclusion and future works section.

4.2 Problem Description: Issues in house level forecast-

ing

When we look at forecasting for large number of aggregated loads we see that it is signi�cantly

simpler since loads within a large system tend to neutralize or attenuate the total demand.

However, for individual loads this is not the case. This results in a highly volatile load

data that is di�cult to forecast. Even when we consider loads across the population at a

particular time instance we see a large variation. If we consider the standard time domain

volatile measure of standard deviation of rate of change, we �nd that volatility of individual

58

loads is order of magnitude more than that of a traditional grid [Amjady et al., 2010]. To

illustrate this, table 4.1 compares time domain volatility measure of individual loads to that

of micro grid, a city wide and regional grid volatility measures from the study of Amjady

and colleagues [Amjady et al., 2010].

Though this volatility in individual loads is dependent on global phenomenons such as

time of day and temperature but loads vary signi�cantly even for the same hour. This is

illustrated through the box and whisker plot in Figure 4.1 for a 24 hour period for load data

of a single day of 204 houses from Eskistuna, Sweden. The whiskers show the maximal value

in a given hour and box encloses 50% of the total data (top edge represents 75th quartile and

bottom edge 25th quartile and line in middle the median). If we construct a model using only

global phenomenons then irrespective of forecasting engine, there is no way to di�erentiate

between loads which are close to mean and which are not. This causes an accuracy drop and

increase in mean square error (MSE).

From such systems it can be inferred that any model trained on global variants is bound

to miss-forecast signi�cant number of individual loads since global variants can only identify

the general trend of the system and do not have su�cient discriminants to di�erentiate

between individual loads. To di�erentiate individual loads, we require deeper understanding

of the individual loads.

There are two types of data variables which a�ects the load consumption: anthropologic

System Individual Micro-grid Alberta's Ontario'sloads (University of power power

Calgary) system systemStandard deviationof normalized rate 1.82 3.83× 10−2 1.84× 10−2 2.69× 10−2

of change

Table 4.1: Comparison of volatility measure of individual loads, micro grid loads and stan-dard grid loads

59

aspects and structural variants. Anthropologic aspects are occupant characteristics such as

number of occupants, age, and so on while structural variants capture the physical char-

acteristics of the house. To construct a forecasting model which can di�erentiate between

consumers we conducted a survey consisting of both anthropologic and structural questions.

The details of the survey are provided in �gure 4.2.

This questionnaire combines a mixture of anthropologic (column 1) and structural (col-

umn 2) questions and pseudo-anthropologic (column 3) questions . These questions are

aimed at capturing a variety of information that ranges from the ages of occupants and

their general behavior of occupation to the type of walls, heating equipment, covered area

of property, etc.

Other than the structural and anthropologic data another important consideration for

modeling this system is the scalability of the proposed technique. Since this forecast is needed

for each house, it is impractical to have a dedicated forecaster for each house. First, this will

require signi�cant computing resources for each load. Second, since each forecaster will have

access to data of a single user, cross-cutting patterns of usage will not assist the forecast.

Intuitively, the method which can process various loads together will be better suited. In

the next section we discuss such a paradigm which can model multiple time-series. It turns

out that such a modeling framework has better accuracy and lower error than an STLF for

each individual load as illustrated in section 4.5.

4.3 STMLF Model

The need for STMLF is born out of the inherent short comings of existing short term load

forecasting models when forecasting for household loads. These short-comings stem from

the fact that till recently, the control of energy in grid did not provide a detailed control

of the demand side. The demand side, although made up of individual loads with their

60

Figure 4.1: Box and whisker plot for consumers load over a 24 hour period of 204 housesfrom Eskistuna, Sweden. Whiskers point the maximum load for the hour upper and lowerbox edges are 25th and 75th quartiles respectively and the line in box is the median. On Xaxis is time at intervals of one hour and Y axis is load in KWH. .

own pro�les, was considered as a single large chunk. Some researchers acknowledged this

diversity of patterns in load data but they only used the sub-pattern to forecast the total

load of the system and were less concerned with forecasting individual loads. For example,

[Tan et al., 2010] and [Nguyen and Nabney, 2010] leveraged this fact by identifying these

patterns through wavelet transform and forecasted the more crisp sub-patterns rather than

a complex combined pattern. These sub-patterns were then combined to form a single

forecast for the entire system. The break-down of the wavelet was also to a degree where

it was needed for large system forecast and not to forecast independent components of the

load.

However for our proposed ADSM in micro-grids the need is no longer for an aggregation

of all loads, rather our interest is to �nd the individual load value for each house for DSM.

But the existing methods which are used for STLF are explicitly limited to single time se-

ries. There are two options, either we use existing STLF for each house or we appropriately

61

Figure 4.2: Classi�cation of survey questions. We classi�ed questions as anthropologic (hu-man centric), or structural (building speci�c) and pseudo-anthropologic which are occupantsimpact or usage of structural facilities.

62

transform the STLF model to work for forecasting multiple loads. We found that trans-

forming STLF to STMLF not only increase e�ciency of running time but also increase the

accuracy of forecast as well. To understand this transformation we will �rst de�ne STLF

as an abstract system and then use this abstract model to explain the transformations to

realize STMLF.

4.3.1 STLF Operations

To understand the working of STLF and reason about the need for STMLF, we will �rst

diverge for a brief discussion on STLF's working at an abstract level. STLF is usually a two

step process. First an STLF modeler builds a model based on the time-series of consumption.

This time-series is usually complemented by other environmental variants which e�ect energy

load. These may include temperature, time of day, season, day of week etc. In addition each

model requires some tuning parameters and constants such as weights for algorithms which

are speci�c to the algorithm and the input data. These are the invariants, or variable which

do not change over time.

Formally we can say that STLF modeler is a function given by:

STLF (T(1..j,0..t−1), P0..t−1, E) = M (4.1)

Where T(1..j,0..t−1) is the time-series for j environmental variants such as temperature,

wind, solar radiance etc., P0..t−1 is the historical time series data of load and E are the local

invariants and tuning parameters such as weights given to parameters etc.

For most forecasting engines the input is usually streamed as series of tuples of data.

Each tuple is made of j + 1 + |E| values, j values for j environment variables, 1 value for

the load and |E| values for the number of invariants. For example, for �fth time quantum

there will be four tuples representing readings from �rst four time quanta and so on.

63

Based on this input STLF creates a model M . M can be simulated such that e�ects of

environment variants T(1..j,t), and invariants E for a speci�c time t over this model produce

load Pt. That is:

simulate(M, t, T(1..j,t−1), E) = Pt (4.2)

Here Pt is the forecast for the system for time T = 1. The modeler usually associate

the variants with speci�c load values. This creates a model of the system to be forecasted.

When a new forecast is required the model is simulated by providing it with variant and

invariant data for forecasting period and model simulation produces the load value which is

associated with the input data.

4.3.2 STLF for Independent House Forecast

STLF forecasts are for a single system. To forecast for a number of houses, this translates

to having an STLF modeler and simulator for each house. For such, following the general

convention of STLF, the input to each modeler will be series of t tuples where t is the length

of training period. Each tuple will contain the environment variable value, the load of the

house and invariants for the modeler. There are two problems with this method of forecasting

which we have discussed brie�y before and will delve in more detail here.

First, such a large number of modelers will require large computational resources. Either

each house will require computing resources to store data of the house and run a compu-

tationally complex model for every forecast or the utility will require numerous computing

resources to achieve this goal.

Secondly, as we have pointed out in the previous section, the load curve of a house is order

of magnitude more volatile than any other system that STLF has been applied on. There

are further two issues with modeling such volatile systems. First, su�cient data attributes

64

should be there to discriminate the root causes of volatility and second su�cient data should

be provided to avoid over-�tting. Over-�tting is the phenomenon when a forecaster captures

outliers, or out of ordinary incidences and considers them part of the normal operations,

thus increasing the error of forecast. First issue is related to the number of attributes of

data and second issue is related to number of good examples for each attribute combination.

Application of STLFs over house loads with existing data su�er from both of these prob-

lems. As we will show in our experiments the existing global variants are insu�cient to

discriminate house loads. This is because the house data is too volatile and the environmen-

tal variants of system are insu�cient to associate a load value to the input. This is evident

from our evaluation results later which show STLFs as ine�ective in forecast loads of houses.

To illustrate this point further let us take an example of a houses in a neighborhood. We

know from previous studies that temperature, day of week and hour of the day are the major

factors for energy load if considered in the same season. For a forecast for the aggregation of

loads these input parameters are su�cient. But each house has its own anthropologic and

structural characteristics. For instance if one house has school going children and another

has only o�ce going residents then though the bulk of the load will be decided by global

factors but the house with children will start consuming more energy a few hours or minutes

earlier than those who do not have this peculiarity.

Secondly, an STLF at house level will over-�t the data since it will have insu�cient sized

data. If we move this STLF to neighborhood level then we will have su�cient data to avoid

over �tting but this model will not be able to capture the di�erences in load variations since

it does not have the discriminating attribute to capture the volatility of sub-systems. Input

vector to this modeler will be the total (or average) load value, global variants and system

invariants. The result will be a forecast for average load of all the houses. This will be an

inaccurate forecast for both the house with school-going children and for those who do not

have this peculiar characteristic. Thus we need a forecast which has considerable size of data

65

to avoid over-�tting and su�cient attributes to di�erentiate between di�erent load patterns.

4.3.3 STMLF1

To ameliorate this problem we propose STMLF, a modeling framework for combining mul-

tiple time-series. We propose two paradigm shifts from STLF for this model.

First, instead of creating load model from a single time-series, we use all the available

time-series as training data. This is di�erent from sum of loads where all the loads are

summed and STLF forecasts the sum (or average) of loads. Rather each load and its at-

tributes are passed to the modeler as a tuple. That is instead of providing one value for each

time-period, we provide n tuples for each time-period. Here n is the number of houses. This

resolves the issue of over-�tting since su�ciently diverse data smooths the out of ordinary

events.

But just combining time-series in a single system is not su�cient. As we have discussed

above, we need to provide discriminating attribute for the modeler to associate the learning

output value with the input values.

Our �rst attempt was to use houseIds as the discriminating attribute. Such a model can

be expressed as:

STMLF1(T(1..j,0..t−1), P1(0..t−1), .., Pn(0..t−1)

, E, houseId) = M1 (4.3)

Here T and E are same as in single load forecasting but for each load i a time series Pi is

also considered.

The resultant model M can be simulated to map time t, environmental variants T , in-

variants E, and the index of load i to predicted load for P(i,t). That is:

simulate(M1, t, T(1..j,t), E, houseId) = Pi,t (4.4)

66

In this model an input tuple in addition to load value, environmental variants and system

invariants also contains the houseId �ag. For house number x the xth �ag is set as one and

the rest as zeros.

This scheme has two drawbacks. First houseId is too vague for the modeler to associate

load patterns with. We will prove this empirically in an experiment later in which we show

that our STMLF2 model is better than the STMLF1 model based on houseId discrimination.

We see that the forecast is strongly dependent on global variants and insu�ciently aided by

discriminating attribute.

Second, this scheme is computationally complex and not scalable as we will discuss later

in this section. A graphical representation of STMLF using houseId as discriminants in a

neural network is shown in �gure 4.3(b). Each input attribute of the tuple corresponds to

a neuron of �rst layer. The number of neurons grows with the number of houses we try to

model. This means that the neuron model is in the order of the number of houses. As the

number of houses grow the size of the neural network grows as well. This neural network

analogy is equally applicable in other forecasting models such as Bayesian belief networks,

time-series analysis etc.

4.3.4 STMLF2

Instead of this complex and inaccurate model, our second paradigm shift is to consider

richer data for forecast. This richer data incorporates the anthropologic and structural data

discussed in section 4.2. This resolves both the problems we faced in using independent

STLFs and in using a combined model using houseIds.

In this methodology, the modeler is provided with the local invariants in addition to

the global variants to construct its model. An input tuple for STMLF2 consists of the j

environment variants, the load data for the house, the system invariants and in addition

local invariants of the house which correspond to the load P .

67

STMLF2 considering this richer data is expressed as following:

STMLF2(T(1..j,0..t−1), P(1..n,0..t−1), E, E′(1..n,1..k)) = M2 (4.5)

Here E ′(1..i,1..k) maps the invariants of house to the load time series. A forecasting engine

will create a model which will associate T , P , and E ′ with the outout. Simulating this model

is a bit di�erent. Instead of providing the house �ag, the invariants of houses with E ′1..m

values are used, in addition to t and T , to construct a forecast for all the houses with E ′1..m

characteristics.

simulate(M2, t, T(1..j,t), E′(1..m), E) = P(1..m,t) (4.6)

We will �rst discuss its graphical representation in neural network model and then discuss

why it is better than STLF and STMLF1. A graphical representation of STMLF using richer

data as discriminants in a neural network is shown in �gure 4.3(c). Each training record of

our model is a tuple consisting of global variants (hour of day, day of week, temperature,

etc.), house variants (number of occupants, number of school going children, wall types,

etc.), and load value for that houseId under the variants. Each input attribute corresponds

to a neuron of �rst layer. The trainer associates weights with each neuron.

In this model di�erent input parameters or their combinations are assigned according to

the training data. Temperature and time of day may have higher weights but statistics such

as number of children will add their weight to the output as well. This weight can be positive

or negative and modulates the temperature driven load on the basis of local characteristics.

To explain this further let us consider the example we discussed above. In such a case,

when input for number of school going children is positive and time and date is early in

the morning and weekday then the the internal node connected with these input neurons

will add positive weight to the output. So for all the houses with these characteristics, in

addition to the load forecasted due to weather conditions, an additional load will be added.

68

(a) (b) (c)

Figure 4.3: ANN models for three forecasters. (a) is ANN model is for a single house whereonly load and global invariants are provided for forecast. (b) is the ANN model for STMLF1.(c) is the ANN model for STMLF2

In comparison, houses with no school going children will only be a�ected with weather

conditions. We add another twist to this example. For the houses with senior citizens, the

consumption may be low early in the morning but is high around 10 AM. For the houses

which contains senior citizens, a load will be added to the base load at 10AM. For those

with school going children, the addition will be for 7 AM. But if a house has both then it

will borrow from both models and will register speci�c consumption patterns for both 7 AM

and 10 AM. In this way we can potentially construct a model from a subset of houses and

use this model to forecast houses with similar trends and traits for forecasting.

4.3.5 Model Complexity

We have discussed three models. First is an STLF for each house, second is the STMLF

using houseId as discriminant and last is STMLF with house attributes as discriminant.

Complexity of a modeler is generally expressed in terms of the extra time system will take

with addition of input parameters. Traditionally O() (big oh) analysis considers the worst

time, that is the most time that algorithm can ever take and is taken as the academic and

69

industrial standard in computational sciences. Here we would like to clarify that O(1) does

not mean a small execution time. Rather it means that as we increase the number of houses,

the expected worst time of completion of algorithm remains unchanged for a single modeler.

Scalable modeler is one which is at the most some polynomial function of input variables

since power series or exponential series are intractable for large data sets. Without loss of

generality we can assume the complexity, or e�ciency, of a modeler as O(xγ) where x is the

number of input parameters and γ is modeler e�ciency

Considering the complexity of our modeler, STLF for each house will have the complexity

of:

Ostlf = n×O(jγ) = O(n)×O(jγ)

That is, we will have n STLFs and for each STLF we will require O(jγ) computations.

Here j are the number of environment variables.

Since j is not dependent on the number of houses, Eγ is a constant for a system with

�xed E and γ thus O(Eγ) = O(1):

Ostlf = O(n)×O(1) = O(n)

In comparison STMLF1 with houseId as discriminant will have

Ostmlf1 = O((j + n)γ)) = O(jγ + nγ) = O(1) +O(nγ)

Here the number of input parameters are j environment variables and n houseId �ags.

Although we require only one algorithm but since houseId �ags increase with time the

complexity of the system is worst than STLF as long as gama > 1.

The third model is STMLF with k house attributes used to model the load. The worst

case analysis for this model is:

70

Ostmlf2 = O((j × k)γ)

That is the forecasting engine's time increases with increase in number of house param-

eters and environment variants but the number of houses do not e�ect the running time of

the algorithm. Since both k and E are constant for a system:

Ostmlf2 = O(1)

.

This means that this model is not a�ected by number of houses. The complexity time

will be similar irrespective of the number of houses that are being modeled and forecasted

for.

4.4 Experimental Setup

This section discusses the experimental setup for our experiment. First our forecasting engine

is described followed by the measures used to assess the e�ects of STMLF and richer data

for forecasting. We will then discuss anthropologic and structural data that was collected

for this experiment.

4.4.1 Forecasting Engine

We discuss in section 4.7 in detail the issues with existing forecasting methods for for fore-

casting in a multivariate environment. The issue is with building a multidimensional model

in higher dimensions. Secondly, our focus in this chapter is to show the e�cacy of our mod-

eling paradigm and e�ects of richer data on forecast. For this reason we select the two base

forecasting algorithms which are used for forecasting namely regression and neural networks.

71

It is easy to see that since most of the state of the art forecasting engines are extensions

of these two basic engines a proof of increased on the archetypical engine implies e�ective-

ness of STMLF with richer data for the enhancements as well. For neural networks we use

the basic resilient back propagation algorithm of ANN proposed by Riedmiller and Braun

[Riedmiller and Braun, 1993]. For regression we use multiple linear regression (MLR) since

it is able to construct model for multivariate input stream.

The forecasting engine is constructed in Matlab. A three layered back propagation neural

network is trained on three weeks of data. ANN consists of three layers, input layer consisting

of 60 neurons representing the input in layer 1(L1). Second layer (L2) consists of 20 neurons.

The output layer only consists of single neuron representing the forecast. The trained model

is used to forecast the power load for each hour for the next day.

MLR was also implemented in Matlab using Matlab's regression toolbox. The toolbox

implements the algorithm proposed by Chatterjee and Hadi [Chatterjee and Hadi, 1986].

4.4.2 Measurements

Measuring success for multiple individual forecasts is more involved than measuring success

of a single system. There are three measures that are usually used for such systems. (1) pre-

cision, (2) accuracy and (3) stability or certainty. These measurements are more appropriate

when measuring forecasts for multiple objects. Traditional measures such as percentage er-

ror and even MSE is not considered the most appropriate measures for numerous forecasted

data as they can be over-in�uenced by some very bad examples and can overshadow a good

forecast for majority of population. For example, if consumption for a house is zero for a

particular hour then any forecast other than zero will be in�nitely erroneous if we consider

percentage error. Similarly a forecast of 0.2 for a consumption of 0.1 will be hundred percent

inaccurate though the actual miss-forecast is 0.1. When we consider numerous forecasts,

the more appropriate measure is accuracy which measures the number of wrong forecasts

72

against the number of correct forecasts. This will be discussed in more detail below.

Precision

Precision is the measure of how close we are able to forecast to the actual load. To measure

precision we use mean squared error given by the following function:

MSEt =

n∑i=0

|Li,t − Pi,t|

n(4.7)

Where Li is the observed load and Pi is the forecasted load.

Accuracy

Accuracy is the measure of how many correct forecasts the forecasting engine makes. Cor-

rectness is a user de�ned parameter. It is preferred to de�ne correct forecast as a value

within a percentage range of actual load. However, for low loads, a percentage range be-

comes insigni�cant. For a load of 0.1 KWH, a 20% range would be 0.08 to 0.12 and a forecast

of 0.2 will be considered extremely wrong. However, practically a forecast of 0.2 will not

be very unsuitable provided that such loads are not majority of population. To avoid this

false loss of accuracy we have two scales to measure accuracy. We set a 15% range of error

for accuracy, but if load is smaller than 3 then we consider range of ±0.5KWH as range of

acceptable forecast.

So accuracy for time t is given as:

Acct =

∑1 +

{∀Pi,t>3&|Li,t−Pi,t|>Pi,t∗0.15}

∑1

{∀Pi,t≤3&|Li,t−Pi,t|>0.5}

(4.8)

Accuracy is speci�cally important measure for measuring success over multiple forecasts.

73

Stability

The third measure of correctness is certainty or stability, that is the variance in error. It is

given by

vart =

n∑i=1

(P̄t − Pi,t)2

n− 1(4.9)

Here P̄ t is the average forecasted load for time t.

4.4.3 Experimental Data Source

The data for these experiments was provided by the Department of Energy,

Building and Environment, Malardalen University, Sweden. We greatly appre-

ciate their e�ort in collecting and sharing this data.

A survey over 204 houses was conducted in Eskistuna, a small town 100 KM from Stock-

holm, Sweden. The main goal of the survey was to collect structural data of the house and

anthropologic data of its occupants. In addition, these 204 houses were �tted with AMR

which collected power consumed at each hour. Weather data was collected from local me-

teorological department for forecasting as well. The questionnaire collected from occupants

contained the questions discussed in section 4.2. To represent the seasonality and season

speci�c patterns we conducted our experiments over a 7 day period in each season. That

is, forecasts were made for a week of January, April, July and October to represent the four

seasonal variations.

4.4.4 Experimental Environment

The simulations for the experiments described below were run on a Intel core2 duo processor.

The clock speed was 1.3 giga-hertz with 2 GB of memory. Matlab's Neural Networks Toolbox

74

was used to implement the ANN.

4.5 Results

The results in this section show the e�ectiveness of STMLF with richer data against use of

STLF with richer data and STMLF without richer data. That is, there are three comparisons,

STLF with anthropologic and structural data, STMLF with global parameters and STMLF

with anthropologic and structural data. Our claim is that STMLF with anthropologic and

structural data is a more robust technique and will have a higher accuracy than the other two

combinations. Here the �rst experiment represents using existing modeling paradigm with

richer data and the second set represents our modeling paradigm without the richer data. To

validate this claim we run experiment on two existing forecasting techniques. The decision

these two techniques is based on the evaluation of existing load forecasting technique and

is discussed in section 4.7. As has been discussed, the goal of these experiments is to show

that STMLF outperforms STLF for AI based and statistical based techniques.

4.5.1 AI Based Experiment Results

4.5.2 Multiple STLFs vs. STMLF

This experiment is designed to compare forecast made by an STLF for each load and STMLF

for entire population. If STLF is applied for independent consumers then for each consumer

a STLF engine will be run for each forecasting cycle. To simulate this we trained and

forecasted 204 STLFs, one for each load for each day. For the 28 day testing period we

executed 5712 STLFs. The results were compared with STMLF executed for each day.

To analytically prove our results we compared the results of STMLF with aggregated

output of multiple STLFs using the three measures discussed previously, namely precision,

75

accuracy and stability. Table 4.2 lists MSE, variance and accuracy for the 4 training weeks

of STMLF against STLF.

As can be observed, STMLF is more precise than aggregated STLF results. Average MSE

for STMLF is almost 42% lower than of aggregated STLFs. Specially in autumn and winter

months, when consumption is relatively higher, MSE for STLF is 2.7 times more than MSE

of STMLF. We can correlate this increase in performance for all three measures to higher

consumption of energy in these months in Sweden.

Similarly, STMLF is more accurate than the aggregated STLFs. This increase is as much

as eight to ten percentage points di�erence in autumn and winter months (59.7% vs 48.9%

and 59.9% vs 51.2% respectively) due to similar reasons stated above.

Stability of the forecast shows same trend as variance for STMLF is lower than aggregated

STLFs for all the four months.

Month STMLF STLF AverageLoad

Var MSE Acc Var MSE AccJan 4.23 1.59 59.9% 5.53 2.29 51.2% 4.21April 2.7 0.93 52.5% 3.08 1.10 49.0% 2.21July 1.93 0.62 65.0% 2.62 1.12 62.1% 1.12October 2.69 0.95 59.7% 3.39 2.61 48.9% 2.61

Table 4.2: Results of 3 measures of forecast through multiple STLFs and STMLF. In additionaverage load of load for that week is provided to show a relationship between MSE andaverage load in that week.

As can be seen STMLF outperforms STLF for each of the 28 test dates across 4 seasons.

STMLF is as much as 17% more accurate on some days in addition to the scalability concerns

as were discussed in section 4.2.

76

a. b.

c. d.

Figure 4.4: Mean squared error for four test weeks (a. Week of January b. Week of Aprilc. Week of July d. Week of October) comparing STMLF with multiple STLFs. Blue line isSTMLF and red line is average MSE of all STLFs. Days of week are on X axis and meansquared error is on Y axis.

77

4.5.3 E�ect of Anthropologic and Structural Data

This experiment compares results of STMLF1 model with STMLF2 model and shows how

anthropologic and structural data in STMLF2 model can increase accuracy of forecast in

comparison to STMLF1 model using house-Ids as discriminating attribute.. Figure 4.5 shows

scatter plot of forecast against actual load for seven day period in January. For each day, two

scatter plots are presented. The top scatter plot presents the forecast using anthropologic

and structural data whereas the bottom graph shows the scatter plot of forecast through

house-Id only. As is anticipated, since house-id is insu�cient to di�erentiate houses, only the

global variants are playing an active role in forecasting. This is evident from lower scatter

plot as the forecasts resembles a horizontal line, that is for each day, STMLF1 forecaster

predicts the mean load for the day for all the (house,hour) combinations. In comparison,

through anthropologic and structural data a more crisp model is created which is able to

di�erentiate the inherent variation in loads and thus the forecast is closer to (x = y) line

representing correct forecast.

To analytically validate our results we conducted three tests of correctness, that is, preci-

sion (MSE), accuracy and stability (variance). Figure 4.6(a-d) plots result of MSE for each

day for the four experimental weeks. As can be observed, MSE of STMLF2 is always better

than the results from STMLF1. Table 4.3 provides average MSE for the four weeks. Average

Month Anthropologic and Control Experiment AverageStructural Model LoadVar MSE Acc Var MSE Acc

Jan 4.23 1.59 59.9% 7.31 3.39 36.4% 4.21April 2.7 0.93 52.5% 3.69 1.57 35.2% 2.21July 1.93 0.62 65.0% 4.86 1.12 49.5% 1.12October 2.69 0.95 54.7% 5.26 1.92 37.6% 2.61

Table 4.3: Results of 3 measures for forecast based on model constructed through anthropo-logic and structural data and forecast based on house-Id. In addition average load of load forthat week is provided to show a relationship between MSE and average load in that week.

78

0 2 4 6 8 10 12 14 16 180

5

10

15

0 2 4 6 8 10 12 14 16 180

5

10

15

0 5 10 150

5

10

15

0 5 10 150

5

10

15

0 5 10 150

5

10

15

0 5 10 150

5

10

15

day 1 day 2 day 3

0 5 10 150

5

10

15

0 5 10 150

5

10

15

0 2 4 6 8 10 12 14 160

5

10

15

0 2 4 6 8 10 12 14 160

5

10

15

0 2 4 6 8 10 12 14 160

5

10

15

0 2 4 6 8 10 12 14 160

5

10

15

day 4 day 5 day 6

0 2 4 6 8 10 12 14 16 180

5

10

15

0 2 4 6 8 10 12 14 16 180

5

10

15

day 7

Figure 4.5: Scatter plot of forecast against actual load for 7 day test period of January. Thetop plot in each �gure is forecast through structural and anthropologic data and bottom oneuses house-Id as discriminant. In all �gure actual load is on X axis and forecast is on Y axis.

79

MSE for the 28 day period for STMLF1 data was 1.02 as compared to 2.0 for STMLF1.

This means the richer data consisting of anthropologic and structural data reduces error by

close to 50%.

Similarly accuracy showed improvement with richer data. As can be seen from data

listed in table 4.3, forecast using STMLF2 model at average is �ve to ten percentage point

better than by STMLF1. If we consider accuracy across the 4 weeks the total accuracy for

STMLF2 model is 58.25% and for STMLF1 is roughly 40%.

The third test for correctness of forecast is its certainty. A forecast with low variance

means high stability. This entails a more meaningful or trustable forecast. Table 4.3 lists

the variance of both the experiment sets for the 4 experimental weeks. As can be observed,

variance is high for both the cases as is expected for a forecast for individual loads due

to volatility of underlying system. However, variance of STMLF2 model is nearly half of

the STMLF1 output (0.63 and 1.02 respectively for entire experiment). This conclusively

proves that STMLF2 model using anthropologic and structural data increase the accuracy,

precision and stability of consumer load forecasts.

4.6 Discussion on Miss-Forecasted Combinations

The results in previous section compares and contrasts the use of anthropological and struc-

tural data against the use of global variants as well as application of our modeling framework

-STMLF- for end user load forecasting. In the results of experiments we observed that a small

proportion of [load,hour] combinations contributed signi�cantly more to miss-forecasting.

For example consider the test result for the week in July as shown in table 4.2. For 204

houses, we have 204× 24 = 4896 [load,hour] combination for each day. For 7 day period the

total data points are 34272 [load,hour,day] combinations. With close to 35% error, we have

11698[load,hour,day] combinations which are miss-forecasted.

80

a. b.

c. d.

Figure 4.6: Mean squared error for four test weeks (a. Week of January b. Week of April c.Week of July d. Week of October). Days of week are on X axis and MSE values on Y axis.

81

On focusing on recurrence of error on each day, we observed that many [load,hour] com-

binations are repeated on each day of the week. That is a load x at hour y is miss forecasted

on more than one day. Table 4.4 lists recurrence frequency of [load,hour] combinations over

the 7 day period. Here column 2 is the count of such instances where load x at hour y fails

for z (value in column 1) days in a week. Column 3 is the percentage of these combinations

in terms of daily combination (e.g. 188 out of 4896) and column 4 is the percentage of error

due to these instances in the whole of the week (188× 7 out of 34272 total errors).

As can be observed, 188 combinations are miss-forecasted on each of the 7 days. This

means that a certain house x is miss-forecasted at time y for each of the 7 days and there are

188 such instances. If we consider [load,hour] combinations which are miss-forecasted for 6

or more days out of 7 or 85% of the time then we have 426 [load,hour] combinations out of

4896 combinations (or 9%) in this range. That is, for these speci�c 9%(or 238) [load,hour]

combination, we can be extremely sure that we will miss-forecast the house since at least 6

days out of 7 we are miss-forecasting it. The probability of correct forecast (0.15) is very

low. Additionally since these house are forecasted incorrectly almost every day of the week,

they contribute roughly 24% of weekly error rate.

A small yet concise set of combinations being miss-forecasted so regularly points to some

inherent trends in these loads at those hours. Either these trends are not captured through

our model or through our data. It maybe that some critical anthropologic or structural

information was not collected leading to error or maybe the pattern of usage was too volatile

for our STMLF engine to capture and forecast.

To understand this error further consider the method of creating STMLF model. STMLF

builds a model based on time series of multiple loads. It uses load attributes E ′ to build these

model so that di�erent value combinations of vector E ′ are weighted to identify di�erent pat-

terns of loads. These patterns are then simulated to forecast the load under E ′ and global

invariants. However, this is under the assumption that combination of attributes discrimi-

82

nates various patterns completely in that variations in E ′ is able to explicitly di�erentiate

all the di�erent types of patterns that exist for loads in the system.

However, if the available attributes are not su�cient to di�erentiate between load pat-

terns then this results in ambiguity in forecaster. In such a situation forecaster is not able to

di�erentiate between patterns. For instance, two houses may have all k attributes of E ′ ex-

actly same but some occupant habit, such as working hours, di�erent then their consumption

patterns will also be di�erent at least for hours where their work hours are di�erent. But if

this critical load attribute is not collected then forecaster has no way to distinguish between

these loads. Since it is realistically impossible to capture all the attributes that e�ect a load,

it is imperative that some alternate method is identi�ed to mitigate this problem. Here we

will discuss our observations which can aid in arriving at some solution for this problem.

Let us di�erentiate this varied patterns within a particular combinatorial values of E ′

as Υn for dominant pattern and Υe as sub-patterns. With no way to distinguish between

Υe from Υn, as shown in discussion above, the forecasting engine forecasts weighted average

for both Υn and Υe. However, for Υe this is incorrect forecast. However, e�ciency can be

increased if we can identify Υe, at time of forecast. We can then sperate Υe prior to model

creation through di�erent methods and reduce the error caused by lack of discriminating

data.

To identify these patterns various classi�cation techniques can be used. We have per-

formed some experiments using support vector machines (SVM) to identify Υe. Our results

are encouraging but discussion of SVM model and its e�ect is beyond the scope of this thesis.

The next question is how to forecast for Υe? There are two options that one can apply to

these combinations. One is that an appropriate upper bound or lower bounds is assigned to

Υe. Another method that seems more promising is to have multi-level forecaster where Υe is

repeatedly removed from training data and model is trained on dominant trends only. Various

thresholds can be used to implement this cleansing of input data. However, discussion of

83

such methods is beyond the scope of this work and will be discussed in future.

Our intention here is to state the issue of ambiguity of patterns even with richer data

and identi�cation of methods to mitigate this problem. We leave this as future work.

4.7 Short Term Forecasting Techniques for STMLF

We now discuss the algorithms used for traditional STLF and their application on STMLF.

There are three concerns that we have for using a forecaster for STMLF. First it should

be able to handle at the least k input parameter. Our results show that this k should be

signi�cantly large to distinguish between house characteristics. Second, as is shown in section

4.2, signi�cant portion of our forecasted data is far from mean. Therefore, the forecasting

technique should not ignore or suppress outliers. Third, the technique should be able to

handle a highly volatile system since consumers loads are highly volatile as discussed earlier.

Now we will discuss various STLF techniques in light of STMLF requirements stated

above and discuss which techniques can be used for STMLF. This discussion is important

in identifying the forecasting engine that we use for STMLF since many existing forecasting

techniques do not support the computation required for STMLF.

Load forecasting historically has been used to forecast large scale monolithic systems such

# of Combination Percentage of Percentage ofmiss-forecasts Count(Percentage) Population Errorin 7 days

7 188 3.8% 11.25%6 238 8.7% 23.5%5 379 16.4% 40%4 583 28.3% 60%3 727 43.2% 78%2 819 59.9% 92%1 908 78.4% 100%

Table 4.4: Repeat count of error and Cumulative accuracy error for 7 day period.

84

as power loads of a city or region or cost of energy in a market. There are three fundamental

techniques which have been applied for such forecasts for a single system: 1) Statistical tech-

niques focused on smoothing and averaging such as regression [Papalexopoulos and Hesterberg, 1990],

exponential smoothing [Christiaanse, 1971], Kalman �lters [Irisarri et al., 1982], stochastic

models [Wang et al., 2011] etc. 2) Time series methods such as linear univariate model

[Cuaresma et al., 2004], ARIMA [Amjady, 2001], Box and Jenkin [Hagan and Behr, 1987], in

combination with econometrics model [D. and Uri, 1978], GIGARCH [Diongue et al., 2009],

GARCH [Garcia et al., 2005] and hybrid models such as combination of ARIMA and GARCH

using wavelet transform [Tan et al., 2010] etc., and 3) AI techniques such as ANN

[Hippert et al., 2001], ANN with radial basis function [Lin et al., 2010], pattern recognition-

based techniques [Dehdashti et al., 1982], expert system-based techniques [Rahman and Bhatnagar, 1988],

particle swarm optimization [AlRashidi and EL-Naggar, 2010] and fuzzy system-based tech-

niques [Yang and Huang, 1998] etc.

Recently due to prevalence of smart grid ideas research has been focused on STLF

for small scale systems. STLF for small scale systems is proven to be a much harder

problem than for a large scale system as has been explained by Amjadi and colleagues

[Amjady et al., 2010]. Amjadi and colleagues and [Amjady et al., 2010] and Gurguis and

Zeid [Gurguis and Zeid, 2005b] have proposed solutions which work better than the standard

STLF for a micro-grid or building level granularity. However, the accuracy of the system

still does not match those of a large scale STLF due to volatility of underlying system.

We will look at each of the three classes of algorithms to identify methods which can be

used for STMLF and also point out the reasons why an algorithm is not usable for STMLF.

We see that most of statistical techniques are not applicable for STMLF for two reasons.

First, these techniques are based on smoothing data around mean. As we have shown in

section 4.2, for large, highly volatile data-set, mean is not a good forecast. Regression, expo-

nential smoothing and Kalman �lters thus are not appropriate for such forecast. Secondly,

85

most of the techniques are not capable of handling higher input dimensions required for

the forecast. This is true for the above methods and the stochastic technique presented in

[Wang et al., 2011]. To prove our �rst claim we used multiple linear regression (MLR) for

STMLF since MLR is able to cater the k dimensions in its model. As expected, the forecast

has a high error rate. Table 4.5 shows the mean squared error (MSE) value for each of the

9 days of experiment. The results showed a high MSE with average MSE of 2.73 and for

some day as high as 3.42. For a value in the range of zero to �fteen, such a value is relatively

very high. When we evaluated the same results from perspective of miss-forecasted loads,

we found that roughly 74% of loads were beyond the acceptable range of forecasted value.

Figure. 4.7 shows the number of miss-forecasted loads over the 9 day period. In this �gure

the darker shade part of the bar represents miss-forecasted and lighter part represents the

correctly forecasted loads. As can be seen, for each day a sizeable number of forecasts are

beyond our acceptable range when forecasted using MLR.

It is well known that time series analysis techniques are neither scalable to higher dimen-

sions nor are e�ective in highly volatile data [Box and Jenkins, 1994]. Usually time-series

analysis are limited to 4 or 5 input variables which is insu�cient for our requirements. For this

reason time series methods such as linear univariate model [Cuaresma et al., 2004], ARIMA

[Amjady, 2001], Box and Jenkin [Hagan and Behr, 1987], in combination with econometrics

model [D. and Uri, 1978], GIGARCH [Diongue et al., 2009], GARCH [Garcia et al., 2005]

and hybrid models such as combination of ARIMA and GARCH using wavelet transform

[Tan et al., 2010] were not considered for STMLF.

In comparison, AI technique such as arti�cial neural networks through their hidden lay-

ers and SVMs through their projection into hyper-dimensions, seem much more capable

of solving an STMLF model. These techniques are able to identify hidden trends thereby

�nding the similar trends in di�erent time series. Furthermore, ANN and SVMs are proven

to be scalable to the dimensional needs of STMLF. However, their ability to handle such a

86

Day Mean Squared Error1 2.722 2.703 2.514 2.435 2.536 2.537 3.238 3.429 2.56

Table 4.5: Mean Squared Error (in KWH) for 9 day STMLF using multiple linear regression

Figure 4.7: MLR forecast error for 9 day evaluation period. Each day has 4896 forecasts.Darker part of bars represent correctly forecasted loads and lighter shade represents themiss-forecasted loads . Correct forecast is forecast within the range de�ned by equation 4.9.

87

volatile data set is still unknown. In next section we will discuss use of ANN for experimen-

tation comparing STLF with STMLF and quantifying e�ect of anthropologic and structural

data over consumer load forecasting. In summary we believe that from the existing short

term forecasting techniques only AI methods with ability to scale in input dimensions are

applicable for STMLF.

4.8 Conclusion and Future Work

In this chapter we have �rst introduced autonomic demand side management (ADSM) as a

paradigm to provide DSM and DR in micro-grids. We have identi�ed forecasting of individual

user's load as an important cog for ADSM and have attempted to answer two important

questions for making this forecast. The �rst question is:

Do current STLF models and techniques work appropriately for forecasting individual

households or are adjustments needed in modeling paradigm for forecasting individual con-

sumer loads?

We found that the STLF model has some shortcomings in forecasting loads of individual

consumers. STLF models are built to forecast for monolithic or single load forecasting.

To forecast for hundreds of thousands of loads, an STLF will be required for each load.

This posses a scalability problem. To overcome this shortcoming, we proposed a short term

multiple load forecasting (STMLF) model which combines individual load time-series into

a succinct model for forecasting many loads with a single model. Even more so we showed

through our results that STMLF is up to seven percentage points more accurate than indi-

vidual short term single load forecasts for each load. Furthermore, we identi�ed techniques

(ANN and SVM) which can compute forecasts based on STMLF model. For our experiments

we used a basic ANN algorithm to prove the e�ect of anthropologic and structural data over

88

STMLF. As future work this ANN engine can be replaced with more sophisticated ANNs to

increase e�ciency of forecast. Our second question was:

Do the anthropologic and structural variables enhance the forecasting accuracy of individ-

ual consumer loads?

We showed through experiments that a combination of anthropologic data and structural

data of houses can greatly enhance forecasting of individual consumer's load. This richer

data can reduce error up to 50% in some cases. However, we did not co-relate the questions

with the e�ciency of the system. A more detailed analysis of e�ect of anthropologic and

structural data over forecast accuracy is required.

Lastly we made observation regarding miss-forecasts of STMLF. We observed that a

pattern exists which can be exploited to increase accuracy and precision of this forecast.

As our future work we are exploring ways to design �lters to identify and separate out

these miss-forecasts. It needs to be investigated on how to mitigate these miss-forecasted

combinations once they are di�erentiated.

In conclusion, we recommend short term multiple load forecasting and use of anthro-

pologic and structural data for smart grid applications where highly accurate behavior of

individual consumers is required such as in demand response and demand side management.

89

Chapter 5

Disaggregation Heavy Loads from

Forecast

In chapter 4 we have described a methodology to forecast household energy loads with higher

accuracy than existing methods. However, the demand side management strategy aims at

controlling the high consumption devices such as heating and air-conditioning units. In this

chapter we answer the question that is it possible to disaggregate the device consumption

prediction from the total household load prediction on an hourly scale with high accuracy?

We will show how we disaggregate the load of these high consumption devices from total

house load forecast. Due to the severe di�erence in consumption when the target device is

on and when it is not, we will show that even with the worst forecast error rate reported in

chapter 4, we can still achieve accuracy of 97%.

90

5.1 Introduction

Load disaggregation is the task of identifying individual device loads by observing the total

load of the entire system. Its main application is in the non-intrusive load monitoring

(NILM) domain in which from a single meter installed at the house entry point individual

load consumption pro�le can be identi�ed [Hart, 1992]. NILM is primarily concerned with

realtime identi�cation and monitoring of loads. Thus there are certain features in almost all

of the load disaggregation algorithms which are speci�c to this NILM analysis.

First, is that the algorithms usually are applied at high frequency data. The data fre-

quency ranges from 1 reading per second to 16 KHz - 16 thousand readings per second.

Second is that all the algorithms are applied on realtime time-series data. This realtime

constraint with the high data rate enforces algorithms to be more fast and scalable than ac-

curate and precise as is discussed in survey by Zeifman and Roth [Zeifman and Roth, 2011].

The algorithms vary in their approach, algorithm and scope of the devices targeted for

disaggregation but usually the algorithms target all the main appliances in the house.

Our system requirement however, are very di�erent from these load disaggregation sys-

tem. First our data is not realtime actual data but is the forecast of the load for the next

24 hour session. An inherent problem of forecast is that it will have errors. As we presented

in chapter 4, the error can be as much as 48% loss in accuracy and a variation of roughly

20%. Second, we do not wish to disaggregate all the device rather are only concerned with

the heating and cooling load. Third, our data is at a coarser granularity of one reading per

hour. Lastly, our main concern is accuracy rather as we shall see in this chapter, our goal is

to reduce false negative since false negative e�ects the correctness of our planning the most.

These characteristics pose our load disaggregation system very di�erently in comparison

to the load disaggregation algorithms discussed in literature. Invariably, our technique is

almost incomparable to other disaggregation techniques due to these variations.

91

As we have discussed above load disaggregation from forecasted data has its complexities.

On the other hand this load disaggregation has certain advantages as well. Since by de�nition

we are targeting the largest possible loads, it aids in our e�ort since the di�erential between

total load when heating or air conditioning is on and when it is not on is signi�cant enough

for the classi�ers to identify. Secondly, since we are more concerned with false negative to

be low and not overly concerned about false positives, we can combine di�erent algorithms

to cover the maximum range of the solution space.

In the next section we will discuss the data that we will use for our experiments followed

by discussion of evaluation criterions. We will then present the results of the ANN-SVM

based technique for load disaggregation and concluding remarks on load disaggregation.

5.2 Data

To conduct the disaggregation experiment we considered the Reference Energy Disaggrega-

tion Dataset (REDD), a publicly available data-set for load disaggregation [Kolter and Johnson, 2011].

This data-set contains detailed energy usage information of several homes over extended pe-

riods and is available in high frequency of 16 K.Hz -16 thousand readings per second- and in

low frequency of 1 Hz - 1 reading per second. Since we required readings at one hour inter-

vals, to simulate hourly loads, we calculated the net energy consumed in a 1 hour window

using the 1 Hz data. In this data-set the data is collected for both the main and the devices.

This provides us with the opportunity to test our strategy to identify times when heating

load is on.

This last information, of device usage pattern, was not available in the Swedish data-

set therefore Swedish data was not usable for demand disaggregation evaluation. But to

simulate hour level forecast of STMLF we added arti�cial noise corresponding to the error

levels in STMLF results.

92

Thus we conduct experiment on two data-sets: disaggregation on clean data, and disag-

gregation on noisy data.

5.3 Evaluation Criterion

The goal of the load disaggregation from the forecasted load is to identify houses where the

high load device will be used. This information will be then used to plan these forecasted

devices for load management. To illustrate and understand the value of correct forecast

we shall discuss the evaluation with reference to confusion matrix. a confusion matrix is a

speci�c table layout that allows visualization of the performance of an algorithm. There are

four cells in the table since we have two classes. The cells represent the following:

• True positive: Predicted true and actual is true.

• False negative: Predicted false but actual is true.

• True Negative: Predicted false and actual is false.

• False positive: Predicted true but actual is false.

Our main concerned is that all the elements that are true, that is all the houses where

heavy loads will be used are identi�ed. A small false positives in our system is less detrimental

than a false negative. This is because if we schedule load management for a device but the

device is not used then we may have marginally extra energy to use that we did not allocate

Predicted classUsed Not Used

Actual

class Used True positive False negative

Not used False positive True negative

Table 5.1: Confusion matrix

93

but system stability will not be compromised. In comparison a false negative would mean

that a device that is not scheduled by our planner is switched on. If su�cient number of

false negative exists then the system stability can be a�ected. Though we have a method

to recover from this situation as we will discuss in chapter 6, however, the result will be

sub-optimal.

The main matric for evaluation thus will be accuracy which we de�ne as:

accuracy =truePositive

truePositive+ falseNegative

In the next section we will present the disaggregation strategy. The strategy speci�cally

aims at reducing the false negative and incurs a bigger false positive. However, as we have

discussed above, a relatively bigger false positive is acceptable as long as it is not too large.

5.4 Disaggregation Strategy

Figure 5.1 shows the hourly load of a house from REDD data-set. The red line shows the

total load on the main of the house and the blue dots specify the hours in which heating

load is on. As can be observed the correlation between high total consumption and heating

load being present in that hour is very high.

To classify the timings in which heating load is present, we applied two classi�cation

algorithm, namely: arti�cial neural network (ANN) and support vector machine (SVM).

We applied two strategies to maximize the likely-hood of reducing the false negative.

First, we biased the training data by categorizing any hour as heat load hour if in any

second of the hour the heating load was present. However, for classifying an hour as heat

load hour, we calculated the net energy and if the net energy represented a heating load then

we considered that hour as a heat load hour. Here the di�erence is that if the heat load was

present for couple of minutes then we do not consider it as a heat load hour for classi�cation.

94

Figure 5.1: Heater load and load pro�le of a single house. Red line represents the main loadvalue and blue dots represents the hours in which the heater was on.

The second strategy is to use Or operator on the output of ANN and SVM. ANN and

SVM are two of the most widely used classi�ers but have marginally di�erent strategies to

build their classi�ers. Whereas SVM is a large margin classi�er which aims to produce a

more generalizable result, ANN attempts to model the system accurately as a mathematical

model. By combining their positive results we can bene�t from both the methods and reduce

our false negatives. The side e�ect of this strategy is a higher false positives rate. However, as

we have discussed above, we prefer false negative over false positive and we see this trade-o�

favorable.

5.5 Results

In this section we present the results of load disaggregation from the forecasted loads. We

present the results of load disaggregation with perfect forecast and forecast with noise equiv-

95

alent to the forecast error presented in chapter 3.

5.5.1 Noiseless Forecast

Our �rst set of experiments assumed that we have perfect forecast. We applied two clas-

si�cation algorithms, ANN and SVM, and a third classi�cation where we combined the

predictions of ANN and SVM. Table 5.5.1 shows the confusion matrix for this experiment.

The interpretation of values is as following. The �rst value is true positive count, the second

value is false negative. The �rst value on second line is false positive and second value is true

negative count. To derive the percentage we divide the cell value with the sum of values of

its row. This scheme will continue for all the confusion matrix presented in this study.

As can be seen the false negative is 17% (5 out of 30) for ANN and 10% (3 out of 30) for

SVM. But when we combine the two, the false negative rate goes down to 0%. That is, we

correctly identify all the hours in which heating was used. The accuracy of the system thus

is 100%.

Although false negative rate of ANN is lower but when we use the OR operator the false

negative rate is same as that of SVM. The false negative rate of the �nal forecast is 28% (20

out of 71).


Actual

class Used 25 5

Not used 6 65


Actual

class Used 27 3

Not used 20 51(a) (b)


Actual

class Used 30 0

Not used 20 51(c)

Table 5.2: Confusion matrices for noiseless forecast. a) Arti�cial neural network (ANN). b)Support vector machines (SVM). c) (ANN OR SVM).

96

5.5.2 Forecast with Noise

Our second set of experiments add noise to the measured values to simulate the forecast

error as presented in chapter 2. We applied the same strategy of applying ANN, SVM and

combination of the two algorithms for these experiments as well. Table 5.5.1 shows the

confusion matrix for this experiment. The interpretation is same as before.

The error interestingly improve in this experiment but this can be attributed to the

random nature of the setup. Nevertheless the variation is minute enough to ignore. The

accuracy for ANN in this experiment is 10% and for SVM is 13%. But when we combine the

two, the false negative rate goes down to 0%. That is, we correctly identify all the hours in

which heating was used.

Thus accuracy of (SVM OR ANN) is not e�ected by the error in the forecast presented

in chapter 2.

The false negative is marginally better as well but only marginally (27%).


Actual

class Used 27 3

Not used 19 52


Actual

class Used 26 4Not used 6 65

(a) (b)Predicted classUsed Not Used

Actual

class Used 27 3

Not used 19 52(c)

Table 5.3: Confusion matrices for noisy forecast. a) Arti�cial neural network (ANN). b)Support vector machines (SVM). c) (ANN OR SVM).

97

5.6 Discussion

In this chapter we have shown that highly accurate prediction of high consumption devices

is possible from the forecasted value of the whole house. We have shown that by using

combination of SVM and ANN we can achieve 100% accuracy. Furthermore we have shown

that there is no visible change in accuracy even if our forecast is faulty within the range of

chapter 2.

98

Chapter 6

Demand Side Management Planning

We have shown in chapter 5 that we can construct an accurate forecast of heavy loads from

short term load forecast of individual houses. In this chapter we show how we can use

this device forecast to construct a demand side management plan. There are two research

questions that we answer in this chapter which were posed in chapter 1. First is the issue

of constructing a scalable and robust plan for DSM scheduling. This is discussed in 6.3.

Second is the variability of the size of the system. The system size changes over time

which allows us to modulate the approximation for a more exact optimization. We �rst

introduce the variation in size of algorithm in section 6.4. In this section we introduce

adaptable optimization or AdOpt to leverage this variation in size to maximize the exactness

of optimization based on runtime constraints. We then discuss the dynamic modeling method

to support AdOpt in section 6.5. Our results show that we can achieve our load curtailment

targets by using the combination of the aforementioned strategies.

99

6.1 Motivation

Our main motivation for autonomic demand side management stems from the critical need

of management in electric power distribution. A typical power distribution system provides

electricity to a locality which consists of tens of thousands of consumers. However, due

to power crises in developing countries if the demand of power outstrips supply the power

company cuts o� complete power to one or more neighborhoods to keep supply more than

demand.

A second challenge to energy management systems are spikes. Power supply to the

power grid can increase or decrease any time depending on the availability of electricity

generation sources. When the power supply drops the power grid managers are forced to

shut down power supply to some areas. This abrupt unscheduled shutdown is damaging and

an inconvenience for the customers and their appliances.

To plan a strategy for optimized power allocation and to handle spikes, we look at our

consumer usage patterns. In a typical household we could divide electric devices into four

broad categories. These four categories are shown in �gure 6.1. The �rst category includes

devices that are low powered and low usage. This means that these devices consume relatively

less power i.e. less than 500 watts and are used seldom. In the second category are devices

that are low powered but are used more frequently or for a longer duration of time e.g.

electric fans, lights etc. The third category are devices that are high powered but are used

seldom. Devices in these categories include things like microwave ovens, washing machines

etc. Finally, the fourth category are devices that use more power i.e. more than 500 watts

and are also used for a longer duration of time typically an hour or more. Devices such as

refrigerators and air conditioners fall in this category.

Our hypothesis in this work is that if we somehow can optimize the use of the devices in

the fourth category, we could eliminate or at least reduce the gap between supply and demand.

100

Low usage High UsageLow Vacuum- 200 watts TV 70 wattsPower Cleaner Fan 50 watts

Shaver 15 watts Computer 150 wattsHigh Micro- 1000 watts Air Cond- 2000 wattsPower wave itioner

Toaster 1500 watts

Table 6.1: Classi�cation of household appliances according to power and usage pro�le

In a tropical countries, due to very long and hot summers air conditioners of di�erent types

makes up for most of the devices in this category. In this chapter we present results on

simulations where we manage the usage of air conditioners to optimize the distribution of

electricity.

What this optimization entails is that a customer will get full electricity supply for type

1, 2 and 3. However, the air conditioners are regulated by the power company. The power

company will be able to remotely switch 'o�' the electricity to the air conditioners for short

durations. This duration is small enough to retain the cooling e�ect produced by the air

conditioners and long enough to save electricity at a grid station level.

To have fairness this scheme provided a service-level guarantee to each household. Since

such a system has to keep up with the demand and supply pattern of electricity and also has

to ensure service-level guarantees for hundreds of thousands of heavy duty electric appliances

we used a linear programming model of the system to apply self-optimization on this system.

The use of linear programming in self-optimization problems could be complex depending

on the dynamics of such a system.

In a nutshell, our optimization scheme turns o� high-powered devices for a small duration

of time typically determined by a service-level agreement between the electricity company

and the consumer. This methodology optimizes the supply of electricity to high-powered

devices based on the overall supply/demand situation, a service-level guarantee and other

101

factors. An hour of usage for each high powered device is divided into six ten minutes slots.

At times when supply outstrips demand then all devices are powered for all the six time

slots. However, as the demand outstrips supply devices are turned o� based on a fair scheme

such as round robin. The maximum a device can be turned o� is based on a service level

guarantee between the electric company and the consumer. For simplicity purposes, in this

paper, we use a two slots service guarantee for all the consumers. This means that a device

is to be turned on for at least twenty minutes of an hour. Therefore, the optimization goal

is to �nd a plan for the next hour for each device in the system.

Since a plan is generated for each hour there is no need to recalculate the plan during

the course of the hour unless two situations occur: there is a sharp increase in the demand

or there is a sharp decrease in the supply. In both cases, the plan has to be recalculated for

the rest of the hour.

But implementing such electricity optimization has many challenges. First, with the

present infrastructure there is no way a power company can turn on or o� the air con-

ditioners remotely. However, recent advances in smart homes and smart grid networking

technology has provided us with su�cient tools to implement such a plan. A survey by Yan

and colleagues provide su�cient information in this regard [Yan et al., 2013]. A study by

Omer and colleagues also provide a �nancial assessment of di�erent available technologies

[Omer et al., 2010].

Second, the number of air conditioners is enormous. In a typical locality thousands of

these devices are present. Therefore, we need a technique of self-optimization that can scale

to large number of devices without a signi�cant cost overhead.

Third, both the electricity supply and its demand can vary, i.e. spikes can occur in our

system. Therefore, the methodology must be dynamic enough to quickly act on supply and

demand spikes and take the system back to an acceptable state. A supply and demand graph

is shown in �gure 6.3. This is a typical supply and demand situation on a summer day for

102

a locality. Assuming that we have such historical data available we plan for the electricity

optimizations for the next hour.

The usage of electricity is a very dynamic. Therefore, in order to cater for any short and

temporary spike we de�ne a reserve margin between the peak demand and supply. Reserve

margin is a bu�er between the maximum anticipate demand and the supply that is available

to the system. This reserve margin is maintained to cater for any growth in demand beyond

the supply. The motivation and way to calculate this reserve margin in section 6.3.3.

In our scheme, if we do need to replan the optimization of electricity this reserve margin

provides the time necessary to replan the distribution of electricity.

For our methodology this margin is margin ≤ calcT ime×∆. Where ∆ is the slope when

the global demand function approaches maximum supply and calcT ime is the maximum time

taken to analyze, plan and execute the plan. The derivation of this equation is provided in

the evaluation section. We subtract the margin from the electricity supply value and used

the new number to plan for electricity optimization. The supply value minus the reserve

margin gives what we call an adjusted supply.

Fourth, the methodology must ensure service-level guarantees i.e. the promise of the

power company with the consumer that an air conditioner will not be turned o� for more

than x minutes in a given hour. This requires a very dynamic self-optimization technique.

6.2 Approach

To plan for such a dynamic system with soft-realtime constraints, we applied linear pro-

gramming to schedule our devices. To achieve this a linear system of equations is developed

based on the entities in the system. Since the decision is a zero/one - device switched o� or

switched on- the problem is easily reducible to 2-dimensional knapsack problem which is a

known NP-complete problem. There are two limitations to this approach. First, the demand

103

for machines in power distribution system depends on consumers turning on and o� their

devices and a �xed linear set of equations is not enough. Second, the system is not scalable

since for large number of entities no solution to 2-dimensional knapsack problem exists.

In order to solve the �rst problem we used a meta model to generate a linear set of

equations at runtime. This set of linear equations is mostly based on the state of the system

i.e. the number of electric devices consuming power at the time of optimization. Once the

linear system of equations is generated an optimization algorithm such as `simplex' is used

to solve the knapsack problem [Hillier and Lieberman, 2001]. This system is resolvable for

a small set of problem when the time constraints are not stringent.

However, when either the system size grows or the response time is small as is the case in

response to a spike then the simple 2-dimensional knapsack formulation is not feasible. To

counter this problem we use a clustering technique to cluster the entities based on a given

variance. The clusters are then used to generate equations and thus a relatively small set of

equations was generated. Secondly, since we have frequency instead of a zero/one decision,

we can use linear programming instead of integer programming needed for 2-dimensional

knapsack problem. In solving this approximated problem the use of linear programming

proved to be considerably fast for even an ultra-large dataset consisting of one hundred

thousand entities. We will discuss the clustering algorithm in section 6.3

However, since we were using clusters the solution speed had a penalty of un-utilized

power that could have been utilized. For the most part when a solution is required instantly

a small quantity of un-utilized power is acceptable but when the need for distributing the

load scaled up or down sharply, we found that we could do a better job with a algorithms

that solves the problem with virtually no unutilized power in the system.

To get the best of both the algorithm, and possibly other approaches, we propose adapt-

able optimization or AdOpt. AdOpt adapts the optimization model and method based on

system state. AdOpt observes the number of entities and the soft-realtime constraints and

104

based on a soft-computing technique it identi�es the optimal model and technique to be

used. So if the time requirements are stringent or the number of entities is large then it

will use clustered optimization. If the size is very large then it will have fewer clusters with

higher penalty but a quicker response time. But if the entities are not many and based on

historical evidence knapsack style problem can be resolved then AdOpt models the system

as such and uses integer programming to derive the optimal answer.

However, changing algorithms, at runtime is not possible because other algorithms could

only be applied if the system is abstracted in a di�erent model. This means that in order to

use multiple optimization algorithms the runtime models have to be generated at runtime

too. Therefore, to use multiple optimization algorithms we need a methodology that analyzes

the state of the system, recommends an optimization algorithm, generates a runtime model

of the system and use an optimization algorithm to produce a plan for the power distribution.

To acheive the aforementioned goal we developed a dynamic modeling strategy discussed in

section 6.5.

In the next sections �rst we will discuss the approximation algorithm which uses clustering

to convert the 0/1 knapsack problem to a frequency domain linear programming problem.

We will then discuss AdOpt which decides between clustered frequency approach and 0/1

knapsack approach based on system statistics. last, we will brie�y present the dynamic

modeling framework used in AdOpt to model the system at runtime.

6.3 Clustered Frequency Based Algorithm

In this section we discuss our approximate algorithm which plans the scheduling problem in

a relatively short time but with the penalty of rounding o� error. The scheduling problem

with our given constraints is a two dimensional knapsack problem. It is desirable to �ll each

time slot with maximal number of loads. On the other hand its is desired that each load

105

is provided the maximum number of operative cycles. On the load dimension there is the

constraint of minimum loading, that is, each load must be scheduled at least 3 times in an

hour.

The 0/1 knapsack problem for any dimension, where we can select a weight as a whole

and not as a fraction is an NP complete problem. However, if it is allowed that we select

partial weights then this problem is solvable in polynomial time. However, for scheduling

air-conditioning loads a partial value is not viable since the state can be either on or o�.

In clustered-frequency based algorithm our main motivation is to transform the problem

into a frequency domain where so that we can solve the problem in polynomial time. We

make this transformation by clustering loads of similar characteristics. The goal of clustering

is to reduce the intra-cluster variance so that mean or max is representative of the loads in

the cluster.

We now model the system using the cluster frequency and their representative value

and schedule number of devices from each class for each time period rather than individual

devices for each time period. This provides us with two advantages. First, the size of the

problem is reduced to the number of clusters rather than the number of devices. This number

of cluster can be tuned through variance goals of the clustering algorithm. Secondly, since

now we have frequency of clusters, we can use linear programming to solve the 2-dimensional

knapsack problem. We take the �oor of the output of frequency value of the schedule. This

results in sub-optimal scheduling where possibly for each <cluster,timeslot> we can miss a

load that was scheduled by the linear program. However, this is error is bounded by the

term < cluster× timeslots > which is less than 6% of the total loads. Given that our other

options are exponential time algorithm or blanket load shedding this is an acceptable error

for now. In this section we will �rst describe the clustering algorithm followed by planning

LP formulation. This will be followed by results and discussion.

106

6.3.1 Clustering

We adopt an incremental clustering approach since it provides the best control over

the inter-cluster variance[Hillier and Lieberman, 2001]. To cluster we �rst sort the power

pro�les in incremental order. For each data point, we include it in the current cluster and

then calculate the variance (σ2) of the current cluster. If σ2 < errorThreshold then we

proceed to the next data point. Other wise we pull out the data point and create a new

cluster for this data point. here threshold is a user de�ned parameter limiting the variance

of a cluster. As our values are sorted prior to the clustering, we are always sure that our

variance for each cluster will be less than threshold.

In order to make the best out of clustering we had to reduce the inter cluster variance.

Therefore, to counter this problem we set our cuto� criterion for the clustering to restrict

the variance (σ2) of a cluster within a threshold. This means that each value in the cluster

is in the range of µ± errorThreshold. Here µis the mean of a speci�c cluster.

To cluster we could also have used another clustering algorithm such as k-means. But

k-means limits of number of clusters (k) where our requirements was to stabilize the system

with respect to the inter-cluster variance and the number of clusters are not important.

The output of the analysis phase hence will be the category 4 device usage information

divided into clusters. This information is then used to plan the actual optimizations.

6.3.2 Linear Programming Based Planning

The clustered data is then passed to the linear programming engine. This data includes the

cluster mean and frequency data and the electricity power supply available. The goal of this

LP formulation is to plan shutdowns in a pre-de�ned scheme.

Figure 6.9 de�nes the set of equations to de�ne LP. The cost function (Eq. 6.27) max-

imizes the total frequency of the system for all time periods. Here Xi,t represents the ith

107

[clusters] = clusterize(n)

sort (n)

k = 1

for i = 1..n

clusters[k].insert( n[i] )

if(variance(cluster[k] > errorThreshold))

clusters[k].remove(i)

k = k + 1

cluster[k].insert (n[i])

(Where: n = Data points to be clustered; clusters = two dimensional array. each 1D arrayis a cluster; k = number of clusters)

Figure 6.1: Clustering Algorithm

cluster in tth time period. As we do not have any priority, for clusters all machines have

equal chance of getting selected. Our z will give us the total number of 'on` machines and

value for Xitwill give us the number of machines to switch in ithcluster at the tth time period.

Equation 6.28 represents the service level guarantee constraints that in every time period,

at-least 1/3rd of systems should be in powered on state. equation 6.29 limits the allocation

under the maximum available supply and equation 6.30 puts the technical constraint that

the number of allocated consumers in a cluster should not exceed the cluster size.

For some of the controlled devices, it is recommended that a delay of 10 minutes should

be given between power cycle. To cater this requirement, we divide our total time (1 hour)

into 6 chunk of 10 minutes. A cluster for each 10 minute time period is represented as

a decision variable. So if we have 100 clusters and t is 6 then we will have 600 decision

variables, each variable de�ning the number of consumer that should get power on in that 10

minute period. The optimization function is to maximize the number of consumer getting

the supply for the entire duration. In should be noted here that for other problems, the t

can be increased or decreased accordingly.

Linear programming assumes real values for all variables. This means that LP outputs

non-integer values for state frequencies for each cluster. For example, LP can output 5.2

108

devices to be in "on" state in time t. We will take the �oor of the cluster values. For our

problem this adds an element of error. However this error will be order of (kt) where k

are the number of clusters and t is the number of time periods. An integer programming

solution will contain the same allocation but some of these kt machines will be allocated as

on and some of them will be 'o�' state. By switching o� all of these kt machines we might

be under allocating the resource. However since our maximum k is 1% and our t = 6 Our

total missed allocation will not exceed 6%.

The plan given by LP thus consists of the machines to be turned o� in each ten minute

period of an hour in each cluster. Therefore, this is an hourly optimization plan to be

implemented in an hour. This plan is provided to the execution module as input.

6.3.3 Spike Handling

A spike is a sudden upward or downward surge in supply or demand. Figure 6.3 de�nes

a typical power usage pattern at a grid level. The power provider guarantees a 2200 KW

power supply. But this supply could drop arbitrarily. The fall in power supply when demand

is low does not e�ect the system much. But when demand matches or exceeds supply, like

in the 11th time period create problems.

A demand spike occurs when a predicted maximum load is crossed. Figure 6.6 shows

the predicted and actual load of a system in real time. The maximum predicted load was

2600 KW. Four hundred kilowatts was a reserve margin. However, still the demand outstrips

supply.

A downward spike in demand or an upward spike in supply does not e�ect our system.

For the sake of simplicity we do not handle these two types of spikes.

To deal with a upward spike in demand and a downward spike in supply we use two

SAPE cycles, respectively. Some of the modules like the sense and analyze are used almost

in a similar fashion but the analyze and plan are designed di�erently.

109

Maximize(Z =∑i,t

Xi,t) (6.1)

∀t∀iXi,t ≥MAXi/3 (6.2)∑i,t

µiXi,t ≤ supply (6.3)

∀i,tXi,t ≤MAXi (6.4)

Figure 6.2: Hourly planning LP equations

Figure 6.3: Typical Supply spike in system

Supply-side Spike

Supply side spike occurs when the power supply company faces a sudden loss of a power

generation source such as a unit in a thermal power plant. This kind of spike occurs almost

instantly. However, sometimes there is a margin of a few seconds. We assume that we use

these few seconds to handle the spike.

As soon as the system sense a supply side spike we initiate a replanning process.

For clustering we assume that our initial prediction and clustering was correct.

However, for LP based planning we use a proactive approach. We calculate the minimum

threshold power that is needed in every 10 minute time period without violating the service

guarantee. We calculate this immediately after calculating the hourly plan. That is, at the

start of each hour, in addition to the main plan, we have 5 additional plans for failure at the

10th, 20th, 30th, 40th, and 50th minutes.

110

Minimize(Z =∑i,t

µi ×Xi,t) (6.5)

∀t∀iXi,t ≥MAXi/3 (6.6)∑i,t

µiXi,t ≤ supplyNew (6.7)


∀t′:t′<currenttime∀iXi,t′ = allocXi,t (6.9)

Figure 6.4: Spike handling LP equations

The planning for each 10 minute period uses the LP in �gure 6.4. Here we �nd the plan

which minimizes the possible power supply without violating the service level agreements

(eq. 6.5).

Assume that the spike occurs at time t′. As at time t′ all the power allocations for time

slices before t′ must have already been implemented, we consider the values for decision

variables for timings before t′ as constant. This is represented as equation 6.9. Our demand

constraint and guarantee constraint remain the same (equations 6.6 and 6.8) . Our opti-

mization function is to choose a set of decision variables for which the total power needed is

minimum (equation 6.5).

If at time t′ the power does go below the amount promised at the start of the hour,

we have a plan ready which can be propagated to the system instantaneously. We assume

here that our communication network is fast and stable enough to propagate the new plan.

As mentioned previously execution phase is implemented using the same communication

network i.e. SMS.

Since our planning is at discrete time intervals, we revert the system to a plan older than

current time. For example, if a drop in supply occurs in the 24th minute then we will revert

to the plan 2 made for t = 20 as shown in �gure 6.5.

111

Demand-side Spike

Demand side spike occurs when the demand from the consumers outstrip the projected

demand calculated at the start of an hour. Figure 6.6 shows a typical hot summer day when

the real demand outstrips than the projected demand. We are calculating demand on per

home basis. The sum of our demands give us a global picture of how much energy is needed.

One thing to note here is that unlike supply-side spike demand-side spike almost always

grow smoothly. This give us breathing space to deal with this kind of spike in a much robust

way. Therefore, we use a reactive method in this approach.

For demand-side spike is a two step process. First, we evaluate that how fast is our

demand approaching our supply since it is non-deterministic. This means that an increase

in demand may not be a continuous increase in demand in which case we would like wait and

see if a replanning is really required. To do this we take the demand data at every minute

and use a linear regression to �nd a �t for our global demand. We then use relation de�ned

in equation 6.10. In this relation ∆ is the slope of our linear regression �t and calcT ime is

the time to cluster, plan and propagate the updated plan. If the relationship does not hold

then we proceed to step two of analysis.

This relationship is the trigger to a replanning. As we measure the rate of growth and

the current state and the relate it with the time it takes for us to react, we ensure that we

always have a plan ready for an eventuality.

We start our step two with pulling the data from all the device units. A demand di�erent

t = 0 t = 20t = 10 t = 30 t = 40 t = 50 t = 60

Figure 6.5: System response for spike at 20 < t < 30

112

Figure 6.6: Typical demand spike in system

reserveMargin ≤ ∆× CalcT ime (6.10)

Figure 6.7: Reserve margin lower limit

from the projected demand requires a re-clustering of the air conditioners usage data. To

do a better job at clustering we use the historical data of the air conditioners and take a

max of the two pieces of data for each device. Here we use the same incremental clustering

algorithm with similar threshold.

We then use an LP which borrows from the two LPs discussed so far. As our goal here is

the same as that of program in �gure 6.9, that is to maximize the number of users, we use

the optimization function from that LP (equation 6.27). The constraints equations will be

mathematically same as those in �gure 6.4. However, we will change the constants di�erently

now. Since in this scenario our mean(µ) for clusters has changed and the supply is same as

before, our LP will have a changed left hand side (µ) instead of right hand side(supply). As

these values are input parameters, we will not need to change the actual LP equations and

only changing the constants will be su�cient. We can also interpolate that the running time

and complexity will be the same for both the LP.

113

6.3.4 Evaluation

We have evaluated our methodology on the challenges mentioned in previous section. These

challenges are scalability, spike handling and ensuring service-level guarantees. The equations

of the linear programming has already proven the e�ectiveness of the technique in ensuring

supply-side guarantees. Therefore, in this section we discuss the simulation results that

proves the scalability of our approach and a mathematical derivation that proves that our

approach is e�ective in handling spikes.

We used two di�erent methodologies to evaluate our systems and support our hypothesis.

We �rst conducted simulations with varied data sets and sizes to test the scalability and

correctness of our system. As our spike handling uses the same clustering and a similar LP,

we used mathematical derivation to evaluate and prove the correctness and resilience of our

spike handler since the scalability and correctness of clustering and LP has already been

shown.

We start with discussion of our complexity of clustering and LP with respect to scalability.

In section 6.3.4 we evaluate our test results for scalability of the system. We prove that our

running time for an increase in number of controlled devices can be managed thus answering

the scalability question. In section 6.3.4 we prove the correctness of spikes mitigation system.

This will give us the answer for the e�ectively handling of un-scheduled updates question.

Is the Solution Scalable?

Since we are controlling individual devices in this methodology the shear number of the

devices require a very scalable solution. This also means that in no way we can a�ord a

non-polynomial solution.

We evaluate the scalability of our methodology for analyze and plan phases

as the sense and analyze measures will be discussed elsewhere

The scalability of our methodology is dependent in most part on the analyze and plan phases.

114

In analyze we have used an incremental clustering technique. Our algorithm tries to insert

each element in one cluster only, and then calculates the variance of that cluster. that is

our complexity is dependent on the length of the largest cluster l, the number of clusters

k and the number of elements n. The order of this incremental clustering algorithm is

O(nkl2 +n(log(n)) where n(log(n)) is the complexity of sorting. A point to note here is that

the number of cluster and size of clusters are very closely related with the error threshold

and the variance of the distribution. We discuss this relationship and its e�ects shortly.

In the plan phase we use a linear programming (LP) algorithm. The complexity of

LP is dependent on the input number of decision variables. Our LP will decide the num-

ber of machines that should be on for each cluster. We used Matlab's LIPSOL algorithm

[Zhang, 1997]. This algorithm is based on Mehrotra's Predictor-corrector interior point al-

gorithm [Mehrotra, 1992]. Complexity of a modi�cation of this algorithms in big-O terms

was calculated by Salahi and colleagues and was found to be O(k2L) [Salahi et al., 2007].

Here k is the number of variables and L is the length of string needed to encode the input.

Recall k is the number of clusters that we created in previous step.

A grid station typically supports a few thousand consumers. As the consumption of

individual devices could vary we used random data generated using an upper and lower

bound to mimic actual device power consumption.

The scalability is a�ected by three variables: 1) the number of device usage pro�les i.e. n,

2) the variance σ2 of the usage pro�les, and 3) the threshold e within a cluster the threshold

we will tolerate within a cluster.

To evaluate the system we considered device usage pro�les between 10,000 and 100,000.

We used data with a variance between 1 kilowatt and 50 kilowatt.

Finally, we varied the threshold within a given cluster between 0.01 and 0.1.

For the aforementioned variables the clustering results with an error threshold of 0.1 are

given in table 6.2 and with an error threshold of 0.01 are given in table 6.3.

115

It can be observed that for a typical 50,000 values and error threshold of 0.01, our

clustering time is less than a minute. In fact our worst time for clustering (151.8 seconds)

was with the lowest variance (1) of data and with the highest error threshold (0.1).

Our observation is that clustering infact takes the major chunk of time as time to calculate

is extremely small. For example with a value of k = 1045 the plan is calculated in just 2.1

seconds. The total time required for both clustering and LP is given in tables 6.2 and 6.3. It

can be seen that LP adds a fraction of time to the total calculations. In contrast an integer

programming solution takes close to 30 minutes for a dataset of size 30!

Is the Solution E�ective in Handling Spikes?

As discussed previously, our system will encounter two types of spikes. We discuss the

e�ectiveness and response time for each source individually. Since our demand spike only

uses an LP which is similar to the one discussed in previous section. And our demand side

spike uses the same clustering algorithm as discussed previously, we do not re-evaluate our

clustering or LP scalability here. Instead our focus in this section will be to prove that the

spike handling mechanism is robust and able to maintain the guarantees, if possible, in real

time.

Supply side spike A sudden dip in supply is very much a possibility. Since our system

requirements stipulate that a machine shut down should not be restarted in the next 10

minutes hence any change in supply is immediately transferred to the customers. For the

next cycle, however, due to our proactive approach we already have a plan ready and this

plan will simply replace the current plan. Hence we will seamlessly integrate the change. It

might be possible that the dip in power in later stages of an hour might give us a situation

where reaching a guarantee is not possible. This case is a policy decision and is beyond the

scope of our work. In case the system managers would like to lower their guarantee with

some penalty then such a plan can also be calculated apriori and enforced in such a situation.

116

Variance Size Cluster Clustering LP Totalcount Time Time Time

10,000 7 1.96 0.06 2.021 50,000 7 41.76 0.05 41.71

100,000 6 151.8 0.09 151.8910,000 64 .76 0.16 .92

10 50,000 76 6.48 0.21 6.69100,000 80 21.49 0.23 21.7210,000 304 0.5 0.37 0.87

50 50,000 348 3.8 0.56 4.36100,000 372 9.12 0.88 10

Table 6.2: Total time for analyze/plan (Error threshold = .1)

Variance Size Cluster Clustering LP TotalCount Time Time Time

10,000 23 1.12 0.12 1.241 50,000 24 14.5 0.09 14.59

100,000 26 58 0.11 58.1110,000 201 0.56 0.43 0.99

10 50,000 220 4.2 0.34 4.54100,000 234 11.2 0.43 11.6310,000 857 0.5 1.40 1.9

50 50,000 1004 3.86 1.45 5.31100,000 1045 8 2.04 10.04

Table 6.3: Total time for analyze/plan(Error Threshold=.01)

117

In a nutshell, at the time of supply side spike the system goes down to guaranteeing

minimum service level. That is the system will try to conserve energy and only ration 20

minutes of usage per device. However if the power dip does not reach our minimum level,

we can always recalculate the plan using LP in �gure 6.9 for next time period and update

the plan like wise. Since maximum time to calculate a plan is less than 4 minutes, and our

time period is 10 minutes, we have the ability to update a plan if needed.

Demand side spike As discussed previously demand side spike is a result of growth of

consumption beyond predictions. A point to note is that the demand growth over time is

not as drastic as the supply side spike. Especially when considering that we have tens of

thousands of consumers. In order to predict demand side spikes we need to �nd out when

the SAPE process should start when demand changes. Another related aspect is the reserve

margin that we need to subtract from the supply to get the adjusted supply.

For the trigger mechanism we must ensure that we have enough time to run the SAPE

process before the demand is at an unacceptable level. Planning is not possible as the growth

in demand is quite non-deterministic. Our limitation is that we need to start our process

calcT ime seconds before we reach our max supply point. We derive a formula for the trigger

as follows:

Let calcT ime be the time to analyze plan and execute our SAPE cycle. Therefore, if

we want to have enough margin so that we trigger our calculations before we overrun our

supply then our trigger will be:

t′ − t ≤ calcT ime (6.11)

Where t is the current time and t′is that time when, according to current estimates, our

demand will reach our supply.

We do a regression analysis for values of demand in previous �ve minutes to approximate

118

the movement of our demand. The regression analysis gives us the equation:

currentdemand = ∆t+ c (6.12)

we can say that t′can be given as:

supply = ∆t′ + c (6.13)

replacing values in 6.11 with equations 6.12 and 6.13, we get

supply − currentDemand∆

≤ calcT ime (6.14)

Simplifying we get our trigger as:

currentDemand ≥ supply −∆× calcT ime (6.15)

That is, when our currentDemand goes below supply −∆ × calcT ime we will initiate our

demand spike planning module.

The second value that we need to know is an acceptable reserve margin. Recall from an

earlier section that we introduced lower limit of margin as margin ≤ calcT ime×∆ . If our

predictions were correct then at the time of peak usage we will not trigger a spike mitigation

if our marginis greater than or equal to calcT ime×∆.

To ensure that at peak time, our demand spike mechanism is not started when we are

following our demand pattern, we put a lower limit to our margin.

reserveMargin ≥ f(t′)− f(t) (6.16)

wheref(t)is the maximum demand and f(t′)is the hypothetical demand at time t′. If the

119

demand would have continued based on regression analysis for a window from �ve minutes

before maximum demand till the maximum demand. We use regression analysis on the

predicted demand function to �nd f(x). The equation given by regression analysis is:

f(x) = demand(x) = ∆x+ c (6.17)

we can hence write equation 6.16 as

reserveMargin ≥ ∆(t′ − t) (6.18)

We know that (t′ − t) is bounded by calcT ime. For calculating margin limit, we use the

lower bound for calcT imehence we can say that.

reserveMargin ≥ ∆× calcT ime (6.19)

This gives us the lower bound for the reserveMargin.

Using these mathematical proofs we can say with con�dence that as long as our reserveMargin

is more than our ∆ × calcT ime and our trigger is calculated regularly and we will always

have enough time to calculate a new solution for an increasing demand.

6.3.5 Discussion

In this section we discuss some of the observations and frequently asked question about our

methodology.

• Is the solution cost e�ective? We have calculated that to control each air-conditioner

remotely we need equipment worth about $30. This includes relays, GSM cell phone,

circuits, wiring, etc. The cost of running the system is also quite low because many

120

GSM companies provides packages of unlimited SMS for just $2 a month. Moroever,

considering that the available alternative solutions, if they are available, are much more

expensive this solution is de�nitely cost e�ective.

• Who will pay the cost/ who will be responsible for setup? We envision that our system

will be adopted by townships and/or power distribution companies. A city council or

a power providing company will decide on setting up a system. In such a case, either

the power company or the city council will enforce such a system and will devise a

cost breakdown mechanism. Such adaptation will have legal and social implications

however this discussion is beyond the scope of this paper.

• Is our solution generic? Our main contribution in this paper is the idea of clustering

a large data set and then to use linear programming to solve an otherwise complex

problem. This has a large range of applicability and can be applied to most of the

problems where decision has to be made on states of elements based on some constraint

and a global goal. We also provide a way to �nd bound to the error that is introduced

due to clustering and subsequent real value LP solution. However, given that the

original problem was intractable by a big margin, it can be argued that the error

margin is negligible. Our technique is not a generic approximation solution for NP-

hard problems. It is a technique for near optimal solution which can decide states of

tens of thousands of variables while staying within constraints and bounds.

• What is the learning curve of applying this technique? Our technique uses basic clus-

tering and mathematics. The modeling framework required to adopt a new problem is

much less than many other proposed techniques [Diao et al., 2003, Abrahao et al., 2006,

Abdelwahed et al., 2004, Lefurgy et al., 2007, Wang et al., 2006]. There are open source

tools available (GNU linear Programming Kit) available for solving LP. Indeed the

methodology we have proposed is not only simple but is so commonly used by economists,

121

managers, engineers and planners.

• Can we use other techniques to solve this problem? As discussed in previous sections,

the task of determining the state of individual elements with constraints is an NP-hard

problem. Our algorithm is a transformation of this problem into a solvable domain.

Though this transformation introduces an error but We have already de�ned a bound

for this error in section 6.3.2. A binary programming solution for this problem will be

in the order of 2nwhere nis the number of elements. For our basic case this will be

210,000. An intractable problem!

6.4 Adaptable Optimization - AdOpt

To realize the goal of using multiple optimizations on a given system, a whole autonomic

framework is required. The very basic building blocks of this framework are the optimization

algorithms that will eventually optimize the system. The selection of optimization algorithm

is partly dependent on the optimization problem in the system and its eventual goals. We

will �rst describe the optimization algorithms that we selected to solve our optimization

problem. Part of these algorithms have been discussed in previous section. Here we also

discuss the implementation details of the algorithm since the decision to chose a particular

implementation is important for AdOpt. This will be followed by discussion of selection

methodology and the formulation of the problem in BIP and LP terms. We will then discuss

the architecture followed by results and discussion.

6.4.1 Self-Optimizing techniques

We used three optimization methods for the calculation plan of the optimization of electricity.

In this section we describe the scalability and applicability of the three optimization methods

.

122

Binary Programming

In our optimization problem we need to �nd a plan of whether to turn on or turn o� a

electric machine. Binary programming (BP) is an ideal solution to this problem because

each device has only an `on' or an `o�' state during a slot [Hillier and Lieberman, 2001].

The advantage of BP is that it gives an exact solution and does not have any rounding o�

error. This means that there is no unutilized power in the distribution system. However, on

the downside the running time for BP degrades exponentially as more devices are added into

the system. Therefore, BP could only be used if the system has a small number of devices.

BP is a known NP-hard problem. To �nd the optimal solution takes the problem a bit

further and makes it a∑2 (Sigma2) problem, a class of problems known to be more complex

than NP. There are no polynomial time algorithm to solve these problems. The only known

way is to enumerate all the possible solutions. However, applying combination of state of the

art branch and bound techniques and linear programming can solve small sized problems in

a relatively short time.

The formulation of BP encodes each machine-time slot tuple as a single variable. BP

decides the state for each tuple subject to the service-level guarantee and the supply and

demand constraints. Therefore, the solution provides the state of each machine in the system

for the six or so time slots. The formulation of the runtime system model to represent our

power distribution system is discussed in a next section.

Linear Programming

Linear programming (LP) is used to solve the optimization problem of power distribution

system. We used two algorithms in linear programming: the simplex method, and the interior

point method. As mentioned before that to solve the system using linear programming so as

to allow for a large number of machines we clustered the data based on a distance threshold

of the power consumption pro�le. The clusters are then used to generate the equations that

123

represents the system from a meta-model. Once the equations are generated one of the two

aforementioned methods to solve the linear system of equations is used to �nd a plan for

electricity optimization. Because of the inherent modeling of the system both the simplex

method and interior point methods result into an error margin i.e. un-utilized power. The

main di�erentiating factor between simplex and interior point is the underlying method of

optimization. Simplex traverses along the edges of dimensions and changes its direction

when a constraint is encountered. This is slower but in general allows a variable to reach it's

maximum point in terms of optimality before optimizing other variables.

Interior point method is the state of the art LP solving method being used and improved

since it was invented in 1987 by Karmarkar. Interior point method traverses the interior of

valid region in search for optimal point. In doing so it changes values for a larger number

of variables at the same time. Hence at the end of the optimization, for a degenerate

problem, there is a possibility that more variables contribute marginally to the optimal

point in comparison to simplex solved solution. A degenerate solution is when more than

one combination of values yield the same optimal value.

In short, simplex has less error rate but is slower than interior point. On the other hand

interior point is fast but could give a large error at the end.

6.4.2 System Models

Multi-dimensional Multi-Knapsack Model

To use binary programming we modeled our problem as a two interwoven multi-knapsack

problems. A Knapsack problem is a formulation where hypothetical sacks have a maximum

weight and the smaller weights have to be �tted into sacks so that the sack is maximally

utilized. In our problem, because of the service-level guarantee, we have to select at least

two time slots for each machine. Therefore, the allocation of time slots for each machine

124

becomes a weight. The number of such devices becomes the number of sacks of a knapsack

problem as shown in �gure 6.8.

Concurrently, for each time slot we have to switch-on devices such that the number of

devices is maximized while the total power consumed is within the maximum power supply

available. This again is a knapsack problem. Here our sacks are the time slots. These two

interwoven problems can be expressed independently and then merged together in a single

equation matrix for one of our solution methods.

The �rst knapsack problem requires a more dynamic solution than the later one. As we

do not know the number of devices, the number of equations that will be in this problem are

not known at design time. For each sack, that is for each device, we create an equation with

six boolean decision variables representing each time period. The sum of these variables

should be greater than or equal to the service-level guarantee that the speci�c device is

calibrated to. All of these equations for sacks are aggregated to form two matrices, the

variable counting left hand side and the service level guarantee bound as right hand side as

shown in �gure 6.8.

The second knapsack has time slots as sacks. This means that for our current setup this

problem will have six sacks. Though it might be possible that the number of these "sacks"

may vary at runtime. For each of these six sacks a sum of product of decision variable and

consumption value is calculated as right hand side. The boolean decision variable here is

the same as used in the previous knapsack problem. This reuse, or double use of decision

variable weaves the two di�erent knapsacks. The left hand side for each time period is the

amount of resource, in our case power available for the system in that time slot.

Figure 6.8 is the template for binary programming planning equations. In this �gure

eq. 6.21 represents a sack for the �rst of the intervowen knapsack. As in that problem,

each device is considered as a sack, equation 6.21 is generated for each device in our system.

In comparison equation 6.22 represents a sack of the second knapsack problem. As in this

125

problem, each time slot is considered as a sack equation 6.22 is generated for each time slot

in our system.

Clustered-frequency Based Modeling

A linear program is a mathematical modeling and solving technique to plan for scarce re-

sources for multiple demands. The system is modeled as a series of linear equations. These

equations de�ne the whole system including the constraints, cost-functions and decision vari-

ables. These equations are usually derived from the technical requirements as well as logical

considerations.

LP solutions are in real domain and cannot be restricted to integer or binary values.

Doing so makes LP an equivalent of Integer Programming which is an NP-Hard problem.

Thus LP on a binary decision problem is not scalable. To derive an answer in within our time

constraints, we instead transformed our problem from binary decision to frequency determi-

nation domain. This was done by reducing the dimensions of the problem through clustering.

A simpli�ed quality threshold algorithm was used to cluster the data [Heyer et al., 1999].

As we are using mean of a cluster as a representative element, restricting the radius of clus-

ter makes the value more meaningful. Through this transformation we are able to change

the problem from deciding weather a machine should be kept on or o� in a time slice, to,

determining the optimal frequency for each cluster of machines. We rounded o� the value

of cluster to arrive at a sub-optimal, but maximal, plan for our system.

The resulting clustered problem has two logical constraints and one technical constraint.

Maximize(Z =∑i,t

Xi,t) (6.20)

∀i∑i

Xi,t ≥ guarantee (6.21)

∀t∑t


Figure 6.8: Hourly planning BP equations

126

Maximize(Z =∑i,t

Xi,t) (6.23)

∀t∀iXi,t ≥MAXi/guarantee (6.24)∑i,t




The �rst logical constraint is that the total energy consumed, as planned by the LP, should

not exceed the available power supply. This is represented as equation 6.29. The second

constraint is that the number of machines to be powered "on" in each cluster for each time

period should at least be x% of the total number of machines in that cluster. Where x is

the minimum service level guarantee for the speci�c cluster . For example if the guarantee

for the system is that no machine will be o� for more than 20 minutes (or 33% of the time).

Then the constraint will be that 33% of the consumers for each cluster shall be powered "on"

in each cycle to ensure the 20 minutes guarantee(equation 6.28). The technical constraint

is that the number of machines powered on in each cycle should not exceed the number of

machines in that cluster as given in equation 6.30.

Figure 6.9 de�nes the complete LP meta-model for our problem. The cost function (i.e.

equation 6.27) maximizes the total number of machines in each cluster for all time periods.

Here Xi,t represents the ith cluster in tth time period. As we do not have any priority, for

clusters all machines have equal chance of getting selected. Our z will give us the total

number of `on' machines and value for Xitwill give us the number of machines to switch

in ithcluster at the tth time period. Equation 6.28 represents the service level guarantee

constraints that in every time period, at-least 1/3rd of systems should be in powered on

state. equation 6.29 limits the allocation under the maximum available supply and equation

6.30 puts the technical constraint that the number of allocated consumers in a cluster should

not exceed the cluster size.

127

The plan given by LP thus consists of the machines to be turned o� in each ten minute

period of an hour in each cluster. Therefore, this is an hourly optimization plan to be

implemented every hour.

Spike in Supply or Demand

In addition to the hourly planning we also developed models that could be used to replan

during an hour. This is necessary because if contrary to the planned optimization there is a

sudden upward trend in the demand or a sudden downward trend in the supply the system

should be able to handle it gracefully. These erratic �uctuations in the demand or supply

pattern called spikes are handled using a similar model as represented in �gures 6.8 and 6.9.

The only di�erence is that when an optimization planning during an hour is performed the

time window of that planning is small and the system has to keep the service-level guarantees

pending for certain consumer devices. Due to the similarity of the spike related meta-model

with the hourly planning model we are not going to discuss its formulation. However, we

will discuss the spike related results in the evaluation section.

6.4.3 Case-Based Reasoning Engine

In planning an optimization the system has to select a method out of the available opti-

mization methods. Because of competing factors it is not always possible to use a simple

rule based system to select it. Therefore, there is a requirement of a recommendation en-

gine that suggest an optimization method to plan optimizations based on historical data

and user preferences. To this end we have used a recommendation engine that is based on

Case-Based Reasoning (CBR) to �nd the right method when an optimization is required

[Aamodt and Plaza, 1994].

Case-Based Reasoning (CBR) is a technique to derive a new solution by considering

similar past solutions. The CBR engine maintains a case base developed using historical

data. Given a new situation the CBR engine compares the new situation with old situations

128

derives a solution that is closest to it. This engine has the ability to revise and update

its case base. We use CBR to select the optimizer for our base problem. The input to

CBR are statistics of the system alongwith user policies, constraints, etc and output is the

recommended optimization method i.e. BP, simplex, or interior point.

Another motivation to use a recommendation engine for the system is because we do

not have any hard boundaries for a given problem in terms of its size. This means that

given a new situation it is not possible for us to select one of the three methods unless

we have evaluated its performance previously on a similar situation. The recommendation

engine based on CBR requires an initial case-base of historical data and when new cases

come it uses the previous experience to recommend an optimization method. The new cases

depending on the error rate and speed of their solution are used to improve the set of cases

in the CBR engine.

Some constraints such as user policies and service-level guarantees are already fed into

the system at design time. At runtime the CBR recommendation engine takes three inputs:

• Total gap between power supply and power demand.

• Total number of machines to be managed.

• Approximate time available for calculating the optimization plan.

Any CBR engine requires a set of initial case bases. We have developed the initial case base

using the following methodology: we generated an initial case base of this CBR by running

our three solver algorithms on sample data. We varied the size, spread and supply-demand

gap of our data. Variations of these are listed in table 6.4. These variations resulted in total

of 54 cases. The resulting time and un-utilized power margin generated by each of these

runs alongwith the inputs are saved in the case base.

Once this initial case base is populated at runtime when an optimization is requested,

CBR �nds the case closest to the input parameters and selects the solver that can solve the

129

problem within the available time while minimizes the un-utilized power. For online learning

a feedback system is incorporated in the CBR recommendation engine system.

6.4.4 Framework Architecture

We developed our �gure so that it is easily integrated into a SAPE cycle. Therefore, we

assume that each electric device has a mechanism to send us the information about its status

including whether it is powered on by the consumer and its present power consumption.

Assuming all this data is received into the system our framework plans for an optimization.

The system state data received from the sensors are fed into a CBR Recommendation En-

gine (CRE). CRE is applied on this data to generate the selection of optimization algorithm

for the given optimization problem. This decision acts as an input to the Runtime Modeler

(RM). The runtime model generated by the RM is an input for a mathematical toolbox

(MT) that applies the appropriate optimization algorithm to the model. This toolbox also

measures the statistics of of the solution such as time taken to solve and amount of power

un-utilized for feedback to the CRE.

The plan generated by the MT are propagated to the devices that make up the system

and executors perform the implementation of this planning. Again sensors and executors

are assumed to be there and our framework �ts in between the two for performing self-

optimization.

In this section we describe a particular implementation of AdOpt used to run the simu-

lations of a real power distribution system.

CBR Recommendation Engine: CRE consists of a mathematical summarizer, a CBR

engine, result evaluator sub-module and a case-base as shown in �gure 6.11. Raw data is fed

to the engine for summarization. This summary is used, in conjunction with the case-base

to generate the input for subsequent modules. At the other end, CRE has a result evaluator

module which receives the statistics of the MT at the end of an optimization cycle. These

130

statistics include, number of devices, supply-demand gap, solver selected, solver time and

error rate. If the real time execution conforms with the case-base then no action is taken.

But if the real time execution reveals that the solver time was incorrect then the case base

is updated. If the number of devices is too far from any of the cases in the case-base then

the result is added as a new case.

Runtime Modeler: Details of our Runtime Modeler (RM) are discussed in previous

sections. Architecturally, dynamic mathematical modeler consists of three components, a

Knapsack modeler, an LP modeler and a clustering component for dimension reduction, as

discussed previously.

Mathematical Toolbox: Our mathematical solvers are standard operations research

tools. We the mathematical solvers for the three optimization methods as black boxes. The

input to the solvers in MT are the runtime model(s) generated by the RM and the output is

an optimization plan. Details of the speci�c mathematical solvers are given when we describe

our evaluation setup.

6.4.5 Evaluation

The promise of autonomic system in general is to reduce the load of the operator. In order to

cater to this need, an autonomic system should be able to cater to as many conditions that

it can possibly handle without operator intervention. In addition, a self-optimizing system

should deliver a better result in general cases in comparison to a system where optimization

is performed manually.

In general, we claim that AdOpt framework caters to these demands of the self-optimizing

system. In fact our system can outperform other self-optimization techniques in a rapidly

changing system by leveraging the very changes which cause sub-optimal behavior.

To evaluate our system against these two core requirements we pose the following ques-

tions for evaluating our system:

131

Figure 6.10: System Flow

Figure 6.11: System Architecture

132

1. Is our system leveraging the change in the dynamics of the system in terms of its in

size and available time?

2. How much did we optimize or in other words what percentage of power is utilized?

We evaluate our system against two di�erent set of evaluation suites. Our �rst evaluation

suite tested our system for e�ectiveness, that is, are we able to provide a plan irrespective

of variations in the system. We compare our results with three techniques running without

adaptation and compare e�ectiveness and saving.

Our second evaluation suite is built on data from a power distribution network where

we compare our framework's results with existing technique. Our savings, or pro�tability

comes from allocating as much power to devices as we can. In this test suite we compare the

un-allocated power of AdOpt framework with the un-allocated power of existing techniques.

In this section �rst we will describe our evaluation environment. Then we will discuss

results for the two di�erent sets of evaluation we conducted on our framework.

Evaluation Environment

We used a a shared 2.4 G.Hz. Pentium Core 2 Duo processor with total of 2.00 GB of

RAM to conduct our simulations. The CRE is developed using FreeCBR. The APIs of CBR

are integrated with Matlab and uses Matlab's internal JDK to call FreeCBR functions. We

wrote our own code for the RT and clustering of raw data. The mathematical solvers in MT

uses Tomlab's CPlex 1 solver to solve BP and uses Matlab's optimization toolbox for LP

optimizations using simplex or interior point methods.

CBR Initial Case-base

To develop our CBR case-base we varied supply-demand gap and size of the problem. We

applied this data to all the three algorithms. Due to practical consideration we limited BP

1http://tomopt.com/tomlab/products/cplex/

133

to sizes of 2000 users. The variations in case-base, at the starting of evaluation is given in

table 6.4. For each data size, we considered three random values of resource provisioning,

ranging from close to minimum requirements (1/3rd) to 88%.

Experimental Simulations

We evaluated the claims on our system on two sets of data. The �rst set consists of a

hypothetical system where we showed the adaptability and e�ciency of our system. To

validate results further we tested our system on a real data obtained from a consortium of

independent power companies in California (CAISO).

Adaptability and E�ciency Evaluation

In these experiments we evaluated our system on three key variation of the system.

These variation includes the electric devices present in the system, the gap between supply

and demand of electricity, and time available to calculate the plan. We made equivalence

classes of these three variations to develop the set of experiments to test the system under

various set of variations.

We used �ve equivalence classes of the number of machines present in the system i.e. 500,

1000, 2000, 10000 and 50000. Three variations in demand and supply gaps are considered i.e.

5%, 20% and 50% more demand than available electricity. The available time to calculate

the plan is also varied. This is the maximum time available to �nd a plan. For hourly

Sizes Resourceprovisioning(%)

100 72 54 351000 88 65 442000 80 65 405000 60 40 3510000 80 62 4250000 67 50 34

Table 6.4: Input combinations for all three algorithms to generate the initial case-base

134

planning this may not matter because we have a whole hour at maximum to calculate the

plan. However, at any time when there is a spike in the system of demand or supply then a

plan needs to be recalculated for the hour. Therefore, we considered three upper boundaries

for time available to recalculate. These upper boundaries are 5, 20, 60 and 200 seconds

respectively.

Since a cross product of these three variations results into sixty cases. Showing results

of sixty cases is not possible in the paper. Therefore, we used orthogonal arrays to generate

twenty combinations of the experimental results that we show here. According to a study by

NIST using orthogonal arrays covers 98% for all cases in real situations [Wallace and Kuhn, ].

These twenty experiments are summarized in table 6.5.

The �rst three columns of table 6.5 show the three variations for the experimental run.

The # of devices shows the number of electric machine present in the system where a service-

level guarantee is required. Shortfall is the di�erent between the demand and supply. A-Time

is the maximum time available to calculate the plan.

Using these variations in each test we used the three optimization techniques indepen-

dently and then used AdOpt to see if AdOpt provides us with a better optimization by

selecting one of the optimization technique dynamically.

The C-Time is the consumed by AdOpt to �nd the plan after it receives the raw data in

the CRE and produces a plan as output. UP is the unutilized power in the system that an

optimization is unable to distribute amongst the electric devices.

Discussion

Now during all the experimental runs all optimization methods gave a result in form of a

plan except a few cases in binary programming (BP). This is because BP is only applicable

at small scales and it runs out of memory in test runs where the number of devices is larger

than 2000. Therefore, we did not run BP on problems involving more than 2000 devices.

It is clear from the results that AdOpt has picked the best optimization method to solve

135

Exp # Variations AdOpt Simplex Interior Binary

Point Programming

# of Short- A C C C C

Devices fall Time Time UP Time UP Time UP Time UP

1 500 5% 5 0.31 1% 0.08 1.5% 0.062 1.27% 0.20 1%2 500 30% 20 0.33 1% 0.1 2.5% 0.078 3% 0.17 1%3 500 50% 60 2.78 1% 0.06 7.4% 0.06 7.4% 2.57 1%4 1000 30% 5 4.54 1% 0.14 2.7% 0.13 3.99% 4.15 1%5 1000 5% 20 0.30 1% 0.13 1% 0.11 1% 0.59 1%6 1000 5% 60 0.31 1% 0.13 1% 0.14 1% 0.59 1%7 1000 50% 200 21.42 1% 0.19 6.9% 0.13 7.9% 1.31 1%8 2000 50% 5 0.33 6.7% 0.20 6.7% 0.20 6.7% 102.06 1%9 2000 5% 20 0.37 1% 0.22 1% 0.23 1% 2.09 1%10 2000 30% 60 46.47 1% 0.23 2.8% 0.18 3% 46.63 1%11 10000 50% 20 0.98 2.3% 0.86 2.4% 0.86 2.4% na na12 10000 5% 200 0.95 1% 0.89 1% 0.84 0% na na13 10000 30% 5 1.11 1% 0.83 1% 0.83 7.7% na na14 50000 30% 200 4.74 2.2% 4.38 2.2% 4.46 2.2% na na15 50000 5% 5 4.52 1% 4.35 1% 4.54 1% na na16 50000 50% 20 4.6 3.7% 4.31 3.7% 4.31 3.7% na na17 500 30% 200 0.23 1% 0.05 5.6% 0.08 5.6% 0.22 1%18 2000 5% 200 2.08 1% 0.22 1% 0.20 1% 1.92 1%19 10000 30% 60 0.98 1% 0.84 1% 0.81 1% na na20 50000 5% 60 4.5 3.7% 4.38 3.7% 4.42 3.7% na na

Table 6.5: Adaptability and E�ciency Test Results

136

Figure 6.12: Summary of Results of Table 6.5. Top most �gure shows results of AdOpt,middle one show results from Interior Point and the bottom one shows result of Simplexmethod.

the optimization problem.

Our �rst evaluation suite is based on pair-wise testing. This evaluation covers 95% of

input space thereby giving us a con�dence that most of our problems encountered can be

handled.

Pair-wise testing generates test cases to cover all operational equivalence partitions.

These partitions cater to various variations that are inherent in the system. We can par-

tition based on our optimization system's classes or on our ME's variation classes. Since

137

our technique selects it's behavior based on learned information, it is more appropriate to

partition the system on basis of managed element's behavior. Variations in our ME are

observed through three parameters; Size of problem, supply-demand gap, and time allowed

to calculate the results.

Our variations in data are closely related to system description in the previous section.

In our previous system, we provided three SAPE cycles to cater to di�erent types of opti-

mizations. Time to calculate in an ME varies with the SAPE cycle for which optimization

is being performed. The times are approximately 10 seconds, 1 minute or 10 minutes.

Historically, in the region of the authors, supply demand gap varies from 5% to 50%.

Our partition for gap will be 5% 30% and 50%.

Size of problem usually lies between few hundreds in winters to tens of thousands in

summer. Our partitions for sizes will be 500, 1000, 5000, and 50,000.

We used satis�ce.com's ALLPAIRS tool to generate test cases. For the variations listed

above 20 test cases were generated.

Summary of the simulation results is provided in table 6.4.1. We compared our technique

against stand alone executions of simplex, interior point and bip solvers.

Our conclusions are as following

• AdOpt is e�ective in 100% of class whereas in comparison

� BIP is not applicable for 50% of classes

� Simplex is not applicable for 10% of classes

Method Average time Average errorSimplex 1.14 3.3 %

Interior Point 1.15 3.4%BIP 12.63 .1%

Adaptable 2.47 2.28 %

Table 6.6: Summary of simulation results for pair-wise testing

138

• AdOpt though is slower than simplex and interior point yet it is as e�ective or better

than any independent technique. In addition AdOpt provides better utilization for

35% of classes

It should be noted that results and result times for all the three algorithms are highly

dependent on the input values. Secondly, the results in this test suites were conducted on

equivalence classes. The percentage results for a running system will depend on the frequency

for each class.

This comprehensive testing routine provides us a guarantee that our system can handle,

optimally, any variation of system behavior.

Our conclusions from this set of tests is that:

• As size increases, un-utilized power increases. Time is a critical factor only if size > 5.

• In almost all of the cases UP is minimum in AdOpt. Except in test-case 8(5,2000) 6.5.

This is becaus time is very small for this run and our shortfall is 50%. Combination

of these two key factors has an adverse e�ect on time.

• E�ciency of AdOpt is inversely proportional to size and shortfall, and directly propor-

tional to time . That is, as our size increases, e�ciency decreases. Similarly increase in

shortfall also has adverse e�ect on e�ciency, but as our allowed time increases e�ciency

increases to.

• BP is not able to handle all the situations and in situations where it is applicable it's

results are close to 1%. This is why we have not shown time vs devices graph of BP.s

CAISO Data-set evaluation

In our second set of experiments, we applied AdOpt on data collected in the state of California

in USA. California Independent System Operator provides it's actual and prediction demand

data online.

139

We collected hourly data for 7 days. Our demand for the system varied between 16000MWh

and 25000MWh. We interpolated this data on our local problem. Power generation in au-

thor's region is usually 30% to 60% less than required. To simulate such a situation, we set

power supply at 10000KWh. which roughly equaled the 30-60% window.

Our motivation to test system with data CAISO is that California's data is available and

the weather pattern in California closely resembles that of Pakistan. With this co-relation,

we can correlate the behavior of users in our country and simulate the results.

Discussion

We applied AdOpt and simplex algorithm for hourly planning. Since results for simplex

and interior point are very close for large data-sets we used only simplex to compare in this

experiment. The size of the problem in some cases is too big for BP to solve hence BP can

not be considered as a stand-alone technique to solve the problem.

Summary of results for 7 days are shown in �g. 6.13. AdOpt provided a better resource

utilization on everyday. For the 7 day period we generated plans with 27% better utilization

as compared to LP solution.

To better illustrate our system. We focus on the most busy day for AdOpt; day 6. Fig.

6.14 describe the supply and demand variations on 6th day of our data. As can be seen, the

demand for power in the starting of day is low, the demand grows steadily till mid-day and

then tapers o� towards the night time. If we use an LP based solver to assign power, our

un-utilized power stays in the range of 6% to 9.5%. In comparison AdOpt's consumption

starts o� at close to 0% when power demand and supply gap was low, as the power shortfall

grew, AdOpt updated it's selection of algorithm to manage the increase in size. At the end

of the day, when the demand again dropped sharply, AdOpt also scaled down and applied a

more conserving algorithm and increased it's saving.

Our comparison with LP yielded a 27% more e�cient energy allocation system. This

was leverage was achieved by applying a more conservative, albeit slow, technique at low

140

consumption times, mainly at night. In day-time, AdOpt adapted to a more robust LP based

algorithm to handle the bigger size/gap situation.

An interesting point to note here is that AdOpt was able to scale up and down imme-

diately. The system did not take any convergence time when the system grew or shrank

sharply. This is in contrast with control theoretic approaches which aim for stability and

sudden changes are smoothed over and sudden jumps in input space do not translate to a

drastic change in solution space.

We can summarize our �ndings as:

• AdOpt provides a 27% better service than LP.

• AdOpt can provide solutions to 100% of problems where as BP is not able to solve all

the situations in a 24 hour period

• AdOpt's adapts to sudden change in realtime. If a change in input necessitates an

adaptation then AdOpt's change is instantaneous.

Figure 6.13: AdOpt and Simplex comparison on 7 day CAISO data

6.4.6 Discussion

In this section we try to answer some of the questions that a reader may have regarding our

framework. Are the simulations realistic? Data for our simulation comes from a live

141

Figure 6.14: Supply demand and comparison for day 6 CAISO data

system. Users can download usage data from the website. Hence the projection of demand

is actual. We know that in places such as Pakistan, supply-demand gap is as much as 50%

so our power provisioning of roughly 80 - 55% of power is a realistic scenario for author's

country.

What is the breaking point of the system? We do not guarantee a plan if the supply

is not enough for the guarantees. In such a case, guarantees will need to be scaled down and

plan then can be calculated. This scaling down is an administrative decision. Our framework

however is able to incorporate such change.

Savings? Our saving, or increase in pro�t, is by allowing more users to consume electricity

thereby increasing user satisfaction and in a round about way more pro�ts from higher and

more tighter consumption.

What about spike handling? Spikes are a reality in our systems. We can not ignore them

but due to paucity of space, we did not discuss the details of spike handling. Our all-pairs

evaluations however did cater for spikes by testing system for 10, and 60 seconds durations.

Why AdOpt when simplex works? It is true that we were able to allocate upto 92%

of electricity through simplex alone, but with AdOpt we have increased this availability to

142

97%. This minor advantage means that in an area of 10,000 devices, 500 customers will be

more satis�ed than in LP alone.

Can this system be implemented? In some cities of Germany, systems which micro-

manage heating to individual hosing has been successfully implemented.

6.5 Adaptable Modeling Framework

Traditionally, models created for optimization of systems are generally expressed as abstract

mathematical models. These models are de�ned in standard mathematical lexicon. When

a system is to be deployed, its model is realized as code segments or equation matrices

or equation arrays based on solver being used for optimization. The dimensions of these

matrices and cardinality of variables is usually de�ned at the time of deployment and is hard

coded in code segments, matrix dimensions, etc.

In comparison, for systems such as DAS, system dimensions at the time of deployment are

meaningless . This is due to the fact that it can grow, as well as shrink over time. To handle

such changes, a measure of self-aware modeling integrated with self-optimization is necessary

to manage DAS. This self-aware optimization can leverage the change in dimensions of DAS

at runtime to attain scalability and performance boost according to the runtime state of

DAS.

Various systems have been optimized through mathematical models. However, in all of

the applications of mathematical techniques seen so far by the authors, the constraints and

tuning parameters were known when the system was being implemented [Femal and Freeh, 2005,

Javed and Arshad, 2008, Jabr et al., 2000]. We have not observed any detailed work for en-

gineering a system's model that exhibited variability in the size of their constraints and

control features.

Therefore in our modeling framework we have used the abstract mathematical models

143

as a meta-model to create an on-demand, instantaneous model of system based on system

statistics. In this section we de�ne our modeling framework for constructing an instantaneous

model of a system at runtime.

6.5.1 Structure of the Mathematical Meta-Model

In practice mathematical models are developed and expressed as abstract models. Mathe-

matical models represent a system in form of decision variables and constraints. Decision

variables are the controlling parameters to change the system state where as constraints are

the limitations of the system. Since in mathematics, a variable can take any numeric value,

it is important that we specify the limits of our decision variables as well.

To model a system, the control parameters and limitations of the system are analyzed. A

system can be composed of many control parameters but usually there exist logical groupings

with which these control parameters can be abstracted into a single entity or class. Usually

this also means that similar constraints apply on each of the element of the grouping. It also

means that a single abstract equation with appropriate quanti�ers can su�ce for containing

the behavior of all the variables within a group. Since these are logical groupings and

resemble a set like structure, we call these variable abstraction as ontologies of our system.

Hence an ontology is a group of control parameters which have similar logical structure

and are subjected to similar constraints. Like sets, ontologies can be grouped together to

form more inclusive notation. Mathematically, this means that whereas two di�erent logical

groups of variables, or ontologies are subjected to their own constraints, it can also have a

set of constraints that are applicable to both the groups. Hence our decision variables can

be part of a multitude of ontologies. Here a subscript de�ne the speci�c element within an

ontology. We call these grouping of ontologies as an ontological class. Figure 6.15 describes

the abstract model that we will discuss in detail here.

144

Adaptive Modeling for AdOpt

We consider making a meta-model for planning in AdOpt. We divide our devices into

ontologies according to their consumption pro�les and time periods. Our task is to maximize

the number of machines from each set which can be kept in "on" state for a particular period

in an hour without violating the service-level guarantee. Here the number of machines to

keep in "on" state in a particular time period is our tuning parameter or "decision variable".

For each tuning parameter there are two ontologies. First there are di�erent sets of machines.

Each type is represented as a subscript i. The second attribute is of time, that is which time

period does a speci�c decision variable represent. These types are represented as a subscript

t. Hence i, and t represent two ontologies combined in a single decision variable Xi,t.

The system in �gure 6.15 is subjected to three classes of constraints. Each of these class

is represented as a single abstract equation. Notice that equation 6.29 is only applicable to

one ontology, the time t while the other two are subjected to both. For demonstration of our

framework we will consider the example of equation 6.28 in detail. This equation constraints

the system by enforcing a minimum service level. It states that for every time period t, the

number of machines switched on in every machine class i should not be less than 1/3rd of

the total number of machines in that class.

During implementation these abstract models are expanded according to available system

statistics. If our system had �xed machine classes, say 10 and 6 time periods (t) the abstract

equation 6.28 would have been expanded to 60 equations. Each of these 60 equations would

have represented one speci�c (t, i) tuple.

Mathematical models for systems which do not exhibit change in cardinality from abstract

model to implemented model can be modeled e�ectively. That is, if we can enumerate at

time of implementation or deployment as to how many machines we have and how many

time segments we have, then generating an actual model of the system from abstract model

is straight forward.

145

However, if the cardinality cannot be evaluated at the time of implementation, then

modeling becomes a di�cult task. A naive modeling technique is to consider worst case

scenario. For example, in the sample model above, we limit i, or device classes, to say 1000

and then make a model for these many classes.

For a grid level electric distribution network this solution is not feasible. First, the

number of device classes cannot be predicted. There are new types of machines that are

being added everyday and limiting this growth is not possible. Second, worst case setup is

highly ine�cient. By calculating for a 1000 classes always, we are consuming much more

resources where as in actuality we might need a fraction of these calculations. Third, because

we always assume a large data-set, the choices for algorithms is limited. There are algorithms

which are more e�cient for small to medium sized data-sets. If we can evaluate and model

at runtime, it is possible to derive a better result by using more accurate algorithms.

6.5.2 Modeling at Runtime

Various techniques exist for creating a runtime model of a system. These e�orts are usually

intended for architectural and operational runtime modeling systems. We observed that

these modeling framework have some commonality in processing their task. Usually runtime

modeling frameworks de�ne a set of primitive artifacts with de�ned semantics. At runtime

these artifacts are instantiated and replicated and relationship among these artifacts is estab-

Maximize(Z =∑i,t

Xi,t) (6.27)

∀t∀iXi,t ≥ supplyi/3 (6.28)

∀t∑i,t

µiXi,t ≤ supplyi (6.29)



146

lished [Pickering et al., 2009, Goldsby and Cheng, 2008, Kuhn and Verwaest, 2008]. There

are various methods to extract information from a system and various uses of the modeled

systems, but this is beyond the scope of runtime model generation.

The underlying architecture of our framework is similar to these runtime modelers. The

di�erence is that we use the components of abstract mathematical models as our primitive

artifacts. Speci�cally, the abstract mathematical model de�ned for the system is used as a

meta-model. The primitive artifacts for us are the ontological classes. When we observe an

object, or a variable, belonging to a speci�c ontology, we create a corresponding ontology

object for it in our mathematical model. This process is covered in the modeling of ontologies

step (step 1 de�ned below).

The equations of our meta-model de�ne the relationships between di�erent variables.

Once we determine the cardinality of ontological classes, we develop relationships of ontolo-

gies by exploring the equations one by one and setting up the constraints and limitation of

the system in the process. This process and production of the complete model is generated

in the modeling phase.

This runtime modeling is three step process. Our framework �rst determines the system

statistics to de�ne cardinality for ontologies. In the second step, it determines the cardinality

of relationships and determine the number of equations each meta-equation will generate.

The third step uses the cardinalities to create an instantaneous model. The second and third

steps are closely related and their implementation is also intertwined. However, since step

2 is platform independent and step 3 is dependent on the solvers, merging the two steps is

avoided where-ever possible.

The description of the phases is given below.

147

Modeling of Ontologies

Modeling of ontologies is a two step process. First we pre-process our data to reduce dimen-

sions of our input data.

An input to our system consists of raw usage data for devices. In pre-processing we

reduce the dimensionality of raw usage data using a clustering algorithm. The details of

this dimension reduction is discussed in our previous work [Javed and Arshad, 2009b]. This

pre-processing is required due to the nature of problem. In other works such as Femal and

Freeh's use of LP, such pre-processing will not be required[Femal and Freeh, 2005]. For such

models, direct evaluation is possible.

Modeling of ontologies determines the cardinality of each ontological class. In our model,

there is only one ontological class, X. This ontology in turn is composed of two co-dependent

ontologies: time interval, represented by subscript t and instance of a cluster represented by

subscript i. We consider 6 time intervals for our problem, however, this interval can also be

changed in runtime.

Modeling of Relationships

A mathematical model is a representation of system in terms of inequality and equality

equations. These equations de�ne the constraints and limits of the system.

Our framework �rst distinguish between the equality and inequality equation. Though

both are evaluated in the same way but in construction step, a di�erent matrix is generated

for each of those equation genres.

Our framework in this step uses cardinalities of ontological classes to expand the quanti-

�ers. Each quanti�er expands some ontological classi�cation. For example, a ∀Xi quanti�er

translates to 1 equation for ontology i within the ontological class X. In addition, the co-

e�cient and right hand side for these equations is also determined in this step as constants

148

are sometimes also associated with a speci�c instance of ontology.

Similarly, equation 6.28 has a (∀t∀i) quanti�er. Hence this meta-equation is expanded

into i x t equations, since the equation is created for each (i, t) tuple. The equation states

that the coe�cient of (i, t)th decision variable is 1. So for each new equation expanded

for meta-equation 6.28, the coe�cient for variable Xi,t will be one and all other variables

will have coe�cients of zero. The equation states that the right hand side of this equation

will have the constant value of supplyi/3. The supplyi/3 is the cluster of the set Xi. We

determined this value in step one. Hence for each equation the correct corresponding value

for Xi/3 is placed.

Model Construction

A mathematical model can be represented in di�erent forms. One of the most commonly

used form to represent mathematical models in computing systems is a matrix form. Since

arrays and matrix are realization of the same phenomenon, we will discuss how we created

matrices from our results from previous steps.

In matrix notation, a series of linear inequality equations are represented as:

A× x < b

and a series of linear equality constraints as:

Ae× x = be

Here x is a vector representing the variables, b is a vector for right hand side constants for

inequality constraints and be for right hand side constants for equality constraints. Similarly

A is matrix of coe�cients of x for inequality constraints and Ae for equality constraints.

149

Similar generalizations exist for non-linear systems but is beyond the scope of this work.

Though both equality and inequality constructs are almost similar but solvers accepts

them in two di�erent set of matrices. We construct both the matrices in similar fashion.

The process of constructing matrices is as follows: We �rst determine the matrix x. We

use the notion determine because x is not constructed in matrix form per se. Rather x

is considered as an ordering of decision variables. Decision variables, if we recall, are the

instances of various ontological classes that we created in step 1. Fixing the order does not

change the execution of algorithm so any convention which completely covers the ontological

class space is su�cient. However, �xing an order is necessary as this order determines the

placement of coe�cients in matrices A and Ae.

Our model has a single ontological class of decision variables, Xi,t. We �x an order of

expanding the two dimensional space of X by arranging rows before columns. This step �xes

our x vector.

Our framework proceeds with processing our equations determined in step 2. For each

equation a row in matrix A and one in b is added for an inequality constraint. Similar step

is executed for equality constraint but for matrices Ae and be. In a newly added row of A,

all elements are zero excepts the ones speci�ed by the equation. The constant values for

coe�cient of A and the value in b are placed. This step is repeated for all the equations

which were generated in step 2. At the end of this The complete matrices A, Ae, b and be

are produced.

6.5.3 Running Example

We now describe the construction for a row of equation 6.28. Let's assume that we 50 clusters

were created during our pre-processing and we have 6 time slots. This means that our step

1 will provide us with the value of 300. These are the number of decision variables that we

150

will have in our system. For our model generating step, this means that size of x matrix will

be 1× 300 and matrices A and Ae will have 300 columns.

Lets assume that cluster number 10 has 18 elements. So our equation for second time

period from step 2 will look like the following:

1×X10,2 < 6

Our model construction will construct the following row for this equation in matrix A.

Column 1..61 col 62 63 .. 300

Value 0..0 1 0..0

In addition, it will add a row in matrix b and put the value 6 in the newly added row.

A complete matrix A thus will have x× t columns and i× t equations for meta-equation

6.28, t equations for meta-equation 6.29 and i × t equations for meta-equation 6.30 and a

solitary equation for meta-equation 6.27.

6.5.4 Evaluation

We have designed a framework for modeling of optimization of large scale power systems.

Conservation of power through optimizing usage of end user devices is a not new concept.

However, to our knowledge very few techniques are available which are scalable and e�cient

to achieve this goal. So far the major work in this �eld has been performed on �xed sized

systems where the number of devices are known at design time. The models of systems are

before deployment time based on the largest possible or worst case deployment of system

[Ashok, 2007, Galus and Andersson, 2008].

Our system engineers the model at runtime instead of populating the variables of a

�xed model. Therefore our evaluation, compares the existing modeling methods for similar

151

smart-grid application with our runtime modeling results. We claim improved performance

using two key matrices; First our response time is faster than a �xed model. Second, we

claim better e�ciency in achieving goal of optimization, i.e. in distributing power to the

consumers.

The aforementioned 'e�ciency' of our electric distribution is the unutilized power (UP) in

the system that an optimization is unable to distribute amongst the electric devices. The de-

tails of why such unutilized power exists is discussed in our previous work [Javed and Arshad, 2009b].

We would like to state here that the increased e�ciency in our example system is because

we modeled it in a way so that a decrease in model size will increase e�ciency. Thus our

results of e�ciency are applicable when the system can be and is modeled in a way which

relates the e�ciency with size of model.

Our evaluation thus evaluates the following hypothesis: Does modeling at runtime for

a system that varies its size and structure results in bene�ts in terms of time or e�ciency.

In order to test this hypothesis we used two sets of real data collected from two di�erent

sources.

We conduct our evaluation on two di�erent sets of actual data readings. This is because

of two reasons: First, consumption data of individual users for a city is not readily available.

Second, this split analysis proves applicability of our framework for systems both large and

small.

Our �rst set is a small but detailed study of household energy use in Sollentuna, Sweden

performed over the course of two years. Experiments on this data is used to show a corre-

lation between total consumption, time to calculate and the number of users. Our second

experiment is on a data from the state of California, USA. In this experiment we apply our

modeling framework on large scale set of data and we see the bene�ts in terms of e�ciency.

152

Evaluation Setup

For our evaluations we used a shared 2.4 G.Hz. Pentium Core 2 Duo processor with total of

2.00 GB of RAM. The mathematical solvers used Matlab's optimization toolbox.

Evaluation Data Details

Our �rst experiment uses hourly consumption data from approximately 700 houses collected

in Sollentuna, Sweden for the year 2005-2006. Through this experiment we validated the

following

• There exists a strong correlation between the time taken for optimization by dynamic

modeler and the consumption of energy.

• There is a weak correlation between a �xed model optimization and consumption of

energy.

• there exists a strong correlation between total demand for energy and the number of

consumer clusters.

Whereas the �rst two claims support the case for dynamic modeling, the last claim helps

us construct a more powerful scenario for validating the scalability and applicability of our

modeling framework.

Our modeling framework can model and optimize systems which vary in size. The real

bene�t of the system is attained when the variation in size is considerable and the scale of

optimization is large. Since a small scale LP optimization in itself takes insigni�cant time.

To test our framework for a large scale realistic system we use data published by CAISO.

This data consists of daily usage of electricity in state of California, USA. A sample of this

data is provided in �gure 6.16. However, this data is incomplete for our modeling since

we require the usage pattern of individual users and not just the total consumption of the

153

system. To overcome this problem, we arti�cially constructed the clusters of users based

on total consumption by dividing the total consumption over in a Gaussian distribution.

Gaussian distribution was used because it was the most appropriate and simple distribution

to represent the natural behavior of large number of users. Though the distribution of load

has a minor impact on the overall performance. We still consider it as part of our future work

to model and evaluate the system with di�erent distributions. To validate this distribution

further, we used results from our �rst experiment set. Even though, intuitively it makes

sense that increased consumption means increase in number of consumers. We still base our

argument for constructing the usage patterns for individual users based on the correlation

found between consumption and users in our �rst experiment.

In the following sections we de�ne the standard modeler, the modeler simulating the

prevalent modeling methods in smart-grid literature, and our dynamic modeler using the

aforementioned sets of data. The �rst set of data will validate the correlations and the second

set will validate the scalability and e�ciency of our framework in a large scale environment.

Standard Modeler

Smart-grid techniques which focus on global optimizations such as in [Ashok, 2007], [Izquierdo et al., 2008],

[Jabr et al., 2000], and [Wang et al., 2002] build models for the worst case scenario. With-

Figure 6.16: Consumption pro�le of California for a day as published by CAISO (Consump-tion in MWh)

154

out a runtime modeling framework this is necessary because updating the system model

manually at runtime is not possible.

For a system such as our micro-management application for smart-grids, a model using

the standard method means constructing a model for the worst possible day throughout the

life cycle of the system. Instead of simulating this scenario, we only consider the cluster

con�guration for the worst hour of the day we conducted our experiments on. Note that this

is not the worst case or largest con�guration for the system life cycle. However, this provides

su�cient comparison since our technique has proven itself to be faster. We use the number

of clusters as the metrics here because the size of the model is dependent upon the number

of clusters for each hour. We used standard k-means clustering on the input data where k

is worst-case clustering size for the day. These k clusters and their frequencies populate the

�xed input matrix for the optimizer.

6.5.5 Evaluation Results

Swedish Household Consumption Data

Our �rst experiment uses the data collected from Sollentuna, Sweden. The data consists of

consumption of electricity in a suburb of Sollentuna collected at an interval of 1 hour. We

use these consumption pro�les as input for both our dynamic modeling framework and the

standard modeler. We conducted the experiments multiple times and considered the mean

of runs to deal with operating system related noise in response time. This is because the

response time for the small data-set is small enough to be a�ected by background processes

of the operating system.

The execution time for the dynamic modeler, standard modeler, and the total demand

of the system is shown in �gure 6.17. Here the line with square points represent the time for

standard modeler in seconds, the line with diamond points represent the response time of

155

dynamic modeler and and the line with square points represent the total energy demand in

MWh. As it can be observed, there is a correlation between the demand and the response

time for the dynamic modeler. The correlation coe�cient for these observations is 0.75 using

Pearson method. On the other hand, the relation between response time of standard modeler

and demand comes out as week inverse, -0.3 using Pearson method. This validates our �rst

two claim that the strong correlation exists between time taken by dynamic solver and the

total demand of the system and that a �xed size modeler is not able to bene�t from the

change in demand.

Our third claim is explained through the graph in �gure 6.18. Here the dotted line

represents the total demand for each hour, the line with square points represents the number

of users and the strong line with triangle points represents the cluster count. Here we can

see the relation between the number of consumption clusters and the total consumption. We

see that a strong correlation exists between the number of and the total consumption using

Pearson method (0.83).

We can thus conclude from this experiment that a strong correlation exists between the

consumption, number of users and time taken by the dynamic modeler. Furthermore, no

correlation was observed between the standard modeler and total consumption.

CAISO Data

Our second evaluation compares our modeling framework with the standard modeling method

on the criterion of running time and e�ciency if we were to distribute electricity in state of

California using our method. We evaluated our system by running both systems on data of

24 hours from a power distributor's pro�le.

We observed that our framework's execution time was considerably less in comparison

to the standard modeler. Figure 6.19 plots our framework time and standard modeler time.

Here the squares represent the time in seconds the standard modeler required to model and

156

Figure 6.17: Response time for dynamic and standard modeler in comparison to demand.(Response time in seconds)

Figure 6.18: Comparison between demand, clusters and active users for 24 hour period asobserved in Sollentuna, Sweden

Figure 6.19: Solver time for 24 hours CAISO data. (Response time in seconds)

157

Figure 6.20: Solver e�ciency for 24 hours CAISO data. (Power allocation in Kilo-Watts)

optimize the data for that speci�c time period and diamonds represent the time in seconds

for our runtime framework. It can be observed that runtime modeling time is considerably

faster throughout, except for two cases, in the 6th and the 21st periods. These are the cases

where the size of runtime model was maximum and both the models were of similar size.

We witnessed at an average 56% better response time than the standard system.

Our second evaluation goal was to achieve better performance. Figure 6.20 plots the

power allocated by runtime framework and by standard modeler. Here diamonds represent

the runtime framework allocation of power in megawatts and squares represent the standard

modeler results . We observed a marginal improvement in allocation of power. Total increase

in power allocation was close to 2% which is signi�cant for a large scale system.

Our results show that our runtime modeling framework is faster than a static modeling

method. Our runtime modeler is approximately 50% faster than the standard modeler.

Furthermore, we observed that we can achieve better performance for our speci�c model

through the use of dynamic runtime modeling

158

6.5.6 Future Dimensions of Runtime Modeling

Dynamic modeling intuitively leads to an e�cient of optimization. Since the model only

consists of variables and constraints that are applicable at that instance, a more streamlined

and concise model is constructed resulting in faster optimization and better results.

From the study of application of smart grid, cloud computing and other �elds where

adaptable behavior is anticipated in future systems, we see that this rigidity of structure will

not be guaranteed in our future systems. We have seen optimization applications of smart-

grid such as applications for Plug-In Hybrid Electric Vehicles (PHEVs) [Galus and Andersson, 2008]

where the demand pattern of the users is an evolving phenomenon. From modeling perspec-

tive this means that the relationships and constraints for the system will be described at

runtime. Even more appropriate comparison is the work of Ogston and colleagues who de�ne

an adaptive clustering method to group together various devices [Ogston et al., 2007]. The

technique is scalable for clustering devices in a city and the resulting clusters, their patterns,

frequencies and shape will emerge at runtime. If we are to use this data to manage these

devices then runtime engineering of model that considering the new clusters and patterns

will be necessary.

Our modeling framework provides the basis for engineering models for such techniques

of the future. Although our existing work caters for LP. However, the three step engineering

process described in section 6.5.2 for creating models is more or less same for modeling non-

linear, integer and some heuristic optimizations. Our work not only provide solution for the

smart grid problem but also provides a foundation for future dynamic modeling for these

modeling techniques.

Our current framework is a proof of concept and requires engineering to integrate a meta-

model into our framework. In our future work we look at ways to bridge this gap. We are

working on evolving a method to de�ne mathematical abstract models in a language which

our framework can understand and create a meta-model from. To this end we are evaluating

159

various modeling languages and are planning on including a translation engine which will

translate abstract mathematical equations into a meta-model. Such work will streamline

integration of our framework with existing optimization platforms.

Our second direction is looking at ways of determining constraints from system statistics.

In our current framework, system cardinality of system constraints are determined solely

by the cardinality of quanti�ers. However, systems which can "sense" constraints through

statistical analysis can produce much more powerful modelers.

Our third direction of interest is integration of our framework and optimizers with phys-

ical infrastructure and implement optimization of resources. A running system of this sorts

will be of real bene�t to society.

6.6 Conclusion

In this chapter we have provided three methods for scalable, dynamically modeled, scheduling

component which self-optimize its accuracy based on size of the system. The proposed

methods have three distinct advantages to static planners proposed in literature. First the

system is able to plan for large scale scheduling. Second, the system is able to leverage

accuracy against size for optimal scheduling. Third, the system is able to model itself at

runtime for a dynamically modeled system to increase the accuracy of the system without

the involvement of human operators.

160

Chapter 7

Conclusion and Future Work

Demand side management (DSM) is the task of managing end user consumption for optimal

provisioning of electricity. Whereas DSM for large customers or at a coarser control gran-

ularity has been deployed for some years now, applying the same strategies over domestic

devices through manual control has been considered infeasible due to the complexity of task

and user fatigue [Kim and Shcherbakova, 2011]. What is required is a self-managing system

which can automate the task of energy management. However, a �ne-grained control of

energy devices is a very hard problem due to complexities of volatility and size.

The demand for energy in a house is extremely volatile when compared to loads in a city

or in large scale industry. Forecasting such load using existing modeling paradigms yielded

inaccurate results. Furthermore, additional data about the consumer did not provide any

increase in e�ciency. Even if perfect forecast is available, Scheduling of devices for optimal

use is reducible to NP-complete scheduling problem proving it to be intractable.

In this thesis we have presented methods and techniques to resolve the volatility and size

issues gracefully and deduce the best possible planning for a DSM system. These techniques

work within a self-managing control loop which seamlessly integrate the di�erent techniques

161

into a self-managing demand side management system. In this chapter we summarize the

results from the di�erent cogs to present a comprehensive self-managing DSM system.

162

7.1 Summary

In chapter 1 we argued that we require a self-managing demand side management for do-

mestic consumers. This self-management is needed to intelligently automate the task of

DSM which would relieve the human consumer and operator from continuous monitoring

and planning of devices. Our strategy is to control the high usage heavy load devices since

they have the biggest impact on the consumption. To provide a fair and equitable distribu-

tion, we restrict our plans to limits prescribed by service level agreements. A service level

agreement is a contract between a utility provider and a consumer limiting the maximum

load shedding that can be done over a period of time.

To plan such devices we require a scalable planning strategy which we discussed in chapter

6. To construct this plan though we require a forecast of device consumption. Due to severe

volatility, constructing such pro�le is non-trivial. In comparison, forecasting for a house and

then desegregating peak load was found to be a more feasible solution. In chapter 4 we

discussed the forecasting paradigm that we introduced to forecast household loads and in

chapter 5 we discussed the disaggregation results proving the e�ectiveness of this strategy

for providing heavy loads data for planning. In chapter 3 we presented an architecture

which tied these cogs together to deliver the self-managing demand side management. In

this chapter we will in turn summarize the results for the three cogs of the architecture.

7.1.1 Planning

Scheduling devices under a set of constraints is an NP-complete problem. Thus scheduling

loads for even a small population exactly is not possible. To resolve this issue we transformed

our problem in to frequency demand through clustering and then applied linear program-

ming to �nd the frequency for each clustered group (chap. 6 sec. 2). This transformation

introduced an error of up to 6% in the optimality of the solution but made the problem

163

scalable. There are two conclusions from this strategy. First is that we can reduce loads by

as much as 30% through this strategy. Second is that we showed that through this transfor-

mation we are able to plan for hundreds of thousands of devices within the time constraints.

We also show that the transformation and optimization is fast enough for us to replan the

system in case the underlying system changes and the data becomes invalid.

An observation from the energy system is that the number of devices in the system

is not constant. Since the heating and cooling loads are dependent on the weather, the

number of devices requiring energy vary over time. Such variations provide us with an

opportunity to improve our optimization performance. Though the transformation for large

scale optimization is necessary for a small scale problem of close to 500 devices, an exact

solution is possible if su�cient time is provided to the optimizer. Our results showed that

through this adaptable optimization we are able to perform better than the transformation

strategy in 75% of the cases (Chap 6. Sec. 3). Furthermore, AdOpt is capable of handling

all the scalability variations in the system whereas the exact integer programming solution

can only cover 60% of the cases.

Since the size of the system varies, it is also important that the model dimensions should

also vary. However, generating model at runtime is a non-trivial problem. In chapter 6

section 4, we present a dynamic modeling technique. This modeling technique varies the

dimensions of the model by observing the types of the systems and their frequencies and

builds a model at runtime which is optimal. Our results show that such modeling marginally

reduces the size of the system resulting in faster calculations and in the case of transformation

a lower error.

The task of planning is strongly dependent upon the knowledge of consumption data

of the devices. The planning requires as input the number of controllable devices that are

predicted to be in use in the next hour. But forecasting devices is a very di�cult task due

to the volatility of the data. However, if we can forecast the house and disaggregate the load

164

to �nd the usage of device then the planning system can be deployed. Next we will discuss

the forecasting methodology followed by disaggregation results

7.1.2 Forecasting

Forecasting energy load in general is a non-trivial task. However, advances in forecasting has

resulted in good strategies for forecasting regional or city-wide loads. The major innovation

which lead to this advance was incorporating weather and other related data which impacted

the consumption of electricity in a region. But for household loads, the extreme volatility of

the data, makes the forecast di�cult. Though it can be argued that this volatility can be

characterized to a certain degree by observing the occupant and the structure of the house,

studies to correlate consumption with these attributes did not yield positive results.

However, we observed a subtle but strong temporal relationship between the attributes

and the consumption. By temporal we mean that attributes showed varying correlation with

the consumption over time. Although it was di�cult to quantify this, we used this subtle

relationship to improve the forecasting accuracy. The existing short-term load forecasting

methods usually apply technique over a single house. But due to the extreme volatility and

small sample space the model was inaccurate and showed signs of over-�tting. In some cases,

the forecasters tried to group together similar houses based on some parameter and then

forecasted for the entire group. Since correlation across attributes for all times was weak,

the forecast accuracy in such cases was also low. We instead, built a multi-dimensional

model where each attribute, including the hour of day, served as a dimension of the model

(chap. 4). Data from all the houses was used to train the data where the hour of the day and

house attributes became the parameters of the model. We used a back propagation neural

network for this task. The internal mechanism of the neural network identi�ed the temporal

relation of loads with attributes and provided us with a better results than with the existing

165

modeling paradigm.

This multi-dimensional model was able to capture the temporal correlation of attributes

with consumption better. This resulted in increase in accuracy by as much as 50% and

reduction in mean squared error by as much as 39%. However, our results even with such

improvement are not very accurate. The maximum achieved accuracy is only 65% and

the least variance is close to 2. Secondly our planning algorithm requires heating and air-

conditioning device state and not the total energy pro�le. To identify heating and cooling

loads, we disaggregated data from the forecasted load which we discuss next.

7.1.3 Load Disaggregation

Load disaggregation is the task of identifying individual device load pro�le or state by observ-

ing only the total load. Load disaggregation is usually used for non-intrusive load monitoring

for identifying device usage in real time. The data for NILM is usually a time series at a

very high frequency ranging from 16 KHz to a reading every second. However, our data is

not realtime but is rather a forecast and is at the level of a reading per hour. On a positive

note, we only require disaggregation for the heavy heating and cooling loads which are very

pronounced in the load signatures.

To achieve this disaggregation we applied a combination of neural networks and support

vector machines (SVM) for disaggregation heating load from the forecasted data (chap 5.).

Our main concern in this task was to �nd all the high consumption devices which are switched

on. Thus the important parameter is accuracy - the percentage of times we identi�ed the

usage of device. Our results show that we can attain accuracy of 99% or more with ANN-

SVM combination for desegregating our target loads.

However, our forecast is not perfect. The forecast can vary in accuracy between 52.5%

and 65% with variance ranging between 2 and 4.23. To simulate the forecasting error and

166

characterize disaggregation results over an inaccurate forecast, we incorporated noise corre-

sponding to the worst numbers of forecast in the load data. Our results show that even with

faulty forecast we can achieve accuracy of 97%.

7.1.4 Putting It All Together

In this thesis we presented a method to forecast energy load with higher accuracy than

existing methods. We then present our �ndings on the ability to identify heavy loads from

this forecast. We then showed that if we have the list of devices which are under our DSM

plan then we can reduce energy demand. Furthermore we can adapt this planning based on

system size for an exact solution or gracefully reduce accuracy in favor of a scalable solution

for large scale datum. In chapter 3 we present a way to integrate these technologies to deliver

a single self-managing DSM solution.

7.2 Lessons Learnt

In this section, we present a list of lessons we learned while working on this thesis.

Adaptability is key to scalability

Scalability is a major concern for the future smart grids. To mitigate the scalability issue

there are two traditional solutions. One is to limit the size of the system and use exact or

close to exact solutions. The other way is to use distributed algorithms. These algorithms

for DSM generally sacri�ce accuracy of result but can resolve larger system con�gurations.

We learnt through our experience in this thesis that the key to maintaining scalability and

provide accurate solution is adaptability. If the system is dynamic in size, that is, the size of

the system varies over time, then this dynamic nature can be leveraged to vary the accuracy

167

of the system. This way we can deliver timely solutions in every scenario but can also

increase our accuracy whenever the system con�guration allows us to increase accuracy.

Forecasting is a anthro-structuro-temporal phenomenon

Short term electricity load forecasting is traditionally modeled with global phenomenons

such as temperature, day of week etc. This is acceptable for regional load modeling since

these variables impact the majority of population. However, when forecasting for a single

house care should be taken in choosing the features which de�ne the variability of the house.

We found that for our system energy consumption has an intricate relation with social,

anthropologic and structural features of the house. The relation was not visible over the

entire period but rather was temporal, that is, di�erent features showed a relationship with

energy consumption at di�erent hours of day or days of weeks. This led us to think of energy

forecast modeling in a very di�erent way leading to STMLF.

Volatility is the biggest challenge for forecasting

Volatility is by far the biggest challenge for making accurate forecast. In this regard we learnt

that it is important to capture and model data which reduces the volatility. Attempting

to forecast the extreme volatile device data may not have resulted in acceptable results.

However, by capturing household consumption data we were able to make a better forecast

due to relatively lower volatility of this data. Since our forecast was more accurate we were

able to reconstruct the device consumption data through disaggregation.

Holistic picture helps in identifying key parameters

Various researchers have attempted to resolve the components of the DSM problems in

isolation. However, we observed that a holistic approach is more practical. For instance, the

168

issue of forecasting device loads is still not resolved yet but looking at it holistically from

system perspective, the household forecast-disaggregate route produces much better results.

7.3 Future Work

For each of the contribution in this thesis we have a list of future directions that can be

explored.

7.3.1 Forecasting

The proposed short term load forecast method is a proof of concept anthropologic and

structural data can bene�t forecasting. We have three proposed strategy to move forecasting

forward.

- First we would like to group together houses on attribute-temporal axis and then con-

struct the forecast either in the sub-groups or by weighting the attributes according to

their groupings.

- Second we would like to explore the miss-forecasted tuples as discussed in chap. 4 sec.

6. Our conjuncture is that we do not have the attributes to di�erentiate these regularly

miss-forecasted tuples from the rest of the data and thus the forecast is so regularly

inaccurate. We would like to try and isolate these tuples and devise a strategy to infer

the missing attributes and reconstruct the forecast using these pseudo-attributes.

- Third, We would like to test the generalization quality of our system. We would like

to see if we can reconstruct attributes of the house by observing the load pro�le and

169

matching it with our labeled data to identify attributes of the house.

7.3.2 Load Disaggregation

Our current load disaggregation solution aims at reducing false negative only. This has the

repercussion of a high false positive. Though it does not e�ect the correctness of our system,

but it may result in sub-optimal solution by providing a bigger pool of demand than actual.

One future direction in this work can be application of algorithms to reduce the false positive

rate without compromising on the false negative rate.

7.3.3 Planning and Modeling

- Our current adaptive model has only two stages, either we construct an exact solution

or an approximate one. A more robust solution may partially transform the problem

based on size and realtime constraints. In such a system instead of transforming all

the decision variables from binary domain to frequency domain we may adapt only

as many variables as the real-time constraints require. This will provide a stratum of

adaptation instead of the current two state model.

- Our current solution only looks at small sub-set of contractual possibilities. We would

like to model other types of contracts as constraints as well to provide di�erent type

of options to the consumers.

- Our current solution does not incorporate time of use pricing or other �nancial initia-

tive based DSM strategies. We would like to extend our planning algorithm to these

strategies as well.

170

Bibliography

[Aalami et al., 2010] Aalami, H., Moghaddam, M. P., and Youse�, G. (2010). Demand

response modeling considering interruptible/curtailable loads and capacity market pro-

grams. Applied Energy, 87(1):243 � 250.

[Aamodt and Plaza, 1994] Aamodt, A. and Plaza, E. (1994). Case-based reasoning: founda-

tional issues, methodological variations, and system approaches. AI Commun., 7(1):39�59.

[Abaravicius and , 2007] Abaravicius, J. and , Sernhed, K. a. P. J. (2007). More or less

about data-analyzing load demand in residential houses. In ACEEE 2006 Summer Study,

Paci�c Grove, California (2007).

[Abdel-Aal, 2005] Abdel-Aal, R. (2005). Improving electric load forecasts using network

committees. Electric Power Systems Research, 74(1):83 � 94.

[Abdelwahed et al., 2004] Abdelwahed, S., Kandasamy, N., and Neema, S. (2004). A control-

based framework for self-managing distributed computing systems. In WOSS '04: Pro-

ceedings of the 1st ACM SIGSOFT workshop on Self-managed systems, pages 3�7, New

York, NY, USA. ACM Press.

[Abrahao et al., 2006] Abrahao, B., Almeida, V., and Almeida, J. (2006). Self-adaptive sla-

driven capacity management for internet services. In in 17th IFIP/IEEE International

Workshop on Distributed Systems: Operations and Management, DSOM.

171

[Albadi and El-Saadany, 2008] Albadi, M. and El-Saadany, E. (2008). A summary of de-

mand response in electricity markets. Electric Power Systems Research, 78(11):1989 �

1996. A summary of demand response in electricity markets.

[Alfares and Nazeeruddin, 2002] Alfares, H. K. and Nazeeruddin, M. (2002). Electric load

forecasting: Literature survey and classi�cation of methods. International Journal of

Systems Science, 33(1):23�34.

[AlFuhaid et al., 1997] AlFuhaid, A., El-Sayed, M., and Mahmoud, M. (1997). Cascaded

arti�cial neural networks for short-term load forecasting. Power Systems, IEEE Transac-

tions on, 12(4):1524 �1529.

[Alliance, 2006] Alliance, Z. (2006). Zigbee speci�cations, version 1.0 r13. ZigBee Alliance,

http://www. zigbee. org.

[AlRashidi and EL-Naggar, 2010] AlRashidi, M. and EL-Naggar, K. (2010). Long term elec-

tric load forecasting based on particle swarm optimization. Applied Energy, 87(1):320 �

326.

[Amaral et al., 2008] Amaral, L. F., Souza, R. C., and Stevenson, M. (2008). A smooth tran-

sition periodic autoregressive (stpar) model for short-term load forecasting. International

Journal of Forecasting, 24(4):603 � 615. <ce:title>Energy Forecasting</ce:title>.

[Amjady, 2001] Amjady, N. (2001). Short-term hourly load forecasting using time-series

modeling with peak load estimation capability. Power Systems, IEEE Transactions on,

16(3):498 �505.

[Amjady and Keynia, 2009] Amjady, N. and Keynia, F. (2009). Short-term load forecasting

of power systems by combination of wavelet transform and neuro-evolutionary algorithm.

Energy, 34(1):46 � 57.

172

[Amjady et al., 2010] Amjady, N., Keynia, F., and Zareipour, H. (2010). Short-term load

forecast of microgrids by a new bilevel prediction strategy. Smart Grid, IEEE Transactions

on, 1(3):286 �294.

[Ardakanian et al., 2011] Ardakanian, O., Keshav, S., and Rosenberg, C. (2011). Markovian

models for home electricity consumption.

[Ashok, 2007] Ashok, S. (2007). Optimised model for community-based hybrid energy sys-

tem. Renewable Energy, 32(7):1155 � 1164.

[Bakirtzis et al., 1995] Bakirtzis, A., Theocharis, J., Kiartzis, S., and Satsios, K. (1995).

Short term load forecasting using fuzzy neural networks. Power Systems, IEEE Transac-

tions on, 10(3):1518 �1524.

[Beal et al., 2012] Beal, J., Berliner, J., and Hunter, K. (2012). Fast precise distributed

control for energy demand management. In Self-Adaptive and Self-Organizing Systems

(SASO), 2012 IEEE Sixth International Conference on, pages 187�192. IEEE.

[Box and Jenkins, 1994] Box, G. E. P. and Jenkins, G. M. (1994). Time series analysis.

Forecasting and control. Englewood Cli�s, NJ: Prentice-Hall,.

[Breukers et al., 2011] Breukers, S., Heiskanen, E., Brohmann, B., Mourik, R., and Feenstra,

C. (2011). Connecting research to practice to improve energy demand-side management

(dsm). Energy, 36(4):2176 � 2185.

[Cappers et al., 2010] Cappers, P., Goldman, C., and Kathan, D. (2010). Demand response

in u.s. electricity markets: Empirical evidence. Energy, 35(4):1526 � 1535.

[Carpinteiro et al., 2004] Carpinteiro, O. A., Reis, A. J., and da Silva, A. P. (2004). A

hierarchical neural model in short-term load forecasting. Applied Soft Computing, 4(4):405

� 412.

173

[Chan et al., 2000] Chan, W., So, A. T., and Lai, L. (2000). Harmonics load signature

recognition by wavelets transforms. In Electric Utility Deregulation and Restructuring

and Power Technologies, 2000. Proceedings. DRPT 2000. International Conference on,

pages 666�671. IEEE.

[Chatterjee and Hadi, 1986] Chatterjee, S. and Hadi, A. S. (1986). In�uential observations,

high leverage points, and outliers in linear regression. Statistical Science, 1(3):379�393.

[Chen et al., 2004] Chen, B.-J., Chang, M.-W., and lin, C.-J. (2004). Load forecasting using

support vector machines: a study on eunite competition 2001. Power Systems, IEEE

Transactions on, 19(4):1821 � 1830.

[Chen et al., 1993] Chen, J.-L., Tsai, R., and Liang, S.-S. (1993). A distributed problem

solving system for short-term load forecasting. Electric Power Systems Research, 26(3):219

� 224.

[Chen et al., 2010] Chen, Y., Luh, P., Guan, C., Zhao, Y., Michel, L., Coolbeth, M., Fried-

land, P., and Rourke, S. (2010). Short-term load forecasting: Similar day-based wavelet

neural networks. Power Systems, IEEE Transactions on, 25(1):322 �330.

[Chen et al., 2012] Chen, Z., Wu, L., and Fu, Y. (2012). Real-time price-based demand

response management for residential appliances via stochastic optimization and robust

optimization. Smart Grid, IEEE Transactions on, PP(99):1 �9.

[Christiaanse, 1971] Christiaanse, W. (1971). Short-term load forecasting using general

exponential smoothing. Power Apparatus and Systems, IEEE Transactions on, PAS-

90(2):900 �911.

[Cole and Albicki, 1998] Cole, A. I. and Albicki, A. (1998). Data extraction for e�ective non-

intrusive identi�cation of residential power loads. In Instrumentation and Measurement

174

Technology Conference, 1998. IMTC/98. Conference Proceedings. IEEE, volume 2, pages

812�815. IEEE.

[Coll-Mayor et al., 2007] Coll-Mayor, D., Paget, M., and Lightner, E. (2007). Future intel-

ligent power grids: Analysis of the vision in the european union and the united states.

Energy Policy, 35(4):2453 � 2465.

[Cuaresma et al., 2004] Cuaresma, J. C., Hlouskova, J., Kossmeier, S., and Obersteiner,

M. (2004). Forecasting electricity spot-prices using linear univariate time-series models.

Applied Energy, 77(1):87 � 106.

[D. and Uri, 1978] D., N. and Uri (1978). Forecasting peak system load using a combined

time series and econometric model. Applied Energy, 4(3):219 � 227.

[Dai and Wang, 2007] Dai, W. and Wang, P. (2007). Application of pattern recognition

and arti�cial neural network to load forecasting in electric power system. In Natural

Computation, 2007. ICNC 2007. Third International Conference on, volume 1, pages 381

�385.

[Daoxin et al., 2012] Daoxin, L., Lingyun, L., Yingjie, C., and Ming, Z. (2012). Market

equilibrium based on renewable energy resources and demand response in energy engi-

neering. Systems Engineering Procedia, 4(0):87 � 98. <ce:title>Information Engineering

and Complexity Science - Part II</ce:title>.

[Das et al., 2010] Das, R., Kephart, J. O., Lenchner, J., and Hamann, H. (2010). Utility-

function-driven energy-e�cient cooling in data centers. In Proceedings of the 7th interna-

tional conference on Autonomic computing, ICAC '10, pages 61�70, New York, NY, USA.

ACM.

175

[Dash et al., 1998] Dash, P., Satpathy, H., and Liew, A. (1998). A real-time short-term

peak and average load forecasting system using a self-organising fuzzy neural network.

Engineering Applications of Arti�cial Intelligence, 11(2):307 � 316.

[Datchanamoorthy et al., 2011] Datchanamoorthy, S., Kumar, S., Ozturk, Y., and Lee, G.

(2011). Optimal time-of-use pricing for residential load control. In Smart Grid Commu-

nications (SmartGridComm), 2011 IEEE International Conference on, pages 375 �380.

Time of use pricing model for monopolies. On simulation no reboud.

[David et al., 2011] David, H., Fallin, C., Gorbatov, E., Hanebutte, U. R., and Mutlu, O.

(2011). Memory power management via dynamic voltage/frequency scaling. In Proceedings

of the 8th ACM international conference on Autonomic computing, ICAC '11, pages 31�40,

New York, NY, USA. ACM.

[Dehdashti et al., 1982] Dehdashti, A., Tudor, J., and Smith, M. (1982). Forecasting of

hourly load by pattern recognition a deterministic approach. Power Apparatus and Sys-

tems, IEEE Transactions on, PAS-101(9):3290 �3294.

[Deng et al., ] Deng, N., Stewart, C., Kelley, J., Gmach, D., and Arlitt, M. Adaptive green

hosting.

[Diao et al., 2003] Diao, Y., Eskesen, F., Froehlich, S., L. Hellerstein, J., Spainhower, L. F.,

and Surendra, M. (2003). Generic online optimization of multiple con�guration parameters

with application to a database server. In Distributed Systems, Operations and Management

(DSOM), volume 2867/2004 of Lecture Notes in Computer Science. Springer Berlin /

Heidelberg.

[Diongue et al., 2009] Diongue, A., Guagan, D., and Vignal, B. (2009). Forecasting elec-

tricity spot market prices with a k-factor gigarch process. Applied Energy, 86(4):505 �

510.

176

[Du and Lu, 2011a] Du, P. and Lu, N. (2011a). Appliance commitment for household load

scheduling. Smart Grid, IEEE Transactions on, 2(2):411 �419.

[Du and Lu, 2011b] Du, P. and Lu, N. (2011b). Appliance commitment for household load

scheduling. Smart Grid, IEEE Transactions on, 2(2):411 �419.

[El Desouky and Elkateb, 2000] El Desouky, A. and Elkateb, M. (2000). Hybrid adaptive

techniques for electric-load forecast using ann and arima. Generation, Transmission and

Distribution, IEE Proceedings-, 147(4):213 �217.

[Escrivá-Escrivá et al., 2010] Escrivá-Escrivá, G., Segura-Heras, I., and Alcázar-Ortega, M.

(2010). Application of an energy management and control system to assess the potential

of di�erent control strategies in hvac systems. Energy and Buildings, 42(11):2258 � 2267.

[Fan and Chen, 2006] Fan, S. and Chen, L. (2006). Short-term load forecasting based on an

adaptive hybrid method. Power Systems, IEEE Transactions on, 21(1):392 � 401.

[Fan, 2011] Fan, Z. (2011). Distributed demand response and user adaptation in smart grids.

In Integrated Network Management (IM), 2011 IFIP/IEEE International Symposium on,

pages 726 �729. Internet tra�c method for DR. Modeling user preference as willingness

to pay. Simulation based. no rebound calculations.

[Faria and Vale, 2011] Faria, P. and Vale, Z. (2011). Demand response in electrical energy

supply: An optimal real time pricing approach. Energy, 36(8):5374 � 5384. demand

response simulator that allows studying demand response actions and schemes in distri-

bution networks.

[Farinaccio and Zmeureanu, 1999] Farinaccio, L. and Zmeureanu, R. (1999). Using a pattern

recognition approach to disaggregate the total electricity consumption in a house into the

major end-uses. Energy and Buildings, 30(3):245�259.

177

[Femal and Freeh, 2005] Femal, M. and Freeh, V. (13-16 June 2005). Boosting data center

performance through non-uniform power allocation. Autonomic Computing, 2005. ICAC

2005. Proceedings. Second International Conference on, pages 250�261.

[Finn et al., 2012] Finn, P., OConnell, M., and Fitzpatrick, C. (2012). Demand side man-

agement of a domestic dishwasher: Wind energy gains, �nancial savings and peak-time

load reduction. Applied Energy, (0):�.

[Fuller et al., 2011] Fuller, J., Schneider, K., and Chassin, D. (2011). Analysis of residential

demand response and double-auction markets. In Power and Energy Society General

Meeting, 2011 IEEE, pages 1 �7.

[Galus and Andersson, 2008] Galus, M. and Andersson, G. (2008). Demand management

of grid connected plug-in hybrid electric vehicles (phev). Energy 2030 Conference, 2008.

ENERGY 2008. IEEE, pages 1�8.

[Garcia et al., 2005] Garcia, R., Contreras, J., van Akkeren, M., and Garcia, J. (2005). A

garch forecasting model to predict day-ahead electricity prices. Power Systems, IEEE

Transactions on, 20(2):867 � 874.

[Gellings, 1985] Gellings, C. (1985). The concept of demand-side management for electric

utilities. Proceedings of the IEEE, 73(10):1468 � 1470.

[Giorgio and Pimpinella, 2012] Giorgio, A. D. and Pimpinella, L. (2012). An event driven

smart home controller enabling consumer economic saving and automated demand side

management. Applied Energy, 96(0):92 � 103. <ce:title>Smart Grids</ce:title>.

[Goldsby and Cheng, 2008] Goldsby, H. J. and Cheng, B. H. (2008). Automatically gen-

erating behavioral models of adaptive systems to address uncertainty. In Model Driven

Engineering Languages and Systems, pages 568�583. Springer.

178

[Greening, 2010] Greening, L. A. (2010). Demand response resources: Who is responsible

for implementation in a deregulated market? Energy, 35(4):1518 � 1525. wider implemen-

tation will need to accrue from coordinated actions along the electricity supply chain.

[Gross and Galiana, 1987] Gross, G. and Galiana, F. (1987). Short-term load forecasting.

Proceedings of the IEEE, 75(12):1558 � 1573.

[Guan et al., 2010] Guan, X., Xu, Z., and Jia, Q.-S. (2010). Energy-e�cient buildings fa-

cilitated by microgrid. Smart Grid, IEEE Transactions on, 1(3):243 �252. Coordinate

energy sources and loads for low energy building.

[Gudi et al., 2011] Gudi, N., Wang, L., Devabhaktuni, V., and Depuru, S. (2011). A

demand-side management simulation platform incorporating optimal management of dis-

tributed renewable resources. In Power Systems Conference and Exposition (PSCE), 2011

IEEE/PES, pages 1 �7.

[Gupta et al., 2010] Gupta, S., Reynolds, M. S., and Patel, S. N. (2010). Electrisense: single-

point sensing using emi for electrical event detection and classi�cation in the home. In

Proceedings of the 12th ACM international conference on Ubiquitous computing, pages

139�148. ACM.

[Gurguis and Zeid, 2005a] Gurguis, S. A. and Zeid, A. (2005a). Towards autonomic web

services: achieving self-healing using web services. SIGSOFT Softw. Eng. Notes, 30(4):1�

5.

[Gurguis and Zeid, 2005b] Gurguis, S. A. and Zeid, A. (2005b). Towards autonomic web

services: achieving self-healing using web services. SIGSOFT Softw. Eng. Notes, 30(4):1�

5.

[Hagan and Behr, 1987] Hagan, M. T. and Behr, S. M. (1987). The time series approach to

short term load forecasting. Power Systems, IEEE Transactions on, 2(3):785 �791.

179

[Hart, 1992] Hart, G. W. (1992). Nonintrusive appliance load monitoring. Proceedings of

the IEEE, 80(12):1870�1891.

[Hassan et al., 2013] Hassan, T., Javed, F., and Arshad, N. (2013). An empirical investi-

gation of vi trajectory based load signatures for non-intrusive load monitoring. arXiv

preprint arXiv:1305.0596.

[He et al., 2006] He, Y., Zhu, Y., and Duan, D. (2006). Research on hybrid arima and

support vector machine model in short term load forecasting. In Intelligent Systems

Design and Applications, 2006. ISDA '06. Sixth International Conference on, volume 1,

pages 804 �809.

[Heyer et al., 1999] Heyer, L. J., Kruglyak, S., and Yooseph, S. (1999). Exploring expression

data: Identi�cation and analysis of coexpressed genes. Genome Res., 9(11):1106�1115.

[Hillier and Lieberman, 2001] Hillier, F. and Lieberman, G. (2001). Introduction to opera-

tions research. McGraw-Hill.

[Hippert et al., 2001] Hippert, H., Pedreira, C., and Souza, R. (2001). Neural networks for

short-term load forecasting: a review and evaluation. Power Systems, IEEE Transactions

on, 16(1):44 �55.

[Hippert and Taylor, 2010] Hippert, H. S. and Taylor, J. W. (2010). An evaluation of

bayesian techniques for controlling model complexity and selecting inputs in a neural

network for short-term load forecasting. Neural Networks, 23(3):386 � 395.

[Hoelzl et al., 2012] Hoelzl, G., Kurz, M., Halbmayer, P., Erhart, J., Matscheko, M., Ferscha,

A., Eisl, S., and Kaltenleithner, J. (2012). Locomotion@ location: When the rubber hits

the road.

180

[Ilyas et al., arch] Ilyas, M., Raza, S., Chen, C.-C., Uzmi, Z., and Chuah, C.-N. (March).

Red-bl: Energy solution for loading data centers. In INFOCOM, 2012 Proceedings IEEE,

pages 2866�2870.

[Irisarri et al., 1982] Irisarri, G., Widergren, S., and Yehsakul, P. (1982). On-line load fore-

casting for energy control center application. Power Apparatus and Systems, IEEE Trans-

actions on, PAS-101(1):71 �78.

[Izquierdo et al., 2008] Izquierdo, M. D. Z., Jiménez, J. J. S., and del Sol, A. M. (2008).

Matlab software to determine the saving in parallel pumps optimal operation systems, by

using variable speed. In Energy 2030 Conference, 2008. ENERGY 2008. IEEE, pages 1�8.

IEEE.

[Jabr et al., 2000] Jabr, R., Coonick, A., and Cory, B. (2000). A homogeneous linear pro-

gramming algorithm for the security constrained economic dispatch problem. Power Sys-

tems, IEEE Transactions on, 15(3):930�936.

[Jain and Satish, 2009] Jain, A. and Satish, B. (2009). Clustering based short term load

forecasting using arti�cial neural network. In Power Systems Conference and Exposition,

2009. PSCE '09. IEEE/PES, pages 1 �7.

[Javed and Arshad, 2008] Javed, F. and Arshad, N. (2008). On the use of linear program-

ming in optimizing energy costs. In IWSOS, pages 305�310.

[Javed and Arshad, 2009a] Javed, F. and Arshad, N. (2009a). Adopt: An adaptive optimiza-

tion framework for large-scale power distribution systems. In Proceedings of the 2009 Third

IEEE International Conference on Self-Adaptive and Self-Organizing Systems, SASO '09,

pages 254�264, Washington, DC, USA. IEEE Computer Society.

[Javed and Arshad, 2009b] Javed, F. and Arshad, N. (2009b). A penny saved is a penny

earned: Applying optimization techniques to power management. In 16th IEEE Interna-

181

tional Conference on the Engineering of Computer-Based Systems (ECBS 2009), 13-16

April 2009, San Francisco, CA, USA.

[Javed et al., 2012] Javed, F., Arshad, N., Wallin, F., Vassileva, I., and Dahlquist, E. (2012).

Forecasting for demand response in smart grids: An analysis on use of anthropologic and

structural data and short term multiple loads forecasting. Applied Energy, 96(0):150 �

160. <ce:title>Smart Grids</ce:title>.

[Jiang and Fei, 2011] Jiang, B. and Fei, Y. (2011). Dynamic residential demand response and

distributed generation management in smart microgrid with hierarchical agents. Energy

Procedia, 12(0):76 � 90. <ce:title>The Proceedings of International Conference on Smart

Grid and Clean Energy Technologies (ICSGCE 2011</ce:title>.

[Kim and Shcherbakova, 2011] Kim, J.-H. and Shcherbakova, A. (2011). Common failures

of demand response. Energy, 36(2):873 � 880.

[Kim and Poor, 2011] Kim, T. and Poor, H. (2011). Scheduling power consumption with

price uncertainty. Smart Grid, IEEE Transactions on, 2(3):519 �527.

[Koller et al., 2010] Koller, R., Verma, A., and Neogi, A. (2010). Wattapp: an applica-

tion aware power meter for shared data centers. In Proceedings of the 7th international

conference on Autonomic computing, ICAC '10, pages 31�40, New York, NY, USA. ACM.

[Kolter and Johnson, 2011] Kolter, J. Z. and Johnson, M. J. (2011). Redd: A public data

set for energy disaggregation research. In proceedings of the SustKDD workshop on Data

Mining Applications in Sustainability, pages 1�6.

[Kuhn and Verwaest, 2008] Kuhn, A. and Verwaest, T. (2008). Fame, a polyglot library for

metamodeling at runtime. Models @ Runtime 2008, pages 57�66.

182

[Kwag and Kim., 2012] Kwag, H.-G. and Kim., J.-O. (2012). Optimal combined scheduling

of generation and demand response with demand resource constraints. Applied Energy,

(0):�.

[Lai et al., 2010] Lai, H. W., Fung, G., Lam, H., and Lee, W. (2010). Disaggregate loads by

particle swarm optimization method for non-intrusive load monitoring. In International

Conference on Electrical Engineering, July 2007.

[Lauret et al., 2012] Lauret, P., David, M., and Calogine, D. (2012). Nonlinear models

for short-time load forecasting. Energy Procedia, 14(0):1404 � 1409. Gaussean process

regression and NN for STLF.

[Lauret et al., 2008] Lauret, P., Fock, E., Randrianarivony, R. N., and Manicom-Ramsamy,

J.-F. (2008). Bayesian neural network approach to short time load forecasting. Energy

Conversion and Management, 49(5):1156 � 1166.

[Lee and Lee, 2011] Lee, J.-W. and Lee, D.-H. (2011). Residential electricity load schedul-

ing for multi-class appliances with time-of-use pricing. In GLOBECOM Workshops (GC

Wkshps), 2011 IEEE, pages 1194 �1198. Scheduling algorithm using time of use to reduce

cost. No rebound in simulation.

[Leeb et al., 1995] Leeb, S. B., Shaw, S. R., and Kirtley Jr, J. L. (1995). Transient event

detection in spectral envelope estimates for nonintrusive load monitoring. Power Delivery,

IEEE Transactions on, 10(3):1200�1210.

[Lefurgy et al., 2007] Lefurgy, C., Wang, X., and Ware, M. (11-15 June 2007). Server-level

power control. Autonomic Computing, 2007. ICAC '07. Fourth International Conference

on, pages 4�4.

[Leite et al., 2010] Leite, J. C., Kusic, D. M., Mossé, D., and Bertini, L. (2010). Stochastic

approximation control of power and tardiness in a three-tier web-hosting cluster. In Pro-

183

ceedings of the 7th international conference on Autonomic computing, ICAC '10, pages

41�50, New York, NY, USA. ACM.

[Liang et al., 2010] Liang, J., Ng, S., Kendall, G., and Cheng, J. (2010). Load signature

study part ii: Disaggregation framework, simulation, and applications. Power Delivery,

IEEE Transactions on, 25(2):561 �569.

[Liaqat et al., 2012] Liaqat, M. D., Javed, F., and Arshad, N. (2012). Towards a self-

managing tool for optimizing energy usage in buildings. In SEB '2012: Proceedings of

the International Conference on Sustainability in Energy and Buildings, Stockholms Swe-

den.

[Lin et al., 2010] Lin, W.-M., Gow, H.-J., and Tsai, M.-T. (2010). An enhanced radial basis

function network for short-term electricity price forecasting. Applied Energy, 87(10):3226

� 3234.

[Lisovich and Wicker., 2008] Lisovich, M. and Wicker., S. (2008). Privacy concerns in up-

coming residential and commercial demand-response systems. In 2008 Clemson University

Power Systems Conference., Clemson University,.

[Livengood and Larson, 2009] Livengood, D. and Larson, R. (2009). Energy box: locally

automated optimal control of residential electricity usage. Service Science, 1(1):1 �16.

[Lu et al., 2004] Lu, J.-C., Niu, D.-X., and Jia, Z.-Y. (2004). A study of short-term load

forecasting based on arima-ann. In Machine Learning and Cybernetics, 2004. Proceedings

of 2004 International Conference on, volume 5, pages 3183 � 3187 vol.5.

[Marceau and Zmeureanu, 2000] Marceau, M. and Zmeureanu, R. (2000). Nonintrusive load

disaggregation computer program to estimate the energy consumption of major end uses

in residential buildings. Energy Conversion and Management, 41(13):1389 � 1403.

184

[Mastorocostas et al., 2000] Mastorocostas, P., Theocharis, J., Kiartzis, S., and Bakirtzis,

A. (2000). A hybrid fuzzy modeling method for short-term load forecasting. Mathematics

and Computers in Simulation, 51:221 � 232.

[Mehrotra, 1992] Mehrotra, S. (1992). On the implementation of a primal-dual interior point

method. SIAM Journal on Optimization, 2(4):575�601.

[Moghaddam et al., 2011] Moghaddam, M. P., Abdollahi, A., and Rashidinejad, M. (2011).

Flexible demand response programs modeling in competitive electricity markets. Applied

Energy, 88(9):3257 � 3269.

[Mohandes, 2002] Mohandes, M. (2002). Support vector machines for short-term electrical

load forecasting. International Journal of Energy Research, 26(4):335�345.

[Mohsenian-Rad et al., 2010] Mohsenian-Rad, A., Wong, V., Jatskevich, J., Schober, R., and

Leon-Garcia, A. (2010). Autonomous demand-side management based on game-theoretic

energy consumption scheduling for the future smart grid. Smart Grid, IEEE Transactions

on, 1(3):320 �331.

[Molderink et al., 2010] Molderink, A., Bakker, V., Bosman, M., Hurink, J., and Smit, G.

(2010). Management and control of domestic smart grid technology. Smart Grid, IEEE

Transactions on, 1(2):109 �119.

[Moslehi and Kumar, 2010] Moslehi, K. and Kumar, R. (2010). Smart grid - a reliability

perspective. pages 1 �8.

[Nguyen and Nabney, 2010] Nguyen, H. T. and Nabney, I. T. (2010). Short-term electricity

demand and gas price forecasts using wavelet transforms and adaptive models. Energy,

35(9):3674 � 3685.

185

[Nie et al., 2012] Nie, H., Liu, G., Liu, X., and Wang, Y. (2012). Hybrid of arima and svms

for short-term load forecasting. Energy Procedia, 16, Part C(0):1455 � 1460. <ce:title>2012

International Conference on Future Energy, Environment, and Materials</ce:title>.

[Norford and Leeb, 1996] Norford, L. K. and Leeb, S. B. (1996). Non-intrusive electrical load

monitoring in commercial buildings based on steady-state and transient load-detection

algorithms. Energy and Buildings, 24(1):51�64.

[of Finance Pakistan., 2009] of Finance Pakistan., M. (2009). Economic survey of pakistan.

Economic survey of Pakistan.

[Ogston et al., 2007] Ogston, E., Zeman, A., Prokopenko, M., and James, G. (2007). Clus-

tering distributed energy resources for large-scale demand management. In Self-Adaptive

and Self-Organizing Systems, 2007. SASO'07. First International Conference on, pages

97�108. IEEE.

[Omer et al., 2010] Omer, A., Javed, F., and Arshad, N. (2010). A case study of imple-

menting a localized smart grid in developing countries. In ICAE '2010: Proceedings of the

Second International Conference on Applied Energy, Singapore.

[Ozturk, 2010] Ozturk, I. (2010). A literature survey on energy�growth nexus. Energy

Policy, 38(1):340�349.

[Pai and Hong, 2005] Pai, P.-F. and Hong, W.-C. (2005). Support vector machines with

simulated annealing algorithms in electricity load forecasting. Energy Conversion and

Management, 46(17):2669 � 2688.

[Papalexopoulos and Hesterberg, 1990] Papalexopoulos, A. and Hesterberg, T. (1990). A

regression-based approach to short-term system load forecasting. Power Systems, IEEE


186

[Pedrasa et al., 2010] Pedrasa, M., Spooner, T., and MacGill, I. (2010). Coordinated

scheduling of residential distributed energy resources to optimize smart home energy ser-

vices. Smart Grid, IEEE Transactions on, 1(2):134 �143.

[Pickering et al., 2009] Pickering, B., Robert, S., M?noret, S., and Mengusoglu, E. (2009).

Model-driven management of complex systems. Technical Report COMP COMP-005-2008

Lancaster University, page 117.

[Powers et al., 1991] Powers, J., Margossian, B., and Smith, B. (1991). Using a rule-based

algorithm to disaggregate end-use load pro�les from premise-level data. Computer Appli-

cations in Power, IEEE, 4(2):42�47.

[Rahimi and Ipakchi, 2010] Rahimi, F. and Ipakchi, A. (2010). Demand response as a market

resource under the smart grid paradigm. Smart Grid, IEEE Transactions on, 1(1):82 �88.

Introductory case building paper.

[Rahman and Bhatnagar, 1988] Rahman, S. and Bhatnagar, R. (1988). An expert system

based algorithm for short term load forecast. Power Systems, IEEE Transactions on,

3(2):392 �399.

[Ramchurn et al., 2011] Ramchurn, S., Vytelingum, P., Rogers, A., and Jennings, N. (2011).

Agent-based control for decentralised demand side management in the smart grid. In The

Tenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS

2011), pages 5�12.

[Ranade and Beal, 2010] Ranade, V. V. and Beal, J. (2010). Distributed control for small

customer energy demand management. In SASO'10, pages 11�20.

[Rastegar et al., 2012] Rastegar, M., Fotuhi-Firuzabad, M., and Aminifar, F. (2012). Load

commitment in a smart home. Applied Energy, 96(0):45 � 54.

187

[Riedmiller and Braun, 1993] Riedmiller, M. and Braun, H. (1993). A direct adaptive

method for faster backpropagation learning: the rprop algorithm. In Neural Networks,

1993., IEEE International Conference on, pages 586 �591 vol.1.

[Romero, 2012] Romero, J. J. (2012). Blackouts illuminate india's power problems. Spec-

trum, IEEE, 49(10):11�12.

[Saele and Grande, 2011] Saele, H. and Grande, O. (2011). Demand response from household

customers: Experiences from a pilot study in norway. Smart Grid, IEEE Transactions

on, 2(1):102 �109. Manual DR in Norway showing reduction in energy at peak times.

[Salahi et al., 2007] Salahi, M., Peng, J., and Terlaky, T. (2007). On mehrotra-type

predictor-corrector algorithms. SIAM J. on Optimization, 18(4):1377�1397.

[Shao et al., 2012] Shao, S., Pipattanasomporn, M., and Rahman, S. (2012). Grid integra-

tion of electric vehicles and demand response with customer choice. Smart Grid, IEEE


[Shen et al., 2011] Shen, Z., Subbiah, S., Gu, X., and Wilkes, J. (2011). Cloudscale: elastic

resource scaling for multi-tenant cloud systems. In Proceedings of the 2nd ACM Symposium

on Cloud Computing, SOCC '11, pages 5:1�5:14, New York, NY, USA. ACM.

[Sheth and Parvatiyar, 1995] Sheth, J. and Parvatiyar, A. (1995). Relationship marketing in

consumer markets: Antecedents and consequences. Journal of the Academy of Marketing

Science, 23:255�271. 10.1177/009207039502300405.

[Srinivasan et al., 2006] Srinivasan, D., Ng, W., and Liew, A. (2006). Neural-network-based

signature recognition for harmonic source identi�cation. Power Delivery, IEEE Transac-

tions on, 21(1):398�405.

188

[Strbac, 2008] Strbac, G. (2008). Demand side management: Bene�ts and challenges. Energy

Policy, 36(12):4419 � 4426. major bene�ts and challenges of electricity demandsideman-

agement (DSM) are discussed in the context of the UK electricity system.

[Sun and Zou, 2007] Sun, W. and Zou, Y. (2007). Short term load forecasting based on bp

neural network trained by pso. In Machine Learning and Cybernetics, 2007 International

Conference on, volume 5, pages 2863 �2868.

[Tan et al., 2010] Tan, Z., Zhang, J., Wang, J., and Xu, J. (2010). Day-ahead electricity

price forecasting using wavelet transform combined with arima and garch models. Applied

Energy, 87(11):3606 � 3610.

[Tarzia et al., 2010] Tarzia, S. P., Dinda, P. A., Dick, R. P., and Memik, G. (2010). Display

power management policies in practice. In Proceedings of the 7th international conference

on Autonomic computing, ICAC '10, pages 51�60, New York, NY, USA. ACM.

[Valenzuela et al., 2012] Valenzuela, J., Thimmapuram, P. R., and Kim, J. (2012). Modeling

and simulation of consumer response to dynamic pricing with enabled technologies. Applied

Energy, 96(0):122 � 132. e develop a model that represents the response of consumers to

dynamic pricing. In the model, consumers use forecasted day-ahead prices to shift daily

energy consumption.

[Vasic et al., 2010] Vasic, N., Scherer, T., and Schott, W. (2010). Thermal-aware workload

scheduling for energy e�cient data centers. In Proceedings of the 7th international con-

ference on Autonomic computing, ICAC '10, pages 169�174, New York, NY, USA. ACM.

[Venkatesan et al., 2012] Venkatesan, N., Solanki, J., and Solanki, S. K. (2012). Residential

demand response model and impact on voltage pro�le and losses of an electric distribution

network. Applied Energy, (0):�. DemandResponse (DR) by utilizing consumer behavior

189

modeling considering di�erent scenarios and levels of consumer rationality while observing

voltage pro�le and losses.

[Walawalkar et al., 2010] Walawalkar, R., Fernands, S., Thakur, N., and Chevva, K. R.

(2010). Evolution and current status of demand response (dr) in electricity markets:

Insights from pjm and nyiso. Energy, 35(4):1553 � 1560.

[Wallace and Kuhn, ] Wallace, D. R. and Kuhn, D. R. Converting system failure histories

into future win situations. nist. 2000.

[Wallin et al., 2005] Wallin, F., Bartusch, C., Thorin, E., Bdckstrom, T., and Dahlquist, E.

(2005). "the use of automatic meter readings for a demand-based tari�". pages 1�6.

[Wang et al., 2011] Wang, J., Botterud, A., Bessa, R., Keko, H., Carvalho, L., Issicaba,

D., Sumaili, J., and Miranda, V. (2011). Wind power forecasting uncertainty and unit

commitment. Applied Energy, 88(11):4014 � 4023.

[Wang et al., 2006] Wang, M., Kandasamy, N., Guez, A., and Kam, M. (13-16 June 2006).

Adaptive performance control of computing systems via distributed cooperative control:

Application to power management in computing clusters. Autonomic Computing, 2006.

ICAC '06. IEEE International Conference on, pages 165�174.

[Wang et al., 2002] Wang, X., Song, Y.-H., and Lu, Q. (2002). A coordinated real-time

optimal dispatch method for unbundled electricity markets. Power Systems, IEEE Trans-

actions on, 17(2):482�490.

[Weron, 2006] Weron, R. (2006). Modeling and Forecasting Electricity Loads and Prices: A

Statistical Approach (The Wiley Finance Series). Wiley.

[Wilhite and , 2000] Wilhite, H. and, S. E. a. L. L. and , Kempton, W. (2000). Twenty years

of energy demand management: we know more about individual behavior but how much

190

do we really know about demand? In 2000 Summer Study Proceedings of the American

Council for an Energy-E�cient Economy, Washington, DC, pages 8435�8453.

[Xu et al., 2010] Xu, Y., Xie, L., and Singh, C. (2010). Optimal scheduling and operation of

load aggregator with electric energy storage in power markets. In North American Power

Symposium (NAPS), 2010, pages 1�7. IEEE.

[Yan et al., 2013] Yan, Y., Qian, Y., Sharif, H., and Tipper, D. (2013). A survey on smart

grid communication infrastructures: Motivations, requirements and challenges. Commu-

nications Surveys Tutorials, IEEE, 15(1):5�20.

[Yang and Huang, 1998] Yang, H.-T. and Huang, C.-M. (1998). A new short-term load

forecasting approach using self-organizing fuzzy armax models. Power Systems, IEEE


[Yao et al., 2000] Yao, S., Song, Y., Zhang, L., and Cheng, X. (2000). Wavelet transform

and neural networks for short-term electrical load forecasting. Energy Conversion and

Management, 41(18):1975 � 1988.

[Yuan et al., 2011] Yuan, L., Lu, G., Zhan, J., Wang, H., and Wang, L. (2011). Powertracer:

Tracing requests in multi-tier services to diagnose energy ine�ciency.

[Yun et al., 2008] Yun, Z., Quan, Z., Caixin, S., Shaolan, L., Yuming, L., and Yang, S.

(2008). Rbf neural network and an�s-based short-term load forecasting approach in real-

time price environment. Power Systems, IEEE Transactions on, 23(3):853 �858.

[Zeifman and Roth, 2011] Zeifman, M. and Roth, K. (2011). Nonintrusive appliance load

monitoring: Review and outlook. Consumer Electronics, IEEE Transactions on, 57(1):76�

84.

191

[Zhang, 2005] Zhang, M.-G. (2005). Short-term load forecasting based on support vector

machines regression. In Machine Learning and Cybernetics, 2005. Proceedings of 2005

International Conference on, volume 7, pages 4310 �4314 Vol. 7.

[Zhang et al., 2012] Zhang, Q., Zhani, M. F., Zhu, Q., Zhang, S., Boutaba, R., and Heller-

stein, J. (2012). Dynamic energy-aware capacity provisioning for cloud computing envi-

ronments. In Proceedings IEEE/ACM International Conference on Autonomic Computing

(ICAC).

[Zhang, 1997] Zhang, Y. (1997). Solving large-scale linear programs by interior-point meth-

ods under the matlab environment.

[Zhu et al., 2008] Zhu, X., Young, D., Watson, B., Wang, Z., Rolia, J., Singhal, S., McKee,

B., Hyser, C., Gmach, D., Gardner, R., Christian, T., and Cherkasova, L. (2008). 1000 is-

lands: Integrated capacity and workload management for the next generation data center.

pages 172�181.

192

Towards Self-Managing Demand Side Management

Documents

Transcript of Towards Self-Managing Demand Side Management